Decoding motor responses from the EEG during altered states of consciousness induced by propofol

Objective. Patients undergoing general anesthesia may awaken and become aware of the surgical procedure. Due to neuromuscular blocking agents, patients could be conscious yet unable to move. Using brain–computer interface (BCI) technology, it may be possible to detect movement attempts from the EEG. However, it is unknown how an anesthetic influences the brain response to motor tasks. Approach. We tested the offline classification performance of a movement-based BCI in 12 healthy subjects at two effect-site concentrations of propofol. For each subject a second classifier was trained on the subject’s data obtained before sedation, then tested on the data obtained during sedation (‘transfer classification’). Main results. At concentration 0.5 μg ml−1, despite an overall propofol EEG effect, the mean single trial classification accuracy was 85% (95% CI 81%–89%), and 83% (79%–88%) for the transfer classification. At 1.0 μg ml−1, the accuracies were 81% (76%–86%), and 72% (66%–79%), respectively. At the highest propofol concentration for four subjects, unlike the remaining subjects, the movement-related brain response had been largely diminished, and the transfer classification accuracy was not significantly above chance. These subjects showed a slower and more erratic task response, indicating an altered state of consciousness distinct from that of the other subjects. Significance. The results show the potential of using a BCI to detect intra-operative awareness and justify further development of this paradigm. At the same time, the relationship between motor responses and consciousness and its clinical relevance for intraoperative awareness requires further investigation.

(e.g. in [1]). Patients under general anesthesia are often temporarily paralysed with a neuromuscular blocking agent. If they awake during surgery, they may find themselves in a situation where they have a certain degree of consciousness but are nevertheless unable to move or speak. This experience, known as 'unintended awareness with postoperative explicit recall', has an estimated incidence of 0.1%-0.2% [2]. Although several (commercial) monitors of anesthetic depth have been developed, they are not often used [3], which may be due to concerns about their reliability [4][5][6][7]. Finding accurate methods for detecting awareness in patients is thus still an ongoing challenge within anaesthesia research. Therefore we propose to extend BCI research into the domain of anesthesia awareness.
One of the best known and most successful BCI paradigms is detection of changes in sensorimotor rhythms from the EEG during attempted and imagined movement [8]. For instance, it has been shown that attempted movements can be detected from patients with tetraplegia [9]. Likewise, motor tasks may be used as a diagnostic tool in determining states of altered consciousness in patients recovering from coma [10]. This evidence shows the potential of using patients' movement attempts during general anesthesia as an indicator of awareness. Intentions of movement could replace or complement the features currently used in anesthesia monitoring, such as entropy or bispectral analysis [11].
Our proposed BCI paradigm proved successful in awake volunteers intending gross movement [12] and also in awake volunteers trying to move one isolated forearm temporarily paralyzed by a neuromuscular blocking agent [13]. However, hypnotics are known to change EEG characteristics [14,15]. Sensorimotor rhythm modulations normally occurring when a person is engaged in a motor task may be altered or disappear altogether.
In this study we therefore investigated the influence of low doses of propofol on sensorimotor rhythms. Healthy participants performed a motor task in a baseline state as well as in altered states of consciousness induced by propofol. For each state, offline classification accuracies of movement as compared to rest were determined. If the specific brain response normally seen during movement is retained after administration of hypnotic drugs, it may be used for BCIbased communication.

Participants
Twelve right-handed healthy volunteers (aged 18-28, 5 females) participated in this study. None had any known neurological or motor impairments, nor contraindications for the use of propofol. All participants gave written informed consent prior to the experiment. Measurements took place in an operating room at the Radboud University Medical Centre in Nijmegen, the Netherlands.

Experimental design
All procedures were according to the Declaration of Helsinki and were approved by the local Medical Ethics Committee.
The experiment consisted of three blocks. The first experimental block was a baseline block in which the subjects performed the movement tasks without administration of propofol (block 0). In the subsequent blocks, propofol was administered via a target controlled infusion (TCI) pump in steps of 0.5 μg ml −1 (target concentration). So, the target concentration was 0.5 μg ml −1 for the second block (block 0.5) and 1.0 μg ml −1 for the third block (block 1.0). An Alaris PK infusion pump (Carefusion, Basingstoke, UK) was used in the TCI mode (Schnider model, effect-site targeting). When the target concentration had been reached, participants waited for another 10 min before proceeding with the experiment, to ensure equilibration between the body compartments. Only subjects S1, S2 and S3 received an additional propofol dose increase with 1.5 μg ml −1 as target concentration. During the entire procedure heart rate, blood pressure and oxygen saturation (pulse oximetry) were monitored. After the end of the experiment participants remained in the OR complex until they were fully recovered.
In each block, sequences of nine movement trials were presented to the subjects. Each trial consisted of an auditory 3 s cue, with a 4 s silence interval between consecutive trials. At the start of each sequence, an auditory instruction was given explaining the task for the upcoming trials: either 'move' (continuous hand tapping) or 'do not move'. The participants had to perform the instructed task during the auditory cues, and rest during the silence intervals (figure 1). Participants were asked to keep their eyes closed throughout the entire sequence. Between sequences participants could have a short rest, then start the next sequence by pressing a button. Per block, between 54 and 63 trials were presented for each of the two task conditions. Within each block, presentation of the sequences was randomized. A short practice block to get the participants acquainted with the task preceded the actual measurements.
The experiment was programmed in and run on the BrainStream platform 4 Version 1.0, i.e. a Matlab (MathWorks Inc., MA, USA) toolbox especially developed for online BCIexperiments, using Psychtoolbox 5 for stimulus presentation.

EEG, EMG and BIS recording and analyses
EEG was recorded with a 32-channel actiCAP system (Brain Products), based on the international 10/20 system. Impedances were kept below 25 kΩ before starting the measurement, and the sampling rate during recording was 2500 Hz. After recording, signals were downsampled to 128 Hz.
Two electrodes were removed from the EEG cap and instead used to record the left forearm electromyogram (EMG). Muscle outputs as recorded by EMG were used to determine if and when participants had executed the wrong task. EMG signals were rereferenced using a bipolar reference for the two channels and high-pass filtered at 10 Hz to reduce the effect of artifacts such as electrode drift. Then, the signals were converted to power over time by taking the absolute magnitude of the analytic signal as found using a Hilbert transform, and the mean power per subject and movement condition was determined for the period between 0.1 and 3.5 s (task onset is at 0). Trials for which the EMG power deviated more than 3 times the standard deviation from the mean for that subject and condition were excluded from further analysis. For the remaining trials, the mean amplitude per subject per condition was determined, as well as the mean movement onset time and standard deviation for the movement tasks by identifying the first rising edge of the EMG amplitude increase.
Additionally, Bispectral index (BIS) was measured using the Philips M1034AX (BIS) Solution plug-in module (Philips Medical Systems, Eindhoven, The Netherlands). BIS is a commercial depth of anesthesia monitor, providing a number between 0 (no brain activity) and 100 (completely awake) [16]. While no true gold standard is currently available, BIS is generally considered to be one of the most important monitors of anesthetic depth in clinical use and it is relatively well known among anesthesiologists. Its straightforward output gave an indication, both during the experiment and when interpreting the results, about the overall awareness reduction in our participants. Values were recorded manually every 1-2 min during the experimental blocks.

Classification.
To test the feasibility of detecting movement during propofol sedation, offline classification analyses were performed separately for each of the three experimental blocks. Data obtained at a propofol effect-site concentration of 1.5 μg ml −1 were not used for analysis, as explained below. The parameter settings used have been validated for this paradigm in a previous study [12]. Specifically, the classifier used information from only nine EEG channels, as this would be more practical in clinical settings than using a full standard EEG cap. Moreover, frequencies above 24 Hz were disregarded. Even though they may contain useful information, in the current setup involving actual movements these higher frequencies may be prone to class-related artifacts [17].
The typical brain response to be seen during motor tasks (actual, attempted or imagined movements) is a power decrease in mu rhythm (8-12 Hz) and beta rhythm (18-25 Hz) activity in the sensorimotor cortex, with a short rebound period in roughly the same frequencies after movement has stopped. These changes are commonly referred to as event-related desynchronization (ERD) and event-related synchronization (ERS) [18]. Thus, these were the main features the classifier used in this study was expected to use for its decisions.
Trials were constituted of 3 s of movement (or no movement) followed by 3 s of rest. For the classification procedure, the data were first linearly detrended to minimize analysis artifacts due to large DC offsets. After calculating the surface Laplacian reference per channel using Perrins spherical spline interpolation method [19], the power spectral density was computed for 8-24 Hz using Welch's method [20] with a resolution of 4 Hz and a Hanning taper applied to 50% overlapping windows (i.e. windows of 250 ms with overlap of 12 ms were used), using separate features for ERD (data obtained during movement, i.e. 0-3 s) and ERS (postmovement, i.e. 3.5-6 s). This subset of power spectral features for each channel was then used to train a quadratically regularized linear logistic regression classifier [21] to distinguish between each subjects specific pattern of spatial and spectral activation for the movement condition as compared to the 'no movement' condition. Validation set performance was estimated using ten-fold cross-validation. So, for each condition the trials were distributed over ten subsets (folds), with each fold used for testing once while the remaining nine folds were used for training the classifier.
Additionally, we calculated the performance of the classifier when it was trained on block 0 (baseline: no propofol), then tested on the data from block 0.5 (propofol effect-site concentration 0.5 μg ml −1 ) and block 1.0 (propofol effect-site concentration 1.0 μg ml −1 ), respectively. The rationale behind this lies in the eventual clinical application, where it would make sense to train a classifier before general anesthesia, then apply it after drug administration. Here, no cross-validation was required for performance estimation. Inspection of the power spectra revealed a β-increase in the sedation conditions, a known effect of anesthetic drugs [22,23]. In order to cancel out these dose-dependent shifts, which are unrelated to the movement task, trials were first baselined by estimating the average spectrum of the entire trial (−1-6 s) and then dividing by this estimate. After that the classifier could be trained on the baseline (no propofol) data and then transferred to the data obtained during sedation.

Statistical analyses.
To test whether the classifier can make any meaningful decision at all, it is important to compare its results to those of a 'random' classifier. For a binary problem with balanced classes, such as in this study, the theoretical chance level performance is 50%. Using the binomial distribution for proportional data, taking into account the number of trials per condition, confidence intervals (CI) for a random classifier can be calculated [24]. Individual classification accuracies were compared to the upper limit of the 95% CI of a random classifier.
Additionally, for each condition the 95% CI for the mean classification accuracy was calculated, using GraphPad Prism version 5.03, GraphPad Software, San Diego California, USA. A 95% CI lower limit above 50% for a given condition means that the classifier performs better than chance, i.e. the true mean in the population is higher than 50% (p=0.05).

Results
All participants performed the tasks well. During the administration of propofol, however, participants started to show signs of sleepiness, and gradually needed to increase their efforts to stay alert and perform the required task. The table shows that the mean BIS values decreased from 92.1 during baseline to 90.5 and 83.1 during propofol administration. As the experiment progressed, reaction times increased and some participants started making a few errors. Based on the EMG responses, 0.6%, 1.7% and 4.1% of trials were judged to have been wrongly executed in blocks 0, 0.5 and 1.0 respectively. For the movement conditions, the mean EMG amplitude as a percentage of each subject's baseline EMG amplitude was 86% and 74% at propofol effect-site concentrations of 0.5 and 1.0 μg ml −1 , respectively. The mean movement onset time increased by 65 ms (from 273 to 338 ms) between block 0 and block 1, but there was no difference between block 0 and block 0.5 (see table 1).
Only the first three participants received propofol aiming at 1.5 μg ml −1 effect-site concentration. For all three, awareness levels were reduced so much that they were unable to perform the task. Therefore the data obtained at this concentration were not analysed and the final nine participants did not receive this dose. Figure 2 shows the paired data for single trial classification accuracies for each subject. Mean accuracies were 87.5% (95% CI 82.4%-92.5%) for block 0, 84.9% (80.9%-88.9%) for block 0.5 and 80.9% (76.1%-85.8%) for block 1.0. For each subject and condition the performance was significantly higher than chance level (p<0.05). After correcting for dose-dependent EEG shifts (figure 3), a classifier was trained on data from the baseline block and then applied on data from blocks 0.5 and 1.0. The mean accuracy for this transfer classification was 83.4% (79.3%-87.5%) for block 0.5, and 72.4% (65.7%-79.1%) for block 1.0. The transfer classification performance was significantly higher than chance level (p<0.05) for all subjects at 0.5 μg ml −1 , but only for 8 out of 12 subjects at 1.0 μg ml −1 . All transfer classification accuracies are shown in figure 4.
To find possible indicators as to why the transfer classification was not significantly better than chance in subjects S1, S2, S3 and S6, the propofol-associated changes in the BIS and EMG measures were reanalysed post-hoc for the effect-  The dashed line shows the binomial confidence interval (α=0.05) for the minimum number of trials used for performance estimation (46 trials for S1 at 1.0 μg ml −1 ). For the remaining subjects and conditions the line would be slightly lower. The error bar gives an indication of the standard error of the performance estimates using 10-fold cross-validation (single example shown).
site concentration of 1.0 μg ml −1 . The mean BIS-value for these four subjects was 80.6, as compared to 84.4 for the remaining subjects. Regarding the EMG, S1, S2, S3 and S6 not only had the highest movement onset times (range 370-448 ms versus range 264-348 ms in the other subjects), but also the standard deviations of the movement onset times were highest for these subjects (range 181-292 ms versus range 64-158 ms), meaning their responses were both slower and more erratic.
In figure 5 details are shown for one participant for whom the transfer worked (S7) and for one participant for whom it did not (S2). The plots reveal that the desynchronization in 8-24 Hz (ERD) remains constant in S7 after propofol administration, whereas for S2 the effect is greatly reduced at 1.0 μg ml −1 target concentration.

Principal findings
This study showed that motor responses could be detected from the EEG of volunteers during altered states of consciousness, with an average single trial classification accuracy of 85% at a propofol effect-site concentration of 0.5 μg ml −1 and 81% at a propofol effect-site concentration of 1.0 μg ml −1 . Single trial 'transfer' classification accuracies of 83% and 72% were obtained at propofol effect-site concentrations of 0.5 and 1.0 μg ml −1 , respectively. Adding this to previous findings showing the possibility of detecting attempted movement during neuromuscular block [13], we conclude that further development of the proposed BCI is justified. During various conditions of drug administration, including both hypnotics (propofol) and neuromuscular blocking agents (rocuronium), movement can be distinguished from rest with high accuracy.
For eight of the 12 subjects it was possible to 'transfer' between the baseline state and the highest level of sedation. In other words, a classifier trained on the subjects' data obtained prior to administering propofol was able to detect the movements at the target concentration of 1.0 μg ml −1 with an accuracy above chance level (p<0.05). This fact is useful for subsequent steps in development of the paradigm, specifically for determining the most efficient way of system calibration. The transfer of the BCI from a baseline condition to the sedation conditions is relevant because of the inter-and intrasubject variabilities in the brain signal. Currently, most BCIs require a calibration phase for each individual user and session. Especially in the developmental phase of BCI paradigms, system calibration is an essential step. In the near future however, end-user applications may no longer require this phase, as novel methods are being developed in which a generic classifier can be applied to every user's data. This means there is not only a transfer between different states in an individual user, but also between users. Promising results have recently been reported on such so-called zero-training BCIs for a spelling paradigm [25], and also movement-related BCIs may be feasible without (or with very limited) userspecific calibration [26][27][28][29].
The successful transfer was partly based on adequate compensation of propofol-induced effects. Most conventional EEG-monitors in anesthesia, like BIS or entropy module, use only frontal electrodes to detect drug-induced EEG changes. However, hypnotic drugs like propofol have substantial effects on the EEG measured at other electrode locations as well [30]. Accordingly, we found a clear propofol effect with a β-increase at the central electrodes, where the main EEG  effect of (attempted) movement is located. By baselining each individual trial this increase was cancelled out. As a result, the classifier only took into account relative motor response effects. Because the exact change of the background EEG does not have to be known, this crucial compensation seems to be a large benefit of using a BCI algorithm during general anesthesia. However, the baselining procedure may introduce other issues, as discussed further on.
Unexpectedly, for four subjects (33%), a classifier trained on the baseline data could not distinguish between movement and rest at a propofol effect-site concentration of 1.0 μg ml −1 . If these subjects were excluded, the transfer classification performance would increase from 72% to 79% (95% CI 74%-85%). Studying the properties of these trials may shed light on how propofol-induced sedation interferes with sensorimotor integration. However, for a full understanding, future studies will be required that directly target at addressing these issues. Nevertheless, careful analysis of the available data showed that a few patterns emerged. Speculating on the reason for the low classification performance for these four subjects, we point out a few aspects.
First, visual inspection of the individual time-frequency spectra of these subjects revealed a large reduction or even absence of the ERD/ERS pattern at the effect-site concentration of 1.0 μg ml −1 . Second, the individual differences in the propofol-induced β-increase should be considered. For S2, the largest relative increase in power between 14 and 22 Hz was found (power at 1.0 μg ml −1 target concentration was more than 200% of the power at 0.0 μg ml −1 target concentration). A large increase was also seen in S1 and S3, but the same was true for S7, S10 and S11 (all between 150% and 190%). While overall the performance was largely increased by adding the baselining procedure, in some cases this may in fact have attenuated the ERD/ERS response altogether. Third, remarkably, the first three subjects entering the study all belonged to the group of four subjects with deviating results. They were the only participants in whom we attained the highest propofol concentration, i.e. 1.5 μg ml −1 . At this effect-site concentration, all three subjects became unable to follow commands. Afterwards, none of these subjects exhibited recall of events from after administration of this third and final dose. Fourth, the four subjects for whom the transfer classification performed below chance level were the four who had the largest increase in both mean movement onset times as well as the largest spread between said onset times. For two of these subjects, S2 and S6, the mean EMG amplitude at 1.0 μg ml −1 target concentration was less than 50% of the amplitude at 0.0 μg ml −1 target concentration, which may point to a change in task intention (note that a reduction or absence of motor output itself does not necessarily mean a reduced brain response; the intention itself seems to be the most important factor [13]). Fifth, S1, S2 and S3 had made relatively many errors in executing the movement task as compared to the remaining subjects at 1.0μg/ml target concentration, while S6 had made the most mistakes at 0.5 μg ml −1 target concentration. This may indicate a misinterpretation of sensory information and therefore a lower level of awareness and command following. Finally, the mean BIS value at 1.0 μg ml −1 target concentration was lower for these four subjects (80.6) than the mean of the other eight subjects (84.4).

Models of consciousness
The current study was based on the assumption that a patient under general anesthesia would either be unconscious and hence not move while being stimulated, or the patient would be conscious and move, unless paralysed. Our findings indicate that instead a more detailed model of consciousness should be adopted.
While the above observations may not be sufficient to draw any hard conclusions, they do indicate that the group of four volunteers may have been close to a state described by Pandit [31] as 'dysanesthesia' at the target concentration of 1.0 μg/ml. In this state, one may respond to simple commands but not to surgical stimuli. There is a certain degree of consciousness, but perception and sensory input are uncoupled such that memory formation is unlikely. This state is, according to Pandit, the minimum requirement for satisfactory general anesthesia [32]. Our findings seem to be in line with the functional model of consciousness proposed by Pandit as well as his view on the isolated forearm technique (IFT, [33,34]).
While reviewing results from IFT, Pandit has suggested that patients may have retained some limited capacity for responsiveness to simple command, but that this does not always mean consciousness. The method of IFT is simple but its interpretation is controversial. One arm of the patient is isolated from the circulation so that it remains unaffected when a neuromuscular blocker is administered. The patient is asked to respond to command by moving the unparalysed hand. An awake patient would thus be able to communicate his/her state of awareness. According to the Global Workspace Theory specialized neuronal networks can execute movements on verbal command without the subject being conscious. This could explain why in some IFT studies patients responded to command, but did not show any spontaneous response to surgery [31]. In addition, movements during IFT are not correlated with any recall of events in most of the studies on IFT [35].
It might be hypothesized that in our study the movements were executed on a subconscious level at a certain propofol target concentration. For the four volunteers showing-to a certain extent-deviating results, the progressive loss of the respective functions constituting consciousness could have been more rapid than in the other participants. If Pandits model is right and if the group of four shows signs of dysanesthesia, then they would no longer belong to the primary target population of our research project.
As the BCI detects real cortical involvement during movement, it may be an even better measure of intraoperative awareness than the BIS or the IFT. Nevertheless, we must recognize that the state of dysanesthesia might represent a precursor for awareness [32].

Limitations and future research
To find more conclusive answers on the matters discussed here, future studies on our proposed paradigm could be expanded with behavioral measures to track memory formation after drug administration. Moreover, it would be highly interesting to measure the EEG above the motor cortex during the IFT and to correlate this with memory formation. A difficulty presented by that type of research setting, however, is that it is very sensitive to the Heisenberg uncertainty principle with a marked observer effect: every behavioural measurement of consciousness itself is potentially altering the state of consciousness by being an arousal stimulus.
Despite this study being conducted in an operating room, the question still remains to which extent this controlled research setting mirrors the real clinical situation. The focus of a patient awakening during surgery who follows movement commands for seconds to escape this situation is different from that of a paid volunteer performing a non-demanding repeating task in a non-stimulating environment for more than an hour. This may explain the marked sedation of the volunteers at relatively low propofol concentrations, as the actual state of sedation is always the result of the balance between sedating drug effect and environmental arousal. For example, Röpcke and colleagues [36] studied the effect of surgical stimulation on the EEG. They found a shifted doseresponse relationship for the effect of anesthetic drugs on the EEG depending on the presence or absence of surgical stimulation.
This study was a first exploration as to whether motor response detection after hypnotics administration may be feasible. Meanwhile, BCI research is continually bringing forth further advancements, of which many may be applied to the paradigm proposed here. For example, improvement of motor detection performance could be gained by taking into account multi-trial classification (i.e. increasing the amount of information the classifier uses for making its decision) and by adapting the system to achieve a certain true positive/false positive trade-off rate. Moreover, with the development of more advanced EEG systems a reduction in setup time and signal noise can be expected in the near future. The introduction of wireless and dry electrodes will also mean an improvement in comfort and user-friendliness, which is important for clinical use [37][38][39].

Conclusions
To conclude, despite a clear effect of propofol on the EEG, changes in sensorimotor rhythms could still be detected in sedated volunteers. These findings are encouraging for the further development of a BCI for detecting attempted movements during intraoperative awareness. Importantly, in contrast to existing monitors, it is based on active communication by the patient, rather than a passive interpretation of the brain signal. However, alongside further technical development of the proposed system, a more precise model of the relationship between motor responses and consciousness is required. Because some volunteers moved without a clear correlated EEG response, future studies are needed to investigate the exact state of consciousness in these cases, as well as its clinical relevance for intraoperative awareness. This provides an opportunity for deeper insights in this challenging field of research. At the very least, anesthesiology research could benefit from BCI technology in general. Much insight could be gained by connecting with this field of research as anesthesiologists and BCI experts pursue a similar goal: communication by patients in a conscious locked-in state.