Decoding hand movements from human EEG to control a robotic arm in a simulation environment

Objective. Daily life tasks can become a significant challenge for motor impaired persons. Depending on the severity of their impairment, they require more complex solutions to retain an independent life. Brain-computer interfaces (BCIs) are targeted to provide an intuitive form of control for advanced assistive devices such as robotic arms or neuroprostheses. In our current study we aim to decode three different executed hand movements in an online BCI scenario from electroencephalographic (EEG) data. Approach. Immersed in a desktop-based simulation environment, 15 non-disabled participants interacted with virtual objects from daily life by an avatar’s robotic arm. In a short calibration phase, participants performed executed palmar and lateral grasps and wrist supinations. Using this data, we trained a classification model on features extracted from the low frequency time domain. In the subsequent evaluation phase, participants controlled the avatar’s robotic arm and interacted with the virtual objects in case of a correct classification. Main results. On average, participants scored online 48% of all movement trials correctly (3-condition scenario, adjusted chance level 40%, alpha = 0.05). The underlying movement-related cortical potentials (MRCPs) of the acquired calibration data show significant differences between conditions over contralateral central sensorimotor areas, which are retained in the data acquired from the online BCI use. Significance. We could show the successful online decoding of two grasps and one wrist supination movement using low frequency time domain features of the human EEG. These findings can potentially contribute to the development of a more natural and intuitive BCI-based control modality for upper limb motor neuroprostheses or robotic arms for people with motor impairments.


Introduction
Motor impairment has a significant effect on a person's daily life. Depending on the severity of their impairment, persons may not be able to walk, eat, drink or even brush their teeth without the help of a caregiver anymore. Motor impairment can have a broad variety of causes ranging from severe trauma to the spinal cord (SCI) and neuropathological conditions to stroke. Naturally, affected persons seek intervention to cushion the resulting effects such as muscle and tendon transfers for tetraplegic SCI persons [1][2][3][4], extensive stroke rehabilitation [5] or in the case of motor neuron diseases, delay and reduce its symptoms [6,7]. When surgical or physiotherapeutic interventions reach their limits, assistive devices attempt to bridge the gap towards a comparatively independent life. The more severe the grade of impairment, the higher the need for more customized assistive devices become.
Non-invasive brain-computer interfaces (BCIs), though still in a prototype stage, can potentially provide a customized control modality for even the most severe cases of motor impairment. They attempt to decode brain signals acquired in real time ('online') by using the electroencephalogram (EEG). Using state of the art machine learning methods [8,9], a control signal can be generated for controlling assistive devices [10], e.g. a robotic arm [11] or an upper limb motor neuroprosthesis [12][13][14][15].
So far, BCIs intended for control of assistive devices often relied on repetitive mental imagery (MI) and oscillation based features for generating control signals [12][13][14][15].
In the field of stroke rehabilitation, Mrachacz-Kersting et al already successfully integrated an MRCP-based BCI in lower limb rehabilitation for stroke survivors: the BCI decodes in real-time EEG correlates of stroke patients performing lower limb movements, which in turn activates non-invasive transcranial magnetic stimulation (TMS). Their results show in both chronic and subacute patients neuroplastic changes and further significant improvements in regaining movement functionality (clinical scales) [31][32][33][34]. In the same field of research, this approach is already investigated for upper limb movements [35].
MRCPs are also investigated for the purpose of control. Especially in the case of persons with high spinal cord injury, BCIs are primarily intended to control artificial limbs, such as robotic arms [11] or upper limb motor neuroprosthesis [36]. Studies conducted in non-disabled populations have shown offline that MRCPs hold sufficient information to decode upper limb movements [25] including complex reach and grasp movements [23,24,37]. However, to our knowledge only one proof-of concept study with one participant has applied this online in a BCI [38]. Recently, Ofner et al showed offline that MRCPs of tetraplegic end users (n = 10) still retain sufficient information for decoding upper limb movements [38]. Additionally, they showed in a proof-of-concept study asynchronous online decoding of hand open vs. palmar grasp attempts in one participant with tetraplegia.
Their offline analysis further revealed that the EEG potentials associated with the motor task in a cue-locked paradigm are contaminated by potentials which are related with the processing of the cue itself. This effect can be problematic: if one wants to develop an online classifier for asynchronous use, the EEG potentials around the movement onset in a cue-free scenario consist solely of the MRCPs itself, without time-locked influences of visual cues (see also [25,39]). It is therefore imperative to study new possibilities to gather calibration data that is equally properly labelled, but in which movement-related features are not masked by the presentation of cues.
Hence, the aims of our current study were twofold: First, while most studies have investigated MRCPs for upper limb decoding rely on offline analysis, we wanted to assess the feasibility of MRCPs in an online system, i.e. allowing for BCI control. Our second goal was to minimize the influence of discrete visual cues in the EEG signals, since such cues could mask discriminable information in the low-frequency time-domain. Therefore, we measured 15 healthy participants that performed three different hand movements of daily life: (i) palmar grasp, (ii) lateral grasp and (iii) wrist supination. We presented the instructions in a realistic simulation environment, engaging study participants in daily life actions (e.g. grasping a glass with a palmar grasp). After recording data for a calibration (calibration phase), we used features extracted from the LFTD to train a classification model. In a subsequent evaluation phase, we gave discrete feedback based on the participants' hand movements and evaluated the performance of the three-class online classifier.

Participants
Fifteen healthy participants aged between 21 and 35 years (median 26, eight male, seven female) took part in the experiment. The study was approved by the local ethics committee of the Medical University of Graz. Participants were briefed about the aims of the study and gave written informed consent to participate. They also received monetary compensation for their efforts. To evaluate their handedness, we performed the three stage hand dominance test developed by Steingrübler [40]. The test assesses the individual hand dominance quantifying the results of three exercises: (i) draw a line within a prescribed path, (ii), dot unaligned circles and (iii) dot horizontally aligned squares. Results show that 13 participants were right-handed and two left handed (see stacks.iop.org/JNE/17/036010/mmedia supplementary table 1 for detailed results).

Experimental setup and paradigm-simulation of daily activities
We conducted all recordings at the BCI-Lab of the Institute of Neural Engineering at Graz University of Technology. Participants were seated in a noise and electromagnetically shielded room to facilitate a stable measurement environment. A monitor was placed in front of them which showed the paradigm. Participants positioned their right hand in an upright position comfortably on the armrest of the chair.
We designed a simulation environment for presenting instructions in a daily life setting: A motor (Center (B)) Experimental paradigm. Each trial started with the robotic arm moving towards the presented object in the center of the screen (2 s duration). Shortly before the hand interacted with the presented object (0 s), it stopped (CUE) and the study participant was tasked to finish the interaction (e.g. grasping the glass in the palmar grasp condition) and to hold the final position until the inter trial interval (second 3, Break). In case of trials with feedback (evaluation phase), feedback designated to the object was given. However, for an incorrect classification, the robotic arm performed a waving movement in the horizontal plane. (Bottom (C) Experimental timeline: Starting with a practice run we recorded 4 runsá 15 trials per conditions (TPC) without giving feedback. After the break, we evaluated the classification model in 3 runsá 15 TPC. In total, each experiment lasted for about two hours. The simulation environment is used with permission from the Institute of Neural Engineering, Graz University of Technology, Stremayrgasse 16/IV, 8010 Graz, Austria. impaired avatar with a robotic right arm reaches for objects of daily life presented on a table. In front of the avatar we showed one of three objects in random order, (i) a glass, (ii) a bowl of soup with a spoon and (iii) a radio with knobs ( figure 1(A)). At the beginning of each trial the robotic right arm of the avatar started moving towards the designated objects, but stopped shortly before interaction (CUE). We instructed the participants to finish the designated movements with their own right hand (see figure 1(B)). For the (i) glass, a palmar grasp, for the (ii) bowl of soup, a lateral grasp and for the (iii) radio, a wrist supination. Participants held the movement until the end of the trial time = 3 s and went back to the starting position (start of inter trial interval). The object on the desk vanished (time = 3 s) and an inter-trial-interval of random length between 2 and 3 s followed. Before the start of the actual recording, each participant performed a practice run for performing the movements correctly and to avoid artifacts in subsequent runs. This training run was not part of any subsequent analysis.
We organized the experiment in two consecutive phases: calibration and evaluation (see figure 1(C). For the calibration phase no feedback was given to the participants and the trial ended 3 s after the robotic arm of the avatar stopped before the object. However, in the evaluation phase participants received feedback based on their actions online. Whenever a participant's movement was recognized correctly, the avatar's robotic arm completed the designated movement. In case of the (i) glass, and (ii) spoon the hand grasped them and brought them towards the avatars mouth, in case of the (iii) radio, the robotic arm turned the knob on the radio. In the case of an incorrect recognition, the avatar's arm performed a repetitive shaking movement in the horizontal plane.
In this manner we recorded 4 runs with 15 trials per condition (TPC) for the calibration phase (in total 60 TPC). At the beginning, half time and end of calibration phase we recorded 3 min of rest as well as 2 min of eye movements or blinks using a cue-guided paradigm presented in [41,42].
Using the data acquired in the calibration phase, we trained a classification model. In the subsequent evaluation phase, we recorded 3 runsá 15 TPC, where we gave feedback to the participants.

Data recording
We recorded EEG with 57 active electrodes covering frontal, central, parietal and temporal areas according to the 5% layout described by Oostenfeld and Praamstra [43]. Additionally, 6 electrodes positioned at the outer canthi, infra and superior orbital to left and right eye were used for recording ocular activity (EOG). However, EOG recordings were not part of the analysis described in this work. EEG and EOG were recorded using four biosignal amplifiers (g.USBamp) and a g.GAMMAsys/g.LADYbird active electrode system (g.tec medical engineering GmbH, Austria). Signals were recorded with a sampling rate of 512 Hz and prefiltered using an 8th order Chebyshev filter in the range of 0.01 to 200 Hz. A photodiode was positioned on the screen to measure the exact cue onset (the stopping of the hand). In addition, we recorded hand movements during the experiment using a data glove (5DT Technologies, Orlando, CA, USA). Data recording and synchronization was achieved via TOBI Signal Server [44] and MATLAB 2015b (Mathworks, Natick, MA, USA) . The online evaluation was implemented in Simulink (Mathworks, Natick, MA, USA). For sending commands and receiving timed triggers between the online evaluation and the paradigm, we used a customized protocol based on TCP/IP.

Movement detection and artefact avoidance and rejection strategies
For determining a reliable single trial movement onset, we used the participant-specific movement data recorded by the data glove. We evaluated 15 sensors positioned at the joints of the finger phalanxes. We epoched all movement trials of the calibration dataset from −3 to 3 s with respect to the movement onset. To reduce the dimensionality of the data, we performed principal component analysis (PCA) on the movement data for each condition and used the first component to extract the movement onset.
To avoid movement-related artifact contamination of the calibration data, our strategy in this experiment was twofold: First, we carefully instructed participants to fixate their gaze on the object presented on the table and to avoid any unnecessary body and eye movements during the trial phase. As a second step we performed steps to exclude potential artifact contaminated trials from the calibration set [45][46][47]. We rejected contaminated trials using statistical methods. Concretely, we filtered all available EEG data between 0.3-35 Hz and epoched each trial from [−1 2] s with respect to the movement onset. Thereafter we rejected trials based on amplitude threshold (exceeding limits of ± 125 µV), channel variance, abnormal joint probability and abnormal kurtosis. For the latter three, we used four times the standard deviation as a threshold for trial rejection. On average we retained 52 trials per condition of the calibration data.

Offline single-trial multiclass classification and calibration
We used the data of the calibration phase to train a classification model for the subsequent online evaluation. After excluding any potential artifact contaminated trials, we causally filtered the raw EEG using a 4th order Butterworth filter in the range between 0.3 and 3 Hz. Additionally we applied common average reference (CAR) filtering and resampled the signal to 16 Hz to ease computational load. Previous studies [24,25,46,48] have shown that the most discriminant features for decoding upper limb movements in the low frequency time domain can be found within the first second after the movement onset. Therefore, we defined for each trial a window of interest (WOI) from [0 2] s with respect to the movement onset calculated from the data of the data glove (For offline analysis, we extended the WOI to [−1 2] s). For each participant we epoched trials according to the WOI and divided them in a training and evaluation set using a 5 × 5 cross validation procedure. For each timepoint within the WOI we calculated a shrinkage linear discriminant analysis classification model (sLDA) [49] using the training set and evaluated its performance on the evaluation set. As features, we used the amplitude values from each channel extracted in steps of 0.125 s of the preceding second with respect to the actual investigated time point [−0.975: 0.125:0] s. In this way, we extracted 8 features per channel resulting in a total of 8 × 57 = 456 features per trial (observation). As a measure of performance, we used the average accuracy on the evaluation folds of the cross validation. The classification model of the time point yielding the highest classification accuracy was then further used in the online BCI evaluation.

Online evaluation
The online BCI model was implemented in Simulink (Mathworks, Natick, MA, USA). Communication between BCI and the Unity based paradigm was done via a customized protocol based on TCP/IP. The incoming EEG was causally filtered using a 4th order Butterworth bandpass filter in the range of 0.3-3 Hz and resampled to 16 Hz. Thereafter, we applied CAR filtering on the signal. Features were again extracted in 0.125 s steps from the preceding one second [−0.975: 0.125:0] s (whereas 0 s is the actual sample).
The previously calculated sLDA classification model was used to continuously discriminate the input between conditions. Shrinkage based LDA is widely used in the field of BCI research [9,49]. However, so far it has not been applied in combination with MRCPs as features to decode singular arm movements online on a large population.
Final discrimination between conditions was achieved by averaging the linear distances of the last three classified samples and selecting the condition with the maximum distance. Eventually, we determined the condition by using a discrete time point in the trial. To determine this time point we deliberately did not use any movement data potentially provided by the data glove to detect the movement onset. Instead we used the time point where the robotic arm of the avatar stopped its movement (CUE) as a reference. Additionally, we appended a participant specific delay which was calculated from the calibration data: With respect to the CUE we added (i) the mean difference between the movement onset and the CUE onset, (ii) network delay and (iii) the time of maximum performance of the classification model.
We gave immediate feedback based on the output of the classifier of this time point. In correct classified trials the avatar completed the movement, otherwise the avatars hand performed a shaking movement in the horizontal plane.
Additionally, we implemented this BCI also as an offline simulation. Using the Evaluation Data set we replaced the estimated onset point with the actual movement onset extracted from the data glove data and compared the achieved performances between real and estimated movement onset.

Analysis of the movement-related cortical potentials (MRCPs)
We analyzed the low-frequency EEG correlates of both calibration and evaluation datasets. We filtered the EEG using a causal 4th order Butterworth bandpass filter in the range between 0.3-3 Hz and resampled it to 16 Hz to reduce computational load. Thereafter we applied CAR filtering and epoched the EEG into trials starting from [−2 2] s with respect to the movement onset acquired by data from the data glove. We were interested in the differences between conditions as well as the differences between the data acquired from calibration and the evaluation phase (non-feedback vs. feedback). For each participant, we calculated the participant specific averages for each condition and its 95% confidence interval using tpercentile bootstrap statistics (alpha = 0.05). We then calculated the group average over all participants.
Additionally, we calculated topographical maps of the grand averages for each condition and their differences. This approach closely follows the analysis described in [46]: differences were calculated by subtraction (e.g. cond(A)-cond(B)) and visualized using the EEGLAB toolbox [50]. To assess significant differences between conditions we used non-parametric paired sample two-tailed permutation tests based on t-statistics (alpha = 0.05) [51]: in steps of 0.125 s we performed individual tests per time point and channel. In 5000 permutations, we applied t-statistics, extracted the maximum t-statistic (t-max) for each permutation and generated a t-max reference distribution which is already adjusted for false discoveries [52,53]. We eventually visualized significant different channels in the topographical difference plots between conditions.

Single trial classification
The analysis of the single trial classification results followed two consecutive steps: First, we evaluated for each participant the results of the 5 × 5 cross validation on the calibration set. Second, we evaluated the online results (evaluation set). Figure 2 (left) shows the grand average of the best performing classification mode) and its time point of maximum accuracy, which was 56.5% at around 1 s after the movement onset. The confusion matrix in figure 2 (middle) depicts the grand average of the participant-specific peak accuracy (row-wise normalized). On average, true positive rates (normalized true positives in percent, TPR) are between 54% and 64% (supination highest with 63.8%). False positive and false negative rates (normalized false positives/negatives in percent, FPR/FNR) between grasps (finger joints versus finger joints) yield around 25% whereas they are lower for grasp versus wrist supination comparisons (finger joints versus wrist joints), with around 19%. On the right side of figure 2 we show the confusion matrix for the performance of the BCI. In comparison to the calibration phase the TPRs decreased leading to a decreased classification performance for all conditions-most notably for the lateral grasp condition which decreased by more than 20% in TPR. Table 1 illustrates all classification results of both calibration and evaluation phase on the participant level. As for the calibration phase, all participants scored better than chance level which was at 44.4% (adjusted Wald interval, alpha = 0.05 [54,55], corrected for multiple comparisons, n = 48). Peak accuracy was in the range from 47% (e.g. participant S13) to up to 76.5% (participant S05) and were achieved in the first second after the movement onset (STD ± 0.35 s, table 1, 2nd column). For the online classification of the evaluation phase we did not rely on the movement onset anymore, rather than a combination of visual stopping cue of the paradigm, the participants individual reaction time and a technical network delay. The overall delay (participants' reaction time to the CUE plus the technical delay) and final classification time point for the online evaluation can be found in columns 4 and 5 of table 1. While the technical delay was 0.11 s ± 0.01 s, the reaction time to the cue was participant dependent for each participant. The last two columns of table 1 show the results of the online evaluation. With the exception of S13, all participants scored significantly better than chance (chance level 40.4% adjusted Wald interval,   In an additional analysis we created an offline BCI simulation and used as time-locking point not the estimated movement onset as in the online BCI rather than the real movement onset calculated from the data glove data. Results indicate that when timelocking on the real movement onset a significant (Wilcoxon rank sum test, p < 0.05) performance increase of about 4.5% could be reached. Detailed results can be found in the supplementary section 3. Figure 3 depicts the MRCPs in the low frequency range from 0.3 to 3 Hz. We show the grand average MRCPs for each condition over all participants as well as the 95% confidence interval of the mean calculated using non-parametric t-percentile bootstrap tests. We show the MRCPs on the channels over the central motor cortex (C1, Cz, C2). We defined the time window of interest as [−2 2] s with respect to the movement onset for both calibration and evaluation data sets. Furthermore, we investigated the evaluation data set further when time-locking to the visual CUE, with a time-window of interest [−2 2] s.

Movement-related cortical potentials (MRCPs)
For both data sets and time-locking points, a negative deflection (Bereitschaftspotential) [17], can be observed starting before the movement onset (strongest for lateral grasp condition) followed by a positive rebound around 1 to 1.5 s after the movement onset. This rebound is pronounced stronger on the evaluation set. On a grand average basis, no significant differences between conditions can be observed. Apart from that, we found a strong lateralization effect towards the contralateral side (left) to the executing hand (right).
Especially for the evaluation data set (figure 3, rows 2 & 3), the confidence intervals for all conditions are broader, especially around 0.8 s after the movement onset which falls in line with the time period where feedback was presented to the participants. When time locking on the visual CUE rather than the real movement onset, the negative deflection of the Bereitschaftspotential shifts by 0.3-0.4 s, which is explained by the reaction time of the participants, but also its intensity is diminished by around 1 µV. The positive rebound effect remains the same. Figure S1 (see supplementary material) shows the grand average for each condition on the topographical level for the Calibration and Evaluation data set. Time = 0 s represents the movement onset acquired using data of the data glove.
Additionally, we investigated both calibration and evaluation data sets for differences between conditions on a topographical level. We calculated this difference by subtraction of two conditions (e.g. cond(A)-cond(B)). Figures 4 and 5 show these condition based differences for the calibration and evaluation data sets in the range from [−0.5 1.5] s with respect to the movement onset. Black dots on the topographical plots notate channels which show significant differences between conditions  (assessed using permutation tests based on t-statistics, p < 0.05 [51],). We also analysed each condition on a topographical level separately (see supplementary figure S1).
For the calibration data set, significant differences can be found in all condition combinations, especially between the palmar and lateral grasp conditions (row 1): Before the movement onset (−0.25 s) significant differences emerge on central-parietal areas (channels CCp3 h, CP2). After the movement onset (0.125 to 1 s), a lateralized pattern emerges at the primary motor cortex at channel locations C1 and C3. For combinations between both grasps and wrist supination, we found a pattern around 0.5 s after the movement onset over central/central parietal areas. These differences become significant for both grasping conditions versus wrist supination on the contralateral side at location CP3 h.
When looking at the topographical difference plots of the evaluation data set, in general, the difference patterns are similar to the calibration dataset, but less pronounced. Hence, for palmar versus lateral grasp differences, the differences in the contralateral areas of the motor cortex turn out not to be significant anymore. Contrary, for grasp conditions versus wrist supination, the differences found in the evaluation set are similar to the findings in the calibration set in both distinction and timing. For palmar grasp versus wrist supination additional significant differences in central frontal channels (Fz, FFC2 h) can be found.

Discussion
In this study we could show the successful online decoding of three upper-limb movements (palmar grasp, lateral grasp and wrist supination), using low frequency time domain features of the human EEG. For all 15 study participants we gathered a set of calibration data to determine the best performing time point and classification model. Offline analysis of this data yielded a peak accuracy of about 60% (± 7.2%) (three condition problem, adjusted significance threshold 44%) about 1 s after the detected movement onset. When using the obtained classification model in the subsequent online BCI scenario, 14 out of 15 participants could retain better than chance performance with an average of 65 correctly classified trials out of 135 trials (48% correct trials, adjusted chance level 40%, alpha = 0.05). Underlying movement-related cortical potentials show no indications of being masked by VEPs at the movement onset. Moreover, significant differences in the calibration data between conditions in the first 0.5 s after the movement onset are mainly located over contralateral sensorimotor areas. These differences are retained to a large extent when looking at the data gathered from the evaluation phase. In either case, these differences lie within the same time period which was used to train the participant specific classification models.

Movement-related cortical potentials
Contrary to our initial approaches [24,25,46,48], we refrained in this study from using non-causal (zero-phase) filtering approaches to be homogenous in preprocessing for both offline and online application. However, when plotting the EEG potentials, one needs to be aware that this processing does not account for additional filter effects such as e.g. phase shifts, which have a potential influence on the signal.
Analysis of the grand average of the MRCPs shows similar morphology for all three investigated conditions: Shortly before the movement onset a negative deflection from the baseline starts, culminating in a negative peak which is characteristic for the Bereitschaftspotential [16,17]. In this case, the negative peak happens after the movement onset rather than before, which we attribute to a delayed onset detection of the data-glove.
The peak negative deflection is lateralized (lateralized readiness potential (LRP)) [46,56], meaning that the negative deflection is stronger on the contralateral side of the executing (right) hand. Following the negative deflection, a strong positive swing can be observed, which peaks around 1 s after the movement onset and is more pronounced on both grasp conditions (see supplementary figure 1) than the wrist supination condition. Furthermore, this positive swing is stronger pronounced in the evaluation data set than in the calibration data set.
Though we did not encounter this positive swing in previous works [24,25,46], we attribute this as an effect of the visual paradigm and feedback presentation as well as the causal filtering approach. When looking at figure 3, 2nd row, the confidence interval becomes considerably broader around 1 s after the movement onset, especially for channels Cz and C1 which we also attribute to feedback presentation.
We were also interested in changes in the MRCP morphology when time locking on the CUE (the robotic arm stops before the interaction with the objects) rather than the calculated movement onset from the data glove (see figure 3 rows 2 and 3 for comparison): our analysis shows, apart from a delayed negative peak of the Bereitschaftspotential (due to reaction time to the CUE), that the morphology of the MRCPs is still preserved, with only a minimal decrease in grand average amplitude.
Naturally, we were also interested in the differences between conditions. Our analysis of the calibration data in channel space shows that the main differences can be found within the first 0.5 s after the movement onset, mainly over the contralateral primary motor cortex (locations C3, C1). Only for the grasp versus grasp comparison, significant differences can already be found 0.25 s before the movement onset. These findings go in line with the results shown in Ofner et al [25] and Itturate et al [23] who both report similar findings regarding effect timing and location. Moreover, we could show that these differences are also still present in the online experiment, though the patterns are diminished. Especially for the grasp versus grasp comparison, no significant differences can be found. On the other hand, we see additional differences over the frontal area (Fz), for grasp conditions versus supination around 0.5 s after the movement onset. Though they only become significant for palmar grasp versus supination, these differences can also be observed in the lateral versus supination comparison.
Summarizing, we found significant differences between different grasp conditions within the first 0.5 s after the movement onset, mainly over the contralateral sensorimotor areas. This finding goes in line with the findings of Agashe et al [28] in grasps (five grasping tasks, information content peaks around 250 ms), as well Ofner et al [25] who investigated a set of upper limb movements (six movements).

Single trial classification
Our offline results for the calibration data show that the movement decoding performance was about 60% over all participants (chance level~44%, adjusted Wald interval, alpha = 0.05). These findings are within the same range as the performances achieved in [23-25, 46, 57, 58]. However a direct comparison is difficult since the number of conditions, trials per condition and especially the paradigm greatly differ.
Peak accuracies were found on average one second after the movement onset. Our classification model was trained using features of the preceding 1 s time window, which includes the time frame where we found significant differences over the contralateral sensorimotor areas between conditions. Analysis of the offline grand average confusion matrix of the participants peak accuracy showed that false positive and false negative rates between grasps (finger joints versus finger joints) were higher than for grasps versus wrist supination (finger joints versus wrist joints). This confirms the findings made by Ofner et al [25] who also found these error rates highest for conditions involving the same joint (e.g. hand open vs hand close; wrist pronation vs. wrist supination).
In the online evaluation, participants scored on average 65 out of 135 trials correctly (~48%, adjusted chance level 40%, alpha = 0.05). Fourteen out of fifteen participants scored higher than chance, however compared to the offline results, performance decreased by about 11%. When looking at the TPR of the confusion matrix, we see that the TPRs for the lateral grasp and supination conditions have dropped by 10% to 20%. Furthermore, true positive and false negative rates for all conditions are now in the same range. When transferring an offline calibrated classification model to online use, a certain drop in performance is to be expected [59]. However, in this cuebased online scenario, several additional factors have to be taken into account: (i) MRCPs are a time and phase-locked phenomenon [16,17]. For the online BCI scenario, we estimated this time point using participant specific behavioural data (timing between the stopping point of the robotic arm and the participants actual movement onset on the calibration data), which is afflicted with a certain variance. Although we attempted to compensate by smoothing the classification output, the classification output is still prone to deviations in the exact timing of the task execution.
To fully understand the impact of using this estimated onset, we performed an offline BCI simulation using the evaluation data set (see supplementary chapter 3 and table S2): We replaced the estimated onset with the real movement onset extracted from the data glove and recalculated the classification accuracy using the same classification model: Results indicate that the overall classification improved significantly (Wilcoxon rank sum test, p < 0.05) from previously 65 to 71 out of 135 (45 TPC) correctly classified trials, which represents a performance increase on average of about 4.5%. We realize that this offline simulation cannot account for feedback-dependent effects such as e.g. showing more positive feedback due to improved classification or improved motivation, however, it underlines the importance of an adequate time-locking point for BCI classification.
(ii) With the presentation of feedback to the participants, we introduced an additional variable potentially influencing the performance of the participant specific classification model. The analysis of the MRCP for the evaluation data set shows that the positive deflection starting around 0.5 s after the movement onset is more pronounced than in the calibration phase. Additionally, channels in central frontal areas (Fz, FFC2 h) show increased activity which are both factors potentially influencing the classification performance. Further studies need to investigate whether this effect can be attributed to e.g. a change in state of mind (excitement, pressure to perform) or feedback presentation.
In either case, the BCI implemented in this study relies on a fixed classification model based on the calibration set data. Studies have shown that there is evidence that co-adaptive training approaches can potentially remedy the performance loss from offline to online BCI models [45,47,[60][61][62]. In a co-adaptive BCI concept not only the machine learning algorithm is acknowledged as a 'learner' but the users operating the BCI too: both parties are engaged in a closed loop mutual learning environment: A co-adaptive BCI collects data online and adapts its classification models in operative use, while users adapt to the feedback received by the BCI. In this way, performance loss due to changes in brain patterns (e.g. by feedback presentation or EEG nonstationarities) could be attenuated due to the co-adaptive training [63,64]. However, to our knowledge, the co-adaptive training approach has only been applied on BCIs using non-phase locked, oscillation based features and it remains to be seen if this concept can be translated using MRCPs seamlessly.
So far, only few non-invasive EEG studies have successfully shown online decoding attempts of upper limb movements/grasps using MRCPs as features for discrimination.
Ofner et al [38] could show in a self-paced proofof-concept online approach in one SCI end user to successfully discriminate between opening and closing the hand. Unfortunately a direct comparison is not possible due to substantial differences in the approach and paradigm (e.g. self-paced vs. cue paced). When comparing in general with the online performances of BCIs, e.g. oscillation based approaches based on repetitive mental tasks presented in [63,[65][66][67], the results of this study are below the average of 75% peak for two conditions (see [59]).

Study limitations
In our current study, we show in a cue-based scenario that online decoding of grasp and hand movements is possible. However, the approach still contains considerable constraints and challenges before a stable BCI control for robotic arms or upper limb motor neuroprostheses conceivable.
For training the classification model, we still relied on the real movement onset, a parameter which is not necessarily available for the targeted end user population. While we compensated for this in the evaluation phase by using the CUE as time-locking point, this contributed to a decreased performance.
The main challenge still remains in improving the decoding performance of the BCI, especially when exploiting the low frequency time domain for discriminable features. Though the results in this study confirm that discriminable information can be found in MRCPs and transferred to an online BCI, its performance is rather low. Various studies by Itturate et al [23], Vuckovic et al [68] and Jochumsen et al [26,37] have already shown by offline analysis that additional discriminable information between grasp and hand movements can also be found in alpha and beta range [69]. We have investigated the combination of time domain features extracted from MRCPs with frequency domain features from alpha and beta range [48]. Though it did not have a substantial effect on grasp versus grasp classification, it led to an increased decoder performance in movement detection against the rest condition. In the current study, we used a cue-guided protocol which allowed us to have a fixed time-locking point rather than detecting the occurrence of the grasp in an asynchronous way. In a scenario of daily life, these reference points would be absent, and any classification model applied would continuously process the data for detecting any upper limb movement intention (e.g. a continuous classification of movement versus rest). However this was not subject to the actual study since our goal was to show the feasibility of grasp discrimination using EEG signals.

Transfer to end users
We conducted this study as a precursor for investigating MRCP-based BCIs for control for severely motor impaired end users (e.g. users with high spinal cord injury). Therefore one of our main interests in this study was to determine whether the discrimination of hand/arm movements can be done on an online BCI control scenario in healthy participants. Now that we showed the feasibility of the approach in healthy participants, we want to discuss its transfer to the final target population.
Firstly, it is imperative to assess the movement capabilities of the potential neuroprosthesis users, since their residual upper limb functions vary [36]. In case of no residual grasp function, we believe that using low-frequency time-domain EEG as a control signal could offer a possibility for an intuitive robotic arm or neuroprostheses control.
Secondly, while in our study we instructed the participants to execute the movements, this is not possible for the targeted end user group. Recent findings suggest that executed movements provide a similar neural representation to that of attempted movements and can as well be decoded from EEG [38,70,71]. So, it is necessary to evaluate the performance of the online decoder while endusers attempt to perform the upper-limb movements.
Additionally, combinations of movement execution and movement attempts, depending on the residual functions of the user, could be explored. For instance, combining non-functional hand/grasp movements with a movement the end user is still capable of, e.g. a reaching movement. A number of studies in healthy participants have already shown offline that different reach-and-grasp actions can be discriminated using EEG [23,24,26,46,58]. In this way, end users would execute the reach and could attempt to perform the designated grasp/supination.
Thirdly, the simulation environment presented in this study can be useful for the end-users since it allows a smoother transition between the virtual and the daily-life scenarios, when compared to the presentation of abstract cues. In the simulation environment, participants interact with virtual objects to perform daily life actions, which we consider to be more immersive. While we did not investigate the effect of training over several sessions, it would be interesting to use this simulation environment for training over multiple sessions with end-users and test whether such training has an impact in the overall performance on a free-control of, e.g. a neuroprostheses. It is also relevant to mention that the simulation is not exclusive to the 3 movements investigated in this study, and it encompasses more objects for a larger set of upper-limb movements (including additional grasps and elbow movements), which allows adaptation according to the users' own needs and final application.
Despite these challenges and limitations, we have already started to assess the feasibility of our findings in a group of tetraplegic participants: Within the MoreGrasp (www.moregrasp.eu) feasibility study, we assess their capabilities of using a BCI to control an upper limb motor neuroprosthesis in several stages [36,40,72]. Analogue to the current study, they perform singular, attempted hand movements to generate control signals for the BCI. In the last stage, study participants are going to train with their mobile, customized BCIs at their homes using a tablet version of the simulation environment evaluated in the current study. Our initial findings so far confirmed that also attempted movement can be used for decoding ( [36,72], analogue to Ofner et al [38]).

Conclusion
In this study we have successfully shown the online decoding of two grasps and one wrist supination movement using low frequency time domain features of the human EEG. In the BCI scenario, 14 out of 15 healthy participants achieved decoding accuracy higher than chance level (three conditions, 40%, adjusted Wald interval, alpha = 0.05 [54,55]), with an average accuracy of 48%. Underlying EEG correlates of the acquired calibration data show significant differences over the contralateral central sensorimotor areas, which are retained to a large extent for the data acquired from online BCI use. These findings can potentially contribute to the development of a more natural and intuitive BCI-based control modality for assistive devices such as upper limb motor neuroprostheses for people with motor impairments.