Towards psychologically adaptive brain–computer interfaces

A Myrden; T Chau

doi:10.1088/1741-2560/13/6/066022

1. Introduction

Brain–computer interfaces, or BCIs, provide a potential means of communication and environmental control for individuals with disabilities [1]. Most current BCIs employ electroencephalography (EEG) to record electrical activity on the cerebral cortex, providing a means by which changes in cognitive activity can be monitored [2]. EEG-BCIs can detect fluctuations in electrical activity that are characteristic of certain cognitive events. For example, P300 and SSVEP BCIs detect the modulation of attention [3, 4] while task-based BCIs detect the performance of mental tasks such as motor imagery [5]. When these cognitive events are reliably detected, BCI users can employ them to control spellers [6], mobility devices [7], and other systems [8].

Practical usage of BCIs has been hindered by their instability. EEG signals are non-stationary both between and within sessions [9]. The former type of non-stationarity can occur due to small variations in electrode impedance and positioning from day to day or changes in the experimental protocol between training and testing sessions, while the latter may be caused by changes in underlying mental state during BCI operation (e.g. the fatigue and attention levels of the BCI user) [9, 10]. This combination of between- and within-session factors can cause inconsistent BCI performance. However, adaptive BCI algorithms may be capable of mitigating the effects of such non-stationarities.

Previous adaptive BCI research has focused primarily on between session non-stationarity. For example, Shenoy et al explored several approaches for adapting a LDA classifier based on data recorded during each new testing session [9]. Li et al investigated the usage of covariate shift adaptation to pursue a similar goal without the need for labeled data from the new testing session [11]. More recently, Nicolas-Alonso et al have used kernel discriminant analysis to gradually enhance classification during testing sessions [12]. Comparatively little attention has been devoted to methods of adapting to within-session non-stationarity, although McFarland et al have demonstrated the potential value of within-session adaptation for task-based BCIs [13, 14].

It has been hypothesized that psychological conditions such as fatigue and attention influence BCI performance [15, 16]. These latent mental states can change rapidly during BCI usage and can affect BCI performance both directly, by causing EEG signal fluctuations that increase the difficulty of detecting the required control task (e.g. motor imagery), and indirectly, by inhibiting the BCI user's ability to effectively complete the required control task. Recent work suggests that at least three mental states—fatigue, frustration, and attention—have statistically measurable effects on BCI performance [17]. However, adaptation to these states is complicated by their subjective nature and the difficulty of monitoring them with high accuracy.

Unlike the intentional cognitive tasks typically detected by BCIs, changes in mental states such as fatigue are involuntary. Systems that are used to monitor such involuntary changes in cognitive activity have been defined as passive BCIs to differentiate them from the more traditional active and reactive BCIs [18]. Some recent examples of passive BCI research include monitoring mental workload [19, 20] and affective state [21]; detecting the perceived loss of control over a system [22]; and detecting interaction errors by active BCIs [23]. In our own work, we have demonstrated the ability to detect coarse modulations in fatigue, frustration, and attention during the performance of mental tasks similar to those used by many BCIs [24]. Combining this passive detection of mental state with a traditional task-based BCI may allow the design of a hybrid BCI that adapts to changes in psychological state.

A number of research studies have investigated between-session BCI adaptation. Sun and Zhang adaptively updated feature extractors based on new data from each session [25]. Shenoy et al investigated several methods of rebiasing and retraining classifiers based on online data [9]. Sugiyama et al derived an importance-weighted covariate shift method to compensate for shifts in the feature distributions between training and testing data [26], a method replicated by numerous other research groups [11, 27, 28]. Vidaurre et al used each sample from a testing session to update an adaptive linear discriminant analysis (LDA) classifier [29], and Llera et al applied similar algorithms to multi-class BCIs (i.e. those that use more than two control tasks) [30]. Arvaneh et al used both supervised and unsupervised data space adaptation to transform EEG data from the target space (i.e. the testing session) to the source space (i.e. the training session) [31]. Samek et al investigated the development of a stationarity common spatial patterns algorithm to extract features that are invariant to signal non-stationarity [32]. However, these approaches mainly focus on the macroscopic differences in EEG signals recorded on different days and using different experimental protocols. Comparatively less attention has been devoted to within-session adaptation [29].

Between-session adaptation typically assumes that the separability between BCI classes (e.g. left and right hand motor imagery) is not lost between sessions, but rather, the distributions of each class migrate within the feature space, necessitating an updated classifier [9]. It is not clear that this assumption holds for within-session adaptation, as certain psychological conditions (e.g. complete loss of attention) may preclude accurate BCI classification. This motivates two approaches to psychological BCI adaptation. In the first, predicted mental state can be used to assess the risk that a given BCI trial will be classified incorrectly by a static, or non-adaptive, BCI. This approach assumes that separability is irreversibly lost under certain conditions. In the second, predicted mental state can be used to directly adapt the BCI, implicitly assuming that separability is retained but the classifier must be retrained to recognize it.

This paper proposes and evaluates a psychologically adaptive two-class EEG-BCI based on personalized mental tasks. This BCI encompasses both a passive BCI that predicts the current levels of fatigue, frustration, and attention and an active BCI that differentiates between two mental tasks. The effects of using predicted mental state to assess the likelihood of task misclassification and to directly increase the accuracy of task classification are independently investigated. These approaches were compared to a typical non-adaptive BCI that ignores psychological state.

2. Methods

2.1. Protocol

11 able-bodied participants (two male, average age 27.4 years) completed five sessions, each of which was approximately one hour in duration. Each session occurred on a different day. The first two sessions were offline training sessions while the remaining three sessions were online testing sessions. Participants had normal or corrected-to-normal vision and refrained from consuming caffeine for four hours prior to each session. Participants provided written informed consent, and the experiment was approved by the Holland Bloorview Research Ethics Board.

2.2. Training sessions

During each training session, participants completed 30 five-second trials of each of five mental tasks—four active tasks (motor imagery, mental arithmetic, music imagery, and word generation) and an unconstrained rest task. For each participant, two-class BCIs were trained to differentiate each of the four active tasks from the unconstrained rest task. One active task was selected for each participant on the basis of participant preference and the accuracy of each BCI. Eight participants selected the word generation task, two selected the motor imagery task, and one selected the music imagery task. The BCI that differentiated the selected task from the unconstrained rest task was used during the testing sessions. Full details regarding the protocol of the training sessions can be found in [17].

2.3. Testing sessions

During each testing session, participants used the selected BCI to play a simple maze navigation game. Participants were required to navigate from intersection to intersection within a simple maze by performing their active task when the direction they wanted to move was highlighted. Likewise, participants performed the rest task when any other direction was highlighted. BCI performance was intentionally hindered by omitting any between session adaptation to ensure that significant changes in psychological state were induced. Participants self-reported their perceived levels of fatigue, attention, and frustration on a continuous scale between 0 and 1 immediately prior to each BCI decision. Full details regarding the protocol of the testing sessions, the online performance of the BCI, and the observed relationships between BCI performance and mental state can be found in [17].

2.4. Data acquisition, signal processing, and feature extraction

During each session, electrical activity on the cerebral cortex was recorded from 15 locations (Fz, F1, F2, F3, F4, Cz, C1, C2, C3, C4, CPz, Pz, P1, P2, and POz by the international 10-20 system [33]) using a B-Alert X24 wireless EEG headset (Advanced Brain Monitoring, Carlsbad, CA). Each recorded signal was band-pass filtered between 2 and 30 Hz. Independent component analysis was used to remove signal artefacts caused by eye movements and blinking [34].

Each five-second trial during which either the rest or the active task was performed was extracted from the recorded EEG signals. Only data from the testing sessions were analyzed, as the data from the training sessions did not have corresponding mental state data. A frequency-domain feature set was used to characterize each trial due to the short trial duration and the importance of cortical activity within the four major EEG frequency bands (i.e. delta, theta, alpha, and beta) for monitoring mental state [35, 36]. For each trial, a fast Fourier transform was used to convert the recorded EEG signals into the frequency domain. The frequency spectra for each signal were compressed by computing the total power within each non-overlapping 1 Hz frequency range from 0–1 Hz to 29–30 Hz. Each of these powers was used as a feature, yielding a feature set with 450 features (the power within 30 frequency ranges from each of the 15 electrodes). This feature set was further compressed during cross-validation using a feature clustering algorithm that derived participant-specific frequency bands for each electrode [37]. Note that this feature set was used for mental state prediction, BCI reliability prediction, and the adaptive BCI. However, different features were chosen from the feature set for each of these applications.

2.5. Mental state prediction

Data from the online sessions were analyzed offline to investigate the ability to predict self-reported levels of fatigue, frustration, and attention based on the recorded EEG data. Previous work on detection of these states focused on binary differentiation between low and high levels of each state [24]. In contrast, during this study each state was self-reported on a continuous scale between 0 and 1. This provided a more nuanced measure of mental state but also necessitated a supervised regression approach.

A 10 × 10 (runs × folds) repeated cross-validation was performed for each participant. Least-squares regression was performed using lasso regularization [38]. The quality of mental state prediction was then quantified by computing the Pearson correlation coefficients between the self-reported and predicted values of each mental state.

2.6. Adaptive brain–computer interface

Two methods of adapting to changes in mental state were investigated: mental state-based reliability prediction, and mental state-based classifier adaptation. For each approach, five features were selected for classification using a fast correlation-based filter (FCBF) while a LDA classifier was used to differentiate between rest and active tasks [39, 40].

2.6.1. Prediction of BCI reliability

This approach assumes that the predicted mental state contains predictive information regarding the success of mental task classification. An inner cross-validation was performed on the training data to predict the class of each training sample using LDA. Let ${x}_{n}\in {{\mathbb{R}}}^{d}$ , d = 5, $n=1,\cdots ,N$ denote the nth training sample, where N represents the total number of samples. Projections ${p}_{n}\in {\mathbb{R}}$ for each sample were computed as:

$\begin{eqnarray}&&P=[{p}_{1}\ {p}_{2}\ \cdots \ {p}_{N}],\end{eqnarray} \tag{ 1 }$

$\begin{eqnarray}&&{p}_{n}={w}^{T}\cdot {x}_{n}+c,\ \ \ 1\leqslant n\leqslant N,\end{eqnarray} \tag{ 2 }$

where w and c, respectively, are the weight vector and constant specified by Fisher's linear discriminant [40]. Each training sample x_n was classified as reliable ( ${r}_{n}=0$ ) or unreliable ( ${r}_{n}=1$ ) according to the following:

$\begin{eqnarray}&&{r}_{n}=0\ \ \ \forall n\ | \ {p}_{n}\lt -h\ast {\rm{med}}(| P| ),\ {C}_{n}=0,\end{eqnarray} \tag{ 3 }$

$\begin{eqnarray}&&{r}_{n}=0\ \ \ \forall n\ | \ {p}_{n}\gt h\ast {\rm{med}}(| P| ),\ \ \ \ {C}_{n}=1,\end{eqnarray} \tag{ 4 }$

$\begin{eqnarray}&&{r}_{n}=1\ \ \ \forall n\ | \ {p}_{n}\gt -h\ast {\rm{med}}(| P| ),\ {C}_{n}=0,\end{eqnarray} \tag{ 5 }$

$\begin{eqnarray}&&{r}_{n}=1\ \ \ \forall n\ | \ {p}_{n}\lt h\ast {\rm{med}}(| P| ),\ \ \ \ {C}_{n}=1,\end{eqnarray} \tag{ 6 }$

where ${C}_{i}=0$ indicates that the sample represented a rest task and ${C}_{i}=1$ indicates that the sample represented an active task. The threshold $h\in {\mathbb{R}}$ controlled the aggressiveness of the algorithm. For h = 0, training samples were classified as unreliable only if they were misclassified during the inner cross-validation. For $h\gt 0$ , training samples were also classified as unreliable if they were classified correctly but were projected near the boundary between classes. For $h\lt 0$ , training samples were only classified as unreliable if they were both misclassified and projected far from the boundary between classes. The threshold h was scaled by the median of the absolute magnitude of the projection vector P to achieve invariance to the magnitude of projections between participants.

Due to the previously observed nonlinear relationships between mental state and BCI performance [17], a support vector machine (SVM) classifier was used to predict BCI reliability [40]. The radial basis function kernel was used, with an inner cross-validation to set appropriate values of σ and C. This classifier was trained using the set of three predicted mental states as a feature set and the training sample reliabilities $R=[{r}_{1}\ {r}_{2}\ \cdots \ {r}_{N}]$ as the set of target labels. The resultant classifier was used to predict BCI reliability for the testing set. The relationships between the three predictive algorithms (i.e. mental state detection; BCI classification; and prediction of BCI reliability) involved in this analysis are depicted conceptually in figure 1.

**Figure 1.** Conceptual overview of how each prediction algorithm is trained. For each learning algorithm, the input feature set is represented by the arrow arriving at the top of the associated rectangle (yellow lines, the labels by the arrow arriving on the side of the associated rectangle (green lines), and the outputs by the arrows departing the associated rectangle (orange lines). For each training sample, the mental task performed is predicted using LDA and mental state is estimated using regularized regression. The mental task predictions are used to estimate BCI reliability (see equations (3) through 6) and the estimated mental state is used to predict these reliabilities. All three learning algorithms are then used for the testing data within each fold of cross-validation.
Download figure:
Standard image High-resolution image

For the testing set, the task label targets are referred to as T, the task label predictions as $\hat{T}$ , and the predictions of BCI reliability as $\hat{R}$ . Each testing sample can be classified as either a true positive (TP), false positive (FP), true negative (TN), or false negative (FN) based on:

$\begin{eqnarray}&&{\rm{TP}}=\{n\ | \ ({\hat{r}}_{n}=1)\ \cap \ ({t}_{n}\ne {\hat{t}}_{n})\},\end{eqnarray} \tag{ 7 }$

$\begin{eqnarray}&&{\rm{FP}}=\{n\ | \ ({\hat{r}}_{n}=1)\ \cap \ ({t}_{n}={\hat{t}}_{n})\},\end{eqnarray} \tag{ 8 }$

$\begin{eqnarray}&&{\rm{TN}}=\{n\ | \ ({\hat{r}}_{n}=0)\ \cap \ ({t}_{n}={\hat{t}}_{n})\},\end{eqnarray} \tag{ 9 }$

$\begin{eqnarray}&&{\rm{FN}}=\{n\ | \ ({\hat{r}}_{n}=0)\ \cap \ ({t}_{n}\ne {\hat{t}}_{n})\}\mathrm{.}\end{eqnarray} \tag{ 10 }$

From these categorizations, six evaluation criteria were used to evaluate the efficacy of the reliability predictions—the sensitivity (Se), specificity (Sp), balanced classification accuracy (BCA), positive predictive value (PPV), negative predictive value (NPV), and overall predictive value (OPV), defined as:

$\begin{eqnarray}&&{\rm{Se}}=\displaystyle \frac{| {\rm{TP}}| }{| {\rm{TP}}| +| {\rm{FN}}| },\end{eqnarray} \tag{ 11 }$

$\begin{eqnarray}&&{\rm{Sp}}=\displaystyle \frac{| {\rm{TN}}| }{| {\rm{TN}}| +| {\rm{FP}}| },\end{eqnarray} \tag{ 12 }$

$\begin{eqnarray}&&{\rm{PPV}}=\displaystyle \frac{| {\rm{TP}}| }{| {\rm{TP}}| +| {\rm{FP}}| },\end{eqnarray} \tag{ 13 }$

$\begin{eqnarray}&&{\rm{NPV}}=\displaystyle \frac{| {\rm{TN}}| }{| {\rm{TN}}| +| {\rm{FN}}| },\end{eqnarray} \tag{ 14 }$

$\begin{eqnarray}&&{\rm{BCA}}=\displaystyle \frac{1}{2}({\rm{Se}}+{\rm{Sp}}),\end{eqnarray} \tag{ 15 }$

$\begin{eqnarray}&&{\rm{OPV}}={\rm{NPV}}-(1-{\rm{PPV}}).\end{eqnarray} \tag{ 16 }$

We note that the OPV represents the difference between the classification accuracy of the BCI for 'reliable' samples and the classification accuracy of the BCI for 'unreliable' samples. A 10 × 10 repeated cross-validation was used to estimate BCA and OPV. Chance levels for each metric were computed using a permutation test by randomly permuting the training set reliability predictions R prior to training the SVM classifier that predicted BCI reliability [41]. This procedure was repeated 1000 times to establish a probability distribution for both BCA and OPV.

2.6.2. Adaptive classification

This approach assumes that the separability of the active and rest tasks is retained throughout mental state space, but that the class distributions of each task move within the feature space as the BCI user moves within mental state space. To address this, a new classifier was trained for each sample in the testing set using only the training samples closest to that test sample within mental state space. Mental state space can be visualized as a three-dimensional space within which each training or test sample can be located using the associated predictions of fatigue, frustration, and attention. However, previous work indicates that the importance of each of these states varies between individuals [17]. Consequently, it was necessary to individualize the concept of mental state space by removing mental states that did not affect classification for each participant. This was performed within each fold of cross-validation based on the training data alone.

A random subset of data, balanced between the rest and active classes, was sampled from the training set. A FCBF [39] was used to reduce the feature set to five features, and an LDA classifier was trained to differentiate the two classes. The weight vector for the classifier was used to project the training data to the single dimension used for classification. For a set of feature vectors ${\bf{X}}$ with corresponding class labels C, projections P, and mental state predictions for any arbitrary state M, scores for each class were computed as:

$\begin{eqnarray}&&{S}_{{\rm{rest}}}=| {\rm{cov}}({P}_{i},{M}_{i})| \quad \forall i\ | \ {C}_{i}=0,\end{eqnarray} \tag{ 17 }$

$\begin{eqnarray}&&{S}_{{\rm{active}}}=| {\rm{cov}}({P}_{j},{M}_{j})| \quad \forall j\ | \ {C}_{j}=1.\end{eqnarray} \tag{ 18 }$

These scores represent the degree to which the projection of each class varies with the predicted mental state. The relationship between each mental state and the separability of the two classes was also quantified by randomly sampling 250 pairs of one rest task and one active task and, for each pair, computing the Euclidean distance between their projections and the distance between their predicted mental states. Concatenating these variables across all pairs to construct vectors of projection distance D_P and mental state distance D_M, another score was computed as:

$\begin{eqnarray}&&{S}_{{\rm{diff}}}=| {\rm{cov}}({D}_{{\rm{P}}},{D}_{{\rm{M}}})| .\end{eqnarray} \tag{ 19 }$

Each score was computed independently for all three mental states. Bootstrapping was used to enhance stability. The size of the randomly sampled subset was varied between 40% and 100% of the size of the entire training set, and 50 iterations were completed at each size. The average value of each score was computed for each mental state across all runs at all sizes. Scores were scaled by dividing by the maximum values of S_rest, S_active, and S_diff observed for any state. For each state, a composite score was computed by summing the three scaled scores. An optimal set of states was identified by choosing the two states with the highest and second highest composite scores. The third state was included only if its score exceeded 70% of the maximum value. Subsequent analysis used only the states that were included in this optimal set.

The proportion of the training set used to train the BCI within each fold was determined through a repeated inner cross-validation on the training data only. For each fold of this inner cross-validation, a classifier was trained for each test sample using 40%–100% of the data from the remaining folds, sampled based on their proximity to that test sample. At each of these proportions, both the classification accuracy and the Fisher score between the projections of the rest task and those of the active task were computed. The resultant accuracies and Fisher scores were smoothed and the proportions that yielded the maximum values of both variables were located. The average of these two proportions was used to train classifiers for each testing point in the outer cross-validation. Figure 2 provides a conceptual overview of the stages of the adaptive classification algorithm while figure 3 depicts the locations of one test sample and all training samples in mental state space, along with visualizations of three possible settings for the proportion of the training set used for classification.

**Figure 2.** Conceptual overview of the adaptive classification algorithm. The optimal mental states and proportion are chosen using an inner cross-validation on the training data and then the training data are resampled for each testing sample to train a new classifier.
Download figure:
Standard image High-resolution image

**Figure 3.** Visualization of 25%, 50%, and 75% cutoffs for training set inclusion. The filled circle represents the predicted mental state for the testing sample while the open circles represent the predicted mental state for each point in the training set.
Download figure:
Standard image High-resolution image

The classification accuracy attained by this adaptive LDA algorithm was compared to those attained by two non-adaptive BCIs—one that randomly sampled the training set to match the size of the training set used by the adaptive BCI, and one that used the entire training set. A 10 × 10 repeated cross-validation was performed for the adaptive BCI. A permutation test was used to estimate the performance of the non-adaptive BCI by performing 100 iterations of a 10 × 10 repeated cross-validation [41]. Each run was randomly initialized to ensure a unique set of samples was selected for each fold. All participants had more examples of the rest class than the active class, but classifiers were provided with balanced data for training. For the adaptive BCI, this was achieved by randomly sampling the subset of training samples from the rest class that were as close to each testing point as the most distant training sample from the active class.

3. Results

3.1. Mental state prediction

The correlation coefficients between the self-reported and predicted values of each mental state are presented in table 1. The average correlation ranged between 0.46 for attention and 0.56 for frustration, indicating that the mental state prediction algorithm was moderately accurate for the population as a whole.

Table 1. Correlation between predicted and self-reported values of each mental state for all participants.

Participant	Fatigue	Frustration	Attention
1	0.25 ± 0.13	0.39 ± 0.12	0.42 ± 0.13
2	0.61 ± 0.13	0.51 ± 0.13	−0.01 ± 0.16*
3	0.52 ± 0.11	0.69 ± 0.09	0.43 ± 0.14
4	0.87 ± 0.03	0.78 ± 0.09	0.72 ± 0.09
5	0.67 ± 0.08	0.32 ± 0.13	0.33 ± 0.12
6	0.71 ± 0.06	0.63 ± 0.08	0.49 ± 0.10
7	0.46 ± 0.12	0.45 ± 0.11	0.59 ± 0.08
8	0.31 ± 0.12	0.47 ± 0.11	0.29 ± 0.11
9	0.51 ± 0.10	0.47 ± 0.12	0.65 ± 0.15
10	0.06 ± 0.14*	0.80 ± 0.05	0.67 ± 0.07
11	0.53 ± 0.09	0.62 ± 0.07	0.46 ± 0.16
Avg	0.50 ± 0.23	0.56 ± 0.16	0.46 ± 0.21

Note: Asterisks indicate the combinations of participant and mental state for which p > 0.05.

3.2. Reliability prediction

Figure 4 depicts the effects of the threshold value h on both BCA and OPV across all participants. Performance for both metrics was relatively stable until the threshold exceeded 0.5, after which they began to decrease. A threshold value of 0.4 was selected for all subsequent analyses. This choice of threshold falls within the stable region for both statistics while also ensuring that a significant number of samples are flagged as unreliable.

Figure 5 presents the balanced classification accuracies and OPVs for each participant for the aforementioned threshold. The average values of BCA and OPV across the entire population were 54.2% and 7.9%, respectively. The observed results for both statistics significantly exceeded chance levels, as shown by figure 6.

**Figure 6.** Comparison between observed BCA and OPV and random results from the permutation test. The observed values for both statistics significantly exceeded chance values. Note that this would also hold true for any point on either curve in figure 4.
Download figure:
Standard image High-resolution image

Although BCA exceeded chance levels on average, it is still quite low at only 54%. Consequently, it is unlikely that mental estimation can be used in isolation for error detection and correction. However, mental state clearly does contain information pertinent to BCI reliability, as evidenced by the average OPV of nearly 8%. This suggests that predicted mental state can be used online to provide some indication of BCI reliability.

3.3. Adaptive classification

Figure 7 compares the classification accuracies obtained by the adaptive BCI for all combinations of mental states and proportions to those obtained by the non-adaptive BCI when random sampling was used to extract identical proportions of the training set. For most participants, the adaptive BCI outperformed the non-adaptive BCI for smaller training set sizes and approached the performance of the non-adaptive BCI when the entire training set was used, as would be expected. This suggests that there are benefits to sampling training data based on mental state rather than at random.

Practically, it is more interesting to compare the performance of the adaptive BCI at all training set sizes to that of the non-adaptive BCI when the entire training set is used. A 95% confidence interval for the non-adaptive BCI classification accuracy under the latter condition was computed for each participant based on the results of the permutation test. Figure 8 compares these confidence intervals to the performance of the adaptive BCI using only the combination of mental states that had the highest average classification accuracy across all proportions. It is clear that some participants (i.e. Participants 1, 2, 4, 6, and 10) exhibit performance that exceeds the upper bounds of the confidence interval for a wide range of training set sizes. Again, this suggests that psychological BCI adaptation is useful for some participants.

**Figure 8.** Accuracy of the most effective adaptive BCI for each participant compared to that of the non-adaptive BCI using the entire training set. Each graph depicts the 95% confidence interval for the accuracy of the non-adaptive BCI (dotted lines) and the accuracy of the adaptive BCI for a range of different training set sizes (blue line). The adaptive BCI either exceeded or approached the upper limit of the non-adaptive confidence interval for the majority of participants. The legend for each plot identifies the combination of fatigue (Fa), frustration (Fr), and attention (At) used for the adaptive BCI.
Download figure:
Standard image High-resolution image

Finally, the mental state and proportion selection algorithms were used to identify an ideal combination of mental states and training set size for each participant. The frequency with which each combination of mental states was selected for each participant is presented in table 2. On average, each mental state was selected approximately 80% of the time (78.4%, 80.0%, and 82.2% for fatigue, frustration, and attention, respectively).

Table 2. Frequency with which each combination of mental states was chosen for each participant.

Participant	Fa–Fr	Fa–At	Fr–At	Fa–Fr–At	Fa	Fr	At
1	0	0	0.02	0.98	0.98	1	1
2	0.2	0	0.08	0.72	0.92	1	0.8
3	0	0.1	0	0.9	1	0.9	1
4	0.02	0.9	0.06	0.02	0.94	0.1	0.98
5	0.2	0.8	0	0	1	0.2	0.8
6	0.6	0.4	0	0	1	0.6	0.4
7	0.74	0	0	0.26	1	1	0.26
8	0.06	0	0.8	0.14	0.2	1	0.94
9	0	0	0.34	0.66	0.66	1	1
10	0.14	0	0.58	0.28	0.42	1	0.86
11	0	0	0.5	0.5	0.5	1	1

Note: Fa refers to fatigue, Fr to frustration, and At to attention. The overall frequencies with which each mental state was selected for each participant, either independently or in combination with any other state, are presented in the final three columns.

The adaptive classification accuracy and average training set size for each participant are summarized in table 3. Five of 11 participants exhibited adaptive classification accuracies that exceeded the upper limit of the 95% confidence interval for the non-adaptive classification accuracy. Overall, the adaptive BCI exhibited an accuracy of 73.2% while the non-adaptive BCI exhibited an accuracy of 72.6%. All participants exhibited adaptive classification accuracies that exceeded the lower limit of the 95% confidence interval, although Participants 9 and 11 were close to this limit.

Table 3. Adaptive and non-adaptive classification accuracies for each participant.

Participant	Non-adaptive	Adaptive	Training set size
1*	90.2	91.2	0.80
2*	70.5	72.6	0.62
3	62.6	63.5	0.75
4*	73.2	74.9	0.64
5*	68.8	71.2	0.62
6	77.2	77.2	0.61
7	65.3	65.3	0.89
8	73.4	73.2	0.77
9	84.6	84.0	0.78
10*	67.1	67.8	0.59
11	65.4	64.5	0.90
Average	72.6	73.2	0.72

Note: The performance of the adaptive BCI exceeded the limits of the 95% confidence interval for the performance of the non-adaptive BCI for participants denoted with an asterisk. The proportion of the training set used by the adaptive BCI, as determined by the algorithm detailed in 2.6.2, is listed in the final column.

Figure 9 compares the performance of the adaptive BCI with automatic mental state and proportion selection to the 95% confidence interval for the non-adaptive BCI and the performance of the most frequently selected combination of states for all proportions of training set size.

4. Discussion

4.1. Predictive value of mental state for BCIs

The results presented here suggest that predictions of mental state can be used both to estimate BCI reliability and to directly improve the accuracy of classification. When mental state was used to classify the reliability of BCI decisions for unseen trials, we observed an 8% difference in BCI classification accuracy between trials classified as reliable and trials classified as unreliable. When mental state was used to directly adapt the BCI classifier, nearly half of participants exhibited a significant increase in classification accuracy. These findings represent an initial step towards the design of BCIs that are capable of adapting to short-term changes in psychological state within a session.

One implication of this analysis is that under some circumstances a small set of relevant BCI training data is preferred over a larger set that also includes less relevant data. This emphasizes the importance of classification algorithms that match each new testing sample to the most relevant data in the training set, such as the dynamically weighted ensemble of classifiers detailed by Liyanage et al [10]. During BCI development, it is important to focus not just on collecting a substantial amount of training data, but also on identifying optimal ways to leverage these data for each new sample.

The importance of relevant training data also has obvious implications for between-session adaptation. When retraining a BCI during a new session, it is common to use a limited subset of the previously collected data [42]. This limits the size of the training set and prevents the new data from being overwhelmed by pre-existing data from previous sessions. Figure 7 suggests that the method by which these pre-existing data are sampled may affect BCI performance. For small training set sizes, mental state-based sampling clearly outperformed random sampling for most participants. Likewise, when discarding previous data to retrain a classifier, it is clear that care should be taken to ensure that this is done systematically and that the relevance of the remaining data is maximized.

Offline analyses such as those performed here are often over-optimistic regarding their conclusions. However, in this case, there is reason to believe that the merits of psychological adaptation in BCIs have been understated. Short-term modulations in mental state are inevitable during BCI usage and have a significant impact on performance [17]. However, it is unclear to what extent these changes in mental state are also driven by BCI performance. The relationship between these variables is likely to be cyclical, as fluctuations in mental state may elicit a deterioration in BCI performance that, in turn, evokes further fluctuations in mental state. As a result, the true efficacy of psychological adaptation may only be apparent in online analyses, where this adaptation can potentially be used to suppress this positive feedback cycle between BCI performance and mental state.

4.2. Contrast with error correction using the error-related potential (ERP)

The ERP has previously been used to detect and correct errors during BCI interaction [43]. While the mental state-based prediction of BCI reliability detailed here cannot match the effectiveness of ERP-based error detection, several characteristics of this approach distinguish it from ERPs. Accurate detection of ERPs requires a participant to remain focused on the BCI so that they realize when it has erred, while mental state-based reliability prediction should remain effective when participants are inattentive. In fact, the approach detailed here is particularly effective when participant focus is inconsistent, rendering it complementary to traditional ERPs. Furthermore, while ERP detection and the associated error correction require a BCI trial to be terminated and a classification decision to be issued, mental state-based reliability prediction can be performed online while the BCI trial continues. Consequently, intermediate actions can be taken to reduce the risk of an error occurring at all. For example, in an online test, mental state-based classifier reliability could be computed online and a BCI trial could be lengthened in real-time if the estimated mental state indicates that classifier performance is uncertain. Alternatively, in the specific case that disengagement from the BCI is detected, the BCI could either shut down to prevent unintentional input or produce an attentional cue to recapture attention. This provides more versatility than the straightforward error correction offered by ERPs.

4.3. Variation between reliability prediction and classifier adaptation

The individual differences in the performance of the reliability prediction and classifier adaptation approaches are intriguing. For example, reliability prediction was most accurate for Participant 4 and mostly uninformative for Participant 2. However, adaptive classification provided statistically significant improvements over non-adaptive classification for both participants. Meanwhile, some participants exhibited no significant differences between the adaptive and non-adaptive BCIs despite excellent reliability prediction (e.g. Participant 9). Overall, there does not appear to be a significant relationship between the effectiveness of reliability prediction and that of the adaptive BCI. Although effective reliability prediction does not necessarily translate to adaptive BCI success due to the possibility that class distributions may simply be inseparable under some psychological conditions, it is less clear why the converse is untrue. Regardless, these results suggest that different adaptation strategies are effective for different individuals, highlighting the importance of flexible BCI design.

4.4. Limitations and future directions

Several limitations are apparent within this study. The first is the offline and cross-validated nature of the analysis. During online usage, it is possible that the cortical correlates of changes in mental state undergo the same inter-session non-stationarity that often characterizes mental task performance. As a consequence, it is unlikely that mental state data from one session will be able to accurately predict mental state data from another session without the use of previously identified methods of adapting to between-session non-stationarity, such as covariate shift adaptation or online classifier retraining. We intend to focus on this problem in further research and identify the degree to which mental state detection itself is affected by inter-session non-stationarity.

Second, the accuracy of mental state prediction clearly imposes some limits on the efficacy of adaptation. Although all three mental states were predicted with moderate accuracy, more accurate predictions may allow greater improvement in adaptive BCI performance. Finally, the impact of mental state on BCI performance seems to be highly individualized. As a result, hyperparameters such as the selection of mental states and the training set size have to be set individually rather than globally, causing some computational disadvantages. Further testing will investigate how consistent these values are and determine whether they can each be set at the beginning of BCI usage, avoiding further computation.

Given a passive BCI that measures mental state and an active BCI that detects mental task performance, psychological adaptation requires the interactions between these systems to be defined. This is a non-trivial task. Two approaches have been investigated here, one of which limits the passive BCI to providing a supplementary measure of BCI reliability and one of which directly intertwines the two systems by using the output of the passive BCI to control the active BCI. However, these are not the only potential architectures for psychological adaptation, and they are not mutually exclusive. Combining the two approaches may allow all users to benefit from psychological adaptation, and there is no doubt that there remain many other ways to fuse passive and active BCIs.

5. Conclusions

This study investigated two methods by which an EEG-BCI can adapt to changes in mental state. Fatigue, frustration, and attention levels during BCI usage were predicted using least-squares regression with lasso regularization. The Pearson correlation coefficients between self-reported mental state and the predicted values approached or exceeded 0.5. These predictions of mental state were used to estimate BCI reliability with an accuracy exceeding chance levels. An 8% difference in classification accuracy was uncovered between trials flagged as reliable and those deemed unreliable. Mental state estimations were also used to directly adapt an active BCI; the classifier was retrained using a mental state-guided resampling of the training set. Five of eleven participants exhibited significant, but practically modest improvements in classification accuracies using the adaptive algorithm while no participants exhibited significant decreases. These results suggest that psychological adaptation may provide a means of improving online EEG-BCI performance.

Towards psychologically adaptive brain–computer interfaces

Article metrics

Submit

Permissions

Author e-mails

Author affiliations

Dates

Peer review information

Abstract

1. Introduction