A visual brain-computer interface as communication aid for patients with amyotrophic lateral sclerosis

OBJECTIVE
Brain-Computer Interface (BCI) spellers that make use of code-modulated Visual Evoked Potentials (cVEP) may provide a fast and more accurate alternative to existing visual BCI spellers for patients with Amyotrophic Lateral Sclerosis (ALS). However, so far the cVEP speller has only been tested on healthy participants.


METHODS
We assess the brain responses, BCI performance and user experience of the cVEP speller in 20 healthy participants and 10 ALS patients. All participants performed a cued and free spelling task, and a free selection of Yes/No answers.


RESULTS
27 out of 30 participants could perform the cued spelling task with an average accuracy of 79% for ALS patients, 88% for healthy older participants and 94% for healthy young participants. All 30 participants could answer Yes/No questions freely, with an average accuracy of around 90%.


CONCLUSIONS
With ALS patients typing on average 10 characters per minute, the cVEP speller presented in this paper outperforms other visual BCI spellers.


SIGNIFICANCE
These results support a general usability of cVEP signals for ALS patients, which may extend far beyond the tested speller to control e.g. an alarm, automatic door, or TV within a smart home.


Introduction
The ability to communicate thoughts, intentions and desires to others, either verbally or behaviorally, depends greatly on muscle control. We need muscles to speak and to perform communicative gestures. But what if you are not (or only partially) in control of your muscles? People suffering from Amyotrophic Lateral Sclerosis (ALS) gradually lose control of the muscles in their body, due to the loss of central and peripheral motor neurons. With that, they may eventually lose their ability to speak (Borasio and Miller, 2001). These people need an augmentative and alternative way of communication. Brain-Computer Interfaces (BCIs) may provide a solution by allowing a user to control an external device using brain activity (Van Gerven et al., 2009). A BCI translates brain activity into computer actions that allow a user to perform actions, such as selecting letters on a digital keyboard to type words. These words may be synthetically vocalized, giving an ALS patient the possibility to regain a voice. BCI spellers may greatly enhance a patient's quality of life by enabling an alternative form of augmentative communication (Mckelvey et al., 2012).
A variety of BCI spellers has been designed and tested. The most well known is the visual P300 speller (Farwell and Donchin, 1988), which detects the brain response to an attended oddball stimulus. With this speller, healthy users can reach a spelling rate of about 6 characters (out of 30 to 36 possible characters) per minute (Treder and Blankertz, 2010;Lin et al., 2018). A variant of this speller, the 72-item Wadsworth BCI home system, allows full keyboard control for ALS patients (Vaughan et al., 2006). A group of 42 ALS patients used this system independently at their home for several months. With this system, they reached an average performance of 73%, typing about 2.9 characters per minute. Although such P300 spellers allow for fairly accurate and stable communication, patients report them to be mentally demanding, slow and uncomfortable to use (Fazel-Rezai et al., 2012).
An alternative to the P300 speller is one that uses codemodulated Visual Evoked Potentials (cVEPs) (Sutter, 1992). The cVEP speller consists of a digital keyboard on which each letter flashes with a pseudo-random noise-code: a unique pattern of flashes. This approach has the advantage that a generative model exists that predicts the brain response to any given binary pattern of flashes. This is done by learning the brain responses to the events that constitute the sequence: the transient responses (Thielen et al., 2015). Because the brain response to these flashes is modulated by eye gaze and attention, the attended noise-code can be decoded and the corresponding letter can be selected. Not only does this method outperform other spellers in terms of speed (spelling up to 21.3 characters per minute), it is also highly robust to noise and reaches spelling accuracies up to 96% (with 32 possible commands) (Spüler et al., 2012). Recently, even higher communication rates were achieved by adopting higher stimulation rates (Bas ßaklar et al., 2019), a language model (Gembler et al., 2020) and deep learning (Nagel and Spüler, 2019). However, to the best of our knowledge, the cVEP BCI speller has only been tested on healthy users 1 .
ALS is known to affect the brain to some degree. Changes in functional brain connectivity have been found in terms of changes in alpha, beta, gamma, theta and delta activity in ALS patients compared to healthy people (Iyer et al., 2015). Moreover, ALS patients often show a weaker overall brain connectivity in resting state compared to healthy people (Mohammadi et al., 2009). In addition, some studies speculate that due to the loss of motor neurons, an overcompensation is present at more frontal regions of the brain, leading to differences in neural connectivity in this region. This seems to lead to more decentralized activity in ALS patients compared to healthy people, which becomes more and more pronounced as the disease progresses (Iyer et al., 2015). Due to these general changes, ALS may affect the brain response to the noise-codes, e.g. causing a higher or lower amplitude, shifted onset, or different width of the main cVEP peaks compared to those of healthy users. A lower amplitude signal may be more difficult to detect and may decrease BCI performance due to a lower signal-tonoise ratio (SNR). However, because the other ALS induced changes in brain activity described above are not time-locked to visual stimuli, the cVEP speller is expected to work for both healthy and ALS users.
To examine whether ALS patients can control the cVEP speller as well as healthy users, we assess the transient responses, BCI performance and user opinion of the speller designed by Thielen et al. (2015) and also patented Desain et al. (2019) in both a healthy and patient population. The BCI performance is assessed across a cued spelling task, a free spelling task and a free selection of Yes/No answers to simple questions. The Yes/No task can be expected to have a higher classification accuracy than the spelling task as classification is easier with only two possible classes. Additionally, a twoclass BCI might be less gaze-dependent than a full speller BCI. To gain more insight into possible causes for differences in spelling accuracy and speed between the different participant groups, the latency, width and amplitude of the brain's transient responses to the short and long flashes of the speller are assessed and compared between groups. Because ALS patients depend more on this system than healthy participants, their opinion may differ from healthy participants: on the one hand, they may like it more, on the other hand they may be more critical of the system. In addition, the user experience may depend on the experienced BCI performance. However, as we do not expect a low BCI performance for a particular participant group, this effect should not be group dependent. If cVEPs can be accurately detected in ALS patients, the tested BCI speller is just a first step in a range of possible BCI applications. As cVEPs are evoked by light flashes, a simple LED light may already be sufficient to allow control of any external device in a home environment. An LED light can replace the need for a computer screen to provide the noise-code stimulation, and can be easily added at a low cost to any controllable device in a home environment. Examples include the control of an alarm button, automatic door, TV, phone or a cursor on a computer screen. Such a BCI supports a patient's autonomy and independence by its potential for easy integration into their environment: it only requires a visual stimulator (e.g. an LED light), an electroencephalography (EEG) headset that is worn directly on a users head, and a decoder. Because the communication between these three components can be established wirelessly, such a BCI system allows a user to move around in their home while using it. The range of patients who may benefit from these applications extends far beyond the pool of ALS patients that are tested in this study. Locked-in patients, (partially) paralyzed patients, severely spastic patients and any other patients with very limited motor control may all benefit from this type of BCI. Although the suggested applications still depend on functional eye gaze due to the required visual attention, applications in the tactile or auditory domain may also be feasible in the future.

Participants
This study was conducted in accordance with the Medical Research Involving Human Subjects Act (WMO) and was approved by and performed in accordance with the local Ethics Committee of the Faculty of Social Sciences of the Radboud University Nijmegen and the local Medical Ethics Committee (2016-3007). A total of 30 participants volunteered to participate in this study of which 20 were healthy participants and 10 were patients diagnosed with ALS. In this paper, we define ''healthy" participants as participants without a history of neurological, psychiatric or medical disorders known to influence the outcomes of this study.
The ALS patients were 37 to 75 years old (with a mean age of 60 AE 12 years). The ALS Functional Rating Scale -Revised (ALSFRS-R) was used to indicate the current stage of ALS (Cedarbaum et al., 1999). It consists of 12 questions on the current status of daily life routines and specific abilities such as speaking, swallowing, breathing and walking. These ALSFRS-R scores and other patient details are provided in Table 1.
All participants had normal or corrected to normal vision. People suffering from epilepsy or other (non-ALS) neurological diseases, except concomitant frontotemporal dementia (see Table 1), were excluded from the study. Neuropsychological testing was not systematically performed. All participants gave their written informed consent prior to the experiment.
To examine whether ALS has an effect on the transient responses and BCI performance, results are compared between ALS patients and older (P 35 years) healthy participants. In addition, to check for differences due to healthy aging (van der Waal et al., 2017;Wang et al., 2016), results are compared between young (< 35 years) and older healthy participants. Out of the 20 tested healthy participants, 12 were younger and 8 were older than 35 years (see Table 2).

Experimental setup
All healthy participants were tested in the BCI lab of the Radboud University, whereas all ALS patients were tested in a dedicated room at the outpatient rehabilitation clinic of the Radboud University Medical Center. Participants were seated at a distance of 50 cm to a mounted 17 inch iPad Pro (Apple Inc.), which was positioned at eyesight level (see Fig. 1). The electroencephalogram (EEG) data were recorded at 2048 Hz with 7 water-based electrodes placed in a custom-built headset designed by MindAffect 2 . The electrodes covered the occipital lobe at positions P7, POz, P8, O1, Oz, O2, and Iz and the ground electrode was placed at T7 (see Fig. 1). The EEG data were amplified by a TMSi Porti amplifier 3 . The EEG data were in real-time high-pass filtered at 2 Hz using a 2nd order Butterworth filter, low-pass filtered at 50 Hz using a 4th order Chebyshev type II filter with 50 dB stop-band attenuation, and downsampled to 180 Hz. Stimulus presentation, data recording and online data analysis were controlled via BrainStream 4 , running on a laptop. An opto-sensor was used to record the presented stimulus flashes on the iPad. This timing information was used to synchronize the EEG data and stimulus presentation.

Stimuli
Two types of keyboards were presented on the iPad: a full keyboard of 29 keys and a binary Yes/No keyboard (see Fig. 2). Each button on either of the keyboards flickered between black (value 0) and white (value 1) according to a modulated Gold code (a binary sequence of values 1 and 0). Gold codes are pseudo-random noise-codes, designed to minimize cross-correlation between the noise-codes of all keys. We used one set of Gold codes to train and test the classifier. Similar to Thielen et al. (2015), the set of Gold codes was created with a feedback shift register of length m ¼ 6, with feedback taps at positions 6, 5, 2, 1 and at 6, 1 (for more information see: Gold (1967), Golomb (2017)). Furthermore, the set of Gold codes was modulated (i.e. multiplied) with a double bit-clock to retain the correlation properties, but remove low-frequency spectral content. The modulated Gold codes contained 2m þ 1 ¼ 65 bit-sequences that all had a length of 2 Ã ð2m À 1Þ ¼ 126 bits and took 126=60 ¼ 2:1 seconds to complete at a refresh rate of 60 Hz on the iPad. The modulation of the Gold codes caused the flickering of each button to be made up of two types of events: long (2 bit: ''110" or ''1100") and short (1 bit: ''10" or ''100") flashes. A short flash took 1=60 Ã 1000 ¼ 16:66 milliseconds, and a long flash took 2=60 Ã 1000 ¼ 33:33 milliseconds to complete.
A typical trial consisted of the following sequence of events that lasted no more than 12.7 s (see Fig. 2(c)): 1. Target: during calibration and cued spelling, a single key is highlighted by a slight shaking movement and then turning green for 3 s. The participant should attend this key during the next phase. During free spelling, no letter is highlighted. Instead, the participant has time to select the key (i.e. letter) of their choice. 2. Prepare: a stable -non flickering-keyboard is presented for 1.5 s.
This time can be used to get ready. Especially during testing, this additional time can be needed to reflect what has been typed and what key to select next. 3. Stimulation: all keys on the keyboard flicker with their dedicated noise-code. During calibration, this time is fixed to 4.2 s, including 2 full repetitions of the modulated Gold codes. During testing, dynamic stopping is used. The maximum noise-code stimulation duration is 4.2 s. However, if the classifier detects an attended key with a confidence level of at least 96% before that time, it will stop immediately (Thielen et al., 2015). The confidence level is based on the difference in correlation between the best and second best predicted brain response and the incoming EEG. Based on the calibration data (see Speller calibration and Yes/No calibration described in Section 2.4), a minimum difference was determined such that the predicted classifier accuracy is at least 96%. 4. Selection: the selected key makes a typical keyboard 'click' sound and is highlighted in white for 2 s. The corresponding letter is typed in the top part of the screen and remains there for the full calibration or test block so a sentence can be typed.

Experimental tasks
The experiment consisted of 6 tasks: 1. Practice: explanation of the BCI system, and mock trials during which participants get acquainted with the task. 2. Speller calibration: participants were asked to focus their eyes on one of the keys for the full duration of a trial. The target letter was selected and highlighted in green at the start of a trial. A total of 20 trials was presented. At the end of this task, the temporal and spatial components of the brain responses to short Table 2 Overview of participants details. Twenty healthy participants and ten Amyotrophic Lateral Sclerosis (ALS) patients participated in this study. The ALS patients are on average the oldest of all tested participants. To control for potential age-related differences in Brain Computer Interface (BCI) performance and user experience, the healthy participants are divided in two groups: young (<35 years old) and older (P35 years old) healthy participants. and long flashes are estimated. These patterns are used to train a classifier to decode an attended noise-code during the cued and free spelling. 3. Cued spelling: participants were instructed to spell out the sentence 'bci geeft mij een stem' (translated: BCI gives me a voice), giving a total of 22 trials. At the start of each trial, the targeted character was highlighted in green. When the classifier was (at least 96%) certain of its choice or when the maximum stimulation period (4.2 s) was reached, the decoded character was selected and typed. 4. Free spelling: similar to the cued spelling task. However, in this case there was no highlighted character at the start of a trial as participants were free to spell out a message. There was no limit on the number of trials. In case of an error, participants could correct it by focusing their eyes on the 'backspace' key.
When the participant finished their sentence, they could shift their eye gaze to the 'speak' key. Once selected, the typed sentence was pronounced using a synthesized voice and the task ended. 5. Yes/No calibration: similar to the speller calibration task. However, in this case the keyboard had only two buttons: yes and No. A total of 10 trials was collected. At the end of the task, a classifier was trained on the collected data. 6. Free Yes/No answers: a question was shown on the screen and pronounced by the system, which the participant was asked to answer by focusing their eyes on one of the two possible answers. The attended answer was selected when the classifier reached a confidence level of at least 96%, or when the maximum trial length of 4.2 s was reached, similar to the cued and free spelling tasks (iii and iv). The full list of questions can be found in Appendix A.
At the start of the experiment, ALS patients were asked about their current means of communication (see Appendix B). At the end of the experiment, all participants were asked to report on their current level of fatigue and their opinion of the BCI speller (see Appendix B). After that, they filled in the system usability score (SUS) questionnaire, assessing the user experience with the BCI system. This scale has been used since 1996 in order to make improvements in user-system interfaces, mainly focusing on effectiveness, efficiency and satisfaction (Brooke, 1996). The 10 questions of the SUS questionnaire are answered on a 5 point scale (1 = strongly disagree, 5 = strongly agree). See Appendix C for the full questionnaire. In total, the experiment lasted approximately 75 min, including instructions and cap fitting. Four 1-min resting state measurements were recorded in between the experimental tasks: 3 with eyes open and 1 with eyes closed. The resting state data were collected for another investigation and is not analyzed in this paper.

Online analysis
Classifier training and testing was performed according to Thielen et al. (2015) and Thielen et al. (2017). The classifier accuracy, defined as the percentage of correct predictions, was determined using 10-fold cross validation on the training data collected in the calibration tasks. After collecting training data, the brain's temporal responses to individual short and long flashes were estimated by decomposing the recorded responses to individual responses to the events within the noise-codes. This deconvolution was implemented using Canonical Correlation Analysis (CCA). With CCA, the deconvolution estimates both the temporal features as well as the spatial features of the brain response (Thielen et al., 2017;Thielen et al., 2021). The temporal features consist of the learned transient responses to the individual events (short and long flash), and the spatial features represent a learned spatial filter. Once the transient responses were known, the brain response to any sequence of short and long flashes (i.e. a noise- code, see Section 2.3) could be predicted. This was computed for each flashing letter by convolving the learned temporal responses with the noise-codes. The letter that gave the highest correlation between its corresponding spatially filtered actual and predicted brain response was selected. To do so, the difference in correlation between the actual brain response and the best (highest correlation) and second best predictions (i.e. the margin) was calculated each 100 ms (Thielen et al., 2015). This difference was then compared to a trained threshold, i.e. the learned difference between the best and second best predictions at which a classifier accuracy of 96% could be achieved on the basis of the training data. Once the online difference was larger than this threshold, a prediction was made and a letter was selected. If this did not happen before the end of the trial (i.e. after 4.2 s of stimulation), the most likely candidate was emitted (i.e., the one with the highest correlation).

Offline analysis
In order to analyze the transient responses to the long and short flashes, three features were extracted from the EEG data of the speller and Yes/No calibration tasks. These features capture the latency, amplitude and width of the two mean peaks that are typically observed in response to the flashes (Thielen et al., 2015): see Fig. 3. The latency was measured from stimulus onset to the first main peak of the transient response. The width was measured as the time interval between the two main (negative and positive) peaks. The amplitude was calculated as the difference in amplitude between the most negative and positive peaks. Note, during the calculation of the transient responses using CCA as described in the previous section, both the amplitude and polarity of the resulting signal are arbitrary: they do not influence the correlations between the single-trial EEG data, and thus the scale of predicted brain responses is chosen freely by the CCA. To provide the transient of each participant group with a peak-to-peak amplitude on the appropriate absolute measurement scale, we set the norm of the spatial filter and transient response to 1, calculate the predicted EEG and regress those responses against the true singletrial EEG measurements. No bias term was taken into account. This provides a scale factor that captures the amplitude of the actual brain response within the transient response. This was done on an individual subject level.
The total of six features (i.e. amplitude, width and latency of the transient response to short and long flashes) was compared between the older and young healthy participants, and the older healthy participants and ALS patients. For each of these 12 comparisons, a permutation test was used with 10000 permutations (Golland and Fischl, 2003;Maris and Oostenveld, 2007). The permutation test was performed across the calibration data of both the binary and full keyboards (tasks 2 and 5 as described in Section 2.4) using a two-sample Kolmogorov-Smirnov test statistic (Massey, 1951). The data were randomly resampled within each task across the different participant groups, thereby keeping the experimental structure intact while shuffling group assignments between participants. If there is a significant difference between groups, the Kolmogorov-Smirnov test statistic of the recorded data should lie within the outer 2.5% of the generated null distribution (adhering to the standard chance of 2.5% of incorrectly rejecting the null hypothesis in a two-sided statistical test). To correct for multiple comparisons, the false discovery rate was used to determine a correct significance level (Benjamini and Hochberg, 1995).
The BCI performance was assessed in terms of accuracy and speed. The accuracy was calculated as the number of correctly classified trials divided by the total number of trials in the cued spelling and Yes/No answer tasks. The classifier detection time (T) consisted of the average time it took the classifier to detect a single character with 96% confidence. To get a measure of the BCI speed, the number of Characters Per Minute (CPM) was calculated as: The inter-trial interval (ITI) encompasses the time in between two subsequent noise-code stimulations. During our online experiment, we have used an ITI of 8:5 seconds, which includes the ''target", ''prepare", ''selection" and ''break" phases as described in Section 2.3.
While the accuracy and speed of the BCI depend upon each other as a trade off, the capacity of the BCI can be captured by a single abstract characteristic, the Information Transfer Rate in bits per minute (Wolpaw et al., 2000): Here, N is the total number of possible commands (N = 29 5 for the spelling task, and N ¼ 2 for the Yes/No task) and P is the classification accuracy on the cued spelling and Yes/No tasks (0 P P 6 1). Including an ITI in the calculation of the CPM yields a realistic measure of the ITR that is more ecologically valid than the inflated ITRs obtained when ignoring ITIs. The ITRs (including an ITI of 8.5 s) were compared between young and older healthy participants and between older healthy participants and ALS patients using a permutation test with 10000 permutations employing a Kolmogorov-Smirnov test statistic. This comparison was made separately for the Yes/No answer The amplitude was calculated as the difference in amplitude between the most negative and positive peaks (the order of which is arbitrary, but coupled to the polarity of the learned spatial filter). The width was measured as the time interval between the two main (negative and positive) peaks. Note, both the amplitude and polarity of the calculated transient response by Canonical Correlation Analysis (CCA) are arbitrary. Therefore the amplitude of the transient signal is measured in arbitrary units (a.u.). 5 Due to a mistake in the settings of the classifier, the classifier matched the trials against N ¼ 36 possible classes rather than 29. In other words, a few theoretically possible classes never occurred in training or testing. If they occurred as classifier output, they were counted as incorrect. This means that the classifier was trying to solve a more difficult problem than necessary during online testing. As a result, the ITRs that we provide in this paper are on the pessimistic side. If the classifier would have correctly solved a 29-class problem, the predictions would have likely happened earlier rather than later. Because our experiment was conducted using online analysis, we could not easily correct our model after the fact. During the experiment, a trial ended as soon as a specific confidence level was achieved. This means that the noise-tag stimuli did not necessarily complete the full scheduled 4.2 s. Because some trials ended before this time, we cannot use this data to re-analyze our results and generate new predictions. Theoretically, a classifier with a different model may have reached the desired confidence level at a later point in time (of which we have no data). For this reason, we believe that reporting the results as they are, rather than reanalyzing them offline (which would also lose the connection with the system performance as experienced by the participants during our experiment), is best. task (task 6 as described in Section 2.4) and the cued spelling task (task 3 as described in Section 2.4). In addition, we explicitly tested whether the 10-fold cross-validated accuracy of the Yes/No calibration task (task 6) was higher than that of the speller calibration task (task 3). This was done with a Wilcoxon signed-rank test (Wilcoxon, 1992). Again, the false discovery rate was used to correct for multiple comparisons.
The SUS scores were calculated and compared between the young and older healthy participants and the older healthy participants and ALS patients. The SUS scores (ranging from 0 to 100) capture the user ratings of the usability of the system in a single number (for the full calculation, see (Brooke, 1996)). Scores higher than 68 are considered higher than average. Again, the scores were compared between participant groups using a permutation test with 10000 permutations employing a Kolmogorov-Smirnov test statistic. Multiple comparisons were corrected using the false discovery rate.
Lastly, the answers to the general questionnaire (see Appendix B) were tested for significant differences between the young and older healthy participants, and the older healthy participants and ALS patients. For both between group comparisons, a Wilcoxon signed-rank test (Wilcoxon, 1992) was used to check each of the 8 questions that all participants completed. Again, the false discovery rate was used to correct for multiple comparisons.

Results
Healthy participant H20 and patients P05 and P08 could not reach a classifier accuracy above 50% on the speller calibration task and therefore only completed the Yes/No calibration and Free Yes/No answers task. For this reason, these participants were excluded from the analyses of the transient responses and BCI performance.
The young healthy participants reached an average spelling accuracy of 94.3% (AE10.5%) on the cued spelling task and 92.5% (AE7.5%) on the Yes/No answer task (see Table 3). Older healthy participants reached an average spelling accuracy of 88.3% (AE11.1%) on the cued spelling task and 90% (AE10.7%) on the Yes/No answer task. ALS patients reached an average accuracy of 79.3% (AE11.1%) on the cued spelling task and 89% (AE11.0%) on the Yes/No answer task. No significant differences were found between the accuracies on the speller and Yes/No calibration tasks for the young (p ¼ :936) or older (p ¼ :188) healthy participants, or ALS patients (p ¼ :422).
Young healthy participants had an average ITR (including a 8.5 s inter-trial interval) of 24.8 bpm (AE5.14 bpm) on the cued spelling task (see Fig. 4 bpm (AE2.0 bpm) on the Yes/No answer task. Older healthy participants had an average ITR of 21 bpm (AE6.3 bpm) on the cued spelling and 4.1 bpm (AE2.3 bpm) on the Yes/No answer tasks. ALS patients reached an average ITR of 20.3 bpm (AE5.0 bpm) on the cued spelling and 4 bpm (AE2.4 bpm) on the Yes/No answer tasks. The ITR of young healthy participants was significantly higher than that of older healthy participants for the cued spelling task (see Table 4). No significant differences were found between the ITR of older healthy participants and ALS patients, nor between any of the participant groups for the Yes/ No answer task. Note, the detection times of the Yes/No answer task appear quite consistent (see Table 3). This is due to a minimum stimulation period of 0.5 s that was fixed for each experimental task. Without this limitation, the classification accuracy Table 3 Overview of the Brain-Computer Interface (BCI) performance. ACC = accuracy (%), T = average detection time of stimulation until classification (seconds per character), ITR = information transfer rate including a 8.5 s inter-trial interval (bits per minute), AVG = average across all participants in a certain group, ALS = Amyotrophic Lateral Sclerosis. Participant H20, P05 and P08 could not reach a classifier accuracy above 50% on the speller calibration task and therefore did not complete the cued spelling task. quickly drops below 70% due to early (<0.5 s) predictions (see Appendix D). During the free spelling task, young healthy participants typed 22 letters on average, needing about 2.2 s (AE0.8 s) of stimulation per letter (see Table 5). Older healthy participants typed 12 letters on average, needing about 2.7 s (AE0.9 s) of stimulation. ALS patients typed 20 letters on average, needing 3.1 s (AE0.7 s) of stimulation. As an indication of the classifier accuracy, the number of times that the backspace key was selected is provided in Table 5: a user can select the backspace key to correct an incorrect classifier prediction, i.e. remove a letter that the user did not intend to type.
All participant groups show a clear transient response to the short and long flashes of the BCI speller (see Fig. 5). Although the responses look similar across groups, some differences can be detected (see Table 4 and Fig. 5). The average amplitude of transient responses to the short and long flashes was significantly larger for the young healthy participants compared to the older healthy participants (see Table 4). In addition, the width of the transient response to a long flash was significantly shorter for the healthy young participants compared to the older healthy participants (see Table 4). This was also true for the ALS patients compared to the older healthy participants (see Table 4). In addition, the average width of the transient response to a short flash was larger for ALS patients compared to older healthy participants. Lastly, the latency of the transient response to the long flashes was significantly longer for ALS patients compared to older healthy participants (see Table 4).
The healthy young participants rated the BCI speller as 'excellent' with an average SUS score of 84.6 (AE 9.4) (for a mapping from SUS scores to verbal labels, see Appendix C). The older healthy participants rated it as 'good' with an average score of 77.2 (AE 6.5). Although large variations exist in the SUS scores of the ALS patients, on average they also rated the cVEP speller as 'good' with an average score of 75.3 (AE 24.6). The SUS scores of both the young and older healthy participants, and the older healthy and ALS patients differed significantly from each other (see Table 4). Both ALS patients and healthy young participants rated the BCI speller with a higher average SUS score than the older healthy participants (see Fig. 4).
A strong positive correlation (r ¼ 0:84) is found between the ITR and SUS scores of older healthy participants (see Fig. 4(c)). A moderate negative correlation (r ¼ À0:49) is found between the ITR Fig. 4. Overview of the System Usability Score (SUS) and Information Transfer Rate (ITR) results. The SUS (a) and ITR (including a 8.5 s inter-trial interval or ITI) (b) results. The grey SUS score labels in (a) are based on (Bangor et al., 2009). The ITR results in (b) are based on the cued spelling task. Significant differences are indicated with a ⁄. (c) Shows a scatter plot of the SUS scores against the ITR of the cued spelling task, including the correlation coefficients (r) of each participant group. Table 4 Overview of the statistics as described in Section 2.6. The analysis of the brain responses to the short and long flashes includes data of both the speller calibration and Yes/No calibration tasks. The False Discovery rate is based on the Benjamini-Hochberg critical value (BH): p-values below this critical value are considered significant (⁄). ALS = Amyotrophic Lateral Sclerosis, ITR = Information Transfer Rate, SUS = System Usability Score. and SUS scores of the healthy young participants. In addition, the ALS patients show a weak positive correlation (r ¼ 0:30) between their SUS scores and ITR. In addition to the SUS scores, the participants were also asked about their general opinion of the BCI speller (see Appendix B). The results of these questions are provided in Fig. 6. No significant differences were found between the answers of the young and older healthy participants, nor between the older healthy participants and ALS patients.

Discussion
In this paper, we set out to test whether ALS patients can control a visual cVEP speller as well as healthy users. To do so, we assessed the performance and user opinion of the BCI speller on young (<35 years) and older (P35 years) healthy participants and ALS patients. Our results show that all three participant groups were able to control the speller significantly above chance level: twenty-seven out of 30 tested participants reached a minimum Table 5 Classification results of the free spelling task. The average detection time (T) is provided in seconds per character. As a rough estimate of the number of spelling errors, the number of times that ''backspace" is classified is provided (Back). In addition, the total number of classified keys is provided (Total). AVG refers to the average across all participants in a certain group. ALS = Amyotrophic Lateral Sclerocis.   Table 4. classification accuracy of 64%, with an average classification accuracy on the cued spelling tasks of 79% for the ALS patients and 94% for the healthy young participants. Although 3 participants could not get an accuracy above 50% on the speller calibration task, all participants could freely answer Yes/No questions with a minimum accuracy of 70% and an average accuracy of 89 (for the ALS patients) to 92.5% (for the healthy young participants). No differences in BCI performance (i.e. ITR) were found between ALS patients and healthy older participants, suggesting that they can indeed control the BCI to a similar degree. During our online experiment, we have used an ITI of 8:5 seconds, which includes the ''target", ''prepare", ''selection" and ''break" phases as described in Section 2.3. However, during free spelling, the ITI need not be that long as the ''target" and ''break" phases are not necessary in that case. For everyday use of the BCI speller, the ITI should be shortened to prevent fatigue in the user. To reflect a realistic speed of the BCI speller in a user application, we have recalculated the ITR with an ITI of 3 s: see Table 6. We believe that this ITI accurately reflects the minimum time that a user would need to think about the next letter and have a little break in between two subsequent letter selections. In our experiment, each character selection took about 2 to 3 s of noise-code stimulation during the cued and free spelling tasks, and about 0.5 s during the Yes/No answer task. With an ITI of 3 s, this translates to typing about 10 to 12 characters per minute during the cued and free spelling tasks and about 17 responses per minute during the selection of Yes/No answers, which is well above the average speed of a P300 speller.
Although no significant differences in ITR were found between older healthy participants and ALS patients, the ITR of healthy young participants was significantly higher than that of older healthy participants. This difference may be due to the significant difference in the amplitude of the transient responses between the two: the amplitude of the transient responses to short and long flashes was larger for young participants compared to older participants. A larger amplitude affects the signal-to-noise ratio positively, making it easier for the BCI to detect the brain responses to the presented noise-codes. As the amplitude difference was  Table 6 Estimated Information Transfer Rates (ITR) for a user application. The ITRs (bits per minute) were calculated based on the accuracy and average detection time of stimulation until classification as reported in this study: see Table 3. However, in this case, we used an inter-trial interval of 3 s. AVG = average across all participants in a certain group, ALS = Amyotrophic Lateral Sclerosis, Spelling = cued spelling task, Yes/No = Yes/No answer task. not found between ALS patients and older healthy participants, it seems to be age-related. In addition to the difference in amplitude, another difference was found between the healthy young and older participants: the transient response to a long flash of young participants was narrower than that of older participants. However, similar to the young healthy participants, ALS patients also show a narrower transient response to long flashes compared to older participants. This suggests that this difference is not solely due to age-related differences and cannot explain the difference in ITR between the young and older healthy participants. Furthermore, the transient response to long flashes of ALS patients has a later onset compared to older healthy participants. Additionally, ALS patients show a wider transient response to short flashes compared to older participants. However, these differences did not translate into different ITRs for older healthy participants and ALS patients, suggesting that they do not affect BCI performance to a large degree.
Interestingly, the SUS scores of healthy young participants and ALS patients were on average both significantly higher than those of older healthy participants. However, these scores do not seem to be highly affected by the BCI performance; participants with a low BCI performance do not clearly rate the BCI system with a lower SUS score. A combination of the experienced BCI performance and the personal relevance of the BCI could possibly explain the difference in SUS scores. Healthy young participants have the best BCI performance, which may explain their 'excellent' SUS scores. ALS patients and older healthy participants have a lower BCI performance, but differ in the level of personal relevance of the BCI. Older healthy participants seem to have a lower overall impression of the BCI speller: compared to ALS patients and young healthy participants, they find the BCI more difficult to learn and use, more uncomfortable, and slower (see Fig. 6). ALS patients may be bothered less by these features, as the BCI is more important for them compared to older healthy users.
ALS patients are very heterogeneous: the experienced symptoms may vary greatly among patients with an equal ALSFRS-R score. Five out of 10 tested ALS patients had an ALSFRS-R score below 40, indicating that they have more ALS symptoms compared to the others. Three of these patients reached a performance of 82 to 95% on the cued spelling task, which means that even for these more progressed patients, the cVEP speller can provide a good means of communication. The other two patients could not reach a performance above 50% on the speller calibration task. Although both these patients had good eye function, they were easily distracted and indicated to have trouble focusing on a single character at once. Both patients found the Yes/No answer task easier to perform. This was also apparent from the results as all patients were able to freely select Yes/No answers with a performance of 80-100%, making it a good alternative to the full keyboard speller.
The ALS patients in our study could realistically have achieved a typing speed of 10 characters per minute with our cVEP BCI speller (including an ITI of 3 s). This would outperform other BCI spellers for ALS patients (Mainsah et al., 2015;McCane et al., 2015;Speier et al., 2017;Wolpaw et al., 2018). Still, some other cVEP methods perform better in a healthy population (Nagel and Spüler, 2018;Nagel and Spüler, 2019). Importantly, we recently showed that our cVEP BCI can be operated in a calibration-free mode without loss of performance as compared to a calibrated alternative as used in the current work (Thielen et al., 2021). This allows plug-andplay BCI use, without the passive training and calibration phase that prevents users from using the BCI directly. Especially for patients who may have a limited attention span or become fatigued quickly such a zero-training mode may be of substantial importance.
The cVEP speller also has its limitations. Its main limitation is that it, like most visual BCI spellers, is eye gaze dependent. As ALS patients may lose eye control in later stages of their disease, the cVEP speller may not be a long term solution. As such, the added benefit of the cVEP speller over an eyetracker may seem limited. In contrast to an eyetracker -that tracks a user's eye gaze on a digital screen-, the cVEP speller is independent of the type of eyelashes or eyes, is less dependent on the exact angle between the screen and eyes, and can be implemented in any electronic device (e.g. TV, alarm bell, light switch, etc.). Moreover, the eye gaze limitation could be completely resolved by implementing the noisecode principle in a different modality, e.g. auditory or tactile. Alternatively, the speller could be controlled via convert attention, relaxing the demands on eye control compared to overt attention. However, as with other gaze-dependent BCIs we would expect a performance reduction for these gaze-independent approaches. As the ALS patients tested in this study could all still verbally communicate (see Table 1), the usability of the cVEP BCI speller for patients with more advanced stages of ALS remains unknown and should be determined in future research.
Although the full keyboard cVEP speller did not work for all ALS patients due to its required level of focus, the binary Yes/No option seems to provide a good alternative. This option may be especially relevant for ALS patients in a locked-in state, when often only limited vertical eye movements are possible. Furthermore, future research may simplify the cVEP stimulation: by increasing the stimulus presentation speed (i.e. bitrate), the presented noisecodes may be subliminally perceivable, making the stimulation less demanding as the digital keys will appear to be stable rather than flickering. For periodic stimuli, the feasibility of stimulation above the perceptual fusing rate was demonstrated by (Molina et al., 2009). All in all, the majority of tested participants rate the cVEP speller as useful and would use it themselves if they would need it (see Fig. 6). Together with its speed and future potential, we believe the cVEP speller is a good step towards an alternative form of effective communication for ALS patients. time and energy to this experiment. Furthermore, we are grateful to the International ALS Association for supporting this research.