Evaluating the clinical benefit of brain-computer interfaces for control of a personal computer

Brain-computer interfaces (BCIs) enabling the control of a personal computer could provide myriad benefits to individuals with disabilities including paralysis. However, to realize this potential, these BCIs must gain regulatory approval and be made clinically available beyond research participation. Therefore, a transition from engineering-oriented to clinically oriented outcome measures will be required in the evaluation of BCIs. This review examined how to assess the clinical benefit of BCIs for the control of a personal computer. We report that: (a) a variety of different patient-reported outcome measures can be used to evaluate improvements in how a patient feels, and we offer some considerations that should guide instrument selection. (b) Activities of daily living can be assessed to demonstrate improvements in how a patient functions, however, new instruments that are sensitive to increases in functional independence via the ability to perform digital tasks may be needed. (c) Benefits to how a patient survives has not previously been evaluated but establishing patient-initiated communication channels using BCIs might facilitate quantifiable improvements in health outcomes.


Introduction
A brain-computer interface (BCI) is a system that records and processes brain activity to control a computer or other device. BCIs might restore, replace, enhance, supplement, or improve functions naturally achieved by the central nervous system [1]. Thus, the potential uses of BCIs are wide-ranging. Previous applications include physical rehabilitation [2][3][4], control of prosthetic limbs [5,6], and treatment of drug-resistant epilepsy [7]. But perhaps the most widely studied use for BCIs is the control of a personal computer or communication device.
Using a BCI to control a personal computer may enable users to interact with their physical, digital, and social environments, for example, controlling smart home devices, browsing the internet, or using apps for personal communication. This may be of benefit to people with a range of disabilities including paralysis, which contributes to a large burden of disease [8]. However, only a few BCIs for controlling a personal computer have permitted independent use outside of supervised research sessions [9][10][11][12][13][14] and none are clinically approved. To realize their potential, these BCIs must gain regulatory approval and be made clinically available beyond research participation.
To date, BCI development has predominantly been approached as an engineering rather than a clinical endeavor. This has been for good reason, as many manufacturing and computational challenges remain before the widespread adoption of BCIs may be realized [15,16]. However, this has also led to a predominance of BCI assessment methods focused on engineering feats rather than patient needs. Common performance metrics such as information transfer rate or characters typed per minute are useful for assessing BCI capability, making comparisons between different systems, and quantifying progress [17], but they do not directly demonstrate clinical benefit. Similarly, while user-centered design frameworks can facilitate a holistic assessment of user experience (e.g. [18]), these approaches still evaluate the engineering of the BCI and not the potential clinical benefit it may provide.
The BCI research community has recognized that for BCIs to 'gain traction' they must appreciably improve patients' lives. Research efforts have been directed from the achievement of greater speed and accuracy (First International BCI meeting [19]) towards collaborative efforts to translate laboratory success into practical devices that serve the needs of the end-user (Seventh International BCI Meeting [20]). As BCI technologies mature, there is an emergent need to transition from engineering-oriented to clinically oriented outcome measures. Ultimately, clinical outcomes will determine whether a BCI technology will be granted approval as a medical device by regulatory bodies such as the Food and Drug Administration (FDA) in the United States.
Which clinical outcome measures to select as trial endpoints will depend on the intended use of the BCI technology under investigation. One potential application of BCIs is rehabilitation of motor function. While there have been calls for greater standardization of clinical outcome measurements in trials of BCI-based rehabilitation [21], evaluations of clinical benefit in these trials need not vary considerably from clinical trials involving non-BCI-based rehabilitation programs. For example, previous studies involving BCI-based rehabilitation for post-stroke motor recovery have used prevailing measures such as the Fugl-Meyer assessment [2,4]. Moreover, outcome measures common to clinical trials of physical rehabilitation such as the Action Reach Arm Test have also been utilized to assess function achieved via BCIcontrolled prosthetics [6], where physical function is replaced rather than restored. Clinical endpoints in trials of BCIs for the treatment of psychiatric disorders or epilepsy may also be mostly consistent with previous trials not involving BCIs. For example, BCIbased interventions for medically intractable partial epilepsy have measured seizure reduction [7,22]. However, how to measure clinical benefit from BCI control of a personal computer (where there may be no expectation of restoring physical or psychiatric function) is not well established. This represents a substantial barrier for developers seeking regulatory and payor approval for a BCI offering control of a personal computer.
'Clinical benefit' represents a positive, clinically meaningful effect on how a patient feels, functions, or survives [23]. Future pivotal trials of BCI technologies might therefore include outcome measures addressing each of these three components of clinical benefit. The goal of this review was to examine how previous studies have assessed the effects of BCIs on how a patient feels, how a patient functions, and how a patient survives. We also consider some other outcome measures that may be included in future clinical trials. We hope this review will be relevant to clinical trials of all BCI technologies, however, we focus on how clinical benefit may be measured in trials of BCIs for independent use of a personal computer.

Patient-reported outcomes (PROs)
Following the World Health Organization's definition of health as 'a state of complete physical, mental and social wellbeing, and not merely the absence of disease and infirmity' , improvements in health gained from BCIs may extend beyond measures of mobility and physical function or biological markers. In fact, BCIs providing control of personal computers may not attempt to directly alter physiology at all. In this case, clinical benefit might be derived from clinically meaningful improvements in how a patient feels.
PROs can provide information on the status of a patient's functioning, symptoms, side-effects, or health condition as perceived by the patient, as well as how these change in response to a clinical intervention. Patient reporting also enables the assessment of states known only to the patient, such as pain intensity or feelings of depression. PRO instruments range from a single-item rating such as a visual analog scale to multidomain questionnaires and may involve surveys, interviews, or patient diaries.
PROs may serve as the primary or secondary outcome measures in clinical trials involving BCIs. They can provide information on a BCI's safety and effectiveness that may be used to support regulatory and healthcare decision-making processes. The United States FDA encourages the collection of patient perspectives using well-defined and reliable PRO instruments to provide valid supporting evidence for new medical devices [24].
Guidelines for the general use of PROs in clinical trials have been communicated previously [25,26]. The following discussion addresses the use of PROs for evaluating BCIs for the control of a personal computer.

Quality of life (QOL) and health-related quality of life (HRQOL)
Among the most common PROs are those that evaluate a patient's QOL or HRQOL. QOL and HRQOL are related but distinct concepts. QOL is a general concept that represents an individual's perception of their overall well-being, including non-health-related aspects of their life. HRQOL is a multi-dimensional concept that represents a patient's perception of the impact of their health status on their QOL. BCIs for the control of a personal computer could impact QOL and HRQOL differently. Therefore, capturing both during clinical trials of BCIs could differentiate the impact of the BCI on health-related and non-healthrelated factors [27]. However, those designing clinical trials of BCIs should be aware that QOL measures may be considered too general and ill-defined to support labeling claims for medical products [28]. HRQOL measures might therefore be preferable for evaluating the clinical benefit of BCIs in clinical trials for the control of a personal computer.
HRQOL encompasses the physical, mental, and social aspects of health, which may be impacted by both disease symptoms and treatment side effects. Dobkin [29] previously presented a framework of HRQOL that involved five dimensions relevant to BCIs, as follows: Physical well-being-including mobility, activity, self-care dependence, disease symptoms, and treatment side-effects.
Social well-being-including social support, social integration, participation in family/social roles, participation in work and hobbies, loneliness, and isolation.
General health-including fatigue, life satisfaction, and perception of overall quality of life.
Caregiver quality of life-including physical, mental, and social well-being, financial stresses, and worklife balance.
HRQOL instruments available for use in clinical studies include generic instruments that cover multiple domains or provide brief summaries of HRQOL, and instruments that are specific to a single patient group, symptom, or function. Each has its own advantages and disadvantages for use in clinical trials of BCI technologies. Generic instruments such as the SF-36 [30] are highly validated and applicable for use with a heterogeneous population of BCI users, however, these measures often contain inappropriate items. For example, the SF-36 asks participants about their limitations in 'vigorous activities such as running, lifting heavy objects, or participating in strenuous sports' , which is unsuitable and potentially insensitive to individuals with severe paralysis of the limbs. Consequently, the responsiveness, and therefore utility, of these generic measures may be limited in individuals with paralysis. Neurological conditionspecific instruments such as the amyotrophic lateral sclerosis specific quality of l.ife [31], spinal cord injury-quality of life [32], and multiple sclerosis quality of life-54 [33] may include more targeted and meaningful items than generic questionnaires. These may be useful in clinical studies of BCI technologies where all subjects have the same neurological condition. However, using these instruments could make cross-condition comparisons of HRQOL responses impossible, and assessing the clinical benefit of a BCI for a heterogeneous set of user groups may be more difficult. This is an important consideration if BCI technology aims to assist individuals with a common symptom without regard to etiology.

Previous use of PRO measures in BCI research
After evaluating several quality-of-life instruments for their utility in BCI studies involving individuals with amyotrophic lateral sclerosis (ALS) and assistive needs for communication [9], Wolpaw and colleagues used the McGill Quality of Life Questionnaire (MQOL) to periodically evaluate QOL over the course of an extended period of independent at-home use of a surface electroencephalography (EEG) based BCI [13]. The MQOL ( [34]) is a validated method of assessing QOL in patients with life-limiting illnesses such as ALS. One advantage of the MQOL for use in clinical trials of many BCIs is that the items are not heavily weighted towards physical function. This instrument may therefore be appropriate when investigating the benefits of BCI technologies that do not attempt to restore or replace physical function, such as BCIs for the control of a personal computer. Another feature of the MQOL is that it contains multiple items within the 'existential domain' , which might evaluate a patient's spiritual well-being. Evaluation of this domain is absent from many QOL instruments even though it might be a significant contributor to overall QOL in many individuals with severe or life-limiting disease [34].
Wolpaw et al [13] found average scores on the MQOL did not decrease despite a decline in neuromuscular function with disease progression in ten BCI users with ALS. This result demonstrated that the benefits of a home-use BCI system may not be an improvement in QOL but instead a maintenance in QOL despite disease progression and decreased function.
Other previous reports involving independent use of a BCI evaluated QOL using the psychosocial impact of assistive devices scale (PIADS) [10][11][12]. The PIADS is a 26-item, self-report questionnaire for evaluating the perceived benefits of an assistive technology [35]. The PIADS comprises three subscales that measure the subject's perceived impact on feelings of competence (12 items), adaptability (6 items) and self-esteem (8 items) using a seven-point Likert scale ranging from −3 (maximum negative impact) to 3 (maximum positive impact). There are some potential advantages of the PIADS for use in clinical trials of BCIs. Firstly, as its name suggests, the PIADS is designed to assess changes occurring in response to the use of assistive technology. Therefore, the individual items may be relevant to studies of many BCIs, and it might be possible to use this instrument across populations with various neurological conditions that might benefit from a BCI technology without compromising the validity of the scale. Additionally, the respondent also provides an impression of change score, which might be beneficial with individuals who have communication difficulties at baseline of a clinical trial. Finally, the items on the PIADS are not weighted towards physical function, thus the responsiveness of the instrument is not dependent on improvements in physical function.
Vansteensel et al [12] recorded positive ratings on all three dimensions of the PIADS (competence: 1.1, adaptability: 2.2, and self-esteem: 1.0) in their prospective case study involving a 58 year-old woman in a locked-in state due to ALS who received a fully implanted subdural-electrode based BCI. The participant provided ratings 7-9 months after surgery, which enabled her to independently control a computer typing software. These ratings were favorable in comparison to the ratings provided in relation to her existing eye-tracking system. Holz et al [10,11] also found positive effects on QOL as measured by the PIADS in their studies involving at-home use of a surface EEG-based BCI-controlled Brain Painting computer application. Use of the painting software had a positive impact on happiness, self-esteem, QOL, usefulness, self-confidence, productivity, and ability to participate. However, one participant also reported a negative impact on independence, which was due to the user being dependent on caregivers to set up the system to use the application [10].
One further study involving two individuals with ALS used the EQ-5D, Hospital Anxiety and Depression Scale, and a visual analogue scale of pain distress to examine different components of HRQOL following 4-12 months of independent use of an endovascular BCI to operate a tablet computer [14]. The EQ-5D is a widely used, standardized HRQOL measure [36,37]. The EQ-5D includes a 'descriptive system' comprising five single-item dimensions (mobility, self-care, usual activities, pain and discomfort, and anxiety and depression) and a single visual analogue scale that measures the patient's perception of their current health. One advantage of the EQ-5D is that it is short and simple to complete. This may be beneficial when collecting repeated measurements at multiple time points, particularly with individuals with communication difficulties. As a generic instrument, it may be applied across patient populations. However, some of the items (e.g. mobility and self-care) may not be pertinent for BCI interventions targeting the control of a personal computer, which may impair responsiveness of the instrument.
The Hospital Anxiety and Depression Scale (HADS) is a well validated tool to assess anxiety and depression in various populations and patient settings (HADS [38,39]). The HADS is a symptom specific scale and may therefore be useful when anxiety and depression is of substantial importance to the targeted BCI user group and there is reason to believe the BCI technology under investigation may have meaningful benefit to these symptoms.
Oxley et al found scores on the EQ-5D and HADS varied non-linearly with repeated measurements [14]. Accurately illustrating the effects of a BCI intervention over time might therefore require the collection of HRQOL outcomes at numerous time points over the course of a study and follow-up period.

Single-item scales
Single-item scales such as visual analog scales, numerical rating scales, and Likert scales may be used to quickly measure the state of a patient's symptoms, treatment side-effects, functioning, overall HRQOL, or a change in any of these measures, e.g. a comparison to before BCI implantation. The speed and simplicity of single-item PROs are beneficial in BCI research where practical constraints such as respondent burden due to communication difficulties may limit study procedures. The short completion time of single-item scales also lends itself to a higher temporal resolution of outcome monitoring than can be achieved using multi-dimensional instruments without undue burden on participants. BCI researchers might therefore consider a more frequent collection of single-item ratings in addition to administering multi-dimensional instruments at fewer, select timepoints during a study.
Several previous studies have investigated the psychometric properties of single-item scales. For example, a single-item QOL scale was found to demonstrate excellent reliability, good validity, and good responsiveness compared to multi-dimensional questionnaires [40]. Single-item measures may also be suitable for assessments of both anxiety and depressive symptom severity and psychological functioning [41,42], satisfaction with life [43], and general self-rated health [44]. However, not all constructs can be appropriately measured using single items in place of multi-item questionnaires [45]. If a concept is conceptually narrow, a single-item scale may be appropriate to measure it, whereas if a concept is complex, a single item may not represent the construct as well as multiple items [46]. Moreover, if a single-item scale is used to measure multi-dimensional constructs such as pain, mood, or overall HRQOL, it is impossible to identify which dimension of the construct is being evaluated by the respondent [47]. This issue is exacerbated when the scale is used for longitudinal assessments, where a respondent may be considering different dimensions of the construct at different time points. Therefore, single-item scales may better approximate responses to multi-item questionnaires or subscales that assess unidimensional constructs [48]. Consequently, previous efforts have been made to reduce longer multidimensional questionnaires to short forms including only one item for each dimension (e.g. the item demonstrating the highest factor loading for each dimension). For example, the SF-8 includes just one question for each of the eight domains covered by the 36-item SF-36. These shorter versions may therefore be more attractive to BCI researchers where instrument brevity provides considerable practical advantage, while still preserving the dimensionality of the construct being assessed. (Note: Although the SF-8 may have practical advantages compared to the longer SF-36 it does not include all the items of the SF-36 often used in economic evaluations of healthcare technologies, which might be important to BCI developers aiming to bring products to market).

Neuro-QoL
One further set of PRO measures worth considering for clinical trials of BCI technologies is Neuro-QoL [49]. Neuro-QoL is a comprehensive HRQOL measurement system designed to be psychometrically sound and clinically relevant for individuals with a neurological condition or disorder. Neuro-QoL encompasses physical, mental, and social health domains, and currently contains more than 20 item banks covering targeted sub-domains. This enables BCI researchers to selectively measure the constructs that are most appropriate for their patient population and study design.
Neuro-QoL has been tested within clinical populations including ALS, multiple sclerosis, muscular dystrophy, stroke, Parkinson's disease, epilepsy, spinal cord injury, and traumatic brain injury [50,51]. The short forms corresponding to the 13 original Neuro-QoL item banks have been validated and found to demonstrate both high reliability and an ability to discriminate between patients grouped by disease burden [49].
The Neuro-QoL was developed using item response theory. Item response theory-based measures identify the precise association between each individual questionnaire item and the latent construct being measured, such that the participant's response to an item reflects a measurable 'amount' of the construct [52]. This allows for highly flexible, customized administration. Neuro-QoL measures may be administered using 'pen and paper' short forms or electronically via computer adaptive testing, in which the selection of items is individually tailored based upon responses to the previously administered items. Moreover, custom measures can be created containing any number and combination of items from the same bank [53]. Items might be selected to be highly precise around a cutoff score or based on their sensitivity in a known subgroup. These measures can then be scored and compared to other measures derived from the same item bank. Measures are scored using a standardized T-score metric (US population-based mean of 50 and standard deviation of 10), which facilitates interpretation and comparisons. The Neuro-QOL also provides normative scoring for a variety of clinical and demographic populations. Overall, the Neuro-Qol may allow BCI researchers to construct valid, study-specific questionnaires that contain few irrelevant items, reducing respondent burden.

Further considerations 2.6.1. Patient priorities
The selection of both primary and secondary outcome measures should be driven by the priorities of the patient population the BCI aims to accommodate. For example, a BCI with a stated aim to improve QOL/HRQOL should identify what has the greatest impact on QOL/HRQOL for their target user group. For individuals with ALS, QOL is more strongly related to social disability than physical disability [54]. Similarly, individuals with tetraplegia often report high QOL if they have good social support and are free from chronic pain [29]. A BCI for the control of a personal computer might aim to increase social support via improved communication channels or access to online communities. Therefore, selecting an appropriate PRO measure of social support might be more of a priority than measures of typing speed and accuracy for a clinical trial involving individuals with ALS or tetraplegia (or both).

Practical limitations
Practical limitations in communicating with participants should also be considered when selecting outcome measures [55]. Longer instruments such as the SF-36 may be poorly tolerated by individuals with communication difficulties [56]. Communication difficulties may also make the collection of PRO measures particularly challenging in patients with locked in syndrome [57]. Although some outcome measures may be collected by proxy, results may not be consistent with patient scoring. As an example, caregivers may rate a patient's QOL significantly lower than the patient themselves [58].

Regulatory guidelines
The FDA has published guidelines for the use of PRO measures in the evaluation of medical devices [24]. This document states that both single-and multiitem PRO instruments can provide valuable evidence for benefit-risk assessments and can be used to support labeling claims providing the instruments are validated in the population of interest. The document presents key considerations which should be addressed prior to the inclusion of a PRO in a study protocol: • Is the concept being measured by the PRO meaningful to patients? • Does the PRO represent a primary, secondary, or exploratory outcome measure? • Is there sufficient evidence to support the use of the PRO to measure the construct of interest in the context of the study design (including patient population)? • If conducting a multinational trial, is the PRO valid across cultures and languages?

Activities of daily living (ADL)
ADLs comprise the fundamental skills required for independent day-to-day living. Basic ADLs (BADLs), sometimes referred to as physical ADLs, include actions required for self-care such as bathing, dressing, toileting, transferring, continence, and feeding. Instrumental ADLs (IADLs) involve more complex activities related to independent living within the community such as using the telephone, shopping, preparing meals, housekeeping, taking medications, and managing finances. The capacity to perform both BADLs and IADLs is required for functional independence. Functional disability in BADLs and IADLs has been measured in many patient populations including stroke and spinal cord injury [60,61]. There are numerous general and disease-specific instruments that record BADL and IADL capabilities either in combination or separately. Widely used instruments include the Katz Index of Independence in ADL [62], the Barthel Index [63], the Lawton IADL Scale [64], and the Functional Independence Measure [65]. Many ADL measures can be collected by selfreport, caregiver report, or observational assessment, and are therefore feasible for subjects with communication difficulties. For a review on assessing ADLs, see Mlinac and Feng [66].
Increasing functional independence is a commonly stated goal of BCI technologies. Recording changes in ADL status might therefore provide a measure of a BCI's ability to increase functional independence. As motor neuroprostheses, BCI technologies can restore or replace motor functions essential to ADLs including the ability to grasp and manipulate objects [6,67]. Regaining these motor abilities might immediately improve ADL status. However, BCIs for the control of a personal computer may not aim to restore physical function. The use of BADL or IADL instruments in clinical trials of these technologies is therefore more challenging.
One previous study assessed changes in IADL status in two individuals with upper limb paresis due to ALS who received an endovascular BCI system [14]. The participants were trained to control a tablet computer using eye-tracking software to control cursor position and the BCI system to generate clicks in lieu of a computer mouse or trackpad. The study targeted three items from the widely used Lawton IADL Scale that could be achieved using a computer: using a [smart] telephone, shopping, and financial management. Specifically, the telephone task involved opening a text messaging application, searching for, and selecting the recipient, typing 'hello' , and clicking the 'send' button. (A similar email task was also completed). The shopping task involved opening a web browser, navigating to an online store, searching for two specific items, adding the items to the cart, and clicking on 'check out' . The finance task involved navigating to their internet banking website using a web browser, logging on, and checking their balance. Both participants, who were reliant on caregiver assistance to perform these activities prior to the study, successfully completed each of the tasks (qualitatively assessed as either successful or unsuccessful) 106-238 days following implantation.
It is important to note that although the assessed tasks corresponded to three items on the Lawton IADL scale, Oxley et al [14] did not claim a score increase of three points on this scale. Computerized means of completing IADL tasks may not have been envisaged by the authors of the original Lawton IADL scale, or by studies assessing its validity and reliability. Previously established psychometric properties, including established norms or clinically important differences may no longer apply when using the computerized tasks and caution should be applied when comparing these results to previous findings. Instead, Oxley and colleagues reported that the computerized tasks were designed to meet the participants' realworld needs and represented a demonstration of the minimal level of functionality required for task independence. As such, the completion of three tasks considered important to daily living by patients who were previously unable to perform them independently still offers a valid demonstration of increased functional independence.
This example demonstrates some of the limitations arising from the bias of existing instruments towards physical capability. Many of our daily activities are now performed digitally, often using a personal computer with internet access. In other words, using a computer is a 21st century activity of daily living. This reliance on digital technologies may be increased with physical disabilities, making the ability to perform digital tasks even more fundamental to daily living in these populations. However, instruments to measure the increases in functional independence enabled via control of a computer are lacking. Validating new instruments-or perhaps the creation of a Digital ADL scale-is therefore urgently required. Gaining regulatory and payor acceptance of digital ADLs as a core component of functional independence may present a further hurdle, but this may be a necessary step in bringing a variety of efficacious digital health technologies to market.

Employment
Sellers et al [9] presented a case study of a 51 yearold with ALS who, despite being unable to use conventional assistive devices, was able to run an NIH funded research laboratory using his BCI. This individual case demonstrated the ability to control a personal computer using a BCI increased functional independence in an employment capacity. Pivotal trials may not be able to present this outcome in the same level of detail as single-subject case studies, however, the ability to maintain employment represents an index of functional independence of substantial socioeconomic importance to many individuals. Measuring vocational and employment outcomes, such as employment status or hours of work per week, might therefore be considered where applicable.

Further considerations 3.3.1. Patient priorities
As with PROs, patient priorities should be considered when assessing changes in functioning. Huggins and colleagues investigated the importance of various BCI-related tasks to individuals with ALS and spinal cord injury [68,69]. Tasks commonly rated as high in importance included operating a computer, controlling room temperature and lighting, and emergency communication. However, ratings varied widely between individuals, and 'other' received the highest overall ratings. As control of a computer enables countless different tasks to be completed, how patients choose to use a BCI enabling the control of a personal computer will differ between individuals. This should be considered before assessing improvements in function using standard instruments or pre-set task lists, which may contain tasks that an individual has not attempted to perform and omit meaningful tasks that the BCI enables them to perform well. Moreover, the tasks individuals perform with computers may increase as technology continues to be integrated into daily life. Therefore, Goal attainment scaling is a method for rating individually set targets during an intervention [70]. In contrast to traditional measures that include a uniform set of tasks each rated according to pre-set criteria, goal attainment scaling includes tasks that are individually identified as most important to the patient and criteria are pre-set according to each individual's current and expected levels of task performance. As such, each patient has an individualized outcome measure, which is scored using a five-point scale from −2 (achieved much less than expected) to 2 (achieved much more than expected). This standardized method of scoring allows the composite goal score to be transformed into a T-score with a mean of 50 and standard deviation of 10 [70]. Bias in goal setting can also be checked by comparing the mean T-score to the expected value of 50. Overall, methodological approaches that consider each individual's priorities prior to BCI intervention may best capture meaningful increases in functioning.

Conclusion
BCI control of a personal computer may enable increases in functional independence without improvements in physical function or mobility. This includes functional independence in ADLs and employment, which can be included as trial outcomes (table 2). While evaluating changes in ADL status following independent use of BCIs is possible, existing instruments are heavily dependent on physical capabilities. The development of new scales may therefore improve future evaluations of the impact of a BCI on 'how a patient functions' . These scales should be sensitive to the fact that a significant level of functional independence can now be achieved with digital technologies that were not widely available when the original BADL and IADL scales were developed (more than half a century ago in the case of the Lawton IADL Scale). Moreover, including tasks important to each individual user in these assessments would also contribute to a meaningful evaluation.

Health outcomes
The effects of BCIs on health outcomes were not evaluated in any of the publications we considered within this review. This is unsurprising given only a few early feasibility studies have included BCIs for the independent control of a personal computer outside of supervised research sessions [9][10][11][12][13][14]. However, it is plausible that using a BCI to control a personal computer could influence health outcomes and future pivotal trials could aim to capture these potential benefits where applicable.

BCI for health-related communication
People with severe communication difficulties such as those with locked in syndrome represent a target population for some BCIs. For these individuals, the restoration of 'direct personal communication' may be considered the most important function of a BCI [71]. Ineffective means of communication can lead to negative affect associated with a loss of control, frustration due to not being understood, and an unmet desire for individualized care [72]. Therefore, BCIs that offer a reliable method of communication could provide a voice to these patients and might increase their autonomy over their healthcare as well as their daily lives.
It is morally imperative to involve patients in major healthcare decisions including end of life preferences, yet such decisions are often made by others on the behalf of fully alert patients with severe communication difficulties. A key aim of some BCIs is to restore a patient's ability to communicate important healthcare preferences [73,74], which might be particularly meaningful for patients without advanced directives. However, criteria for meeting this outcome are unclear. Fins and Schiff [75] have argued that a patient's inability to initiate questions, give nuanced responses, or demonstrate understanding commensurate with the significance of a decision might leave doubt over their capacity to consent. For example, 'yes/no' responses are unlikely to meet a 'clear and convincing' standard of evidence and might amount to assent at best [75]. The ability to initiate questions and give full responses should therefore be reflected in the outcome measures of trials of BCIs for communicating healthcare decisions.
Of course, healthcare decisions do not have to be life-or-death to be meaningful to patients. Also important are the day-to-day healthcare choices such as pain management. Previous studies on intensive care patients with communication difficulties following respiratory tract intubation have shown pain to be the most common topic of attempted communication as well as the most difficult topic to communicate successfully [76,77]. Happ et al [77] found that 38% of communications about pain were unsuccessful, compared to 25% of communications about other topics. Moreover, communications initiated by the patient were typically less successful than those initiated by a care provider [77]. Effective communication with regards to pain and use of pain medication is important as undertreatment can lead to discomfort and distress, whereas inappropriate administration of pain medications including opiates can pose risks such as organ damage and addiction [78,79]. Future clinical trials might quantify the number of successful attempts at communication on different health-related topics (e.g. pain management) or record changes in perceived efficiency of these communications from both the patient's and caregiver's perspective.
The ability to initiate communication may be particularly impactful to patient health. For individuals with severe communication difficulties, establishing patient-initiated communication channels using a BCI might enable the user to inform caregivers of early symptoms of secondary medical complications before they are evident from vital signs or routine monitoring [29]. For example, being able to communicate shortness of breath may enable patients to alert care providers to the onset of respiratory complications (e.g. pneumonia), which are among the most common causes of hospitalizations and mortality in individuals with paralysis following ALS or spinal cord injury [80][81][82]. Earlier identification of potential medical complications may also permit preventative or early-stage treatment options resulting in improved outcomes. For instance, early identification of respiratory failure and initiation of noninvasive ventilation can improve survival in individuals with ALS [83,84]. Moreover, in spinal cord injury populations, it has been shown that effective communication between patients and health professionals can optimize prevention of pressure ulcers [85]. BCIs may be instrumental in facilitating this level of collaboration for patients with severe communication difficulties.
BCI for control of a computer may also facilitate direct communication between patients and clinicians using digital platforms. Enhancing the efficiency of this direct communication may have several benefits that would not be limited to individuals with severe communication difficulties. For example, patients may be embarrassed to discuss certain healthcare issues with family or aides. Direct communication between patients and clinicians might therefore reduce the embarrassment felt by patients or enable the disclosure of health-relevant information that would otherwise not have been reported via an intermediary.
Establishing the efficacy of BCIs for control of a computer on improving specific health outcomes (e.g. incidence of pressure ulcers) may not be feasible in heterogenous clinical trial cohorts, however, more general measures of health outcomes could be captured. The incidence of any secondary medical complication, the number and duration of hospital stays/days out of hospital, and even differences in mortality are quantifiable during longitudinal studies of independent home use of BCIs. Importantly, demonstrating these potential health benefits would require that the number of participants be sufficiently high, which is a major challenge for clinical trials of novel invasive BCIs. Sufficient follow-up periods must also be included within study designs to reliably capture these metrics. Finally, new channels for patient-initiated communication might improve health outcomes without needing to meet formal standards of an actual 'alarm' system.
Secondary medical complications can also be expensive to treat [86,87] and additional complications arising during rehospitalizations can further inflate medical costs [88,89]. A reduction in incidence or severity of secondary medical complications via early identification could therefore decrease the total financial cost of a patient's medical care. Although cost-benefit analyses may not be within the purview of many BCI researchers, achieving a positive benefit-to-cost ratio can be important for health technologies to obtain health insurance coverage. Early and regular collection of economic data alongside effectiveness data has therefore been recommended to facilitate the translation of BCI prototypes to clinically available products [90].

Further considerations 4.3.1. Safety
Primary and secondary safety endpoints might also contribute to an assessment of 'how a patient survives' . Reported safety outcomes should include a detailed description of all adverse events for all participants. Scheduled assessments of physical or cognitive ability might also be considered to ensure BCI placement does not cause a worsening of disability (accounting for progressive disease states). Anticipated adverse events and safety monitoring will vary depending on the invasiveness of the BCI modality. Novel invasive BCIs will require a different approach to non-invasive BCIs. The collection of adverse event data is necessary to determine the risk associated with a BCI, which can be weighed against the potential benefits.
Adverse event monitoring has been reported by previous studies involving invasive BCIs for the control of a computer. Vansteensel et al [12] reported that their 58 year-old patient experienced one serious adverse event: a fever arising five days following the implantation of the subdural electrode array and subcutaneous transmitter device, which led to readmission following discharge from the hospital earlier on the same day. This serious adverse event resolved without treatment allowing discharge from the hospital the following day. Postsurgical feeling of numbness in the skin around the left ear and increased tiredness were also reported but these adverse events were not serious. Oxley et al [14] reported no serious adverse events for either of their two participants following implantation of an endovascular electrode array and subcutaneous telemetry unit. One participant experienced a post-proceduralrelated episode of syncope one day following surgery, although it was suggested that this was not device related.

Conclusion
Establishing channels for patient-initiated personal communication is a sought-after BCI function, which may increase the level of autonomy a patient has over their healthcare. The discussion above identifies two potential benefits to this increase in autonomy that can be evaluated in pivotal clinical trials of BCIs for the control of a personal computer/communication device: (a) patients are actively involved in decision making such that they receive the healthcare they request, and (b) patients inform care providers of symptoms of secondary medical complications, leading to reductions in the incidence or severity of secondary medical complications. Quantifying healthcare use (e.g. incidence of secondary medical complications, volume of hospital stays, etc) may be used to demonstrate clinical benefit in sufficiently large clinical trials.

Interdependence
Changes in how a patient feels, functions, and survives may not be independent of each other. HRQOL and ADL status may be positively correlated [91,92], and both have been suggested to be predictive of morbidity and mortality [93,94]. Including separate measures of how a patient feels, functions, and survives may lead to a greater understanding of how these constructs are related to each other in patients with BCIs.

Non-clinical measures
While the focus of this review was restricted to assessments of clinical benefit, we remain advocates for holistic, user-centered evaluations of BCI technologies. Previous research has examined patient preferences and performance requirements for BCI adoption. For example, a telephone survey of 61 people with ALS found that 84% of respondents reported they would be satisfied with a command accuracy of 90%, and 72% of respondents reported they would be satisfied with a typing speed of 15-19 characters per minute using a BCI [68]. Accuracy, typing speed, and numerous other performance metrics (see [17,95]) are important for demonstrating that BCI performance meets acceptable standards for different BCI user groups. The external validity of clinical trial outcomes may be compromised if BCI performance limits uptake or adherence once the trial has finished. Patient follow-up focusing on whether participants continue to use their BCI and the reasons for stopping may be important to better understand informed patient preferences.
As BCI performance improves, more emphasis may be placed on other aspects of user experience. Thompson et al [95] have proposed the 'uFEEL' framework for BCI assessment, which includes four user experience factors: usability, affect, ergonomics, and quality of life. 'Usability' includes measurements of effectiveness, efficiency, learnability, and satisfaction. 'Affect' concerns participant feelings and might include how comfortable a system is or the unpleasantness of any audio/visual stimuli. 'Ergonomics' corresponds to human-computer interactions and might include level of control and cognitive workload. Finally, the 'quality of life' factor was proposed to measure an overall quality of experience and perceived return on investment. We do not discourage investigators from applying this framework; each of these factors includes worthwhile measures of BCI engineering. However, we caution readers not to confuse these measures of user experience with measures of clinical benefit.
Finally, control of a personal computer allows for a wide variety of tasks to be performed. Both BCI performance and user experience may influence the way in which a patient chooses to use a BCI for the control of a personal computer. This could influence the effects of a BCI on how a patient feels, functions, and survives both during and after a clinical trial. Engineering-oriented measures will therefore remain as important outcomes of BCI research alongside clinical outcomes.

Study design
Conventional randomized controlled trials may not always be feasible or ethical for studies of BCIs. Shamming implantable BCIs might be impossible and there may be no obvious comparator. Furthermore, due to the potential risks associated with novel implantable BCI devices, subject numbers in clinical trials may be limited, particularly in feasibility trials. In these cases, single-subject or small group analytical approaches such as regression discontinuity may be more methodologically valid than group comparisons. As many recipients of BCI technologies may have progressive conditions or worsening health, the multiple baseline measurements associated with these approaches are useful to establish trends in outcome measures prior to the start of a BCI intervention.
Researchers might also consider retrospective pretest-posttest designs where participants rate PRO items twice during the same posttest measurement occasion; first corresponding to their current feelings following the intervention, and second corresponding to how they felt at a specific time prior to the intervention (i.e. retrospectively) [96]. This approach addresses validity concerns associated with PROs assessed via traditional pre-post evaluation designs, which can result in weaker than expected effect sizes. For example, a participant's internal frame of reference when rating HRQOL may not be consistent over time. This approach may also have practical benefits for participants with severe communication difficulties at baseline. Importantly, this approach can be used in combination with conventional pre-test post-test assessments. Alternatively, researchers could consider using a self-anchoring rating scale such as Bernheim's Anamnestic Comparative Self-Assessment to collect subjective wellbeing. Here, self-reported ratings of well-being are scored between −5 and plus 5, which correspond to the worst and best periods within the respondent's own life experience. Using empirical scale anchors provides an internal reference instead of abstract scale anchors that encourage external reference for determining subjective well-being [97]. This may improve both sensitivity and responsiveness to intervention compared with conventional questions [98]. Anamnestic Comparative Self-Assessment has previously been used within target populations of BCI including individuals with locked-in syndrome [99]. Detailed discussions of these 'non-traditional' methodological approaches are beyond the scope of this review, but we highlight them here for consideration alongside the selection of outcome measures.

Other resources
The FDA has recently published a draft guidance document that puts forward many considerations for clinical trials of implanted BCIs in the United States [100]. This document included numerous recommendations for pre-clinical testing as well as considerations regarding patient populations, informed consent, testing in the home environment, and investigational and statistical analysis plans for clinical studies. The document emphasized that primary and secondary effectiveness endpoints in pivotal trials must be validated for the intended patient populations, but indicated feasibility studies may be used to validate desired metrics where there is strong justification for the use of novel endpoints. We recommend BCI researchers consult this helpful documentation prior to designing clinical trials of any BCI technology. Moreover, clinical trials of novel BCI technologies should be designed in cooperation with regulatory bodies and payors where possible. Discussing which outcome measures would be appropriate to provide supporting evidence is essential before commencing trials.

Conclusion
BCIs that enable the independent control of a personal computer may provide myriad benefits to multiple patient populations. However, for these benefits to be realized, BCI technologies must become clinically available beyond participation in research studies. The path to market for these technologies may involve approval from both health insurance providers and regulatory bodies requiring demonstration of a clinical or financial benefit associated with adoption of the technology. Valid outcome measures for demonstrating clinical benefit are therefore fundamental to the translation of BCI research. However, selecting appropriate measures for clinical trials of BCIs for the control of a personal computer is challenging as there may be no expectation of restoring physical function, which many existing instruments are biased towards. This review presented the measures that have been used in previous trials involving independent use of a BCI to control a personal computer and made some suggestions for future research. Overall, the careful selection of measures of how a patient feels, how a patient functions, and how a patient survives can provide a comprehensive overview of clinical benefit.
Although we have focused on outcome measures suitable to clinical trials of BCIs for the control of a personal computer, we hope this review will be applicable to trials of many other BCI technologies. Similarly, we have been deliberately unspecific regarding BCI sensor type (surface EEG, electrocorticography, functional near-infrared spectroscopy, etc) as our discussions should apply to a variety of approaches to BCIs. However, we acknowledge that invasive and non-invasive BCI users may have different priorities/expectations that could influence the selection of outcome measures. Finally, we have not included studies demonstrating BCI control of a personal computer that did not allow for independent use (i.e. outside of supervised research sessions) as these studies were not well suited to assess clinical benefit and did not typically report attempts to do so. The small number of remaining studies included within this narrative review highlights the need for further attention to this topic. We hope this review will stimulate discussion in this area among a broad coalition of stakeholders including patients, caregivers, clinicians, researchers, ethicists, advocates, payors, regulators, and BCI companies.

Data availability statement
No new data were created or analysed in this study.