Advances in Neuroprosthetic Learning and Control

This essay summarizes recent advances in the field of brain-machine interfaces, with a focus on the learning and acquisition of neuroprosthetic skills.


Introduction
The goal of cortically controlled motor neuroprosthetics [1][2][3][4][5][6][7][8][9][10][11][12][13][14] is to reliably, accurately, and robustly convey enough motor control intent from the central nervous system (CNS) to drive multidegree-of-freedom (DOF) prosthetic devices by patients with amputated, paralyzed, or otherwise immobilized limbs for long periods of time (decades). To achieve this goal, two main challenges remain: 1) how to make viable neural interfaces that last a lifetime, and 2) skillful control and dexterity of a multi-DOF prosthetic device comparable to natural movements. In a BMI system, neural signals recorded from the brain are fed into a machine that transforms these signals into a motor plan. This is the subject's ''intention of movement,'' which is then streamed to the prosthetic device. A closed control loop is established by providing the subject with visual and sensory feedback of the prosthetic device.
The first challenge is to have a neural interface viable for a lifetime. In the front end, the physical substrate should be able to withstand a variety of biotic and abiotic effects that presumably lead to performance degradation at the electrode-tissue interface [15]. In the back end, the system should be wireless, require minimum power, and support bidirectional dataflow, i.e., ''reading'' and ''writing'' from/to the brain. Ideally, these systems would be fully implantable in the intracranial space as well as have batteryless operation. They should also be modular enough to allow the measurement and stimulation of different types of neural signals, such as the electrical activity of individual neurons or groups of neurons, as well as other physiological parameters such as glucose, brain pulsation, etc. that may become important for powering the implanted device in future generations of this technology [16].
The second challenge is getting the brain to recognize an ''actuator,'' or prosthetic device that is not part of the body, and being able to control it without enacting overt physical movements (as in the case of a paralyzed patient). This has two differentiated components: motor and sensory. On the sensory side, the goal is to provide realistic sensory feedback from the prosthetic device by directly stimulating sensory areas in brain regions that would mimic lost/damaged inputs. This should allow the user to feel the environment through the prosthetic device, which has been supported by recent examples using electrical microstimulation [17,18]. Future BMI systems may incorporate optical stimulation in lieu of electrical stimulation [19]. On the motor side, we suggest that in order to boost the performance of current BMI systems both neural adaptation (brain plasticity) and artificial adaptation (machine learning) should be combined in a coadaptive way. Ultimately, the goal is to achieve a quantum-leap increase in neural controllable degrees of freedom that should allow a patient to effortlessly perform tasks of daily living.
Next we will focus on recent advances, mostly from our laboratory, that are relevant to the acquisition and retention of skills to control disembodied effectors such as computer cursors and prosthetic limbs. These advances gravitate around the concept of ''prosthetic motor memory'' facilitated through brain plasticity, and the ''tuning'' of decoding algorithms while the subject is using the BMI.
Decoding Natural Actions vs.

Learning to Perform New Ones
When thinking about BMI design, there are at least two different approaches one could take for converting thought into action, also known as the decoding vs. learning argument [20,21]. One approach aims to decode (or read out) the natural motor plan to control the missing, impaired, or intact limb. In this approach, a mathematical model or decoder that relates neural activity to natural limb movements is generated and then used to predict these movements from the record-ed neural activity alone. The other approach requires the brain to learn a transform in order to control the new actuator, irrespectively of physical movement of the natural limb. This approach treats a BMI system as a ''modified CNS'' that has to be learned.
But why should we approach BMI as a modified CNS? When interfacing the brain with a machine, we are effectively creating a de novo circuit for action. The neuroprosthetic system under control in this new circuit is fundamentally different than the natural system used to control the native arm. For instance, our musculoskeletal systems have very little to do with robotic limbs in the way they function and how they are controlled. The same applies to the spinal cord, which in the neuroprosthetic system is approximated by a set of mathematical rules called the transform. This transform projects from a high dimensional space of dozens to hundreds of neurons to a subspace of a few control signals (e.g., position and velocity of the end effector). This is particularly important because the motor and sensory pathways will be compromised in patients with spinal cord injury or other neurological disorders. Hence, if we are trying to control a prosthetic device that is different from our native arm, why should we aim to decode the brain signals related to this arm in the first place? Instead, could the brain learn to control a prosthetic device that is not part of the natural body and generate novel actions with it?
One important aspect relevant to this discussion is the type of experimental model used for BMI experiments. The brunt of the invasive BMI work (i.e., that which uses implantable technology to record from populations of neurons) is currently done in able-bodied animal subjects [22] except for some exceptions in animal models of spinal cord injury [23], temporary models of paralysis induced via nerve blocks [9,11,12], as well as a few clinical trials in humans [7,13,14]. In addition, there is an increasing number of studies in epileptic and stroke patients that involve BMI tasks using electrocorticographic (ECoG) signals [24,25]. So, how can able-bodied subjects learn these circuits for neuroprosthetic control? Our approach is to change the rules of the instrumental learning task previously learned under manual control.
Take for example a standard center-out reaching task in which the subject manually controls an actuator-a robotic manipulandum, exoskeleton, or its own natural limbto reach for instructed targets in order to obtain a reward. The performance feedback received by the subject typically consists of a visual observation from a computer screen displaying the controlled actuator. Upon switching to neural control (or ''BMI mode''), the experimenter swaps the visual feedback from the actuator controlled manually with that of the actuator controlled through the BMI. Here is when the manner in which the subject is instructed to perform the BMI task is key. By physically removing the actuator from the experimental rig during BMI mode (or restraining the arm to the primate chair in the case of the natural arm being the actual actuator used to reach for targets), we are effectively changing the rules of the task. It is no longer a ''move joystick to center target'' type of rule but one that requires learning to mentally steer the actuator using biofeedback. This triggers a learning process that we call ''transform learning'' [26], in which the brain modifies the tuning properties of the neurons incorporated into the BMI (and therefore causally linked to behavior) to minimize error in the motor output through a process of plasticity [10,27,28].
Alternatively, not changing the rules of the task upon switching to BMI mode (that is, keeping the experimental setup and task the same as during manual control) typically leads to the able-bodied subject continuing to engage as in the manual task (i.e., overtly moving the natural limb and oblivious that a change of mode of operation, from manual to BMI mode, has taken place). Thus, as the afferent and efferent pathways in the subject remain intact, the patterns of neural activity evoked during BMI mode will be very similar to those evoked during manual control. This mode of BMI operation predicts few plastic changes in the brain, since the same circuits for motor control are being used.

Circuit Stability Facilitates Prosthetic Motor Memory
As noted, in the learning approach to BMI control, the subject has to learn the ''spinal cord'' (i.e., the transform) for neuroprosthetic function in order to perform the actions required to achieve the desired goals. For practical reasons, this learning process is typically initialized with a biomimetic transform (i.e., generated from natural arm movement data). However, this is not a requirement, as previous work has shown that primates and even rodents can also learn arbitrary transforms trained with nonbiomimetic data [9,10,28,29], demonstrating the capacity of the brain to create de novo circuits to perform novel (neuroprosthetic) actions.
Regardless of the way in which the transform is trained (biomimetic or not), the complexity of both the prosthetic device to be controlled (e.g., degrees-of-freedom of the apparatus) and the task to be performed play a crucial role in transform learning. For instance, in early studies [2][3][4], new transforms were trained at the beginning of every session. In this approach, the subject has to effectively learn a new transform every day before being able to perform the task proficiently. If the task is simple enough, the brain can learn the transform in a single day (intrasession). However, as task complexity increases, it becomes more difficult for the subjects to learn the daily trained transforms. This results in variable performance from day to day that prevents consolidation and retention of prosthetic skill [10,26]. Hence, intrasession transform learning alone becomes impractical for learning skillful neuroprosthetic control. So, how can we achieve consolidation of the learned skill?
We hypothesized that pairing stable neural recordings with a fixed, static transform-as opposed to retraining the transform every day-would lead to retention of the learned skill across time and therefore facilitate the consolidation of a prosthetic motor memory. The key element here is the stability of the circuit; the neural input to the transform and the parameters of the transform remains unchanged throughout learning. This is what we tested in previous work [10,27] in which we showed that the primate brain can achieve and consolidate skilled control of a prosthetic device in a way that resembles that of natural motor learning, i.e., a motor skill that is retained. Specifically, when a fixed transform algorithm was applied to stable recordings from an ensemble of primary motor cortex (M1) neurons across days, there was dramatic long-term consolidation of prosthetic motor skill. This process created a motor map for prosthetic function that was readily recalled and remarkably stable across days. Surprisingly, the same set of neurons could learn and consolidate a second motor map without interference with the first map, highlighting another attribute of transform learning that is similar to natural motor learning: that of being able to learn new motor skills without interfering with previously acquired skills.
Hence, transform learning leads to the formation of a stable cortical map that has the putative attributes of a memory trace; namely, it is stable across time, readily recalled, and resistant to interference. We believe such a prosthetic motor memory will be critical for the skillful control of multi-DOF prosthetic limbs, and that these devices could eventually be controlled through the nominally effortless recall of motor memory in a manner that mimics natural skill acquisition and motor control.

The Role of Machine Learning in the BMI Loop
What about adaptation taking place in the machine instead of the brain? This process, known as closed-loop decoder adaptation (CLDA) [30], is an emerging paradigm for achieving rapid performance improvements in BMI control (Figure 1). CLDA consists of adapting the transform's or decoder's parameters during closed-loop BMI operation (i.e., while the subject is using the BMI) to more accurately represent the mapping between the user's neural activity and their intended movements [3,[31][32][33][34][35][36]. The error signals required to adapt the decoder can be estimated in a variety of different ways, including using the task goals to infer the subject's intention [33], Bayesian methods to self-train the decoder [34], and extracting error signals directly from the brain [35].
The design process of a CLDA algorithm requires important decisions not only about which parameters of the decoder should be adapted and how these should be adapted, but also when, (i.e., how often), as the rate at which the decoder changes can influence performance. Also important is the way in which the decoder is initialized. Movement disorders such as paralysis and stroke prevent patients from making the types of natural movements that are often used to initiate the decoder. As a result, less favorable methods of decoder initialization, such as motor imagery, must be used, typically resulting in low initial performance. To address the problem of accelerating learning and boosting BMI performance in these settings, we recently developed SmoothBatch, a CLDA algorithm that improves performance in a relatively short time and independent of the decoder initialization conditions [36]. This method infers the subject's intended movement goals during online control [33] and updates the decoder on an intermediate (1-2 min) time-scale. The main feature of SmoothBatch is that it can readily improve performance in a relatively short time, independent of the subject's initial closed-loop BMI performance. This could be particularly useful in clinical applications in which the patient cannot move the limbs.
While CLDA algorithms can readily improve performance in a relatively short time, the brain still faces a ''moving-target'' problem of being able to learn an adaptive decoder. Can we facilitate the coadaptation between the brain and the machine so that the motor memory-like properties emerging through transform learning can be preserved while adapting the transform? One possible avenue for future studies could be starting with an early CLDA phase, in which the transform is adapted until certain level of performance is achieved, followed by a prolonged period of static transform, allowing the brain to optimize its control.

Conclusion
Achieving skillful control of a multi-DOF prosthetic will entail synergizing two different types of adaptation processes: natural (brain plasticity) and artificial (machine learning). In addition, providing realistic sensory feedback from the prosthetic device should allow the user to feel the environment and achieve more natural control. Transform learning facilitates the formation and retention of a prosthetic motor memory through a process of neuroplasticity. CLDA techniques expedite the learning process by adapting the transform during online performance. We believe that BMI systems capable of exploiting both neuroplasticity and CLDA will be able to boost learning, generalize well to novel movements and environments, and ultimately achieve a level of control and dexterity comparable to that of natural arm movements. Figure 1. Closed-loop decoder adaptation (CLDA) accelerates learning and improves performance by updating a BMI decoder's parameters in closed-loop operation (i.e., while the subject is using the BMI). The gray arrows point to the main elements of a closedloop BMI: sensing (neural activity), estimation (decoding algorithm or transform), control of the actuator, and feedback. The red arrows represent the CLDA component. BMI errors are analyzed online with respect to inferred or known task goals, and/or on evaluative feedback. These errors are used to modify the decoder's parameters. Overall, CLDA improves BMI performance by making the decoder more accurately represent the true underlying mapping between the user's neural activity and their intended movements (adapted from [30] with permission). doi:10.1371/journal.pbio.1001561.g001