Topographic Somatosensory Imagery for Real-Time fMRI Brain-Computer Interfacing

Real-time functional magnetic resonance imaging (fMRI) is a promising non-invasive method for brain-computer interfaces (BCIs). BCIs translate brain activity into signals that allow communication with the outside world. Visual and motor imagery are often used as information-encoding strategies, but can be challenging if not grounded in recent experience in these modalities, e.g., in patients with locked-in-syndrome (LIS). In contrast, somatosensory imagery might constitute a more suitable information-encoding strategy as the somatosensory function is often very robust. Somatosensory imagery has been shown to activate the somatotopic cortex, but it has been unclear so far whether it can be reliably detected on a single-trial level and successfully classified according to specific somatosensory imagery content. Using ultra-high field 7-T fMRI, we show reliable and high-accuracy single-trial decoding of left-foot (LF) vs. right-hand (RH) somatosensory imagery. Correspondingly, higher decoding accuracies were associated with greater spatial separation of hand and foot decoding-weight patterns in the primary somatosensory cortex (S1). Exploiting these novel neuroscientific insights, we developed—and provide a proof of concept for—basic BCI communication by showing that binary (yes/no) answers encoded by somatosensory imagery can be decoded with high accuracy in simulated real-time (in 7 subjects) as well as in real-time (1 subject). This study demonstrates that body part-specific somatosensory imagery differentially activates somatosensory cortex in a topographically specific manner; evidence which was surprisingly still lacking in the literature. It also offers proof of concept for a novel somatosensory imagery-based fMRI-BCI control strategy, with particularly high potential for visually and motor-impaired patients. The strategy could also be transferred to lower MRI field strengths and to mobile functional near-infrared spectroscopy. Finally, given that communication BCIs provide the BCI user with a form of feedback based on their brain signals and can thus be considered as a specific form of neurofeedback, and that repeated use of a BCI has been shown to enhance underlying representations, we expect that the current BCI could also offer an interesting new approach for somatosensory rehabilitation training in the context of stroke and phantom limb pain.


INTRODUCTION
Brain-computer interfacing refers to the use of a signal originating from neural (cerebral) activity to control an external device (Wolpaw et al., 2002;Daly and Wolpaw, 2008;van Gerven et al., 2009). While the first brain-computer interfaces (BCIs) used electrophysiological signals recorded by electrodes penetrating the brain (Fetz, 1969) or by means of electroencephalography (EEG, Farwell and Donchin, 1988), more recently, functional magnetic resonance imaging (fMRI) and functional near-infrared spectroscopy (fNIRS), sensitive to hemodynamic changes induced by active neurons, have also been used in the context of BCI communication and control (Yoo et al., 2004;Naito et al., 2007;Lee et al., 2009;Sorger et al., 2009Sorger et al., , 2012Sorger et al., , 2018Monti et al., 2010;Bardin et al., 2011;Gallegos-Ayala et al., 2014;Chaudhary et al., 2017;Nagels-Coune et al., 2017). Compared to fMRI, EEG has a higher temporal resolution, is more mobile and cheaper. However, fMRI has the advantage of providing higher spatial resolution and higher sensitivity to subcortical brain regions. Also, hemodynamic BCIs generally allow for almost instant control, requiring less preparation and training of the subject Nagels-Coune et al., 2017). This means that hemodynamic BCIs can be especially useful for applications in a hospital setting that require high accuracy in a short amount of time and, in the case of fMRI, are used on an incidental basis. Such applications can be found in the realm of state-of-consciousness diagnostics (Naci et al., 2012), communication on important medical issues with ''locked-in''-patients  or auxiliary treatment in psychotherapy or rehabilitation.
With hemodynamic BCIs, it is possible for users to apply an explicit strategy or task to control their brain signal and thereby the BCI output. The requirements for the control strategy vary depending on the intended application. In fMRI-neurofeedback applications, users attempt to reach a certain target brain activation profile by practicing and improving a certain mental strategy based on feedback on their performance. In this case, it is important to establish that the mental strategy is in principle feasible and activates the brain regions that underlie the symptoms that the BCI is intended to improve. For diagnostic or communication applications, on the other hand, there is more flexibility with respect to the applied mental strategies. However, the control strategy should preferably be intuitive and allow almost instant control. It should not tax the possibly limited attentional and cognitive resources of patients (Naci et al., 2012). Ideally, the selected control strategy should be tailored to the patient, relying only on those mental capacities that are preserved (Kaufmann et al., 2013), to increase proficiency in BCI control. It is therefore important to develop and test a range of different mental strategies with corresponding experimental and data-analysis protocols to ensure that each patient can be offered the most optimal control strategy.
An especially easy-to-implement category of hemodynamic BCI control strategies is mental imagery, because it does not require any particular stimulation device and can be tailored to the preserved mental capacities of the intended user of the communication BCI. After instructing the participant before the scanning session, start and stop instructions during scanning can be given in any sensory stimulus modality (visual, auditory, tactile), as long as they are technically available and compatible with the user's faculties. Presently, motor and visual imagery are among the most frequently used mental-imagery strategies in the context of fMRI-BCIs. Both have been shown to generate robust brain activation patterns that can be used as a control signal in the context of fMRI-BCIs for communication (Boly et al., 2007;Naci et al., 2012;Sorger et al., 2012), consciousness diagnostics (Monti et al., 2010) and neurofeedback (Subramanian et al., 2011;Scharnowski et al., 2012). Other mental imagery strategies that have been tested in the context of BCI control are spatial navigation (EEG: Cabrera and Dremstrup, 2008;fMRI: Monti et al., 2010), inner speech and mental calculation (fMRI: Sorger et al., 2012;fNIRS: Naito et al., 2007), emotional imagery (fMRI: Johnston et al., 2011;Sulzer et al., 2013), mental drawing (fNIRS: Nagels-Coune et al., 2017), thinking ''yes '' or ''no'' (fNIRS: Gallegos-Ayala et al., 2014;Chaudhary et al., 2017) and auditory imagery (EEG: Curran et al., 2004;fMRI: Yoo et al., 2001).
However, many of the above-mentioned imagery strategies could be suboptimal in patients who suffer impairments in the respective sensory or cognitive modalities. The current understanding is that ''the nature of mental representations is formed by our perceptual apparatus and experience'' (Schmidt et al., 2014). Therefore, it is important to extend the range of feasible imagery strategies available for operating a BCI in a clinical setting with imagery grounded in modalities in which intended BCI users have vivid memories based on recent experience. In this respect, somatosensory imagery could be an ideal candidate strategy, especially suited for older patients as well as patients suffering from locked-in-syndrome (LIS). The somatosensory modality is relatively well-preserved across the life span compared to, e.g., vision and audition. Also, in ''locked-in'' patients vision is often impaired due to problems with eye-muscle control and drying of the eyeball related to lack of orbital movement, whereas the somatosensory modality is often one of the few sensory channels that are preserved in the complete ''locked-in'' state (Murguialday et al., 2011). However, to our knowledge, a somatosensory imagery strategy has not yet been tested for hemodynamic brain-computer interfacing.
Therefore, the goal of the current study was to: (1) investigate whether somatosensory imagery is a suitable strategy for operating a hemodynamic BCI, and provide a proof-of-concept that it is suitable for online BCI-based communication; and (2) explore to what extent its classification is driven by somatotopic information. For this proof-of-concept study, we exploited the high signal to noise ratio of ultra-high-field 7T fMRI.
Somatosensory imagery recruits the core imagery-generating network (McNorgan, 2012;Schmidt et al., 2014;de Borst and de Gelder, 2017), but also involves primary and secondary somatosensory cortex (Yoo et al., 2003;Newman et al., 2005;Olivetti Belardinelli et al., 2009;McNorgan, 2012;Schmidt et al., 2014;Wise et al., 2016;de Borst and de Gelder, 2017). In visual mental imagery, the topographic nature of representations has been shown to be preserved in the primary visual cortex, even allowing for decoding of imagery content (Klein et al., 2004;Slotnick et al., 2005;Thirion et al., 2006). The primary somatosensory cortex (S1) also contains topographic representations (hand: Nelson and Chen, 2008;Schweizer et al., 2008;Sanchez-Panchuelo et al., 2010;Kolasinski et al., 2016;Pfannmöller et al., 2016;leg/foot: Akselrod et al., 2017;body: Penfield and Rasmussen, 1950;Zeharia et al., 2015), but although imagery-induced activation has been shown in S1 and S2, it is unknown whether similar somatotopic activity can be induced using somatosensory mental imagery of the body surface. If somatosensory imagery would induce somatotopically specific activation patterns, this would probably simplify (online) classification, and open up the possibility to use it to activate and train particular subregions of the somatosensory map.

Participants
Ten neurologically healthy participants (six males, mean age = 29 years, SD = 5 years, one left-handed) were recruited among students and staff members of the Faculty of Psychology and Neuroscience at Maastricht University (Maastricht, The Netherlands). Participants were screened for MRI counterindications and signed an informed consent form prior to their participation. Students received monetary compensation or credit points after the experiment. The study was conducted in accordance with the Declaration of Helsinki and approved by the ethics committee of the Faculty of Psychology and Neuroscience.

Procedure and Experimental Design
Two to five days before the actual scanning took place, subjects participated in a training session in which they performed one or more imagery runs in a mock scanner. The purpose of this training session was to instruct them about the task and possible imagery strategies, and to encourage them to try several imagery strategies and select the one that worked best for them, evoking the most stable and vivid somatosensory experience, as assessed by introspection. Two imagery strategies were suggested as possible starting points. The first strategy was to imagine someone gently touching the inner surface of the hand or foot. The second strategy was to apply techniques from autogenic training, imagining a heavy and warm feeling in the hand or foot. The session lasted at least 30 min or until the participant indicated (s)he had found a subjectively effective imagery strategy.
During the scanning session, the participants' heads were stabilized with foam padding to minimize head movement. They were instructed not to move and to keep their eyes closed during the functional runs. In a pilot experiment, participants reported that somatosensory imagery was easier to perform with eyes closed. As the eyes were closed throughout the entire run this did not constitute a confounding factor. Participants received auditory instructions indicating when they had to start imagining touch to the right hand (''hand'') and the left foot (''foot''), and when they could stop (''rest''). Each participant completed five (participant 7) to six somatosensory-imagery classifier-training runs. Every classifier-training run contained a pseudorandom sequence of nine left-foot (LF) and nine right-hand (RH) somatosensory mental imagery trials of 18 s duration, each followed by a 16, 18 or 20 s rest period (Figure 1). Seven participants also completed one or more answer-encoding runs in which they answered a binary question using imagined touch to the right hand to encode ''yes'' and imagined touch to the left foot to encode a ''no'' answer. Each answer-encoding run included five tactile-imagery trials (encoding the same answer). The timing of the imagery trials and rest intervals was identical to the training runs. If possible, each question was followed by FIGURE 1 | Diagram illustrating the timing and exemplary pseudorandom temporal order of imagery trials and rest intervals for the classifier-training and answer runs, as well as the sequence and most common number of training (n = 6) and answer (n = 2) runs. two answer runs. In the first answer run, subjects were asked to encode the actual answer to the question using the appropriate imagery strategy. In the second answer run, they were asked to encode the opposite answer (in order to obtain approximately the same number of ''yes' ' and ''no'' answers-encoding data). The correct answers to the questions were unknown to the researcher responsible for data analysis until after completion of the offline analyses. In one subject, the answer runs were also analyzed online and the results were reported back to the participant right after the run.
After scanning, participants were asked to fill out a questionnaire in which they had to rate the experienced quality of their somatosensory imagery (i.e., whether it was a robust mental experience) and the extent to which it evoked a tactile sensation (i.e., to what extent did they experience it as inducing tactile input coming from the target limb) on a ten-point scale (one being the worst/least applicable, ten being the best/most applicable). They also rated the percentage of time that they were successful at maintaining the imagery and described the particular imagery strategy they used.
All subjects reported that they were able to evoke somatosensory imagery. The type of strategies used ranged from imagining a warm feeling (four subjects), imagining vibratory stimulation to the finger or toe (two subjects), to imagining touch (massage, brush strokes, somatosensory sensation; four subjects; Supplementary Table S1). On a scale of 1-10, the average score subjects gave for the quality of their hand imagery was 6.9 (SD = 0.65) and of their foot imagery 6.6 (SD = 0.49). The average score in response to the question to what extent they experienced a tactile sensation during imagery was 5.3 (SD = 2.3) for the hand and 4.9 (SD = 1.9) for the foot. The reported average percentage of time that somatosensory imagery worked during the hand imagery epochs was 75% (SD = 4.5%, N = 6) and 73% (SD = 8.2%, N = 6) for the foot imagery epochs. Paired t-tests did not reveal any significant differences between foot-and hand-related ratings at p < 0.05.

MRI Data Acquisition
Data were acquired at Maastricht Brain Imaging Centre (Maastricht University) on a 7T scanner (Siemens Healthcare, Germany) with a 32 receiver channel (Nova) head coil.
The scanning session included the acquisition of an anatomical scan (8 min) and five to six classifier training runs (12 min/run). In seven subjects, there was sufficient time to end the session with one to seven additional answer-encoding runs (1 answer run: 2 subjects, 2 answer runs: 4 subjects, 7 answer runs: 1 subject) in which the participant answered one or more binary questions using somatosensory imagery (4 min/run).
In the functional runs, a gradient echo sequence was used with 78 slices of 1.5 mm thickness, 1.5 mm in-plane resolution, 198 mm × 198 mm field-of-view, TR/TE = 2,000/26.1 ms, 60 • flip angle, multiband acceleration factor 2 and GRAPPA acceleration factor 3 (340 volumes for the training runs, 96 volumes for the answer runs). In each session, five volumes were acquired in opposite phase encoding direction to be used for (offline) correction of EPI distortions.

Preprocessing for Offline Multivariate Analyses
Offline data analysis was performed using Brainvoyager QX 2.8.4 (BrainInnovation, Maastricht, the Netherlands). The anatomical scan was down-sampled to 1 mm 3 . The first two volumes of each functional run were discarded. The remaining volumes were adjusted based on slice-scan time, realigned to the first functional image of the session using rigid body transformations, temporally filtered to remove linear trends and nonlinear periodic signals with a frequency below 0.01 Hz, corrected for EPI distortions, coregistered to the anatomical scan interpolating to 2 mm 3 isotropic, and normalized to Talairach space. This procedure includes the definition of subject-specific anatomical landmarks (anterior commissure, AC; posterior commissure, PC and the borders of the cerebrum) that are used to rotate each brain in the AC-PC plane followed by piecewise, linear transformations to fit each brain in the common Talairach ''proportional grid'' system (Talairach and Tournoux, 1988).
The multivariate analyses were limited to the voxels included in anatomical masks. Masks were obtained by creating anatomical regions of interest (ROIs) in Talairach space based on the Jülich probability maps for the primary (PSC3a, PSC3b, PSC2 and PSC2; Geyer et al., 1999Geyer et al., , 2000Grefkes et al., 2001) and secondary somatosensory cortex (SIIOP1, SIIOP2, SIIOP3 and SIIOP4; Eickhoff et al., 2006a,b) at different probability levels (see below).
Features (t-values per voxel) were extracted by linearly fitting each voxel within the mask with a design matrix consisting of a constant term (intercept) and a modeled hemodynamic response, within a temporal window of 30 s, spanning from one volume (2 s) before the auditory cue to 14 volumes (28 s) after the auditory cue. The modeled hemodynamic response was obtained by convolving a canonical hemodynamic response function (hrf) with a box-car predictor whose value was one in the volumes corresponding to the imagery blocks, and zero otherwise. The fit was done separately for each trial, and the t-estimates of the hemodynamic model were used as features per voxel. This procedure resulted in 45 (participant 7)/54 feature examples per class (nine per run).
Data were imported in Matlab (R2013b, Mathworks Inc., Natick, MA, USA) using the NeuroElf (v1.0, neuroelf.net, Jochen Weber) for feature extraction and Spider toolboxes for multivariate classification. A linear support vector machine (SVM; Vapnik, 1995) with a constraint parameter C = 1 was used in several ''leave-run-out cross-validation'' procedures which differed depending on the research question, and are described in more detail below. Before learning a model, training features were z-scored to ensure the magnitude of weights would be interpretable. To test whether classification accuracies were significantly above chance, 2,000 permutations (Golland et al., 2005) were performed in which condition labels were randomly reassigned to each block in each separate run. The p-value was calculated as the ratio between the number of permutations where the error was lower or equal to the observed one and the total number of permutations (adding one to both numerator and denominator to avoid zero values). Subsequently, proportions correct classifications were computed with edge correction [(number of correctly classified trials − 0.5)/number of trials] and logit transformed before statistically testing the hypotheses of interest using SPSS (IBM SPSS Statistics 24 for Windows, Armonk, NY: IBM Corp.).

Effect of the Amount of Training Data and Imagery Class
To test the hypothesis that the amount of training data would increase decoding accuracy independent of the imagery class, the number of runs used for SVM training was increased from one to five, each time testing on one of the remaining runs, using all possible combinations. Features were extracted from the largest, most inclusive anatomical masks of the primary and secondary somatosensory cortex. To test singlesubject significance, permutation tests were performed. The edge corrected and logit transformed classification accuracies were subsequently entered in a two-way repeated-measures ANOVA in with imagery class (2) and the number of training runs (5) as within-subject factors. To test whether accuracy increased with the number of training runs, within-subject planned polynomial contrasts were used.

Assessment in the Context of BCI Communication
To test the suitability of the current approach for (online) communication, a simulated real-time classification was performed using Turbo-BrainVoyager (version 3.2, BrainInnovation B.V., Maastricht, the Netherlands) to emulate the procedure that would be performed during the scanning session. In the last participant, a proof-of-concept online communication experiment was performed in which the answer runs were also decoded online during the scanning session using an SVM based on data from all available training runs.
The online and simulated real-time analyses were methodologically identical. Both analyses were based on slice data in native resolution (1.5 mm 3 ), which were 3D-motion corrected using the standard TBV incremental procedure spatially realigning each volume to the first recorded volume by rigid body transformation. The lowest probability (largest) S1 and S2 mask was back-transformed from Talairach space into native space. After spatially smoothing the functional data with a 4 mm full-width-at-half-maximum (FWHM) kernel, t-values were computed by fitting a general linear model with the same predictors as used in the offline procedure, within the same temporal window of 30 s, spanning from one volume (2 s) before the auditory cue to 14 volumes (28 s) after the auditory cue. T-values from the S1 and S2 mask were used as features for SVM training and testing using LIBSVM (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009) Chih-Chung Chang and Chih-Jen Lin).
In the simulated real-time analyses, the effect of the number of training runs was tested by using a classifier based on an increasing number of runs, starting from the run closest to the answer run to emulate the temporal adjacency that is characteristic for the online situation, each time adding a preceding run for each additional level for SVM training, and subsequently testing on the available answer run(s). The resulting edge-corrected and logit transformed accuracies were entered into a repeated-measures ANOVA with the number of training runs (1-6) as within factor was used to statistically assess the effect of the amount of training data.

Comparing Accuracy in Primary and Secondary Somatosensory Cortex Masks
The effect on decoding accuracy of the anatomical region chosen for feature extraction was evaluated by repeating the cross-validation procedure with five runs (four for subject 7) for training and one run for testing with features extracted from the anatomical masks of the primary (PSC3a, PSC3b, PSC2 and PSC2) and secondary (SIIOP1, SIIOP2, SIIOP3 and SIIOP4) somatosensory cortex at different probability levels (Geyer et al., 1999(Geyer et al., , 2000Grefkes et al., 2001;Eickhoff et al., 2006a,b). The probability level (a percentage) of a mask for a particular anatomical area indicates that all voxels in that mask were classified as part of that anatomical area in at least the corresponding percentage of subjects in the database used to construct the anatomical probability map. For example, an anatomical mask with a probability level of 30% contains voxels that were classified as belonging to that particular anatomical region in at least 30% of the subjects in the database used to construct these anatomical probability maps. The higher the probability level, the smaller the mask size (Supplementary Table S2). To test single-subject significance, permutation tests were performed.
Accuracy values were entered in a two-way repeatedmeasures ANOVA, with anatomical region (2, S1 or S2) and probability level (8, 10%-80%) as within-subject factors, to test whether there was a difference between S1 and S2 in terms of decoding accuracy, and whether this was possibly affected by the probability level, which was further assessed using withinsubject planned polynomial contrasts. For the ANOVA, only probability levels of 80% or lower were taken into account which corresponded to masks with more than 500 voxels, to ensure stable classification results.

Assessment of Somatotopy in Discriminative SVM Weights
The extent to which the classification was based on somatotopically different weight patterns in the S1 was assessed using the average discriminative weights across the different training splits. Discriminative weights were extracted from single-subject maps (obtained when testing the overall singlesubject classification accuracies) that represented the average weights across the different training splits, using SVM classifiers trained on features from the most inclusive anatomical mask, with all possible combinations of 4/5 runs for training. Because the representations of hand and foot are most clearly separated on the x-axis (left-to-right), we computed the subject-specific median x-coordinates for the 200 largest positive (foot imagery) and 200 largest negative weights (hand imagery) and compared them using a Wilcoxon signed-rank test. Finally, the Spearman rank correlation was computed between the subject-specific differences in median x-coordinates for foot and hand and the leave-one-run-out classification accuracy.

RESULTS
Ten healthy participants were instructed by auditory cues to perform alternating somatosensory imagery of the hand or foot (Figure 1) while ultra-high-field 7T fMRI data were collected. Each participant completed five (participant 7) to six somatosensory-imagery classifier-training runs. Seven participants also completed one or more answer-encoding runs in which they answered a binary question using imagined somatosensory stimulation to the right hand to encode ''yes'' and imagined somatosensory stimulation to the left foot to encode a ''no'' answer.

Suitability of Somatosensory Imagery for BCI
After establishing that all participants subjectively experienced somatosensory imagery (see ''"Materials and Methods'' section), we went on to test the suitability of this strategy to induce activation that can be reliably decoded for the purpose for BCI communication. We hypothesized that using a machinelearning algorithm, the contents of somatosensory hand or foot imagery can be accurately decoded on the single-trial level with similar accuracy for each imagery class, with accuracy increasing when increasing numbers of runs are used to train the classifier (hypothesis 1a). Our second hypothesis (1b) was that somatosensory imagery would yield accurate decoding of answer runs in simulated real-time as well as during a proof-of-concept online brain-based communication session.
Testing the first hypothesis (1a), we found that significant above-chance classification accuracy (p < 0.05, based on permutation testing) was achieved in eight out of 10 subjects with one training run, and for all subjects for two or more trainings runs (Figure 2). The average proportion correct classifications increased with the number of runs included for training (Figure 3), up to 0.82 (SD = 0.12) when using five runs (four in one subject) for training and one run for testing. The effect of imagery class and the number of training runs were tested in an imagery class (2, foot or hand) by training runs (5) repeated measures ANOVA on the edge corrected and logit transformed accuracy data. This revealed a significant effect of the number of training runs on the decoding accuracy (F (1.103,8.112) = 8.823, p < 0.001, Greenhouse-Geisser (GG) corrected for non-sphericity with ε = 0.276). Planned within-subjects contrasts showed significant effects for the linear (F (1,8) = 40.869, p < 0.001) and quadratic contrasts (F (1,8) = 7.808, p < 0.023). There was no effect of the imagery class (F (1,8) = 1.728, p = 0.225), nor an interaction between imagery class and number FIGURE 2 | Individual offline classification accuracies. Individual single-trial classification accuracies were obtained using a leave-one-run out cross-validation scheme based on spatial features extracted from the mask combining S1 and S2 at the most lenient probability threshold. The black dots indicate the average individual classification accuracies. The gray dots and shaded area respectively represent the average of the empirical null distribution and 95% confidence interval obtained by permutation testing (2,000 permutations).
FIGURE 3 | Effect of the amount of training data on the offline-obtained single-trial classification accuracy using a leave-one-run out cross-validation scheme based on spatial features extracted from the mask combining primary somatosensory cortex (S1) and S2 at the most lenient probability threshold. The dots indicate the average classification accuracies across the nine participants who completed a total of six training runs and the shaded area indicates the standard deviation across participants.
To test the second hypothesis (1b) and provide a proof-ofconcept for brain-based communication, we used the answer runs that were available from seven subjects (17 runs in total, 7 ''no,'' 10 ''yes''; Table 1). A simulated real-time analysis revealed no significant effect of the number of runs used for training (F (5,30) = 1.470, p = 0.229; Figure 4). Using all training runs, the average proportion correct for these seven subjects was 0.86 (SD = 0.17), mounting to 0.92 (SD = 0.08) when disregarding one subject who, after the session, indicated he misunderstood the instructions for the answer runs. In the proof-of-concept, online communication session with the last subject all answers to the questions were accurately decoded. Note that methodologically the simulated real-time and real-time analyses were identical, except for the fact that the simulated real-time analysis was performed after the scanning session.

Assessment of Somatotopy
Second, we wanted to assess whether classification success would be based on somatotopic information. We hypothesized (2a) that classification accuracy would be higher in the S1, where somatotopy is most pronounced, compared to secondary somatosensory cortex and (2b) that the discriminative weights obtained from the SVM would show a somatotopic distribution.
Next, we tested somatotopy of the discriminative SVM weights (hypothesis 2b). Visual inspection of discriminative SVM weight maps revealed a pattern that roughly corresponded to the expected somatotopy in 4 out of 10 subjects (2, 8, 9 and 10; Figure 6). A weaker somatotopic pattern was observed in one subject (4). A reversed pattern was noticed in two subjects (3 and 6), and a mixed pattern was seen in three final subjects (1, 5 and 7). In a more quantitative evaluation of the extent to which there was a somatotopic distribution in the discriminative SVM weights, we tested whether there was a distinction in the median coordinates of the voxels associated with the 200 largest positive (foot) weights and the largest negative (hand) weights. We focused on the x-coordinates only because somatotopy is expected No answer-encoding data were obtained for subjects 4, 7 and 9 due to running short of MR scanning time. The values in the bottom row correspond to the average percentage correct obtained by first averaging the answer runs of individual subjects, and subsequently averaging across subjects.
to be most pronounced on the medial-lateral axis of the brain. A Wilcoxon signed-rank test revealed that there was a significant within-subject difference (z = 2.09, p = 0.0365) between the median x-coordinates of the voxels associated with the 200 largest positive (foot) weights and the median x-coordinate of the voxels associated with the 200 largest negative (hand) weights. The median x-coordinate across subjects was 8.50 (SD = 14.5) for the 200 largest positive weights and −21.5 (SD = 18.4) for the 200 largest negative weights (Figure 7). Also, we found a significant correlation between the difference in median × coordinates and the overall classification accuracy, i.e., subjects with a larger value for the foot-hand median x-coordinate difference (larger ''hand-foot distance'') also showed a higher classification accuracy (Spearman's rho 0.948, p < 0.001).

DISCUSSION
The current study: (1) investigated whether somatosensory imagery is a suitable encoding strategy for a hemodynamic (fMRI-based) BCI, aiming to provide a proof-of-concept of its suitability for online BCI-based communication; and (2) tested to what extent its decoding is based on somatotopic information.
In accordance with our first set of hypotheses, we found that the contents of somatosensory imagery could be decoded above chance in each of our participants with similar accuracy for hand or foot, and that decoding accuracy increased with an increasing number of training runs. We also found that answers to binary questions encoded by seven healthy participants using the somatosensory imagery BCI could be successfully decoded in simulated real-time as well as during a proof-of-concept online communication session tested in one subject. In accordance with our second set of hypotheses, we found that decoding accuracy was significantly higher in the S1 compared to the secondary somatosensory cortex and that discriminative weight patterns generated by healthy participants through somatosensory mental imagery showed a somatotopic distribution.

Suitability for Use in fMRI Brain-Computer-Interfacing
The average classification success was 82% for the offline leave-one-run out cross-validation procedure on the training runs, and 92% for the classification of the answer runs in simulated real-time (86% when including S03 who indicated afterward that he misunderstood the instructions). The fact that the classification success was higher for simulated real-time classification of the answer runs could be explained by the additional training run that was used for training the classifier in simulated real-time, whereas in the offline leave-one-run out procedure one run necessarily had to be left out for testing. However, the evaluation of the effect of additional training runs in the offline condition suggested that the beneficial effect of adding training runs leveled off after four runs. Another possible explanation might lie in the slight differences in the analysis pipelines for the simulated real-time and offline analyses regarding slice-scan-time-correction and data transformation FIGURE 4 | Effect of the number of training runs on average classification success in the simulated real-time procedure. The dots indicate the average classification accuracy across all participants who completed one or more answer runs and the shaded area the standard deviation.
FIGURE 5 | Offline single-trial classification accuracies for S1 vs. S2. Accuracies were obtained using a leave-one-run out cross-validation scheme based on spatial features extracted from separate masks of S1 (red line) and S2 (purple line) at different probability levels ranging from 10% to 80%. Higher probability levels were excluded because the masks included less than 500 voxels for the S2 mask. The shaded areas indicate the standard deviation for each case.
(Talairach transformation and downsampling to 2 mm 3 offline vs. smoothing with 4 mm FWHM Gaussian kernel in native space in simulated-real-time). On the other hand, the increased accuracy for the answer run classification might also be related to enhanced imagery quality during the answer run, due to extended practice, focus on a single imagery strategy (in contrast to the alternating imagery strategies in the training runs) and possibly increased motivation because of the social interaction component of answering a genuine question. Taken together, these findings inspire confidence that somatosensory imagery will be a viable additional control strategy for BCI communication procedures. Somatosensory imagery might be especially suited for locked-in patients with reduced visual or auditory faculties who lack muscle control. These patients might not be able to evoke stable imagery involving their most affected modalities, as they lack recent memories of sensory experiences which can increase the level of detail and vividness of mental representations generated by imagination (Hassabis and Maguire, 2009). The current study was performed at 7T, but given the relatively large distance between hand and foot representations in S1 similar results could be expected at a field strength of 3T which is more commonly available in clinical settings. Our novel somatosensory imagery control strategy might also be applied in binary fNIRS-FIGURE 6 | Z-scored average individual weight maps obtained from the combined S1 and S2 mask at the most lenient probability threshold, projected on individual cortical surface reconstructions. Individual subject data were ordered by average classification accuracy in the leave-one-run out cross-validation procedure (accuracy increases from left to right and top to bottom). The map of participant 7 is shown separately, because for this, only four training runs could be used for testing instead of the five available for the other participants. Warm colors indicate positive weights (one could say voting for "foot"), cold colors for negative weights (a vote for "hand"). The white dotted lines indicate the central sulci. Hypothetical somatotopic support vector machine (SVM) weight pattern with similar color coding is shown at the bottom right of the figure, based on the approximate location reported for the hand and foot representations in the S1. Note that the foot representation extends onto the medial surface.
FIGURE 7 | Weight distribution on the medial-lateral brain axis for hand and foot imagery. Median x-coordinate of the 200 largest positive and negative weights for individual subjects, ordered from left to right according to increasing classification accuracy. The shaded region indicates the expected location for foot (orange) and hand (blue) on the somatotopic map based on values reported in the literature (Nelson and Chen, 2008;Akselrod et al., 2017). The dotted line indicates the brain midline.
based communication BCIs. Of course, taking into account the considerably lower spatial resolution of fNIRS, optode placement and data analysis procedures would have to be optimized.

Topographic Nature of Somatosensory Imagery
Our results indicate that decoding accuracy was higher within the primary than in the secondary somatosensory cortex mask and that it is possible to induce somatotopic discriminative weight patterns by mental imagery. The median x-coordinates of the subjects with highest offline classification accuracies roughly correspond to values reported in the literature for finger (Nelson and Chen, 2008;Schweizer et al., 2008;Sanchez-Panchuelo et al., 2010;Kolasinski et al., 2016;Pfannmöller et al., 2016) and foot representations (Akselrod et al., 2017 ; Figure 7). The slightly more medial imagery-related weight coordinates could be due to the fact that, in the current study, imagery involved the whole hand whereas stimulation is usually delivered to the fingers only. For the foot, imagery coordinates were more inferior than reported stimulation coordinates. Similar observations of corresponding sites for imagery and stimulation in primary sensory cortices have also been made in previous studies in the somatosensory domain (Yoo et al., 2003;Newman et al., 2005;Schmidt et al., 2014;Wise et al., 2016;de Borst and de Gelder, 2017), however, to our knowledge, the current study is the first to report a somatotopic pattern of discriminative weights in the (primary) somatosensory cortex during imagery. Previous studies have reported topographic recruitment of visual cortex during imagery (Klein et al., 2004;Slotnick et al., 2005;Thirion et al., 2006).
The presence of a somatotopic pattern in the discriminative weights, that is a more medial and right-sided location for the highest positive weights corresponding to the imagery class ''foot'' and a more lateral and left-sided location for the highest negative weights associated with ''hand,'' was associated with higher classification success. This suggests that subjects might increase classification success by increasing the level of somatotopy in their imagery-induced brain activation patterns. If this finding can be extrapolated to subjects with maladaptive somatosensory representations, we speculate that the current BCI might be suitable to use for training the somatotopy of these representations. Somatosensory dysfunction and aberrant plasticity have been linked to pain severity in chronic pain (Flor et al., 1997;Pleger et al., 2005;Wrigley et al., 2009;Di Pietro et al., 2013Kim et al., 2017; but see Makin et al., 2013 for an alternative view). Neurofeedback based on somatosensory mental imagery could be especially suitable for clinical interventions in patients for whom actual somatosensory stimulation is difficult or impossible due to their physical limitations or technical limitations of the scanner environment.
Somatotopic patterns could in theory also be evoked using motor imagery or even actual movement. However, our subjects were carefully instructed and trained in a mock scanner to exclusively use somatosensory imagery, and it was emphasized that they should not move during imagery runs. Subjects' compliance to these instructions is supported by the fact that we did not observe hand or foot movement during training or scanning. Also, during the training runs, subjects did not receive any feedback or reward based on their performance, so there was no incentive to try out non-somatosensory imagery strategies. In line with this, subjects' reports indeed indicated that they used purely somatosensory strategies (Supplementary Table S1).
Based on the current data, we cannot establish a link between classification accuracy and the use of a specific somatosensory imagery strategy: the two subjects with the largest classification success used touch and warmth imagery, as did the two subjects with the lowest classification success. It would be interesting to investigate whether imagery targeting different somatosensory submodalities induces different patterns of activation in the S1. However, such pattern differences might be most apparent within a limb-specific subregion, whereas the general somatotopy of hand and foot is most likely preserved across somatosensory submodalities (Sur et al., 1984).
We instructed our subjects to stick with the same strategy for hand and foot. Whereas some subjects only reported using imagery based on one somatosensory submodality (e.g., vibrotactile stimulation), others used imagery of more naturalistic somatosensory stimulation involving a combination of somatosensory submodalities. Such imagery, e.g., combining warmth and touch might increase vividness and evoke stronger activation. Work by Kosslyn and Thompson in the visual domain suggests that the level of vividness and detail is related to primary visual cortex activation during imagery (Kosslyn and Thompson, 2003). We, therefore, propose to leave the choice of strategy to the participant and have them choose the strategy that subjectively evokes most vivid and detailed imagery and that they feel most comfortable with.
There are some limitations to this study. Somatosensory imagery involved only two contralateral body parts, which are relatively far apart in physical and somatotopic space, and whose representations reside on contralateral sides of the body. Thus, classification could have been based solely on less specific imagery related to the body side. However, the fact that classification success was highest within the S1 mask and linked to the degree of somatotopy in the discriminative weight maps indicates that subjects used a limb-specific somatosensory imagery strategy rather than a body-side related strategy. This is also in accordance with the introspective description of imagery strategies reported by the subjects. However, note that the question whether the brain signal is body-part or body-side specific is irrelevant in the context of the suggested two-choice communication BCI application, as long as it can be used at will.
Second, we used a limited number of subjects and online answer runs to provide a proof of concept for the possibility of using topographic somatosensory for basic fMRI BCI-based control/communication. That is, our aim was to show that this strategy is in principle feasible in individual subjects. The current study does not claim that this strategy is feasible in all subjects of a particular (patient) population. This would require further research. However, based on the high accuracies during training and during answer runs, we deem the prospects for such a future study will be positive.
A third limitation of this study is that only relatively young healthy subjects without neurological impairments participated in the current study. The average patient will probably be older, and might have altered somatosensory representations due to motor impairment, which could lead to different imagery representations. Also, patients might have shorter attentional spans leading to more variable imagery quality. However, the current approach does not rely on a particular imagery representation, it only relies on the presence of differential activation profiles. Moreover, we found higher classification rates during answer runs than during training runs, possibly due to increased motivation. This effect might positively extrapolate to patients in a BCI communication setting.
Finally, it is important to acknowledge that fMRI-based brain-computer interfacing is unlikely to ever become a viable technique for bedside use. However, due to its high reliability and the fact that little pre-training is needed, it can be a very useful technique to communicate with patients about important questions, e.g., related to the continuation of treatment. This is especially relevant in hospital settings where fMRI is usually readily available. Also, it could be a crucial adjuvant method to identify and train patients on promising strategies for subsequent use with fNIRS, a less expensive and more mobile imaging technique that is also based on sampling of hemodynamic signals, albeit at a lower spatial resolution. By first using real-time fMRI-based brain-computer interfacing, clinicians can be faster at weeding out the unsuccessful strategies to subsequently implement those that were shown to be successful in fNIRS brain-computer interfacing. Finally, real-time fMRI communication/control BCIs could also be useful to assess the level of consciousness of patients (Monti et al., 2010;Naci et al., 2012).

CONCLUSION
Taken together, we conclude that somatosensory imagery is a successful information-encoding strategy. We provide proof of concept for the usability of somatosensory imagery in the context of hemodynamic BCI-based communication. The current results are a first step towards future application of somatosensory imagery for clinical applications, e.g., for brainbased communication in patients with LIS, and neurofeedbackbased training to support the rehabilitation of somatosensory representations in stroke and chronic pain patients suffering from somatosensory deficits.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics Review Committee Psychology and Neuroscience (ERCPN), Maastricht University, Maastricht, the Netherlands. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
AK, RG, and BS contributed to the conception and design of the study. AK performed the preprocessing. AK and GV performed the multivariate analyses. AK wrote the first draft of the manuscript. GV and BS wrote sections of the manuscript. All authors contributed to manuscript revision, and read and approved the submitted version.

FUNDING
This manuscript has been released as a Pre-Print at BioRxiv (Kaas et al., 2019). The work was supported by the BrainGain NWO (Dutch Research Council) SmartMix grant (AK, BS and GV) and European Research Council (ERC) Grant 269853 (RG). This work was also supported by the European Commission (7th Framework Programme 2007-2013, DECODER project, BS and RG).