Features Selection and Pattern Classification of Electroencephalography Motor Imagery Tasks of Right Hand

This study presentsa Brain Computer Interface (BCI) approach to detect the motor intents of the disabled people with right hand amputation. Electroencephalography (EEG) Motor Imagery (MI)-based Brain Computer Interface (BCI) systems have been recently used to improve the quality of life of disabled people. However, to naturally trigger particular applications (i.e., upper limb prostheses), independent BCIs appeal further paradigms to involve realistic motor imagery tasks. This study proposes an approach to classifying imagined hand gesture tasks, including the water glass gesture and the index pointer gesture of the right hand using OPENBCI as a consumergrade EEG acquisition device. For three subjects, the data recorded by OPENBCI were sampled with a sampling rate of 250 Hz. The Minimum Redundancy Maximum Relevance (MRMR) technique was implemented as a feature selection method along with the Support Vector Machine (SVM) algorithm for classification. By obtaining a maximum classification accuracy of 91.7%, the results showed the feasibility of such Brain Computer Interface systems to detect different motor imagery tasks for the right hand. Consequently, upper limb prostheses could be manipulated using the intended motor imagery tasks.


INTRODUCTION
In recent decades, diverse neuroscience-related studies have been conducted for attaining a robust perceptron of the human brain activities that reflect the motor imagery tasks of the amputated limb.Among these studies, EEG-based BCI systems are the most widely investigated due to their noninvasiveness and portability (Liu et al., 2012).Recent investigations have shown promising results, empowering upper limb amputees to interact with the external world (Edelman et al., 2014).Non-invasive BCI's are based on decoding the changes in oscillatory activities caused by motor imagery tasks, exploiting the Event-Related Desynchronization (ERD); a suppression of α and β rhythm power and Event-Related Synchronization (ERS); an increase in α and β rhythm power, phenomena (Pfurtscheller and Lopes de Silva, 1999;Soman et al., 2013).
It is recognized that, EEG suffers from the volume conduction effect, which confuses and distorts Neuro-Electric signals during the journey from the firing source in the brain to the scalp (He and Ding, 2013).Consequently, the ERD and ERS generated in response to MI tasks are lacking in clarity due to the intervention of adjacent somatosensory regions.The EEG low spatial resolution dilemma has meant that investigations have only been able to discriminate between spatially distant MI tasks (i.e., right hand and left hand MI tasks) (Edelman et al., 2014;Dharmasena et al., 2013;Elasuty and Eldawlatly, 2015;Wang et al., 2014).Thus, the aforementioned MI tasks cannot be considered realistic or relevant to the intended motor tasks for many rehabilitative and prosthetic BCI systems.Nevertheless, recent studies have successfully demonstrated the possibility of manipulating prosthetic hands through the classification of the intended MI tasks (Edelman et al., 2014;Edelman et al., 2016;Liao et al., 2014;Bhagat et al., 2014).In order to decode the intended MI tasks, it is crucial to achieve pattern recognition concerning the selection of discriminative features (Shiratori et al., 2015); the use of a classifier to process such a large feature vector would be computationally expensive and produce only moderate returns'.For this reason, feature selection algorithms were applied for diminishing the feature vector size as well as raising the classification performance.Among these algorithms, Principal Component Analysis (PCA) is a statistical approach that linearly transforms the projection of the highly dimensional input data, which are basically correlated, into a lower dimensional subspace.Although PCA was widely implied in several BCI-based studies, PCA likely fits with the correlated data that have a large variance.
In contrast, the Maximum Relevance Minimum Redundancy (MRMR) technique is a feature selection method in which features are ranked according to two criteria: • Calculating the correlation between every feature and the corresponding class • Calculating the mutual information between every feature and every other feature Based on the aforementioned parameters, a feature could be considered discriminative if it has Maximum Relevance and low redundancy with regard to the target class and the other features respectively.MRMR produced powerful results in bioinformatics applications; i.e., predicting the characteristics of genes and phonotypes (Ding and Peng, 2005).MRMR selects the most relevant features to the target classes using information that is mutual to those classes (Peng et al., 2005).
This study aims to improve the ability of realistic control signals to trigger BCI's by discriminating between two MI tasks of the right hand; the water glass gesture and the index pointer gesture, utilizing OPENBCI as a consumer-grade EEG acquisition device.For each recorded pattern, features are extracted by using the Band Power (BP) and the Power Spectral Density (PSD) estimates.The Minimum Redundancy Maximum Relevance (MRMR) method was used to select the most discriminative features.Finally, SVM was applied as a classifier.The aforementioned approach is outlined in Fig. 1 and illustrated in the following sections.

EXPERIMENTAL SETUP
In the present study, three subjects (three males, average age of 30±8), with no previous BCI experience, participated voluntarily.After receiving the respective ethical approval of each subject, the experiments were conducted in a comfortable and environmentally shielded room.The subjects were seated in front of a computer screen and instructed to continuously produce right hand motor imagery for either the water glass gesture or the index pointer gesture.Each subject was asked to complete five sessions.Each session was composed of 40 trials (20 trials for the water glass gesture MI tasks and 20 trials for the index pointer gesture MI tasks for the right hand).Trials were randomized within a run for each experimental session.As Fig. 2 shows, Trials were designed as follows: A "cross fixation" icon appeared on the computer for 3 sec, accompanied by a beep sound which lasted for 0.2 sec, followed by a 1.2 sec of a "target" cue for indicating which MI task to perform.This was followed by an "imagination" cue for 3 sec during which the subject performed the specified MI task.Finally, a "rest" icon was shown on the monitor for 3 sec.
EEG signals were acquired using an OPENBCI board kit at a sampling frequency of 250 Hz.OPENBCI is capable of recording up to 8 channels with low noise input channels (CMRR = -110dB), programmable gain (1 to 24) and fast processing speed (32-bit processor).Interfacing with OPENBCI, OPENVIBE platform was applied for developing such scenarios and protocols as well as recording the EEG signal with respect to the  (Renard et al., 2010) shows channel locations F3, C3, P3, O1, F4, C4, P4 and O2 were involved in recording the dataset.These locations were selected to identify variants in the EEG signal produced by the sensorimotor cortex during MI tasks.Additionally, two electrodes were located at A2 and FP2 for signal referencing and Electrooculography (EOG) activity removal respectively.

METHODS
Data preprocessing: Once the data had been acquired, they were generally preprocessed in order to de the signals and to reveal their underlying features.With the help of the EEGLAB toolbox Makeig, 2004), EEG recordings were down 100 Hz and band pass filtered between 7 and 30 Hz using a second-order Butterworth filter.Additionally, a Notch filter was applied to completely eliminate the noise caused by the 50/60 power source.

Spatial filtering using ICA:
ICA is a statistical method that decomposes a set of mixed signals into its sources with no prior information on the nature of the signal.In ICA, it is assumed that the unknown implicit sources are mutually and spatially independent in statistical terms.ICA assumes that the observed EEG signal is a set of several independent source signals coming from multichannel cognitive activities.Prior ., 2010).As Fig. 3 shows channel locations F3, C3, P3, O1, F4, C4, P4 and O2 were involved in recording the dataset.These locations were selected to identify variants in the EEG signal produced by the sensorimotor cortex during MI odes were located at A2 Electrooculography Once the data had been acquired, they were generally preprocessed in order to de-noise l their underlying features.With the help of the EEGLAB toolbox (Delorme and , EEG recordings were down-sampled to 100 Hz and band pass filtered between 7 and 30 Hz order Butterworth filter.Additionally, a lied to completely eliminate the noise caused by the 50/60 power source. ICA is a statistical method that decomposes a set of mixed signals into its sources with no prior information on the nature of the signal.In assumed that the unknown implicit sources are mutually and spatially independent in statistical terms.ICA assumes that the observed EEG signal is a set of several independent source signals coming from Prior to the feature extraction stage, Independent Component Analysis (ICA) was employed to specify the most active channels among the eight.Figure 4 demonstrates the contribution of each channel to scalp activity by mapping the Independent Components (IC's) for subject.Cortical mapping over the whole trials showed that channels F3, C3 and P3 were the most active and representative of the locations, as shown by Fig. 5.
Feature extraction: Due to the complex and stochastic nature of EEGs, they cannot be dire external devices.Nevertheless, such ERD/ERS underlying information could be extracted from the preprocessed dataset and where the different motor imagery tasks are well represented, they can be clearly differentiated.The Event Desynchronization/Synchronization (ERD/ERS) phenomena arises in the sensorimotor rhythms during motor and motor imagery tasks.ERD/ERS can be quantified by calculating the varying power of α and β bands.The band power and the power spectral densi estimates (Kamousi et al., 2007) are calculated for four frequency ranges (7-13,13-20, 21-27 and 28 each of the selected channels with a window size ranges between 250 and 1800 msec and 8 window increments with a moving step of 250 msec at a As a result, each pattern was embodied by a feature vector of 399 feature elements.With regard to the tempo-Frequential domain, the analysis showed that both accuracy and computational time are substantially affected by the size of the window.

Feature selection method:
Minimum Redundancy Maximum Relevance (MRMR): Regarding to the so Dimensionality" issue, feature selection is still considered to be one of the basic problems in pattern recognition.Given the enormous amount of inform underneath the human scalp; that characterizes all of the human status, together with the background noise; that cannot be totally removed, the extracted features need to be ranked for optimal classification.the MRMR for feature selection, the criteria of MRMR are discussed in the following paragraphs.Due to the complex and stochastic nature of EEGs, they cannot be directly to control the external devices.Nevertheless, such ERD/ERS-related underlying information could be extracted from the preprocessed dataset and where the different motor imagery tasks are well represented, they can be clearly differentiated.
The Event Related Desynchronization/Synchronization (ERD/ERS) phenomena arises in the sensorimotor rhythms during motor and motor imagery tasks.ERD/ERS can be quantified by calculating the varying power of α and β bands.The band power and the power spectral density are calculated for four 27 and 28-35 Hz) for each of the selected channels with a window size ranges between 250 and 1800 msec and 8 window increments with a moving step of 250 msec at a time.
As a result, each pattern was embodied by a feature vector of 399 feature elements.With regard to the Frequential domain, the analysis showed that both accuracy and computational time are substantially

Minimum Redundancy Maximum Relevance
Regarding to the so-called "Curse of Dimensionality" issue, feature selection is still considered to be one of the basic problems in pattern recognition.Given the enormous amount of information underneath the human scalp; that characterizes all of the human status, together with the background noise; that cannot be totally removed, the extracted features need to be ranked for optimal classification.Having chosen ion, the criteria of MRMR are discussed in the following paragraphs.(2) Fig. 6: Flowchart of # of features and classifier parameters optimization approach using K-fold cross validation Since the selected features, according to Max-Relevance, are likely to be redundant, the dependency among these features could be large.Thus, there is no considerable change in the respective classdiscriminative power if the features, with maximum redundancy, were minimized and substituted by one feature.Therefore, the following minimal redundancy (Min-Redundancy) condition can be added to select mutually exclusive features: By combining the aforementioned criterion in Eq. ( 2) and Eq. ( 3), "Minimal-Redundancy-Maximal-Relevance" (MRMR) could be approximated by defining the operator ߔ ሺ‫,ܦ‬ ܴሻ to combine‫ܦ‬and ܴ and considering the following simplest form to optimizeD and R simultaneously: max ߔሺ‫,ܦ‬ ܴሻ, ߔ = ‫ܦ‬ − ܴ (4) Classification: With the help of the LIBSVM toolbox in MATLAB (Chang and Lin, 2011), the Support Vector Machine was implemented for the classification stage.By applying SVM, featured data was mapped to a higher-dimensional space, in which the data could be linearly separable by simply optimizing such a hyperplane.
Since the present study was a two-class classification problem, a binary classifier was applied, i.e., the linear Support Vector Machine (SVM) method with the Radial Basis Kernel Function (RBF) from the LIBSVM package.Briefly, the method maps input feature data into a high dimensional space and seeks an optimal separating hyper-plane that has maximal margins between two classes of data samples.The penalty parameter and gamma value in the RBF kernel were determined by a grid-search approach.
Validation: To validate the effectiveness of the number of the selected features, a ten-folds cross validation was applied.As Fig. 6 illustrates, selected feature matrix was first split into ten chunks so that, for each iteration of the middle loop (N <= 10), only one chunk was used for error prediction while the other chunks were used to train the SVM classifier.In the inner most loop, another ten-fold cross validation was applied for SVM parameter selection.For each iteration of the gridsearch, a new SVM classifier was built followed by computation of the respective error rate.At the end of the process, a voting technique was used in order to obtain the best SVM classifier parameter.

RESULTS AND DISCUSSION
Table 1 shows the window size and computational time, at which the peak classification accuracy is obtained per subject.These results demonstrate the potential of the proposed approach to discriminate between two adjacent MI tasks for the right hand.However, each of these accuracies was observed at a different time window size.Thus, the wide variance of the window size, which is occasionally selected for optimal feature extraction, between the three subjects still presents a challenge (Fig. 7a).In other words, a constant window size cannot be generalized for all subjects.It can be seen that, the respective neural activities were dominant at α and β bands.Regarding the number of selected features, Fig. 7b demonstrates the clear distinction between the MRMR and PCA in terms of maximum classification accuracy (91% and 75% respectively).On the other hand, maximum accuracies of MRMR and PCA were obtained by   2).Moreover, Fig. 8 shows variations in EEG power, for all of the subjects, across different temporal and frequential ranges for the finger pointing and glass of water MI gestures respectively.However, sample size should be increased in order to enhance the classification rates using smaller window sizes.
Recent studies have been conducted in favor of decoding the various motor imagery tasks exploiting the relevant EEG recordings.As the challenges are extensively associated with the whole architecture of EEG-based BCI applications, recent studies showed promising results through the investigation of featuring the MI-based EEG dataset and classification of the different MI tasks.Elasuty and Eldawlatly (2015) proposed a feature extraction-based approach for discriminating of right hand and left hand MI tasks using Dynamic Bayesian Networks as a feature extraction method.The latter study showed an increase in the accuracy of classifying right and left hands MI tasks.For the same classification task, Dharmasena et al. (2013) achieved a considerable classification accuracy by using a consumer-grade device (EMOTIV headset).Like most of the previous work, the aforementioned studies were proposed to only discriminate between adjacent MI tasks.Thus, the proposed approach can be considered promising for classification of the MI tasks of the same body part.Furthermore, the present study proposed utilizing a mobile EEG signal acquisition device (OPENBCI) which is a benchmark for real-time applications.Among the little work which have been conducted to investigate the discrimination of different MI tasks of the same body part, Edelman et al. (2014Edelman et al. ( , 2016) ) proposed approaches by which complex right hand MI tasks can be distinguished by transforming the EEG source signals into their spatial cortical images.These studies showed a successful result in classifying between the different MI tasks of the right hand by using the EEG source imaging rather than the EEG signals.However, none of these studies has showed any contribution to the problem of features dimensionality reduction.To our knowledge, MRMR have not been used for feature selection in motor imagery tasks.In contrast to the most widely applied feature selection methods; i.e., Principal Component Analysis and Mahalanobis distance, MRMR provides information about the most relevant features which have the least redundancy with the corresponding targets.By applying this assumption, the proposed approach increased the classification accuracy of the right hand MI tasks compared to PCA method.

CONCLUSION
In the present study, an approach was proposed to discriminate neural traces of different right hand gesture MI tasks.The proposed technique employed temporal, frequential and spatial task specificity of cortical behavior in order to characterize the brain processes involved with glass of water and index pointer gesture MI tasks of the right hand.Using the consumer-grade device OPENBCI and MRMR as a feature selection method, the proposed approach resulted in increased classification accuracies among the respective motor imagery tasks.Moreover, this study highlighted the influence of window size on classification accuracy.Since the present study focuses on only validating the discrimination of different MI tasks for the same hand with no novelty in feature extraction methods, future work would be investigated on feature learning by applying a deep learning approach.

Fig. 1 :
Fig. 1: A block diagram of the proposed approach

Fig. 2 :
Fig. 2: Trial time scheme for Motor Imagery task data.Imagery period is only selected for generating the labelled dataset

Fig. 2 :
Fig. 2: Trial time scheme for Motor Imagery task data.Imagery period is only selected for generating the labelled dataset

Fig. 2 :
Fig. 2: Trial time scheme for Motor Imagery task data.Imagery period is only selected for generating the labelled dataset extraction stage, Independent Component Analysis (ICA) was employed to specify the most active channels among the eight.Figure 4 demonstrates the contribution of each channel to scalp activity by mapping the Independent Components (IC's) for each subject.Cortical mapping over the whole trials showed that channels F3, C3 and P3 were the most active and representative of the locations, as shown by Fig. 5.

Fig. 4 :
Fig. 4: Topographic 2-D map of the eight Independent Components (IC's) for all of the subjects

Fig. 8 :
Fig. 7: (a): Classification performance of the two motor imagery tasks using the MRMR and PCA feature selection methods plotted as a function of the number of best features used.The maximum accuracy for MRMR and PCA method was achieved when using 58 and 70 of the top features respectively; (b): Mean classification performance of each window size for the three subjects using only MRMR; (c): Confusion matrices for the MRMR and PCA methods at their respective peak overall accuracy (Water glass gesture, Index pointer gesture) feature vectors of 58 and 70 respectively.Therefore, a computational time-based preference for MRMR over PCA was achievable.As a result, Fig. 7c illustrates the overall accuracies for MRMR and PCA as per each

Table 1 :
The window size and the computational time for the maximum accuracy for each subject while applying MRMR as a feature selection