An Improved Refined Composite Multivariate Multiscale Fuzzy Entropy Method for MI-EEG Feature Extraction

Feature extraction of motor imagery electroencephalogram (MI-EEG) has shown good application prospects in the field of medical health. Also, multivariate entropy-based feature extraction methods have been gradually applied to analyze complex multichannel biomedical signals, such as EEG and electromyography. Compared with traditional multivariate entropies, refined composite multivariate multiscale fuzzy entropy (RCmvMFE) overcomes the defect of unstable entropy values caused by the scale factor increase and is beneficial towards obtaining richer feature information. However, the coarse-grained process of RCmvMFE is mean filtered, which weakens Gaussian noise and is powerless against random impulse noise interference. This yields poor quality feature information and low accuracy classification. In this paper, RCmvMFE is improved (IRCmvMFE) by using composite filters in the coarse-grained procedure to enhance filter performance. Median filters are employed to remove the impulse noise interference from multichannel MI-EEG signals, and these filtered MI-EEGs are further smoothed by the mean filters. The multiscale IRCmvMFEs are calculated for all channels of composite filtered MI-EEGs, forming a feature vector, and a support vector machine is used for pattern classification. Based on two public datasets with different motor imagery tasks, the recognition results of 10 × 10-fold cross-validation achieved 99.43% and 99.86%, respectively, and the statistical analysis of experimental results was completed, showing the effectiveness of IRCmvMFE, as well. The proposed IRCmvMFE-based feature extraction method is superior compared to entropy-based and traditional methods.


Introduction
Brain-computer interface (BCI) is a new type of humancomputer interaction technology that enables the brain to control external devices [1,2]. Motor imagery electroencephalogram-(MI-EEG-) based BCI has great prospects in the field of rehabilitation medical engineering. One of the key technologies of BCI is the ability to effectively extract features from complex multichannel MI-EEG signals.
Although the above univariate methods have shown good performance, they are only suitable for single-channel recording analyses. ey fail to measure multichannel data synchronously and ignore the dynamic characteristics across channels [25]. So, SampEn was extended to produce multivariate SampEn (mvSE) [26] and multivariate MSE (mvMSE) [26][27][28][29] to analyze multichannel signals more effectively. Considering the disadvantages of SampEn in mvSE and mvMSE, multivariate FE (mvFE) and multivariate MFE (mvMFE) [30,31] were yielded by replacing SampEn with FE. Recently, as an improvement of mvMFE, a refined composite mvMFE (RCmvMFE) was proposed to analyze fault signals and biomedical signals [32,33]. In RCmvMFE, the entropy stability is improved and the signals' length sensitivity is reduced. However, the coarse-grained process of RCmvMFE is a mean filter that smoothens signals but does not eliminate random impulse noise interference. It is inevitable to produce high-amplitude electrooculogram and electromyography interference during the acquisition of MI-EEG. is is not conducive to extracting valid feature information from multichannel MI-EEG signals. In this paper, improved RCmvMFE (IRCmvMFE) is developed by combining median [34] and mean filters in the coarse-grained process to further improve filter effect, i.e., first the median filter is applied to each channel to remove pulse interference, and then the mean filter is used for further smoothing. Subsequently, IRCmvMFE is proposed to extract features from multichannel MI-EEG signals. e experimental research shows the effectiveness of IRCmvMFE. e rest of the paper is described as follows: Section 2 introduces the process of extracting MI-EEG features using IRCmvMFE, Section 3 describes the experiments performed, Section 4 discusses the results, and Section 5 provides the conclusions.

Feature Extraction with IRCmvMFE
By combining median filters and mean filters in coarsegrained processes, RCmvMFE is improved to produce IRCmvMFE, which is applied to extract features of MI-EEG. e main steps are as follows: preprocessing, optimal channel selecting, performing multivariate coarse-grained analysis of preprocessed MI-EEG data, calculating IRCmvMFE, and constructing a feature vector. e support vector machine (SVM) was used to classify the feature vector. e block diagram of the proposed method is displayed in Figure 1.

Preprocessing MI-EEG Signals.
For two-class motor imagery tasks, assume that X 0 T,C i (e)] T represents the ith channel MI-EEG sequence of the Tth task, where T ∈ 1, 2 { }, i � 1, 2, . . . , p; e and p represent the sample points and the number of total channels, respectively. X 0 T,C i is bandpass filtered to the frequency band associated with the tasks and is expressed as T . e motor imagery time period [a, b] is taken as the optimal sampling interval, and MI-EEG signals in the segment are summarized as represents the sampled MI-EEG points within the optimal sampling interval.

Channel Selection.
When the brain is engaged in motor imagery, only parts of channels are activated in the form of the power spectrum. Extracting the features of all channels not only increases the computational complexity but also increases the feature information redundancy and reduces the classification accuracy [35]. erefore, the choice of   Computational Intelligence and Neuroscience optimal channels is important. In this paper, the Fisher score of the average power spectrum of X 2 T,C i is calculated to select channels according to the following equation: where P 1 (i) and P 2 (i) represent the average power spectrum on the ith channel of class 1 and class 2 motor imagery tasks, respectively. var() is the variance and F(i) represents the ith channel Fisher score. e larger the F(i), the greater contribution of the ith channel. e signals of p ′ channels with the top F(i) are selected for subsequent research.
. . , p ′ , in which p ′ stands for the number of selected channels.

Coarse-Graining of IRCmvMFE
Step 1. In the coarse-grained process of IRCmvMFE, the median filter is first performed on X T,C i . Supposing the filter size is j � 2k or j � 2k + 1, the data in the window would be sorted in ascending order with the filter output being where s ∈ 1, . . . , N { } and X T,C i (k) means the kth maximum value in the window.
Step 2. For the scale factor τ, the kth coarse-grained sequence on C i channel of Tth class task is where N ′ � int[N/τ] represents the sample points of the coarse-grained sequence. erefore, τ multivariate coarsegrained sequences are obtained and described as

IRCmvMFE Calculation
Step 1. e multivariate coarse-grained sequence Y k,τ T,C i is executed for multivariate embedded reconstruction, with the multivariate composite delay vectors Z k,τ T,m (i) calculated as Z k,τ T,m (i) � y k,τ T,C 1 (i), . . . , y k,τ T, where Step 2. e distance of any two multivariate composite delay vectors Z k,τ T,m (i) and Z k,τ T,m (j) is computed in the following equation: Step 3. Given a threshold r, suppose the fuzzy membership Step 4. e average membership grade ϕ k,τ T,m (r) can be obtained using the following equation: Step 5. Repeat the above steps, extend the dimension of the multivariate composite delay vector from m to m + 1 and derive ϕ k,τ T,m+1 . For each Z k,τ T,m (i), we get τ ϕ k,τ T,m (r) and τ ϕ k,τ T,m+1 (r). e average ϕ k,τ T,m (r) and ϕ k,τ T,m+1 (r) are calculated. e definition of IRCmvMFE is as follows: e procedure for calculating IRCmvMFE is summarized in Algorithm 1.

Determination of a Maximum Scale Factor.
As the number of scale factors increases, multivariate coarsegrained sequences become smoother. Scale factors that are too large omit useful information and reduce classification accuracy. erefore, the impact on sequence smoothness and classification accuracy should be considered comprehensively to determine its maximum scale factor τ max .

Construction of a Feature Vector.
For τ ∈ [1, τ max ], IRCmvMFE at τ scale in the Tth class task, i.e., IRCmv τ T , is estimated and combined to form the feature vector F T : Computational Intelligence and Neuroscience 3 e feature vectors of the two tasks are fused in parallel to obtain the feature vector of MI-EEG:

Data Description and Preprocessing.
MI-EEG data were obtained from dataset III in the BCI Competition II [36] and dataset IVa in the BCI Competition III [37]. MI-EEG signals on channels C3, Cz, and C4 were recorded in dataset III of BCI Competition II, where the data were from a healthy subject who imagined left-right hand movement. Left-and right-motor imagery tasks were each performed 140 times for a total of 280 experimental trials. e signals were sampled at 128 Hz and filtered to 0.5-30 Hz. e MI-EEG collection timing scheme is shown in Figure 2(a). e subject was at rest for the first 2 s, and the corresponding motor imagery task was completed according to the screen prompts from 3 s to 9 s. To better distinguish the two-class tasks, this paper used the sampling interval [451, 900]. e dataset IVa of BCI Competition III recorded the MI-EEG signals of five healthy subjects using 118 channels during right-hand (RH) and right-foot (RF) motor imagery tasks. e original sampling rate was 1000 Hz, but we downsampled these data to 100 Hz. e subjects performed the corresponding imaginary movement according to the prompts in the first 3.5 s and then rested for a random epoch between 1.75 s and 2.25 s. e timing scheme of MI-EEG collection during the right-hand-foot motor imagery task is shown in Figure 2(b). Each subject performed 280 trials, with 140 each of the RH and RF motor imagery tasks. In this paper, MI-EEG related to mu rhythm (8-13 Hz) and beta rhythm (14-32 Hz) related to motor imagery tasks were selected, i.e., the original MI-EEG signals were preprocessed by a bandpass filter of  Hz. e data between 0.5 s and 3.5 s were used for subsequent experimental research.

Channel Selection.
Channel selection directly affects the quality of feature information and classification accuracy. It is essential to select the optimal channels before extracting MI-EEG features. ere was a close relationship between the signals on channels C3, Cz, and C4 in the left-right-hand motor imagery task, so the data of these three channels were used for feature extraction. When RH and RF motor imagery tasks were conducted in dataset IVa from BCI Competition III, the Fisher Score of each channel was calculated by equation (1). e scores of different subjects are shown in Figure 3.
For each subject, the score of each channel is different and for different subjects, scores from the same channel are different. us, the optimal channels for each subject are different due to individual differences. e channels with the top three Fisher scores can be used as the optimal channels. e detailed information is shown in Table 1.

Comparison of Coarse-Grained Sequences between IRCmvMFE and RCmvMFE Methods.
To confirm the effectiveness of IRCmvMFE in extracting MI-EEG features, the coarse-grained processes of RCmvMFE and IRCmvMFE were compared. e relevant parameters were selected as follows: m k � 2, λ k � 1, r � 0.2SD, and τ � 10, where SD represents the standard deviation of X T,C i . According to Table 1, the channel k with the highest Fisher score of each subject was selected. e experimental process was as follows: when a motor imagery task was performed, at τ scale, the first j(0 ≤ j ≤ N − τ) points of X T,k were removed in turn.
e RCmvMFEs of the remaining points were calculated separately, and they were e average time series of IRCmvMFE was obtained the same way as IRC T,k . When imaging left-right-hand motor imagery, training set data were used for analysis, i.e., n e � 70. Similarly, n e was selected as 140 when the RH and RF motor imagery were performed. e amplitude of the original MI-EEG signals and coarse-grained sequences of RCmvMFE and IRCmvMFE during left-right-hand motor imagery are Input: Channels selected data X T,C i (1) Coarse-graining of IRCmvMFE Step 1. Calculate the output of median filter on X T,C i : y T,C i Step 2. Calculate the output of mean filter on y T,C i : y k,τ Step 2. For k � 1 to τ Calculate the distance and the similarity of Z k,τ T,m (i) and Z k,τ T,m (j) using equations (5) to (6) Calculate ϕ k,τ T,m (r) by equation (7)  End Step 3. Repeat the above steps, extend the dimension from m to m + 1, and calculate ϕ k,τ T,m+1 Step 4. Set ϕ  Computational Intelligence and Neuroscience   Figure 4. Similarly, the experimental results from imaging right-hand-foot movement are shown in Figure 5. It can be seen from Figure 4 that the original MI-EEG signals had larger fluctuations, which was obviously improved after the coarse-grained process of both RCmvMFE and IRCmvMFE; and the smoothness of IRCmvMFE was better than RCmvMFE. In Figure 5, there are different intensity impulse noise interferences for different subjects. e coarse-grained sequences of both RH and RF motor imagery tasks using the RCmvMFE and IRCmvMFE of each subject changed with the fluctuations of the original MI-EEG but oscillated more smoothly. For subject "aw," the impulse noise is not obvious, and the coarse-grained sequence of IRCmvMFE had larger fluctuations than that of RCmvMFE. But the intensity of impulse noise interference is higher for other subjects. Both RC T,k and IRC T,k showed better smoothness, and IRC T,k was superior to RC T,k for rapid MI-EEG changes. e reason is that the coarse-grained process of RCmvMFE is equivalent to a mean filter, which has the effect of low-pass filtering and smoothing and can remove some random interference. However, it is helpless against impulse noise caused by sudden factors such as eyemovements, blinks, and motion. In the coarse-grained IRCmvMFE, the median filter is assigned to remove the impulse noise interference, and then the filtered signals are smoothed by a mean filter.

Selection of Parameters in IRCmvMFE.
e parameter selection will affect the estimate of IRCmvMFE. According to equation (8), the estimation of IRCmvMFE is not only related to the preprocessed MI-EEG but also involves selecting an embedding dimension vector M � [m 1 , m 2 , . . . , m p′ ], time delay vector λ � [λ 1 , λ 2 , . . . , λ p′ ], threshold r, and scale factor τ. e selection of parameter M was similar to reference [32], i.e., m k � 2. Parameter λ does not have any proven standards, so for simplicity, λ k was selected as 1. e threshold r was determined as r � 0.2SD.
In addition, the selection of τ influenced the filter effect in the coarse-grained process of MI-EEG and affected the extracted features and the classification results in turn. e larger the τ, the larger the calculation and the better the recognition. In contrast, a smaller τ resulted in poor filter performance [34]. When τ ∈ [1, 75], the IRCmvMFEs with imaging left-right-hand movements were estimated and then classified by SVM. Gaussian kernel function was employed in this paper, and SVM optimized by grid search. When τ ∈ [1, 45], the same experiment was performed with right-hand-foot motor imagery tasks. e 10 × 10-fold cross-validation (CV) was used to eliminate the contingency in the feature extraction process of MI-EEG. e average classification accuracy of the 10 × 10-fold CV is shown in Figure 6.
In Figure 6(a), the classification results gradually increased as the scale factor τ increased. When τ was from 55 to 75, the classification accuracy tended to be stable and close to 100%, and the highest recognition was obtained at 65 scale. erefore, the maximum τ about left-right-hand motor imagery was selected as 65. In Figure 6(b), with the increased τ, the average recognition rate of each subject first increased and then later decreased. In this paper, the τ max values of subjects "aa", "al", "av", "aw," and "ay" during right hand-foot-motor imagery were chosen as 41, 37, 33, 38, and 39, respectively. And, τ max is related to the mathematical model of the coarse-grained process of IRCmvMFE. ere is a significant difference in τ max during different types of two-class motor imagery tasks, while the difference between multiple subjects during the same type of tasks is not obvious.

Comparison of Multiple Entropy-Based Feature Extraction Methods.
In this section, the comparative study of IRCmvMFE and various entropy-based feature extraction methods was conducted. To make the comparison process more objective, the same dataset was selected as reference [13,22], i.e., dataset III from BCI Competition II, and SVM was used for classification. e classification result of IMFE was derived from [22], and the related parameters of other entropy-based methods were selected as references [13]. e average recognition results of 10 × 10-fold CV and standard deviations are displayed in Figure 7.
In Figure 7, the classification result of MFE was higher than SampEn, FE, and MSE. Because the fuzzy membership function was used to enhance the stability of MFE, richer feature information from the multiscale was collected. At the same scale, the information of multiple coarse-grained sequences was integrated by CMFE, yielding a slightly better result. Based on the parameters' independent optimization strategy, the preferred parameters were used by IMFE to extract features from the MI-EEG, and the recognition accuracy was further improved. Despite the results of mvSE, mvFE, and mvMSE being poor, mvMFE, RCmvMFE, and IRCmvMFE showed the advantages of multivariate entropy methods over traditional univariate entropies, both in terms of classification accuracy and standard deviation. is was mainly because these feature extraction methods evaluated the multivariate complexity of multichannel data and expressed the dynamic relationships and synchronizations across channels.
IRCmvMFE, RCmvMFE, and mvMFE methods displayed superiority on dataset III from BCI Competition II. To further illustrate the improvement of IRCmvMFE, a comparative study of these three methods was performed based on dataset IVa and using SVM for classification. e classification results with 10×10-fold CV are shown in Table 2. For each subject, the recognition rates obtained by using RCmvMFE to extract features of MI-EEG were higher    Figure 5: Comparison of MI-EEG signals and the coarse-grained sequences by RCmvMFE and IRCmvMFE during RH and RF motor imagery tasks. (a) Subject "aa" with RH motor imagery. (b) Subject "aa" with RF motor imagery. (c) Subject "al" with RH motor imagery. (d) Subject "al" with RF motor imagery. (e) Subject "av" with RH motor imagery. (f ) Subject "av" with RF motor imagery. (g) Subject "aw" with RH motor imagery. (h) Subject "aw" with RF motor imagery. (i) Subject "ay" with RH motor imagery. (j) Subject "ay" with RF motor imagery. 8 Computational Intelligence and Neuroscience than those of mvMFE because the multivariate feature of RCmvMFE was considered at the same scale, and the defect of unstable entropy values, i.e., coarse-grained time series shortening with scale factor increases, was overcome. Moreover, a composite filter technique was applied in the coarse-grained process of IRCmvMFE to eliminate burst-like impulse noise and the Gaussian noise of the MI-EEG, which produced better quality information. For different subjects, IRCmvMFE achieved better recognition accuracy and a smaller standard deviation than RCmvMFE, illustrating the stability and superiority of IRCmvMFE. Further, according to Figure 5, the impulse noise interference was not obvious for subject "aw," and the recognition result by IRCmvMFE was slightly better than RCmvMFE. However, there was greater impulse noise interference for most subjects ("aa," "al," "av," "ay"), after using IRCmvMFE to enhance the filter effect, the recognition results were obviously improved.

Statistical Analysis.
In this section, statistical analysis was performed to further describe the development of IRCmvMFE. e kappa coefficient, which was designed to measure the classification precision and the comparison of performance in multiclass tasks, was made fairer. is method is a common indicator for evaluating the performance of BCI systems [38,39]. e calculation of κ coefficient was expressed as where p 0 represents the classification accuracy and p e means the probability of opportunity consistency. For a two-class task, if the number of samples across classes was equal, then the value of p e was 0.5. Using equation (11), the mean kappa coefficients of IRCmvMFE, RCmvMFE, and mvMFE with 10 × 10-fold CV were calculated. e results are shown in Table 3.

Computational Intelligence and Neuroscience
Comparing the mean kappa values, the results of MI-EEG feature extraction from each subject was highest when using IRCmvMFE; this result revealed that IRCmvMFE had better consistency than those of RCmvMFE and mvMFE.

Comparison of Multiple Traditional Feature Extraction
Methods. A variety of traditional feature extraction methods [3][4][5][6][7][8][9] were compared with the method presented in this paper, using SVM as a classifier. In Table 4, the top classification results and average classification of 10 × 10-fold CV of referenced feature extraction methods [3][4][5][6][7] on BCI competition II are displayed. IRCmvMFE achieved the highest classification accuracy over the referenced methods, and its 10 × 10-fold CV results were also better; it also showed the ability of IRCmvMFE to quantify the complexity of multichannel signals and implied its superiority in extracting features from MI-EEG signals.
e CSP-based feature extraction methods have been extensively studied on BCI competition III. e experimental results of 10 × 10-fold CV with CSP, filter bank CSP (FBCSP), discriminant FBCSP (DFBCSP), sparse FBCSP (SFBCSP), and spectrally weighted CSP (SWCSP) methods were from references [8,9]. e method presented in this paper was compared with these methods, and the recognition rates are shown in Table 5. e results of CSP-based feature extraction were lower than those of IRCmvMFE. CSP-based methods only considered the spatial characteristics of MI-EEG signals, ignoring the features in other domains. IRCmvMFE effectively extracted nonlinear dynamic features of MI-EEG, correctly analyzed multichannel signals, and had good applicability in multiple subjects.

Discussion
In this paper, IRCmvMFE was proposed as a feature extraction method for MI-EEG signals. In IRCmvMFE, a composite filter technique was applied to improve the coarse-grained process of RCmvMFE, which eliminated   Note: "-" represents that average recognition rate of 10 × 10-fold CV is not given in the reference. impulse noise interference due to random factors, produced smoother MI-EEG time series, and enhanced the filter results. e optimal channels and the optimal parameters were selected to calculate IRCmvMFE for each subject when imaging left-right-hand or right-hand-foot movement. Multiscale IRCmvMFEs were constructed as a feature vector. Entropy-based and traditionally referenced feature extraction methods were compared on two public datasets. e kappa coefficients of IRCmvMFE, RCmvMFE, and mvMFE were calculated for statistical analysis. e results implied the superiority and applicability of IRCmvMFE for the analysis of two-class motor imagery tasks. In the future, we will continue to focus on the research of multiclass motor imagery tasks.

Conclusions
A novel nonlinear dynamics method based on RCmvMFE, called IRCmvMFE, was introduced in this study.
is method provides a potential tool for the nonlinear dynamic analysis of multichannel MI-EEG signals. RCmvMFE was developed using a composite filter technique in the coarsegrained process, which effectively removes impulse noise interference, better reflects the dynamic correlations both within and across channels, and is more closely matched the nonlinear and time-varying characteristics of MI-EEG and produced better features and classification accuracy. IRCmvMFE was applied to the analysis of multichannel MI-EEG signals and was compared to other commonly used feature extraction methods. IRCmvMFE yielded the highest classification results and improved stability; it also displayed the applicability of IRCmvMFE for MI-EEG feature extraction and provided a useful tool for the analysis of other complex, two-class biological signals.

Data Availability
Two previously reported datasets were used to support this study and are available at http://bbci.de/competition/ii/ and http://www.bbci.de/competition/iii. ese datasets are cited at relevant places within the text as references [36,37].

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this article.