A Method of Feature Extraction for EEG S ignals Recognition Using ROC Curve

The feature extraction of Electroencephalograph (EEG) signals plays an important role in mental task recognition of brain-computer interaction (BCI). In this study, a novel method of EEG signal feature extraction is proposed using techniques of fast Fourier transform (FFT) and receiver operating characteristic (ROC) curve. In the proposed method, the raw EEG data was transformed into power spectrum of FFT at first, and then to find frequencies decided by area under curve (AUC) of ROC between the value of spectrums of different classes of metal tasks. Experiment results using benchmark data of EEG signals showed the effectiveness of the proposed feature extraction method when support vector machine (SVM) was used as a classifier.


Introduction
The electronic potential signals of electrodes, which are arrayed on the surface of head, are measured by electroencephalograph and it is called as "Electroencephalogram", i.e., "EEG".When different metal tasks such as motor imagery, calculation, number counting, article considering, and so on, different patterns of EEG signals can be observed.So the analysis and recognition of EEG signal is one of the way to realized brain computer interface (BCI), or brain machine interface (BMI), i.e., people can control the machines or robots by their imaginations [1] [2].Feature extraction of the EEG signal plays a very important role of mental task recognition to enhance the accuracy and there have been many studies of this theme [3] [4].In [5], Obayashi et al. proposed to use the nonlinear normalized feature spaces of fast Fourier transformation (FFT) of EEG signals.And the method was improved by selecting the most characterized phases of raw EEG signals in our previous works [6].
In this paper, we propose a novel method using FFT and the value of area under curve (AUC) of receiver operating characteristic (ROC) curves to extract the feature of EEG signal and verified its effectiveness with a powerful classifier: kernel-support vector machine (k-SVM).Benchmark data of EEG classification [7] and BCI competition II data [8] were used in the experiments.

Method
To extract the feature vector space of EEG signal for mental task recognition, we propose to use FFT and ROC to find the limited frequencies which AUC values are high when two classes EEG data are compared in the training process, and the power spectrums of these frequencies are used as input vector of classifiers.function provided by free software package R [9] to process the raw EEG signals.

Receiver operating characteristic (ROC) curve
Assume that two classes data class A and class B have their probability density functions as shown in Fig. 1.The true positivity of class A will be the shadowed area α in the left and the false positives area 1 -β in more shadowed area.As α and 1 -β are plotted while sliding the threshold along the x axis, a graph as shown in Fig. 2 is obtained.The curve in this graph is called the ROC curve and it shows the divisibility of the two probability density functions.If two distributions completely overlap, the for any position of the threshold we have that α =1 -β.

Fig. 2 AUC of a ROC curve
In Fig. 2, the area below the ROC curve is called "area under curve" (AUC).This value takes from 0.0 to 1.0, and it is an indicator of the divisibility of the two distributions.If the value of AUC becomes to 0.5, two distributions completely overlapped.Conversely, when the value of AUC reaches 1.0 (or 0.0), it means that the two distributions are completely separated.
ROC was used to classify cDNA microarray successfully [10] and applied to EEG classification with wavelet transformation (WT) recently [11].Here, we propose to use the value to AUC of ROC and FFT to distinguish the two classes EEG signals.

Method of using FFT
Suppose that an EEG signal n m x , (m=1, 2, …, M, n=1, 2, …, N) is given.m indicates the number of channels, and n indicates the number of samples in channel m.The procedure of the conventional method using FFT to extract the feature of EEG [6] is as follows.
Step 1 Divides the EEG signal of all the channels m into multiple windows l (l=1, 2, …, L) along the time domain.
Step 2 Perform FFT on all of the divided windows, and let the result be l m F , .
Step 3 Compare the power spectrum of the window l of the channel m to its adjacent window l + 1, and calculate the difference Step 4 From Step 3, the maximum in the channel m is determined, and power spectrum in a region of 4 to 45 Hz in the frequency band of the window l is used as a feature space for classifiers such as SVM, multi-layer perceptron (MLP), and so on.

Method of using FFT and ROC Curve
Let the input signals be

(c) AUC of FFT of 2 kinds of mental tasks
Fig. 3 The procedure to obtain AUC The procedure of how AUC of FFT of 2 kinds of EEG data patterns is shown in Fig. 3 Frequencies with enough high values of AUC in Fig. 3 (c) are chosen as input features to classifiers.

Recognition Experiment
Mental task recognition experiments using EEG data and the methods introduced in Section 3 were performed.Two kinds of EEG data are used: (i) a benchmark data provided by Colorado State University [7] and (ii) BCI Competition II generated by Birbaumer [8].In the case of (i), there are 5 kinds of mental tasks (see Table 1) were requested to subjects, and their EEG data were obtained 10 trials with 7 channels.Meanwhile in (ii), there are 2 data sets named "Ia" and "Ib" which are EEG data obtained by 2 kinds of tasks: Ia was asked to recognize visual objects displayed on the top or bottom of monitor and move a cursor up or down, and Ib with the same tasks but with not only visually presentation but also auditorily.Details of EEG data used in our experiments are listed in Table 2.  Kernel SVM (k-SVM) was used as the classifier, its package and ROC package in R [9] was utilized in our experiments.
As shown in Fig. 4, the average classification rate of the conventional method [6] (FFT, 140 Data) was 76.0%, meanwhile 100.0% of the proposed method (FFT+ROC, 140 Data) for the 5 tasks EEG data.140 Data indicates that the input dimension to SVM was 140 which came from 20 points of FFT (P in Step 5 of the proposed procedure) signals in 7 channels.
For BCI competition II data Ia and Ib [8] (2 classes), the proposed method also showed its priority to the conventional method as shown in Fig. 5 and Fig. 6 respectively.Additionally, higher dimensionality P -656 showed higher recognition rate according to the experiments results.For Ia, 120 Data means 20 points of one channel data, and 6 channel data were used, as well as 216 Data chosen by 36 points per channel.For Ib, 140 Data were obtained by 20x7, and 315 Data from 45 points per channel.The highest recognition rates of our method for Ia and Ib are 91.23% and 77.65%, higher than the best classification rates 90.10% and 56.67% of T. Nguyen et al. [11] respectively.

Conclusion
In this paper, we proposed a feature extraction method for EEG pattern recognition.The proposed method uses the AUC value of ROC curves of FFT results for two classes EEG signals to decide the feature vector space for classifiers.Classification experiments were performed using the Gaussian kernel SVM (k-SVM) to confirm the proposed method.As the result of the experiments, the proposed method showed its higher recognition ability than the conventional methods including the conventional FFT feature extraction method and others.We also used ROC of raw EEG data and ROC of wavelet transform (WT) as the input of k-SVM, and experiment results showed that the method described in Section 3, i.e., features extracted by ROC of FFT, was superior to others.
The future work of this study is to confirm the performance of the proposed method by other classifiers such as MLP, K-nearest neighbors (KNN), convolutional neural networks (CNN), and so on.

Fig. 1
Fig. 1 Overlapping of the probabilities of two classes data.

Step 4
2, k=1, 2, …, K), where k indicates the kth EEG signal of a set of EEG data, and c indicates the class of mental task.m and n are the same as in section 3.1.The procedure of the novel method is as follows.Step 1 Perform FFT to all the EEG signals c = 1 and 2 of K signals of channel m.Step 3 Calculate the ROC curve and its AUC Repeat Step 2 and Step 3 on all channels, a set p m A , of frequency p in channel m is obtained.Step 5 Find P points of frequencies which p m A , is high.Step 6 Power spectrum p m E , (p=1, 2, …, P) of the unknown EEG signal are used to as input feature vector of a classifier.(a) The raw EEG data (b) FFT results

Fig. 4
Fig. 4 Results of different feature extraction methods for the benchmark EEG data [7] (5 classes)

Fig. 5 Fig. 6
Fig. 5 Results of different feature extraction methods for the BCI competition II data Ia [8] (2 classes)

Table 1
Mental tasks in a benchmark database [7].

Table 2
EEG data used in the experiments.