Optimal Number of Electrode Selection for EEG Based Emotion Recognition using Linear Formulation of Differential Entropy

Anxiety, nervousness and stress are daily challenges for mankind. These challenges are very severe specifically for students of the age group between years 14 to 25. Therefore it very important to develop the simplest, low cost, accurate and handy process which will be helpful for the society to gauge the anxiety levels and take necessary corrective actions to avoid health and psychological issues. It is of extreme importance to have regular checks on change in behavior and to ensure correct emotion analysis and take the corrective action. Article elaborates unique feature extraction method called as “linear formulation of differential entropy”. With this method we have significantly reduced number of (Electroencephalography) EEG channels for emotion detection. This work has discovered new approach in neuroscience. It’s proved that, single-channel EEG contains sufficient information for emotion recognition. The performance of the newly proposed technique is based on the “Database for Emotion Analysis using Physiological Signals” (DEAP) benchmark database using single channel FP1, prefrontal channel [FP1, AF3, FP2, AF4], and all 32 channel. Bidirectional long short term memory (BiLSTM) is used as classifier. The performance shows that, accuracy achieved using proposed method of single channel (FP1) is almost equivalent to the accuracy of 32 channels.

Day today challenges, success, failures, personal and professional relationship has huge impact on emotions. Little deviations from routines activities make either positive or negative impact on human mind set. Such deviations lead to generate positive, negative or neutral emotions within mind. Every emotion generates impact. Such impact can be noticed through body language, voice modulations, facial changes 7 . Human beings can hide such visible impacts and therefore it is pertinent to have scientific, economical and easiest way to recognise the emotions and provide necessary support to the object for its own behavioural development. Such support is extremely useful to handle pressure due to fears competition and pressure situations. Scientific research work on this subject is helpful for analysing mental health of human.
In this research work portable and wearable single channel device has been proposed. This method is very unique and varies from conventional practice to use bulky multichannel EEG devices.
Zhong Min et al. had explained significance of use of less number of electrodes for EEG based emotion detection. He had used normalized mutual information (NMI) for enhancing accuracy 15 . W.Zheng proposed group sparse canonical correlation analysis for EEG based emotion recognition. He addressed importance of reduction of number of channels for lowering complexity 20 . K. Ansari Asl et al. highlighted a problem of large number electrode for EEG based emotion detection. He used synchronization likelihood method for emotion detection 1 . J. Zhang et al. proposed method known as Relief to analyze emotions 17 . His findings were issues with use of massive set of EEG electrode and complexity to handle data to while working on arithmetic computations. He also highlighted issue of great inconvenience to the user due to time consuming process of recording various signals from bulky and multi channel device. V. Gupta et al. worked on EEG based emotion recognition method called as ûexible analytic wavelet transform (FAWT) and his findings were also in line with above referred research work 4 .
While we discussed on use of less number of channel for better accuracy various researchers suggested number of feature extraction and classification methods. Few of them are as elaborated. Ruo Nan Duan et al. suggested feature extractions method known as differential entropy (DE). DE is further classified as differential asymmetry (DASM) and rational asymmetry (RASM) method. He conclusions suggested that, differential entropy method generates much better results compare to traditional energy spectrum method 3 . L.C. Shi et al. further worked on Differential entropy method and proved that differential entropy is equal to logarithm power spectrum 12 . D.W. Chen et al. presented fusion of differential entropy with linear discriminant analysis (LDA) and explained that , how to reduces time complexity using DE method 2 . W.L. Zheng et al. proposed deep neural network method in which precise frequency bands were selected 18 .
The deep learning is very advanced technique used in machine learning. Different deep learning architectures are introduced in the literature. The models with deep architectures attain excellent results which are better than old methods 16, 19 8, 9 . Figure 1 present the work flow. There are four stages viz. database collection, band separation, feature extraction, and an emotion classification.

Collection of the EEG signal database
Established benchmarked, DEAP 6 open data base has been used for analysis. Existing open data base has been generated through experiment on 32 subjects. To perform experiment 40 Channels have been used. Out of this 32 Channels were EEG and remaining was peripheral. Each subject was asked to watch 40 clips of one minute. This recording was in line with 10-20 system using sampling frequency of 512 Hz. This 512 Hz frequency has been down sampled to 128 Hz. Such EEG signals are further band passed to 4 to 45 Hz.

Feature Extraction Method
Feature Extraction process plays important role to develop an EEG based brain computer interface (BCI) system. Available literatures have reported different feature extraction methods i.e. Hjorth parameters, wavelet entropy, mean variance, kurtosis, power spectral density (PSD) and differential entropy.
In this research paper, instead of using raw EEG signals, different features extraction methods have been used. Due to use of different features extraction methods, disadvantage of incredibly large size data handling and memory size requirement has been eliminated. While developing analysis model three types of features namely Hjorth parameters, differential entropy and linear formulation of differential entropy have been used. The features are extracted in each of the frequency range i.e. theta, alpha, beta, and gamma. From Figure 2 it is evident that, EEG signals are consistent with frequency ranges: theta (q) :-(4-8 Hz), alpha (a) :-(8-14 Hz), beta (b) :-( 14-30 Hz), and gamma (g) :-( 30-44 Hz). Hjorth Parameters 5 Hjorth Parameters are used in signal processing in time domain analysis. These parameters are described by B. Hjorth in 1970. There are three types of parameters i.e. Activity, Mobility and Complexity.

Activity
The activity parameter measures the signal power and the variance of a time function. This is computed by using the following equation 10 :

Mobility
The mobility parameter indicates the mean frequency of the power spectrum.
Mobility is the ratio of the square root of variance of the first derivative of the signal k (t) to variance of the signal k (t) 10 .

...(2) Complexity
Complexity parameter represents frequency change. It is given by following equation 10 , Hjorth parameters are helpful to analyze the signals in time domain. Differential entropy 3 To determine the complexity of a continuous random variable, differential entropy is used. For normal distribution N (µ, ä 2 ), the differential entropy can be described below: ... (4) Where, z is a random variable, ð and e are constants. In a particular band, for a fix length EEG signal, DE is correspondent to the logarithmic spectral energy 12 .

Linear Formulation of Differential entropy (LF-DE) using sigmoid functions for fourth order spectral moment of EEG signal
The main purpose of this work is to propose novel feature for EEG signal analysis using EEG trace in terms of slope spread. As fourth order spectral moment compute excessive feature correlation with the sine wave i.e. the "softest" possible curve.

Fig. 6. EEG based Emotion Detection Using Prefrontal Channels
Eq. 10 shows, linear formulation of differential entropy using sigmoid function for fourth order spectral moment of EEG signal. Fourth order spectral moment effectively detects nonlinearity and non-Gaussianity in EEG signal. In the frequency and in the time domains, linear formulations are more easily merged 5 .

Classification
In this paper, BiLSTM is used as classiûers for examine the performance of proposed feature extraction method. Long-Short Term Memory (LSTMS) 11 Long-Short Term Memory Networks (LSTMs) are a particular class of recurrent neural networks (RNN). Same is invented by Hochreiter  Figure 3. Long sequences pose challenges to learn from standard RNN and cause the problem of vanishing/exploding gradient. Lengthy sequences are memorized by the LSTM. LSTM can very efficiently extract the temporal and spatial features from EEG signal. Figure 4 depicts the structure of a BiLSTM layer, comprising a forward and a reverse LSTM layer. The LSTM comprises of three control units, namely input gate (5ØVÜ5ØaÜ), output gate (5Ø\Ü5ØaÜ) and forget gate (5ØSÜ5ØaÜ) at time 5ØaÜ, respectively. A bidirectional LSTM network is used to advance systems performance. Bidirectional LSTM network has the ability to store more information as compared with unidirectional networks. The proposed design is implemented using BiLSTM.
Bi-LSTMs have two hidden layers, one in forward and other in a backward direction 14 . The forward hidden layer train input sequences the same as conventional unidirectional LSTM and backward hidden layer train input sequence in reverse order. The purpose of using a backward and forward hidden layer is to observe past and future data and learns its weights accordingly. Hence Bi-LSTM can use past and future context of a given input sequence 14 .

RESULT AnD DISCUSSIOn
From research work, it is found that currently there is no robust single channel feature extractor available with an equivalent performance of multichannel EEG device. Apart from this conventional classifier does not address the need of temporal property of EEG signal while extracting feature. Deep learning algorithm of BiLSTM provides solution on this issue. BiLSTM can learn both spatial and temporal feature and therefore enhances performance and reduces the noise. Before arriving at this conclusion, we have evaluated three feature extraction methods namely LF-DE, DE and Hjorth parameters. Findings of proposed with BiLSTM classifier with single channel, prefrontal and 32 channels are shown in Figure 5, 6, and 7 respectively. Comparison of results of proposed LF-DE feature extraction method with other existing method is shown in Table 1. It is evident that average accuracy achieved using single channel (FP1) is 71.87% which is very close to the 32 channels average accuracy of 74%. Same is be seen graphically in Figure 5, 6, and 7.

COnCLUSIOn
In this paper, a new feature extraction method "Linear Formulation of differential Entropy" (LF-DE) is proposed. The result obtained here using proposed method for single channel are almost equivalent to that of multichannel devices. The proposed work will help the society to a great extent. The frequent hospital visits can be reduced and diagnosis can be done by the individual using single channel EEG device and with same accuracy level. Single channel EEG device with dry electrode FP1 are commercially available at economical prices and are very convenient to use traditional approach to use full-channel EEG signals has huge disadvantage as this approach calls for handling of excessive and unwanted data which also leads to higher memory requirements and hence costly hardware and firmware requirements. Use of BiLSTM to classify emotional state helps to reduce handling of EEG channels from 32 to 1. The results shown above substantiate that, the proposed method of single channel use with the help of BiLSTM effectively improves the accuracy of emotion recognition.

ACKnOWLEDGEMEnT
Not applicable