An Adaptive EEG Feature Extraction Method Based on Stacked Denoising Autoencoder for Mental Fatigue Connectivity

Mental fatigue is a common psychobiological state elected by prolonged cognitive activities. Although, the performance and the disadvantage of the mental fatigue have been well known, its connectivity among the multiareas of the brain has not been thoroughly studied yet. This is important for the clarification of the mental fatigue mechanism. However, the common method of connectivity analysis based on EEG cannot get rid of the interference from strong noise. In this paper, an adaptive feature extraction model based on stacked denoising autoencoder has been proposed. The signal to noise ratio of the extracted feature has been analyzed. Compared with principal component analysis, the proposed method can significantly improve the signal to noise ratio and suppress the noise interference. The proposed method has been applied on the analysis of mental fatigue connectivity. The causal connectivity among the frontal, motor, parietal, and visual areas under the awake, fatigue, and sleep deprivation conditions has been analyzed, and different patterns of connectivity between conditions have been revealed. The connectivity direction under awake condition and sleep deprivation condition is opposite. Moreover, there is a complex and bidirectional connectivity relationship, from the anterior areas to the posterior areas and from the posterior areas to the anterior areas, under fatigue condition. These results imply that there are different brain patterns on the three conditions. This study provides an effective method for EEG analysis. It may be favorable to disclose the underlying mechanism of mental fatigue by connectivity analysis.


Introduction
Mental fatigue is a kind of key matter that threatens the traffic safety. It is very common during the daily life. Mental fatigue is defined to be the difficulties of initiating or maintaining initiative activity [1].The mental fatigue can lead to the decline of the alertness and vigor states accompanied by the tiredness, drowsiness, and difficulty in attention concentration. These manifestations are very dangerous for drivers after a long-time driving. Reports have shown that 16% traffic accidents are related to the mental fatigue of drivers [2].
Recently, many researchers have been devoted to mental fatigue effect [3][4][5], mental fatigue classification [6,7], and fatigue countermeasures [8,9]. In the view of biology, the mental fatigue is related to neuron energy reduction and glutamate transmission decrease [10]. Fatigue is also a comprehensive representation including both physiological and psychological elements [11]. The study about fatigue mechanism was elicited by auditory stimulus reports that mental fatigue is associated with the changes of brain activation on dorsal pathway [12]. It should be noted that the dorsal pathway has strong relationship with attention. The mental fatigue has also been proven to be related to the cognitive task, but not to be restricted to the performances of the stimulus-related brain areas [1]. Therefore, the exploration of the connectivity among multiareas of brain is favorable for illuminating the mechanism of mental fatigue.
Many kinds of measurements, such as face gestures [13] and neural signals [14], have been used to study mental fatigue. The electroencephalographic (EEG) as the direct and noninvasive measurement of the brain neuron activities has been regarded as one of the most applicable and reliable manifestation of the mental fatigue [15]. EEG primarily represents the excitement and inhibition of a mass of neurons' postsynaptic potentials, presenting a high temporal resolution [16]. EEG signals can be divided into delta (0.5 Hz-4 Hz), theta (4 Hz-7 Hz), alpha (8)(9)(10)(11)(12)(13), beta (13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30), and gamma (30 Hz-80 Hz) activities, etc. Among these activities, the delta and theta activities have been proven to be related to fatigue condition [17]. These two activities have been used for the level of visual attention analysis [18], the mental fatigue evaluation [19,20], and the fatigue prediction [21]. Methods, such as correlation analysis [22] and small-word network algorithm [23] are the common ways to analyze the brain connectivity. However, due to the attenuation of the brain structure, EEG manifests low signal-to-noise ratio (SNR) and space resolution. Noise may result in false connections between network nodes during connectivity analysis, thus blur the true connected relationship. Therefore, the effectiveness of these methods is limited. Brain source localization (BSL) algorithm [24] is a method to improve the space resolution and SNR through reconstructing the brain activity based on source. However, the low SNR of EEG still affects the solution of the ill inverse problem of BSL. The power spectrum density (PSD) of delta and theta activities has been widely employed for quantitative analysis of fatigue [25,26]. However, the low SNR of EEG limits its application on connectivity analysis. Hence, a feature extraction algorithm is in urgent need for the analysis of mental fatigue. Murata and Uetake applied event-related potential and principal component analysis to extract the main fatigue feature from the blurred EEG [27]. Principal component analysis (PCA) is a common dimensionality reduction algorithm for improving SNR. It applies linear transformation to achieve a set of linearly independent components, thus to extract the principal feature. But, this algorithm may be limited during nonlinear EEG processing. The autoencoder is a novel nonlinear dimensionality reduction method. The stacked denoising autoencoder (SDAE) is a feedforward neural network which is consist of multiautoencoders. Its input signals are corrupted by noise. The hidden layer of SDAE that is restrained to be a narrow bottleneck can be considered as the reconstruction of original clean input signals. In this paper, in order to suppress the interference from noise, to decrease false connection between multibrain areas, and then to explore the underlying mechanism of mental fatigue, a novel model establishment method based on SDAE is proposed. This model is applied to extract features of mental fatigue under different kinds of fatigue conditions. The causal analysis of the extracted feature is applied for exploring the connectivity among multiareas of brain. This work provides a novel way to quantitatively analyze the mental states. It is also hopefully benefit to reveal the underlying synergistic effect between multibrain areas.

Dataset
The experiment about mental fatigue includes fifteen subjects. The average age is 23.5 with a deviation of 1.37. All the subjects without any injured and diseased vision or diseases of central nervous system are from Northeastern University. All subjects give their informed consent for inclusion before they participate in the study. The study is conducted in accordance with the Declaration of Helsinki, and the protocol is approved by the institution's ethical review board of Northeastern University. The EEG data are recorded using a g.HIamp system (g.tec Inc., Austria) with a sampling rate of 1200 Hz from active 126 Ag/AgCl electrodes according to the 10-5 electrode location system [28]. The unilateral earlobe is chosen as the reference and the frontal position (Fpz) is adopted as the ground. To study the connectivity of the multibrain areas, the electrode distribution is subdivided into four areas, the frontal area (area 1), the motor area (area 2), the parietal area (area 3), and the visual area (area 4) according to the major function division of Brodmann area as shown in Figure 1.
The data are bandpass-filtered between 0.5 Hz and 100 Hz and notch-filtered from 48 Hz to 52 Hz to suppress noise. During the whole experiment process, all electrode impedances are kept below 30 kΩ.
Subjects are, respectively, seated in an armchair in a dark and electromagnetic shielding laboratory. The computer screen on the desk is about 1 meter away from the tip of subjects' noses. To suppress eyeball movement, subjects are instructed to focus on the screen center and to reduce the body movements. To record the EEG under awake condition, experiments are often carried out at about 9 am. The data collection on each subject lasts for about one minute. Afterwards, a consecutive P300 training section which continues for at least an hour is executed. Subjects are required to concentrate on the computer screen center and to type words from an English passage by P300 system. The line and row of P300 system lighten randomly. The subjects focus on the  Neural Plasticity character that they should type. Then, about one minute EEG data are recorded as fatigue condition. After that, subjects need to keep awake for a whole night. They are not allowed to attend any entertainment activities. In the following day, about one minute EEG recording is collected at about 8 am on the same subjects as sleep deprivation condition.

Methodology
The flowchart of the EEG processing method we applied is illustrated in Figure 2. Before data analysis, to reduce the influences of the volume conduction and to improve SNR and spatial resolution, the surface Laplacian algorithm has been applied on EEG recordings. It is formulated as Equation (1).
where V C is the signal after surface Laplacian filter. V CO is the original signal. V 1 , V 2 , V 3 , and V 4 are the signals around original signal. The locations of V 1 , V 2 , V 3 , and V 4 are symmetrical in pairs. The center of symmetry is the location of V CO . The angle between adjacent two locations from V 1 , V 2 , V 3 , and V 4 is 90 degrees.
3.1. The Stacked Denoising Autoencoder. The stacked autoencoder is an artificial neural network architecture, comprised of multiple autoencoders and trained by greedy layer wise training. Each autoencoder includes the middle layer, the output layer, and the input layer. The output of the middle layer acts as the input of the next autoencoder in the stacked autoencoder. The SDAE is the extension of the stacked autoencoder. The input signals of SDAE are corrupted by noise. To decode and recover the blurred original input EEG X = ½x ð1Þ , x ð2Þ , ⋯, x ðcÞ from noise, a brief model of SDAE with two autoencoders is applied in this study. c is the channel number of input signal. The corrupted signals in the input layer is ðcÞ . These corrupted input signals are mapped to a hidden layer with n units by sigmoid function as Equation (2) [29].
where Y 1 is the signal on middle layer of the first autoencoder; W, b, and f are the weight matrices, bias, and activation function of encoder on the first autoencoder, respectively.
The weight and bias matrices are random assignment on the initialization stage. The uncorrupted input Z = ½z ð1Þ , z ð2Þ , Laplacian filtering Area division

Rawdata
Input feature X Output feature The first autoencoder The second autoencoder Node number constraint

Constraint function
Granger causality analysis Corrupted by noise Figure 2: The flowchart of the EEG processing method. 3 Neural Plasticity ⋯, z ðcÞ , the estimation of X, can be reconstructed by the decoder of the first autoencoder as Equation (4).
where W ′ , b ′ , and g are the weight matrices, bias, and nonlinear function of decoder on the first autoencoder, respectively.
To minimize the average reconstruction error, the squared error loss function L is applied during the training of the structure parameters θ = ðW, bÞ and θ ′ = ðW ′ , b ′ Þ of SDAE model. The optimization of these parameters is expressed in Equation (5).
After the optimization, the first autoencoder has been established. The middle layer output of the first autoencoder is regarded as the input of the next autoencoder to further   Neural Plasticity train the model of the second autoencoder. The output of the middle layer Y 2 = ½y 2 ð1Þ , y 2 ð2Þ , ⋯, y 2 ðmÞ of the second autoencoder has been considered as the deeply denoising feature extracted from the original input signals X.

Model Selection.
In this study, to extract the EEG features with high signal to noise ratio under three conditions, the node numbers of middle layer on two autoencoders should be constrained. n and m are the node numbers of middle layer on first autoencoder and second autoencoder. For dimensionality reduction, n and m should be less than the number of nodes in their input layer. Then, the middle layer can be deemed as the dimensionality reduction of the input signals. The proposed constraints of n and m are formulated as Equation (6). The constraint function is formulated as Equation (7). A higher c f indicating a better balance between model error and performance on feature extraction is a better selection.
where U def , C def , and C com are a positive integers. C def is smaller than the number of input channel.ŷ 1 = ½y∧ 1 ð1Þ , y∧ 1 ð2Þ , ⋯, y∧ 1 ðnÞ is the output of the second autoencoder. c STFT ði, jÞ is the short-time Fourier transform coefficients. q is the time sampling number. d is the concerned frequency sampling number.
In this study, let U def = 30, C def = 15, C com = 5, and λ = 0:2. For the awake, fatigue, and sleep deprivation conditions, the main features of EEG are mainly under 20 Hz. Therefore, in this study, d is the frequency sampling number under 20 Hz. The signals from different areas are calculated by the SDAE model above, respectively. The average of output feature is applied as the extracted feature from the input layer.
The results of c f on the awake condition is illustrated in Figure 3. The pair of m and n obtaining higher c f should be selected as the node numbers. In consideration of three conditions, the fine-tuned parameters for training the SDAE model of the experimental data are illustrated in Table 1.
To study the extracted feature from SDAE model, the features extracted by the first and second autoencoders, and original signal have been analyzed by short-time Fourier transform. The time-frequency images of the average original signal, the feature extracted by the first autoencoder, and the feature extracted by the second autoencoder have been shown in Figure 4. Figure 4 indicates that the brain activities with high amplitude are highlighted by the second autoencoder.
The short-time Fourier transform coefficients of the extracted features and original signal have been normalized as Equation (8). The average c STFT under 20 Hz is analyzed by t-test. Statistical results show that the features extracted by the second autoencoder obtains significant greater coefficients than the original signal (P < 0:001) and the features extracted by the first autoencoder (P < 0:001). Therefore, the output of the proposed model can significantly highlight EEG with high amplitude under 20 Hz.
To evaluate the performance of the proposed model selection method on EEG feature extraction, PCA algorithm has been applied for comparison. Figure 5 illustrates the power spectrum of the average original signal across channel, the extracted feature by PCA, and the extracted feature by SDAE from four areas on awake condition. Figures 6 and 7 illustrate the fatigue condition and the sleep deprivation condition, respectively. Previous study indicates that alpha and beta frequencies dominate the common brain activities of human during the awake condition [30], and delta and theta activities have been proven to reflect mental fatigue. In Figure 5, under the awake condition, the original signal has been polluted by the low-frequency interference which is common in EEG. Alpha and beta frequencies are not obvious. The proposed model can extract the alpha frequency activity from the blurred signal and suppress the interferences from other frequencies, while PCA still remains these low-frequency interferences on the extracted feature. Under the fatigue and sleep deprivation conditions, the delta and theta activities dominate the brain activities with noticeable first primary frequency as shown in Figures 6 and 7. It is hypothesis that the first primary frequency contains important information about the brain activities. The ratio of the power on the first primary frequency and the average power (RPFA) on the concerned frequencies (awake: alpha and beta; mental about fatigue: delta and theta) is analyzed by t -test and illustrated in Table 2. A higher RPFA indicates a greater signal to noise ratio. Statistical results show that the proposed model obtains a significant greater RPFA than the original signal (P < 0:01) and PCA (P < 0:05). These results above indicate that the proposed model achieves a better performance compared with PCA in extracting feature from the blurred original EEG data, highlighting the first primary frequency and improving SNR.

Granger Causality
Analysis. The features extracted by the proposed model are applied to the Granger causality analysis (GCA) to explore the connectivity among multibrain areas under the fatigue, awake, and sleep deprivation conditions. The GCA algorithm is a statistical method based on the forecast of time sequence. The causality relationship between sequences represents a better prediction accuracy on one time sequence with the prior knowledge of another time sequence. In order to explore the connectivity among Neural Plasticity multibrain areas, the multiple vector autoregressive model has been employed in this study [31]. Y 2 = ½y 2 ð1Þ ðtÞ, y 2 ð2Þ ðtÞ , ⋯y 2 ðmÞ ðtÞ denotes the extracted feature. m is the vector number of Y 2 . The mutual prediction model is formulated as Equation (9).
where p is the maximum number of lagged observations; ζ 1 , ζ 2 , and ζ m are the prediction errors; C is the coefficient of the multiple vector autoregressive model.
The maximum number of lagged observations p is estimated by the ratio of the Akaike information criterion and the Bayesian information criterion. The noise covariance matrix is shown in Equation (10).
where var ðζ 1r Þ is the component of the upper left corner in the noise covariance matrix of the restricted model omitted y 2 ð2Þ ðtÞ.

Results
To study the connectivity between multibrain areas, the extracted features by the proposed model and PCA on the area 1, area 2, area 3, and area 4 have been analyzed by Granger causality analysis. The connectivity between areas is illustrated in Figures 8 and 9, respectively. In Figures 8  and 9, a, b, and c represent the awake condition, the fatigue condition, and the sleep deprivation condition, respectively. The connection between areas is significant (P < 0:01).

Discussion
The SDAE is a novel feature extraction method. In this study, the proposed model based on SDAE algorism has been applied on the analysis of EEG data about mental fatigue. Figures 5-7 indicate that the proposed model has an excellent performance on feature extraction of three conditions. It is should be noted that the concerned frequency range is different on three conditions. To study this, the results by shorttime Fourier transform on the original signal and the features extracted by the first and second autoencoders are analyzed. It indicates that the proposed model is sensitive to brain activities with higher amplitude. Comparing with other conditions, the awake condition owns higher brain activities on mu and beta rhythms. The first autoenconder may focus on the contrast of light and shade. The second autoenconder may focus on amplitude difference. Therefore, in the awake condition, the information with high amplitude on mu and beta rhythms has been extracted and highlighted by the second autoencoder. Similarly, in the fatigue and sleep deprivation conditions, the information with high amplitude on delta and theta has been extracted and highlighted. Therefore, the model we proposed is an efficient and adaptive method on the analysis of EEG data about mental fatigue. Figure 9 shows more bidirectional connections between areas than Figure 8. Most of the connection relationship in Figure 8 has been involved in Figure 9. These results have demonstrated the outstanding capability of the proposed model on extracting the main features from the blurred EEG, avoiding false connections and improving SNR compared with PCA. In Figure 8, the connectivity based on the features extracted by the proposed model under the awake condition presents a significant connection from the area 1 to its posterior areas in a vertical view. The connectivity under the fatigue condition reveals a complex trajectory, from the area 1 to its posterior areas and from posterior areas to the anterior areas. For the connectivity under the sleep 9 Neural Plasticity deprivation condition, there is a causal flow from the area 4 to its anterior areas. There are different connected patterns on different brain mental states. On a paired connection relationship, the starting node contains important information that can be used to forecast the information of ending node. Therefore, the connection relationship may imply the process of information transmission on brain. The frontal area dominates the attention [32]. It is proved to be more acti-vated with the increasing of the task complexity [33]. In this study, the results about EEG connectivity under the awake condition indicate that the area 1 plays an important role, and it may dominate brain activities. However, the awake condition does not contain any external mental concentrated task. Therefore, the awake condition may be not just an idling state, but an internal state requiring high concentrated attention. Dimitrakopoulos    fluxion from the anterior areas to the posterior areas and the reverse under a one-hour simulated driving and a half-hour sustained attention task [34]. In this study, the complex bidirectional causal fluxion has been uncovered under the fatigue condition. To compare with the obvious unidirectional fluxion under the awake and sleep deprivation conditions, the complex connectivity between multibrain areas under the fatigue condition reveals that there may be a synergy or cross influence of multibrain areas after a long-time high attention-demanded task. Under the sleep deprivation condition, there is a causal flow from the area 4 to its anterior areas.
Kar and Routray report that there are strong connections between electrodes on visual area during sleep deprivation [35]. Sleep deprivation is proved to slow the visual processing and to compromise the ability of visual stimuli processing [36]. Hereby, the connectivity of sleep deprivation condition indicates that the area 4 dominates the mental state with the visual processing suppression. This suppression may affect the other areas of the brain.

Conclusions
Fatigue is a common phenomenon during the period of performing cognitive task. In this study, to overcome the influence from noise and to study the underlying mechanism of fatigue, the model establishment method based on SDAE has been proposed. The proposed model has been applied to extract EEG features. The results have indicated that the proposed method can significantly improve SNR of the extracted feature. The causal connectivity of the extracted feature between multibrain areas under the awake condition, the fatigue condition, and the sleep deprivation condition has been studied. Different directions of causal flow have been revealed. The causal flow directions under the awake condition and the sleep deprivation condition are unidirectional but opposite. The connectivity under the fatigue condition exhibits the most complex trajectory between areas. It reveals a bidirectional causal fluxion, from the anterior areas to the posterior areas and from the posterior areas to the anterior areas. These results may reveal that different condition owns different underlying synergistic way between multibrain areas. This work provides a novel way to quantitatively analyze the mental states. It will be helpful to disclose the underlying mechanism of mental fatigue.

Conflicts of Interest
The authors declare that there is no conflict of interest regarding the publication of this paper.