Steady-State Visual Evoked Potential Classification Using Complex Valued Convolutional Neural Networks

Ikeda, Akira; Washizawa, Yoshikazu

doi:10.3390/s21165309

Open AccessCommunication

Steady-State Visual Evoked Potential Classification Using Complex Valued Convolutional Neural Networks

by

Akira Ikeda

and

Yoshikazu Washizawa

^*

Department of Computer and Network Engineering, The University of Electro-Communications, Tokyo 182-8585, Japan

^*

Author to whom correspondence should be addressed.

Sensors 2021, 21(16), 5309; https://doi.org/10.3390/s21165309

Submission received: 28 June 2021 / Revised: 24 July 2021 / Accepted: 3 August 2021 / Published: 6 August 2021

(This article belongs to the Special Issue Biomedical Signal Acquisition and Processing Using Sensors)

Download

Browse Figures

Versions Notes

Abstract

:

The steady-state visual evoked potential (SSVEP), which is a kind of event-related potential in electroencephalograms (EEGs), has been applied to brain–computer interfaces (BCIs). SSVEP-based BCIs currently perform the best in terms of information transfer rate (ITR) among various BCI implementation methods. Canonical component analysis (CCA) or spectrum estimation, such as the Fourier transform, and their extensions have been used to extract features of SSVEPs. However, these signal extraction methods have a limitation in the available stimulation frequency; thus, the number of commands is limited. In this paper, we propose a complex valued convolutional neural network (CVCNN) to overcome the limitation of SSVEP-based BCIs. The experimental results demonstrate that the proposed method overcomes the limitation of the stimulation frequency, and it outperforms conventional SSVEP feature extraction methods.

Keywords:

brain–computer interfaces (BCI); steady-state visual evoked potential (SSVEP); complex valued deep neural networks

1. Introduction

The brain–computer interface (BCI) or brain–machine interface (BMI) is a direct communication pathway for controlling external devices by discriminating brain signals [1,2]. BCIs have been widely researched and applied for many practical systems [3,4,5,6]. There are various methods for acquiring brain signals for BCIs such as electroencephalography (EEG), magnetoencephalography (MEG), and functional magnetic resonance imaging (fMRI). In this paper, we focus on EEG-based BCI because EEG is a noninvasive and simple signal acquisition method for BCIs [7].

There are various ways to issue commands for BCIs. For example, event-related potentials (ERPs) such as P300 [8], sensorimotor rhythms (SMRs) such as

μ

-rhythm [9,10], the steady-state visual evoked potential (SSVEP), and auditory or tactile evoked responses [11,12] have been used for BCIs. A recent review of BCI technologies and applications appeared in [13].

SSVEP is a response to periodic flicker stimulation with a frequency of 5–30 Hz that is usually presented by an LED or LCD [14,15]. The response EEG has peak frequencies that are the fundamental and harmonic frequencies of the flicker stimulation. An SSVEP-based BCI is realized by assigning commands or alphabets to flickers having different frequencies [16,17]. BCIs can determine the intended commands of users by detecting the peak frequency from observed EEGs. SSVEP-based BCIs achieve the highest information transfer rate (ITR) among state-of-the-art BCI implementations [16,17,18]. There are many extensions and hybrid BCIs for SSVEP-based BCIs. For example, [19,20] extended the SSVEP-based BCI by utilizing the phase of the stimulus in addition to frequency information.

The command of SSVEP-based BCIs is determined by a frequency analysis of the observed EEGs. In the early days, Fourier analysis or spectrum estimation methods were used [21,22]. Then, canonical component analysis (CCA)-based approaches were developed [23]. There are various extensions for CCA-based methods that determine the command of SSVEP-based BCIs. Multi-way CCA (mwayCCA) was developed for tensor EEG data [24]. Phase-constrained CCA (p-CCA) was proposed for phase-modulated SSVEP BCIs [25]. Filter bank canonical correlation analysis (FBCCA) decomposes EEGs by using a sub-band filter bank and then applies CCA to each sub-band signal [26]. The individual template-based CCA (IT-CCA) integrates the standard CCA and the canonical correlation between individual templates and test data [27].

Although the SSVEP-based BCI achieves performances well among state-of-the-art BCI implementations, it and its signal processing methods still have drawbacks to be addressed. LCD is often used to present SSVEP stimuli, and it normally has a fixed refresh rate of 60 Hz. Thus, the stimulus frequency is basically limited to a sub-multiple of the refresh rate. Although an approximation method for generating a stimulus in which the frequency is not sub-multiples of the monitor refresh rate was proposed [28], it is not an essential solution and it requires a longer analysis time window. Another problem is due to Fourier analysis or CCA-based feature extraction. They extract not only the stimulus flicker frequency but also its harmonic components. Therefore, these methods cannot discriminate two flicker frequencies when one frequency is twice (or three times) that of the other, and these stimulus frequency pairs are not available for current SSVEP-based BCIs. These two problems prevent an increase in the number of BCI commands and ITR.

To address these problems, we propose a complex valued convolutional neural network (CVCNN) for SSVEP classification. To the best of our knowledge, this paper is the first to apply CVCNN to an SSVEP classification problem, and it proves that CVCNN overcomes the problem of stimulus frequency limitation. The combination of feature extraction using the discrete Fourier transform (DFT) and convolutional neural networks (CNNs) has been applied to SSVEP classification and has shown higher ITR performance [29]. However, the biggest advantage of CNNs is the data-driven automatic learning of optimal feature extraction, called representation learning [30]. Feature extraction using DFT may not be optimal for SSVEP data, and the combination of DFT and CNN cannot extract optimal features from EEG data directly. Any DFT coefficient is represented by the complex inner product between an input signal and the discrete Fourier basis; hence, we propose using a complex valued neural networks for feature extraction. DFT is a fixed basis for representing an input signal; on the other hand, a complex weight vector of a neural network is a variable parameter that is optimized with respect to an appropriate criterion represented by a cost function or loss function. In other words, by virtue of complex valued convolutional neural networks and representation learning, the network obtains the optimal feature extraction including DFT. We introduce the activation function using a complex absolute value function that is equivalent to calculating the amplitude spectrum for a fixed complex Fourier basis [31].

We conducted an SSVEP-based BCI experiment in order to demonstrate the performance of the proposed CNN and compared it with conventional methods. In our experiment, the flicker stimulations had pair flickers in which one frequency was twice that of the other. We also show the experimental results obtained using open SSVEP-based BCI data. The experiments demonstrate that the proposed method can discriminate pair commands, and it outperforms the conventional methods for SSVEP-based BCIs.

The rest of the paper is organized as follows. Section 2 explains complex valued neural networks and their learning procedure. Section 3 describes the experimental setting of our SSVEP-based BCI and the open dataset. Section 4 shows the experimental results, comparing the proposed method with conventional methods. Section 5 concludes the paper.

2. Method: Complex Valued Neural Networks (CVCNN)

2.1. Activation Function for CVCNN

Complex feature vectors represent periodic/cyclic data, such as oscillation and wave, e.g., electromagnetic phenomena, electric circuits, acoustic/biomedical signals, and imaging radar, in a natural way. Complex valued neural networks (CVNN) have been used for array signal analysis [32,33], radar image processing [34,35], and EEG analysis [36,37,38].

Unlike ordinary real valued neural networks (RVNNs), the activation function of CVNNs has to be considered carefully because complex functions have a restriction in terms of differentiability expressed by Cauchy–Riemann equations. The activation functions of CVNNs are categorized into two groups: the split type

f (z) = ϕ_{1} (ℜ (z)) + j ϕ_{1} (ℑ (z))

and the amplitude-phase type

f (z) = ϕ_{2} (| z |) exp (j arg z)

, where

j = \sqrt{- 1}

is an imaginary unit, and

ℜ (z)

and

ℑ (z)

are real and imaginary parts of

z \in C

, respectively. For the split type, the rectified linear unit (ReLU) function

ϕ_{1} (x) = max (x, 0)

or the hyperbolic tangent function

ϕ_{1} (x) = tanh (x)

is often used. For the amplitude-phase type,

ϕ_{2} (x) = tanh (x)

is used [39,40]. A review of CVNN and of their applications in signal processing can be found in [41].

Phase information in EEGs or other signals is often unnecessary because it depends on the onset of data. Some SSVEP-based BCIs utilize absolute or relative phase information to increase the number of commands. However, in these methods, the BCI system should carefully synchronize EEG recording and stimulus presentation. Here, we consider discrimination of the stimulus frequency; therefore, phase information should be removed. For this purpose, the authors previously proposed the use of the complex absolute function

ϕ (z) = | z |

for the activation function [31]. In the proposed method, we use three activation functions: the amplitude-phase and split types, and the absolute function. The structure of the proposed neural networks is shown in Table 1 and Figure 1. The complex version of batch normalization proposed in [40] is used.

2.2. Forward and Backward Propagation

Let

W^{(l)}

be the weight matrix and

o^{(l)}

be the output vector of the lth layer of the network. Then, the forward propagation of an L layer CVCNN is calculated by the following equations:

\begin{matrix} u^{(l)} = & W^{(l)} o^{(l - 1)} \end{matrix}

(1)

\begin{matrix} o^{(l)} = & ϕ^{(l)} (u^{(l)}), l = 2, \dots, L, \end{matrix}

(2)

where

ϕ^{(l)}

is the element-wise activation function of the lth layer and

o^{(1)}

denotes the complex valued input vector of the network.

The set of weight matrices

W = {W^{(l)}}_{l = 1, \dots, L}

is optimized to minimize a loss function by back-propagation (BP) for CVCNNs. Square-error loss, logistic loss, or softmax loss is often used for classification problems. Let

E_{n} (W)

be the loss function for the nth input training sample. Each weight matrix is iteratively updated by

\begin{matrix} W^{(l)} \leftarrow W^{(l)} - η \nabla_{W^{(l)}} E_{n} l = 2, \dots, L, \end{matrix}

(3)

where

η > 0

is the learning rate.

2.2.1. Split-Type Activation Function

Suppose that we use a split-type activation function. Let

u_{r}^{(l)}

be the rth element of

u^{(l)}

, i.e.,

u^{(l)} = {[u_{1}^{(l)}, u_{2}^{(l)}, \dots, u_{m}^{(l)}]}^{⊤}

,

w_{r p}^{(l)}

is the

(r, p)

element of

W^{(l)}

, and

\begin{matrix} δ_{r}^{(l)} = \frac{\partial E}{\partial ℜ [u_{r}^{(l)}]} + j \frac{\partial E}{\partial ℑ [u_{r}^{(l)}]} . \end{matrix}

(4)

Then, applying the chain rule for real and imaginary parts independently, we have

\begin{matrix} \frac{\partial E}{\partial w_{r p}^{(l)}} = & \frac{\partial E}{\partial ℜ [w_{r p}^{(l)}]} + j \frac{\partial E}{\partial ℑ [w_{r p}^{(l)}]} = δ_{r}^{(l)} \bar{o_{p}^{(l - 1)}}, l = 2, \dots, L, \end{matrix}

(5)

where

\bar{z}

denotes the complex conjugate of z and

o_{p}^{(l - 1)}

is the pth element of

o^{l - 1}

. For simplification purposes, we omit the subscript of the training index, n. Then,

δ_{r}^{(l)}

is calculated by the chain rule:

\begin{matrix} ℜ [δ_{r}^{(l)}] = & ℜ [\sum_{q} δ_{q}^{(l + 1)} \bar{w_{q r}^{(l + 1)}}] \frac{\partial ℜ [ϕ^{(l)} (u_{r}^{(l)})]}{\partial ℜ [u_{r}^{(l)}]} \end{matrix}

(6)

\begin{matrix} ℑ [δ_{r}^{(l)}] = & ℑ [\sum_{q} δ_{q}^{(l + 1)} \bar{w_{q r}^{(l + 1)}}] \frac{\partial ℑ [ϕ^{(l)} (u_{r}^{(l)})]}{\partial ℑ [u_{r}^{(l)}]} . \end{matrix}

(7)

2.2.2. Amplitude-Phase Type

In the case of the amplitude-phase type activation function

f (z) = tanh (| z |) exp (j arg z)

, the absolute value and phase of the entries of

W^{(l)}

are independently updated by the following rule [42,43]:

\begin{matrix} | w_{r p}^{(l)} | \leftarrow & | w_{r p}^{(l)} | - η δ | w_{r p}^{(l)} |, arg w_{r p}^{(l)} \leftarrow arg w_{r p}^{(l)} - η δ arg w_{r p}^{(l)}, \\ δ | w_{r p}^{(l)} | = & (1 - | {(o_{r}^{(l)})}^{2} |) (| o_{r}^{(l)} | - | d_{r}^{(l)} | cos (arg o_{r}^{(l)} - arg d_{r}^{(l)})) | o_{p}^{(l - 1)} | cos θ_{r p}^{(l)} \end{matrix}

(8)

\begin{matrix} - | o_{r}^{(l)} | | d_{r}^{(l)} | sin (arg o_{r}^{(l)} - arg d_{r}^{(l)}) \frac{o_{p}^{l - 1}}{u_{r}^{(l)}} sin θ_{r p}^{(l)}, \\ δ arg w_{r p}^{(l)} = & (1 - | {(o_{r}^{(l)})}^{2} |) (| o_{r}^{(l)} | - | d_{r}^{(l)} | cos (arg o_{r}^{(l)} - arg d_{r}^{(l)})) | o_{p}^{(l - 1)} | sin θ_{r p}^{(l)} \end{matrix}

(9)

\begin{matrix} + | o_{r}^{(l)} | | d_{r}^{(l)} | sin (arg o_{r}^{(l)} - arg d_{r}^{(l)}) \frac{o_{p}^{l - 1}}{u_{r}^{(l)}} cos θ_{r p}^{(l)}, \end{matrix}

(10)

where

θ_{r p}^{(l)} = arg o_{r}^{(l)} - arg o_{p}^{(l)} - arg w_{r p}^{(l)}

, and

d^{(l - 1)} = \bar{ϕ^{(l)} (\bar{d^{(l)}} W^{(l)})}

.

2.2.3. Absolute Activation Function

Since the output of the absolute activation function

ϕ (z) = | z |

is real-valued, the layers after the absolute activation function are treated as ordinary RVNNs. Suppose that the activation function of the lth layer is the absolute function; then, the chain rule is given by Equations (6) and (7).

3. Experimental Setting and Dataset

We used two datasets: (i) our original SSVEP dataset (original dataset) and (ii) an open SSVEP benchmark dataset (open dataset) [44]. The original dataset was designed to demonstrate the distinctiveness of harmonic stimulus flicker frequencies. The stimulus flicker frequencies were 6.0, 6.5, 7.0 7.5, and 8.0 Hz, and their second harmonics were 12, 13, 14, 15, and 16 Hz. With CCA-based methods, it is difficult to classify these harmonic pairs.

3.1. Original Dataset

We used a refresh rate of 60 Hz, a 27-inch LCD monitor, and Psychtoolbox-3 for stimulus presentation [45]. For EEG recording, we used the MATLAB data acquisition toolbox, g.tec active EEG (g.GAMMAcap2, g.GAMMAbox, and g.LADYbird), a TEAC BA1008 amplifier, and a Contec AI-1664 LAX-USB A/D converter. The sampling frequency was 512 Hz. A reference electrode was placed on the right earlobe, and the ground electrode was placed on FPz. We used nine electrodes for recording: O1, O2, Oz, P3, P4, Pz, PO3, PO4, and POz.

Five healthy subjects (four males and one female, 26.8 ± 7.43 y.o.) voluntarily participated in the experiment. The experiment procedure was approved by the ethics committee of the University of Electro-Communications, and it was conducted in accordance with the approved research procedure and with the relevant guidelines and regulations. Informed consent was obtained from all the subjects.

Figure 2 depicts the stimuli presentation setting of our experiment. The distance from the eyes of the subject to the monitor was 70–80 cm. The target queue was randomly displayed for two seconds, and flicker stimuli were then presented for five seconds. The stimuli were given with a white and black box. One trial consisted of ten randomly ordered target stimuli, and ten trials were conducted for each subject. The subjects took a rest between trials as needed. The subjects were asked not to blink during the stimulus presentation. Due to the latency of the SSVEP, we extracted EEG from 0.14 s after the start of stimulation. The length of the EEG data for one target was five seconds. We rearranged the data and generated data 1000 ms and 500 ms in length with no overlaps. For each subject, the number of pieces of data were 500 and 1200 for data of 1000 ms and 500 ms in length, respectively. We applied band-pass filtering from 4 Hz to 45 Hz. The classification accuracy and ITR were evaluated using ten-fold cross validation.

3.2. Open Dataset

The open dataset consisted of ten subjects’ SSVEP data [44]. There were twelve targets (9.25, 9.75, …, 14.25, and 14.75 Hz). Four-class phase modulation was also used (

0 π

,

0.5 π

,

1.0 π

, and

1.5 π

). There were 15 trials for each target, and 1 trial lasted four seconds.

To utilize the phase modulation, we generated training and test data as follows: (i) applied band-pass filtering; (ii) extracted the data with offset

t_{off}

from the onset of the stimulus for each trial, with the data length being 500 ms or 1000 ms; (iii) generated training/test data by five-fold cross-validation, i.e., for each target, 12 trials for training data and 3 trials for test data; and (iv) calculated classification accuracy and ITR and averaged them for several offset values

t

. For the 500 ms data, we used

t_{off} \in {0, 500, \dots, 3500}

, and for the 1000 ms data, we used

t_{off} \in {0, 1000, \dots, 3000}

.

3.3. Methods for Comparison

We compared the proposed method with (i) CCA [23], (ii) combined CCA [44], and (iii) CCNN [29].

For the CCA and combined CCA, we used the fundamental and second harmonics of the stimulus flicker frequency for the reference signal. We implemented them using Python 3.6.8, Numpy 1.19.1, and Scikit-learn 0.23.1.

For the CCNN, we used the same network structure as in [29], and we applied DFT with zero-padding; then, a complex spectrum from 3 Hz to 35 Hz was input to the network in a similar manner as [29]. The frequency resolution was set to 0.293 Hz. For the CCNN and CVCNN (proposed), the network parameters were fixed or determined by the grid search listed in Table 2 and implemented by using TensorFlow 1.14.0 and Keras 2.2.4 in addition to the software above.

4. Results

The classification performance of the SSVEP-based BCI was evaluated in terms of the classification accuracy and ITR by using the cross validation for each subject. ITR I (bit/min) was obtained by the following equation:

\begin{matrix} I = \frac{60}{T} ({log}_{2} M + P {log}_{2} P + (1 - P) {log}_{2} (\frac{1 - P}{M - 1})), \end{matrix}

(11)

where T (s) is the time to output the command, P is the mean classification accuracy, and M is the number of targets [2]. We assumed that the latency of the subject’s attention to the target stimulus was 0.5 s and obtained the ITR.

4.1. Original Dataset

We excluded subject 2 from the analysis because the recorded EEG data were inappropriate because some electrodes were not set up properly. Figure 3 and Figure 4 show the classification accuracies and ITR. The proposed method exhibited the best classification accuracy and ITR for all data lengths. Since the stimulus flicker frequencies included harmonic pairs, the CCA-based methods performed worse than the CNN-based methods.

To show the classification performance for the harmonic pairs, we show the misclassification rate for another target harmonic pair in Figure 5 for the 1000 ms data length. For example, the bar of 12 Hz represents the misclassification rate at which the 12 Hz target was misclassified as the 6 Hz target. The figure shows that the CCA-based methods showed higher misclassification rates than the CNN-based methods.

4.2. Open Dataset

Figure 6 and Figure 7 show the classification accuracy and ITR for the open dataset. Although the dataset did not have harmonic flicker frequency stimuli, the proposed method exhibited the best classification accuracy and ITR for the 500 ms data length. For the 1000 ms dataset, the proposed method showed almost comparable results with the state-of-the-art method, the combined CCA, because the accuracy rates saturated to 100% for some subjects.

5. Conclusions

We proposed a complex valued convolutional neural network (CVCNN) structure and absolute activation function for the classification of the SSVEP-based BCI. The SSVEP-based BCI is often realized by an LCD display; however, it has a limitation of the stimulation frequency due to the refresh rate. Moreover, the conventional CCA-based SSVEP classification methods (e.g., [23,27]) have the disadvantage that it is difficult to discriminate harmonic frequency stimuli (e.g., 6 Hz and 12 Hz). This problem hinders performance improvement of the SSVEP-based BCI in terms of the number of available commands and ITR. The proposed method overcomes this problem and outperforms state-of-the-art SSVEP classification methods by using feature extraction in the frequency domain of complex valued networks and the representation learning of convolutional neural networks.

Our original data are based on a small number of participants (five subjects and four available data); therefore, further investigation is needed. Future work includes extending the proposed method for phase modulation of SSVEP applications and self-paced BCI applications.

Author Contributions

Data curation, A.I. (Original dataset); Formal analysis, A.I. and Y.W.; Investigation, A.I. and Y.W.; Methodology, A.I. and Y.W.; Software, A.I. and Y.W.; Visualization, A.I. and Y.W.; Writing—original draft, Y.W. Both authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by JSPS KAKENHI Grant Number 20H04206.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the the ethics committee of the University of Electro-Communications (No. 20038).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The open SSVEP dataset in Section 3.2 is downloaded from https://github.com/mnakanishi/12JFPM_SSVEP (accessed on 1 June 2021).

Conflicts of Interest

The authors declare no conflict of interest.

References

Rezeika, A.; Benda, M.; Stawicki, P.; Gembler, F.; Saboor, A.; Volosyak, I. Brain-computer interface spellers: A review. Brain Sci. 2018, 8, 57. [Google Scholar] [CrossRef] [Green Version]
Wolpaw, J.R.; Birbaumer, N.; Mcfarland, D.J.; Pfurtcheller, G.; Vaughan, T.M. Brain-computer interfaces for communication and control. Clin. Neurophysiol. 2002, 113, 767–791. [Google Scholar] [CrossRef]
Chaudhary, U.; Birbaumer, N.; Ramos-Murguialday, A. Brain-computer interfaces for communication and rehabilitation. Nat. Rev. Neurol. 2016, 12, 513–525. [Google Scholar] [CrossRef] [Green Version]
Sunny, T.; Aparna, T.; Neethu, P.; Venkateswaran, J.; Vishnupriya, V.; Vyas, P. Robotic arm with brain-computer interfacing. Proc. Technol. 2016, 24, 1089–1096. [Google Scholar] [CrossRef] [Green Version]
Casey, A.; Azhar, H.; Grzes, M.; Sakel, M. BCI controlled robotic arm as assistance to the rehabilitation of neurologically disabled patients. Disabil. Rehabil. Assist. Technol. 2019, 16, 525–537. [Google Scholar] [CrossRef] [PubMed]
Deng, X.; Yu, Z.; Lin, C.; Gu, Z.; Li, Y. A Baysian shared control approach for wheelchair robot with brain machine interface. IEEE Trans. Neural Syst. Rehabil. Eng. 2020, 28, 328–338. [Google Scholar] [CrossRef] [PubMed]
Lotte, F.; Bougrain, L.; Cichocki, A.; Clerc, M.; Congedo, M.; Rakotomamonjy, A.; Yger, F. A review of classification algorithms for EEG-based brain–computer interfaces: A 10 year update. J. Neural Eng. 2018, 15, 031005. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Farwell, L.; Donchin, E. Talking off the top of your head: Toward a mental prosthesis utilizing event-related brain potentials. Electroencephalogr. Clin. Neurophysiol. 1988, 70, 510–523. [Google Scholar] [CrossRef]
Yuan, H.; He, B. Brain-computer interfaces using sensorimotor rhythms: Current state and future perspectives. IEEE Trans. Biomed. Eng. 2014, 61, 1425–1435. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Sannelli, C.; Vidaurre, C.; Müller, K.R.; Blankertz, B. A large scale screening study with a SMR-based BCI: Categorization of BCI users and differences in their SMR activity. PLoS ONE 2019, 14, e0207351. [Google Scholar] [CrossRef] [Green Version]
Sugi, M.; Hagimoto, Y.; Nambu, I.; Gonzalez, A.; Takei, Y.; Yano, S.; Hokari, H.; Wada, Y. Improving the performance of an auditory brain-computer interface using virtual sound sources by shortening stimulus onset asynchrony. Front. Neurosci. 2018, 27, 108. [Google Scholar] [CrossRef]
Brouwer, A.M.; van Erp, J. A tactile P300 brain-computer interface. Front. Neurosci. 2010, 4, 19. [Google Scholar] [CrossRef] [Green Version]
Bonci, A.; Fiori, S.; Higashi, H.; Tanaka, T.; Verdini, F. An Introductory Tutorial on Brain–Computer Interfaces and Their Applications. Electronics 2021, 10, 560. [Google Scholar] [CrossRef]
Kuś, R.; Duszyk, A.; Milanowski, P.; Łabȩcki, M.; Bierzyńska, M.; Radzikowska, Z.; Michalska, M.; Żygierewicz, J.; Suffczyński, P.; Durka, P. On the quantification of SSVEP frequency responses in human EEG in realistic BCI conditions. PLoS ONE 2013, 8, e77536. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Herrmann, C. Human EEG responses to 1-100Hz flicker: Resonance phenomena in visual cortex and their potential correlation to cognitive phenomena. Exp. Brain Res. 2001, 137, 346–353. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Xie, S.; Wang, H.; Zhang, Z. Data analytics in steady-state visual evoked potential-based brain-computer interface: A review. IEEE Sens. J. 2021, 21, 1124–1138. [Google Scholar] [CrossRef]
Li, M.; He, D.; Li, C.; Qi, S. Brain-computer interface speller based on steady-state visual evoked potential: A review focusing on the stimulus paradigm and performance. Brain Sci. 2021, 11, 450. [Google Scholar] [CrossRef]
Nakanishi, M.; Wang, Y.; Chen, X.; Wang, Y.T.; Gao, X.; Jung, T.P. Enhancing detection of SSVEPs for a high-speed brain speller using task-related component analysis. IEEE Trans. Biomed. Eng. 2018, 65, 104–112. [Google Scholar] [CrossRef]
Wilson, J.; Palaniappan, R. Augmenting a SSVEP BCI through single cycle analysis and phase weighting. In Proceedings of the 2009 4th International IEEE/EMBS Conference on Neural Engineering, Antalya, Turkey, 29 April–2 May 2009; pp. 371–374. [Google Scholar]
Chen, X.; Wang, Y.; Nakanishi, M.; Jung, T.P.; Gao, X. Hybrid frequency and phase coding for a high-speed SSVEP-based BCI speller. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC) 2014, 2014, 3993–3996. [Google Scholar]
Cheng, M.; Gao, X.; Gao, S.; Xu, D. Design and implementation of a brain-computer interface with high transfer rates. IEEE Trans. Biomed. Eng. 2002, 49, 1181–1186. [Google Scholar] [CrossRef]
Chen, Y.J.; See, A.; Chen, S.C. SSVEP-based BCI classification using power cepstrum analysis. Electron. Lett. 2014, 50, 735–737. [Google Scholar] [CrossRef]
Lin, Z.; Zhang, C.; Wu, W.; Gao, X. Frequency recognition based on canonical correlation analysis for SSVEP-based BCIs. IEEE Trans. Biomed. Eng. 2007, 54, 1172–1176. [Google Scholar] [CrossRef]
Zhang, Y.; Zhou, G.; Jin, J.; Wang, M.; Wang, X.; Cichocki, A. L1-regularized multiway canonical correlation analysis for SSVEP-based BCI. IEEE Trans. Neural Syst. Rehabil. Eng. 2013, 21, 887–896. [Google Scholar] [CrossRef] [PubMed]
Pan, J.; Gao, X.; Duan, F.; Yan, Z.; Gao, S. Enhancing the classification accuracy of steady-state visual evoked potential-based brain-computer interfaces using phase constrained canonical correlation analysis. J. Neural Eng. 2011, 8, 036027. [Google Scholar] [CrossRef] [PubMed]
Chen, X.; Wang, Y.; Gao, S.; Jung, T.P.; Gao, X. Filter bank canonical correlation analysis for implementing a high-speed SSVEP-based brain-computer interface. J. Neural Eng. 2015, 12, 046008. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Nakanishi, M.; Wang, Y.T.; Jung, T.P. A high-speed brain speller using steady-state visual evoked potentials. Int. J. Neural Syst. 2014, 24, 1450019. [Google Scholar]
Wang, Y.; Wang, Y.T.; Jung, T.P. Visual stimulus design for high-rate SSVEP BCI. Elecron. Lett. 2010, 46, 1057–1058. [Google Scholar] [CrossRef]
Ravi, A.; Beni, N.; Manuel, J.; Jiang, N. Comparing user-dependent and user-independent training of CNN for SSVEP BCI. J. Neural Eng. 2020, 17, 026028. [Google Scholar] [CrossRef] [PubMed]
Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1798–1828. [Google Scholar] [CrossRef]
Ikeda, A.; Washizawa, Y. Spontaneous EEG classification using complex valued neural network. In Proceedings of the 26th ICONIP: International Conference on Neural Information Processing, Sydney, NSW, Australia, 12–15 December 2019. [Google Scholar]
Yang, W.; Chan, K.; Chang, P. Complex-valued neural-network for direction-of-arrival estimation. Electron. Lett. 1994, 30, 574–575. [Google Scholar] [CrossRef] [Green Version]
Mishra, R.; Patnaik, A. Designing rectangular patch antenna using the neurospectral method. IEEE Trans. Antennas Propag. 2003, 51, 1914–1921. [Google Scholar] [CrossRef]
Zhang, Z.; Wang, H.; Xu, F.; Jin, Y. Complex-valued convolutional neural network and its application in polarimetric SAR image classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 7177–7188. [Google Scholar] [CrossRef]
Hirose, A.; Motoi, M. Complex-valued region-based-coupling image clustering neural networks for interferometric radar images processing. IEICE Trans. Electron. 2001, E84-C, 1932–1938. [Google Scholar]
Aizenberg, I.; Khaliq, Z. Analysis of EGG using multilayer neural network with multi-valued neurons. In Proceedings of the IEEE Second International Conference on Data Stream Mining & Processing, Lviv, Ukraine, 21–25 August 2018. [Google Scholar]
Peker, M.; Sen, B.; Delen, D. A novel method for automated diagnosis of epilepsy using complex-valued classifiers. IEEE J. Biomed. Health 2016, 20, 108–118. [Google Scholar] [CrossRef]
Wu, R.; Huang, H.; Huang, T. Learning of phase-amplitude-type complex-valued neural networks with application to signal coherence. In Proceedings of the 2017 International Conference on Neural Information Processing, Long Beach, CA, USA, 4–9 December 2017; pp. 91–99. [Google Scholar]
Kim, T.; Adalı, T. Approximation by fully complex multilayer perceptrons. Neural Comput. 2003, 15, 1641–1666. [Google Scholar] [CrossRef] [PubMed]
Trabelsi, C.; Bilaniuk, O.; Serdyuk, D.; Subramanian, S.; Santos, J.; Mehri, S.; Rostamzadeh, N.; Bengio, Y.; Pal, C. Deep Complex Networks. arXiv 2018, arXiv:1705.09792. [Google Scholar]
Hirose, A. Neural System Learning on Complex-Valued Manifolds. In Complex-Valued Neural Networks: Advances and Applications; Wiley-IEEE Press: Piscataway, NJ, USA, 2013; pp. 33–57. [Google Scholar] [CrossRef]
Sunaga, T.; Natsuaki, R.; Hirose, A. Proposal of complex-valued convolutional neural networks for similar land-shape discovery in interferometric synthetic aperture radar. In Proceedings of the 25th International Conference on Neural Information Processing, Siem Reap, Cambodia, 13–16 December 2018. [Google Scholar]
Hirose, A. Continuous complex-valued back-propagation learning. Electron. Lett. 1992, 28, 1854–1855. [Google Scholar] [CrossRef]
Nakanishi, M.; Wang, Y.; Wang, Y.; Jung, T. A comparison study of canonical correlation analysis based methods for detecting steady state visual evoked potentials. PLoS ONE 2015, 10, e0140703. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Brainard, D. The psychophysics toolbox. Spatial Vision 1997, 10, 433–436. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Network structure of the proposed CNN: the first layer is the complex weighted averaging for channels (2–4 in Table 1). The second layer is the filtering for time index (5–8 in Table 1). Then, the data are flattened and goes through the complex dense layer with the absolute activation function. Finally, the output is given by the softmax layer.

Figure 2. Stimulus presentation: unit, mm; 1: 6.0 Hz, 2: 6.5 Hz, 3: 7.0 Hz, 4: 7.5 Hz, 5: 8.0 Hz, 6: 15 Hz, 7: 16 Hz, 8: 12 Hz, 9: 13 Hz, and A: 14 Hz.

Figure 3. Classification accuracy and ITR for the original dataset with a data length of 1000 ms.

Figure 4. Classification accuracy and ITR for the original dataset with a data length of 500 ms.

Figure 5. Misclassification rate for another target of harmonic pair.

Figure 6. Classification accuracy and ITR for the open dataset with a data length of 1000 ms.

Figure 7. Classification accuracy and ITR for teh open dataset with a data length of 500 ms.

Table 1. Network structure of the proposed CNN:

N_{ch}

is the number of channels, and BN is batch normalization.

Table 1. Network structure of the proposed CNN:

N_{ch}

is the number of channels, and BN is batch normalization.

	Layer	Filters	Size	Activation
1	Input
2	ComplexConv2D	10	$(N_{ch}, 1)$
3	ComplexBN			$tanh (\| z \|) exp (j arg (z))$
4	Dropout
5	ComplexConv2D	10	(1,10)
6	Flatten
7	BN			$ReLU (ℜ (z)) + j ReLU (ℑ (z))$
8	Dropout
9	ComplexDense			$\| z \|$
10	Softmax

Table 2. Parameter setting of neural networks.

Parameter	Original Dataset	Open Dataset
No. of epochs	250	250
Mini-batch size	30	32
Dropout rate	${0.25, 0.3, \dots, 0.5}$	${0.25, 0.3, \dots, 0.6}$
Learning rate	${0.0005, 0.001, 0.005, 0.01}$	${0.001, 0.005, 0.01}$
$L_{2}$ regularization	${1 e - 4, 1 e - 31 e - 2}$	${1 e - 4, 1 e - 31 e - 2}$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ikeda, A.; Washizawa, Y. Steady-State Visual Evoked Potential Classification Using Complex Valued Convolutional Neural Networks. Sensors 2021, 21, 5309. https://doi.org/10.3390/s21165309

AMA Style

Ikeda A, Washizawa Y. Steady-State Visual Evoked Potential Classification Using Complex Valued Convolutional Neural Networks. Sensors. 2021; 21(16):5309. https://doi.org/10.3390/s21165309

Chicago/Turabian Style

Ikeda, Akira, and Yoshikazu Washizawa. 2021. "Steady-State Visual Evoked Potential Classification Using Complex Valued Convolutional Neural Networks" Sensors 21, no. 16: 5309. https://doi.org/10.3390/s21165309

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Steady-State Visual Evoked Potential Classification Using Complex Valued Convolutional Neural Networks

Abstract

1. Introduction

2. Method: Complex Valued Neural Networks (CVCNN)

2.1. Activation Function for CVCNN

2.2. Forward and Backward Propagation

2.2.1. Split-Type Activation Function

2.2.2. Amplitude-Phase Type

2.2.3. Absolute Activation Function

3. Experimental Setting and Dataset

3.1. Original Dataset

3.2. Open Dataset

3.3. Methods for Comparison

4. Results

4.1. Original Dataset

4.2. Open Dataset

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI