Steady-State Motion Visual Evoked Potential (SSMVEP) Enhancement Method Based on Time-Frequency Image Fusion

The steady-state motion visual evoked potential (SSMVEP) collected from the scalp suffers from strong noise and is contaminated by artifacts such as the electrooculogram (EOG) and the electromyogram (EMG). Spatial filtering methods can fuse the information of different brain regions, which is beneficial for the enhancement of the active components of the SSMVEP. Traditional spatial filtering methods fuse electroencephalogram (EEG) in the time domain. Based on the idea of image fusion, this study proposed an SSMVEP enhancement method based on time-frequency (T-F) image fusion. The purpose is to fuse SSMVEP in the T-F domain and improve the enhancement effect of the traditional spatial filtering method on SSMVEP active components. Firstly, two electrode signals were transformed from the time domain to the T-F domain via short-time Fourier transform (STFT). The transformed T-F signals can be regarded as T-F images. Then, two T-F images were decomposed via two-dimensional multiscale wavelet decomposition, and both the high-frequency coefficients and low-frequency coefficients of the wavelet were fused by specific fusion rules. The two images were fused into one image via two-dimensional wavelet reconstruction. The fused image was subjected to mean filtering, and finally, the fused time-domain signal was obtained by inverse STFT (ISTFT). The experimental results show that the proposed method has better enhancement effect on SSMVEP active components than the traditional spatial filtering methods. This study indicates that it is feasible to fuse SSMVEP in the T-F domain, which provides a new idea for SSMVEP analysis.


Introduction
To improve the comfort of light-flashing stimulation, we proposed a steady-state motion visual evoked potential (SSMVEP) method to replace light-flashing stimulation with motion stimulation in the previous study [1]. In this study, the SSMVEP signal was selected as a research object. e SSMVEP collected from the scalp will be contaminated by a variety of artifacts, such as the electrooculogram (EOG) and electromyogram (EMG). Consequently, the requirements for subsequent signal processing are high. In electroencephalogram (EEG) signal processing, using multichannel EEG is beneficial and effective [2,3]. In the EEG literature, the method for linearly fusing the multilead signals into single-channel or multichannel signals is called spatial filtering. Spatial filtering combines the EEG information of signals from the selected electrode signal. Dennis et al. studied the enhancement effects of standard ear reference, CAR fusion, and Laplacian fusion on the active components of EEG.
e results show that the Laplacian fusion achieved the best fusion effects on EEG active components [9]. Friman et al. proposed the minimum energy fusion and maximum contrast fusion methods [5]. e minimum energy fusion was achieved by minimizing the noise energy, and the maximum contrast fusion was achieved by maximizing the EEG SNR. In the study, Friman et al. compared the effects of average fusion, native fusion, bipolar fusion, Laplacian fusion, minimum energy fusion, and maximum contrast fusion on steady-state visual evoked potential (SSVEP) signals. Among these, the enhancement effect of average fusion was worst, and the enhancement effect of minimum energy fusion was best. CCA fusion is used to study the linear relationship between two groups of multidimensional variables and is also a commonly used spatial filtering method [6].
Since Friman et al. demonstrated that minimum energy fusion and maximum contrast fusion had better fusion effects and did not compare minimum energy fusion and maximum contrast fusion with CCA fusion, this study only compared the enhancement effects of minimum energy fusion, maximum contrast fusion, CCA fusion, and the proposed method on the active components of the SSMVEP. Minimum energy fusion, maximum contrast fusion, and CCA fusion are used to analyze multidimensional EEG signals in the time domain. is study aims to investigate whether electrode signals can be fused in the time-frequency (T-F) domain. e method of image fusion can fuse multiple images into one, thus improving the image quality [10][11][12].
is study transformed EEG from the time domain to the T-F domain and analyzed multidimensional EEG with the idea of image fusion, to improve the enhancement effect of existing spatial filtering methods on SSMVEP active components. Firstly, two electrode signals were transformed from the time domain to the T-F domain via short-time Fourier transform (STFT). en, two T-F images were decomposed by two-dimensional multiscale wavelet decomposition, and both the high-frequency coefficients and low-frequency coefficients of the wavelet were fused by specific fusion rules. After two-dimensional wavelet reconstruction, two images could be fused into one image. e fused image was filtered via mean filter, and the fused timedomain signal could be obtained by inverse STFT (ISTFT). e experimental results show that the proposed method based on T-F image fusion can enhance the SSMVEP better than the traditional time-domain analysis method.

Subjects.
Six males and four females (20-28 years old) were recruited as subjects for this study. e participants were healthy and had normal colour and visual perception.

Experimental
Equipment. g.USBamp (g.tec, Austria) was used to collect the EEG. e sampling frequency of the equipment is 1200 Hz. A g.USBamp EEG amplifier and g.GAMMAbox active electrode system were combined to form the experimental platform. Prior to the experiment, the reference electrode was placed at the left ear of the subjects, and the ground electrode Fpz was placed at the forehead. EEGs were collected from the following six channels: PO7, Oz, PO8, PO3, POz, and PO4.

Experimental
Step. A checkerboard of radial contraction-expansion motion was used as a stimulation paradigm, and the details of the paradigm can be found in Reference [1]. e stimulus paradigms were presented on a screen at a refresh rate of 144 Hz, and the subjects were positioned 0.6-0.8 m from the screen. e experiment included six blocks, each containing 20 trials corresponding to all 20 targets. e stimuli were arranged in a 5×4 matrix, and the horizontal and vertical intervals between two neighboring stimuli were 4 cm and 3 cm, respectively. e stimulus frequencies were 7-10.8 Hz with a frequency interval of 0.2 Hz, and the radius of the stimuli was 60 pixels. Each trial lasted 5 s, separated by an interval of 2 s. Between two blocks, the subjects were allowed to rest properly. Twenty targets were simultaneously presented on the screen and numbered 1 to 20. Before the stimulus began, one of the 20 serial numbers appeared below the corresponding target, indicating the focus target.

Preprocessing of EEG Data.
e corresponding EEG data segments were extracted in accordance with the trial start and end times. e MATLAB library function detrend was used to remove the linear trends for each channel. Chebyshev bandpass filtering of 0.5-50 Hz was used to remove low-frequency drifts and high-frequency interferences.

Significance Test.
e data were expressed as mean values. A paired-sample t-test was used to determine the significance. Statistical significance was defined as p < 0.05.

Spatial
where n represents the number of harmonics and a i,j and ϕ i,j represent the amplitude and phase of the jth harmonic component, respectively. e model decomposes the signal into the sum of the SSMVEP induced by the visual stimulus and the noise e i (t) generated by the electromyogram (EMG), electrooculogram (EOG), and other components. us, equation (2) can be expressed as where S i � a i X, e submatrix X n is composed of sin(2nπft) and cos(2nπft), and a i represents the amplitude of SSMVEP at its stimulus frequency and harmonics. We assume that the signal matrix recorded by N electrodes is where each column corresponds to an electrode channel. Y can be projected to the SSMVEP space through a projection matrix Q, namely, Reference [13] gives the solution of Q as us, the noise signal can be expressed as e minimum energy fusion obtains the spatial filter coefficients W by minimizing the noise energy, namely, e above minimization problem can be obtained by decomposing the eigenvalues of the positive definite matrix Y ′ T Y ′ . After decomposition, N eigenvalues and corresponding eigenvectors can be obtained. e spatial filter coefficient W is the eigenvector corresponding to the smallest eigenvalue (MATLAB code for minimum energy fusion and test data is provided in Supplementary Materials (available here)).

Maximum Contrast Fusion.
e goal of maximum contrast fusion is to maximize SSMVEP energy while minimizing noise energy. e SSMVEP energy can be approximated as W T Y T YW. erefore, the maximum contrast fusion can be achieved by the following maximizing equation: Here, Y and Y′ are the same as in Section 2.6.1. Equation (9) can be solved by generalized eigenvalue decomposition of Y T Y and Y ′ T Y ′ . e spatial filter coefficient W is the eigenvector corresponding to the largest eigenvalue (MAT-LAB code for maximum contrast fusion and test data is provided in Supplementary Materials (available here)).

Canonical Correlation Analysis Fusion.
CCA fusion is used to study the linear relationship between two groups of multidimensional variables. Using two sets of signals Y and X, the goal is to find two linear projection vectors w Y and w X so that the two groups of linear combination signals w T Y Y and w T X X have the largest correlation coefficients: e reference signals were constructed at the stimulation frequency f d : where k is the number of harmonics, which is dependent on the number of frequency harmonics existing in SSMVEP; f s is the sampling rate; and l represents the number of sample points. e optimization problem of equation (10) can be transformed into the eigenvalue decomposition problem, and the spatial filter coefficient W is the eigenvector corresponding to the largest eigenvalue (MATLAB code for CCA fusion and test data is provided in Supplementary Materials (available here)).

Two-Dimensional T-F Image Representation and Reconstruction of EEG Signals.
e T-F analysis can transform the signal from the time domain to the T-F domain, and then, the T-F domain signal can be regarded as an image for analysis. In this study, T-F transform was performed on the time-domain EEG signal using STFT. Let h(t) be a time window function with center at τ, a height of 1, and a finite width. e portion of the signal e STFT is generated by the fast Fourier transform (FFT) of the windowed signal x(t)h(t − τ): Equation (12) can map the signal x(t) onto the twodimensional T-F plane (τ, f), and f can be regarded as the frequency in the STFT. After fusing multiple EEG T-F images, the fused two-dimensional image signals need to be transformed into one-dimensional time-domain signals for the subsequent analysis. e STFT has an inverse transform that transforms the T-F domain signal into a time-domain signal: Computational Intelligence and Neuroscience 3 (13) In this study, the T-F transform of the SSMVEP signal was conducted by MATLAB's tfrstft function. After fusing two T-F images, the T-F signal was inversely transformed into a one-dimensional time-domain signal by MATLAB's tfristft function.

T-F Image Fusion Based on Two-Dimensional Multiscale
Wavelet Transform. Among the many image fusion technologies, two-dimensional multiscale wavelet transformbased image fusion methods have become a research hotspot [14][15][16]. Let f(x, y) denote a two-dimensional signal, and x and y are its abscissa and ordinate, respectively. Let ψ(x, y) denote a two-dimensional basic wavelet and ψ a;b 1 ,b 2 (x, y) denote the scale expansion and twodimensional displacement of ψ(x, y): en, the two-dimensional continuous wavelet transform is e factor 1/a in equation (15) is a normalization factor that has been introduced to ensure that the energy remains unchanged before and after wavelet expansion. e twodimensional multiscale wavelet decomposition and reconstruction process is shown in Figure 1.
e decomposition process can be described as follows: First, onedimensional discrete wavelet decomposition is performed on each line of the image, which obtains the low-frequency component L and the high-frequency component H of the original image in the horizontal direction. e lowfrequency component and high-frequency component are the low-frequency wavelet coefficients and high-frequency wavelet coefficients obtained after wavelet decomposition, respectively. en, one-dimensional discrete wavelet decomposition is performed on each column of the transformed data to obtain the low-frequency components LL in the horizontal and vertical directions, the high-frequency components LH in the horizontal and vertical directions, the low-frequency components HL in the horizontal and vertical directions, and the high-frequency component HH in the vertical direction. e reconstruction process can be described as follows: Firstly, one-dimensional discrete wavelet reconstruction is performed on each column of the transform result.
en, one-dimensional discrete wavelet reconstruction is performed on each row of the transformed data to obtain the reconstructed image.
e T-F image obtained by the STFT is decomposed into N layers by the two-dimensional multiscale wavelet decomposition, as shown in Figure 1, and 3N + 1 frequency components are obtained. In this study, the LL N decomposed from the highest level is defined as the lowfrequency component, and the HL M , LH M , and HH M (M � 1, 2, . . ., N) decomposed from each level are defined as highfrequency components. e active components of the EEG are mainly contained in the low-frequency component. e fusion process of two T-F images based on two-dimensional multiscale wavelet transform is shown in Figure 2. Firstly, the T-F images 1 and 2 are decomposed by two-dimensional wavelet decomposition, and then, the corresponding lowfrequency components and high-frequency components are fused according to the corresponding fusion rules. Finally, two-dimensional wavelet reconstruction is performed to obtain the fused image by fusing low-frequency components and high-frequency components. In this study, the twodimensional wavelet decomposition was performed using the MATLAB's wavedec2 function and the two-dimensional wavelet reconstruction was performed using the MATLAB's waverec2 function.

Image Mean
Filtering. For the current pixel to be processed, a template is selected, which is composed of several pixels adjacent to the current pixel. e method of replacing the value of the original pixel with the mean of the template is called mean filtering. Defining f(x, y) as the pixel value at coordinates (x, y), the current pixel g(x, y) can be calculated by where D represents the square of the template size.

Summary of the Proposed Method for EEG Enhancement Based on T-F Image Fusion.
Here, the T-F image fusion method of the electrode signals x 1 (t) and x 2 (t) is introduced, and the fusion process is shown in Figure 3.
(1) STFT is performed on the two electrode signals x 1 (t) and x 2 (t) to obtain T-F images 1 and 2. (a) e low-frequency components of the T-F images 1 and 2 are subtracted to obtain the lowfrequency components of the fused image F: e low-frequency components represent the active components of the SSMVEP, and the subtraction of the e two-dimensional signal after the STFT is a complex signal; the two-dimensional signal after two-dimensional wavelet decomposition is still a complex signal. e value used for the comparison is the real part of the complex signal. Here, i represents the number of frequency components of the M decomposition layer (i.e., the number of wavelet coefficients).

Parameter Selection
In this study, the EEG time-domain signals were subjected to STFT and ISTFT, using the MATLAB library functions tfrstft and tfristft. e parameters that need to be set here are the number of frequency bins and the frequency smoothing window. If the number of frequency bins is too large, the calculation speed of the entire process will be low. In this study, the value of this parameter can be set to [32 54 76 98 120]. Since the Fourier transform of the Gaussian function is also a Gaussian function, the window function of the optimal time localization of the STFT is a Gaussian function. e EEG is a nonstationary signal [17] and requires a high time resolution of the window function; that is, the window length should not be too long.
In this paper, a Gaussian window was selected, and the value of the window length can be set to [37 47 57 67 77 87 97].

Two-Dimensional Multiscale Wavelet Transform
Parameters. In this study, the MATLAB library functions wavedec2 and waverec2 were used to perform twodimensional multiscale wavelet decomposition and reconstruction on T-F images. Here, the number of layers of wavelet decomposition and the wavelet basis functions need to be set. e sampling frequency of the EEG acquisition device used in this paper was 1200 Hz, and the stimulation frequency of the SSMVEP was 7-10.  (3)).

Parameter Selection.
e T-F image fusion method proposed in this study requires the setting of the following five parameters: the number of frequency bins, frequency smoothing window, vanishing moments of db N wavelet functions, mean filter template size, and the value of LL N,F . In this study, the grid search method was used to traverse the combination of all parameters to find the best combination of parameters that can enhance the SSMVEP active components. In this study, six rounds of experiments were conducted. e first three rounds of experimental data were used to select the optimal combination of parameters for T-F image fusion. According to the selected combination of parameters, the last three rounds of experimental data were used to compare the enhancement effect of SSMVEP active components by the method of minimum energy fusion, maximum contrast fusion, CCA fusion, and T-F image fusion. e disadvantage of grid search is that the amount of calculation is large; however, considering the difficulty of superparameter selection, grid search is still a more reliable method.

Electrode Signal Selection for T-F Image Fusion.
e T-F image fusion method analyzes two electrode signals. us, two suitable electrodes need to be selected from multiple electrodes.
e bipolar fusion signal is obtained by subtracting the two original EEG signals. In equation (17), we subtracted the wavelet low-frequency components of the two T-F images to obtain the wavelet low-frequency components of the fused image. e subtraction of the low-frequency components removes the common noise in the electrode signals, which is similar to the principle of bipolar fusion. erefore, the two electrode signals with the best bipolar fusion effect were used in the T-F image fusion. For SSVEP or SSMVEP stimulation, the Oz electrode usually has the strongest response [18] and is most commonly used to analyze SSVEP or SSMVEP [19]. e Oz electrode was used as one of the analysis electrodes for bipolar fusion. In this study, the Oz electrode was fused with the other five electrode signals so that the best combination of electrodes can be selected. When bipolar fusion is used, the Oz electrode as a reducing electrode or a reduced electrode will not affect the experimental results. For example, Oz-POz and POz-Oz have the same fusion effect. In this study, the Oz electrode was used as the reduced electrode. Twenty trials per subject were used for electrode selection. e Welch power spectrum analysis [20] was performed on the obtained signals of bipolar fusion for each subject.
e Gaussian window was used, and the length of the window was 5 s. e overlapping was 256, and the number of discrete Fourier transform (DFT) points was 6000 for the Welch periodogram. 20 focused targets correspond to 20 stimuli frequencies, and the frequencies corresponding to the highest power spectrum amplitudes at the 20 frequencies were determined as the focused target frequency. e high recognition accuracy indicates that the amplitude of the power spectrum at the stimulation frequency is prominent, and it indicates the effectiveness of the fusion method for the enhancement of the SSMVEP active component. Table 1 shows the recognition accuracy of all 10 subjects under the bipolar fusion. e Oz electrode was fused with different electrodes and obtained different fusion effects. Moreover, due to the differences among individuals, the electrode combination with the highest recognition accuracy per subject was also different. e asterisk ( * ) marks the bipolar fusion electrode group with the highest recognition accuracy of each subject, and the corresponding electrode combination was used in T-F image fusion.

Comparison of Enhancement Effects of Minimum Energy Fusion, Maximum Contrast Fusion, CCA Fusion, and T-F
Image Fusion on SSMVEP Active Components. T-F image fusion used two channel signals, and CCA fusion, maximum contrast fusion, and minimum energy fusion used all six channel signals. In this study, the accuracy was used as the evaluation standard of the fusion effect. e high online accuracy indicates that the amplitude at stimulus frequency is prominent, which shows the effectiveness of the fusion method. For T-F image fusion, the Welch power spectrum analysis (the parameters are the same as in Section 3.2) was performed on the obtained signal after the T-F image fusion, and the frequency with the highest amplitude at twenty stimulus frequencies was identified as the focused target frequency. e minimum energy fusion, maximum contrast fusion, and CCA fusion require prior knowledge of the frequency of the signal to be fused. However, the frequency of the signal to be fused cannot be determined because the user's focused target cannot be determined beforehand. For the current tested signal, the minimum energy fusion, maximum contrast fusion, and CCA fusion need to first perform the fusion analysis at all stimulus frequencies, and then, the frequency with the highest amplitude was identified as the focused target frequency. Take CCA fusion as an example, and assume that the current tested signal is Y. First, 20 template signals are set according to equation (11), and then, 20 spatial filter coefficients are obtained according to equation (10). e tested signal is fused with 20 spatial filter coefficients, respectively. e obtained 20 sets of vectors are separately analyzed by the Welch power spectrum (the parameters are the same as in Section 3.2) to obtain the amplitude at the corresponding frequency. Finally, the frequency with the highest amplitude is identified as the focused target frequency. e first three rounds of experimental data were used to select the parameters of the T-F image fusion (see details in Section 3.1), and the last three rounds of the experimental data were used to test the online fusion effects under the four methods. Figures 4(a)-4(d) show the analysis results plotted using the data of subject 4 in the fourth round of the experiment. Figure 4   e red flag f represents the target was misidentified, and the blue flag f represents the target was correctly identified. Figures 4(a)-4(d) show that the T-F image fusion method has the best online performance and the minimum energy fusion method has the worst online performance. Maximum contrast fusion and CCA fusion have similar fusion effects.
In online analysis, the power spectrum amplitudes of the test signal at all stimulus frequencies were calculated, and then, the frequency corresponding to the maximum amplitude was identified as the focused target frequency. Error recognition occurs when the amplitude at the focused target frequency is lower than the amplitudes at the nonfocused target frequencies. erefore, the power spectrum amplitude at the focused target frequency can be regarded as the active component of the signal, and the power spectrum amplitudes at the remaining nonfocused target frequencies can be regarded as noise. e SNR in this study was defined as the ratio of the power spectrum amplitude at the focused target frequency to the mean of the power spectrum amplitudes at the remaining nonfocused target frequencies. e SNRs of ten subjects at all stimulus frequencies were superimposed and averaged, and the experimental results are shown in Figure 5. It can be seen from Figure 5 that the T-F image fusion method obtained the highest SNR, which indicates that the T-F image fusion method proposed in this study can effectively enhance the SSMVEP SNR. e online accuracies of each subject in the three rounds of experiments were superimposed and averaged. e experimental results are shown in Figure 6. It can be seen that most subjects obtained highest online accuracy under the T-F image fusion and some subjects obtained highest online accuracy under CCA fusion and maximum contrast fusion. Only subject 9 obtained the highest online accuracy under the minimum energy fusion. e online accuracies of all the subjects were superimposed and averaged. e experimental results are shown in Figure 7. e experimental results show that the online performance of the T-F image fusion method is better. e online accuracy of the T-F image fusion method is 6.17%, 6.06%, and 30.50% higher than that of the CCA fusion, maximum contrast fusion, and the minimum energy fusion, respectively. e paired t-test shows significant differences between the accuracies of T-F image fusion and minimum energy fusion (p � 0.0174). Online accuracy analysis results show that the proposed method is better than the traditional time-domain fusion method.

Discussion
e SSMVEP collected from the scalp contains a lot of noise and requires an effective signal processing method to enhance the active components of SSMVEP. Spatial filtering methods utilize EEG information of multiple channels and exert positive significance to enhance the active components of the SSMVEP. At present, the commonly used spatial filtering methods include average fusion, native fusion, bipolar fusion, Laplacian fusion, CAR fusion, CCA fusion, minimum energy fusion, and maximum contrast fusion. e first five spatial filtering methods can only process the preselected electrode signals. Minimum energy fusion, maximum contrast fusion, and CCA fusion improve the above problem, and they can be used to fuse any number of electrode signals with better fusion effects. Minimum energy fusion, maximum contrast fusion, and CCA fusion fuse SSMVEP in the time domain. is study first put forward the idea of fusing EEG in the T-F domain and proposed a T-F image fusion method. We compared the enhancement effects of minimum energy fusion, maximum contrast fusion, CCA fusion, and T-F image fusion on SSMVEP. It is verified by the test data that the proposed method is better than the traditional time-domain fusion method.
T-F analysis is a good choice for the transformation of the EEG data from one dimension to two dimension. EEG transformed into the T-F domain can be used as an image for analysis. e STFT has an inverse transform, which can inverse transform the fused T-F image into a time-domain signal. erefore, the STFT was used to transform the EEG from the time domain to the T-F domain. e twodimensional wavelet transform can fuse two images into one to achieve the purpose of multichannel EEG fusion. After two-dimensional wavelet decomposition, the lowfrequency components of the two subimages were subtracted to obtain the low-frequency components of the fused image, and the maximum value of the high-frequency components of the two subimages was taken as the highfrequency component of the fused image. e low-frequency components after wavelet decomposition represent the active components of the SSMVEP. e subtraction of the low-  frequency components can remove the common noise in the electrode signals.
is is similar to bipolar fusion, which subtracts two electrode signals in the time domain. We also compared the enhancement effects of bipolar fusion and the proposed method on SSMVEP active components. e electrodes used for bipolar fusion are the same as those used in T-F image fusion. e results show that the proposed method is better than the bipolar fusion (the online accuracy of bipolar fusion is 71.52%). In this study, the fused image was filtered by mean filter to further achieve low-pass filtering. We also tried to remove the mean filtering step after completing the T-F image fusion.
e results show that removing the mean filtering step reduced the fusion effect. In this study, the mean filtering step was performed after T-F image fusion. We tested the fusion effect when the mean filtering step was performed before T-F image fusion (performing mean filtering step on the subimage after STFT). For some subjects, it was better to perform the mean filtering step after T-F image fusion. erefore, we recommend that researchers follow the analysis steps in this study. e wavelet low-frequency coefficients (see details in equation (17)) of the fused image have an important influence on the fusion effect. If we perform T-F image fusion on six electrode signals at the same time, we will get six sets of wavelet low-frequency coefficients. Here, each set of lowfrequency coefficient is a two-dimensional signal. Referring to equation (1), we can assign a coefficient to each twodimensional signal and obtain the fused two-dimensional signal (i.e., the wavelet low-frequency coefficients of the fused image) after linear summation. is study explored the feasibility of applying the spatial filter coefficients of CCA fusion and maximum contrast fusion to T-F image fusion when fusing six electrode signals by using the T-F image fusion method at the same time. Since the fusion effect of minimum energy fusion was poor, the spatial filter coefficients of the minimum energy fusion were not used here. e spatial filter coefficients of the maximum contrast fusion and CCA fusion at the frequency f were obtained by equations (9) and (10) and are set to V max and V cca , respectively. V max and V cca are two vectors with dimensions of 6 × 1, where V max (1, 1) and V cca (1, 1) represent the first elements of the vectors V max and V cca . Since the T-F image fusion method was performed on the six electrode signals, equation (17) is transformed into equation (19), where V corresponds to the spatial filter coefficient V max or V cca . e rest of the analysis process is the same as that in Section 2.10. e same analysis steps were performed for all 20 targets (7-10.8 Hz). e Welch spectrum analysis (the parameters are the same as in Section 3.2) was performed on the obtained signal after T-F image fusion. Figures 8(a) and 8(b) show the Welch power spectra plotted using the data of subject 4 in the fourth round of the experiment. Figure 8(a) shows the power spectrum plotted using spatial filter coefficients of CCA fusion and Figure 8(b) shows the Welch power spectrum plotted using spatial filter coefficients of maximum contrast fusion, where f represents the stimulus frequency and the red circle indicates the amplitude at the stimulus frequency. Figures 8(a) and 8(b) show that the amplitudes at the stimulation frequencies are prominent. e spatial filter coefficients of CCA fusion and maximum contrast fusion are effective for T-F image fusion. us, the T-F image fusion method proposed in this study focuses on the fusion of wavelet low-frequency coefficients of multiple  images (by assigning a coefficient to each two-dimensional signal and obtaining the fused two-dimensional signal after linear summation). e premise of the above test results is that we know the frequency of the signal to be fused. If the online test method is used (see details in Section 3.3), the same online fusion results as CCA fusion and maximum contrast fusion were obtained. T-F image fusion used only two electrode signals, and the better online fusion effect than that of the CCA fusion and the maximum contrast fusion was obtained, which shows the effectiveness of the proposed method. Next, we will explore the fusion method of multiple T-F images (more than two images), that is, to find suitable spatial filter coefficients for the wavelet low-frequency components of multiple T-F images: e parameters affect the fusion effect of the proposed method. We listed the parameters that need to be selected and the possible values of the parameters in Section 3.1. In this study, the grid search method was used to traverse the combination of all parameters, and the best combination of parameters was found. In the experiment, we found that the number of frequency bins and the Gaussian window length can be fixed to 54 and 57. We recommend that the researchers determine the parameters according to the parameter selection principles and ranges given in Section 3.1. In this study, the SSMVEP active component was enhanced by fusing multichannel signals into a singlechannel signal, and then, the focused target frequency was identified by performing spectrum analysis on the fused signals. is is beneficial for SSMVEP studies based on spectrum analysis. For example, Reference [21] proposed a frequency and phase mixed coding method in the SSVEP-based brain-computer interface (BCI), which increases the number of BCI coding targets by making one frequency correspond to multiple different phases. In the study, the FFT analysis of the test signal is required to find the possible focused target frequency and then calculate the phase value at that frequency. e proposed method in this study has a positive significance for accurately finding the focused target frequency in the spectrum. Moreover, some EEG feature extraction algorithms for the BCI also require spectrum analysis of the test signals [22,23]. erefore, the method proposed in this study has a potential application value.

Conclusion
To explore whether T-F domain analysis can achieve better fusion effects than time-domain analysis, this study proposed an SSMVEP enhancement method based on T-F image fusion. e parameters of the T-F image fusion algorithm were determined by the grid search method, and the electrode signals used for T-F image fusion were selected by bipolar fusion. e analysis results show that the key of the T-F image fusion algorithm is the fusion of the wavelet low-frequency components. is study compared the enhancement effects of minimum energy fusion, maximum contrast fusion, CCA fusion, and T-F image fusion on SSMVEP. e experimental results show that the online performance of the T-F image fusion method is better than that of the traditional spatial filtering methods, which indicates that the proposed method is feasible to fuse SSMVEP in the T-F domain.

Data Availability
e analytical data used to support the findings of this study are included within the article, and the raw data are available from the corresponding author upon request.
Ethical Approval e subjects provided informed written consent, in accordance with the protocol approved by the Institutional Review Board of Xi'an Jiaotong University.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Authors' Contributions
Wenqiang Yan worked on the algorithm and wrote the paper. Guanghua Xu involved in discussions and gave suggestions throughout this project. Longting Chen and Xiaowei Zheng worked on the experimental design. All authors read and approved the final manuscript.