A deep learning framework with multi-perspective fusion for interictal epileptiform discharges detection in scalp electroencephalogram

Objective. Interictal epileptiform discharges (IEDs) are an important and widely accepted biomarker used in the diagnosis of epilepsy based on scalp electroencephalography (EEG). Because the visual detection of IEDs has various limitations, including high time consumption and high subjectivity, a faster, more robust, and automated IED detector is strongly in demand. Approach. Based on deep learning, we proposed an end-to-end framework with multi-scale morphologic features in the time domain and correlation in sensor space to recognize IEDs from raw scalp EEG. Main Results. Based on a balanced dataset of 30 patients with epilepsy, the results of the five-fold (leave-6-patients-out) cross-validation shows that our model achieved state-of-the-art detection performance (accuracy: 0.951, precision: 0.973, sensitivity: 0.938, specificity: 0.968, F1 score: 0.954, AUC: 0.973). Furthermore, our model maintained excellent IED detection rates in an independent test on three datasets. Significance. The proposed model could be used to assist neurologists in clinical EEG interpretation of patients with epilepsy. Additionally, this approach combines multi-level output and correlation among EEG sensors and provides new ideas for epileptic biomarker detection in scalp EEG.


Introduction
Epilepsy, caused by abnormally high discharges from neurons, is a chronic brain disease and is characterized by recurrent seizures [1]. Patients often suffer from uncontrollable movement, loss of consciousness, or temporary confusion during seizures. These symptoms adversely affect or threaten daily activities and work, and even the lives, of patients. According to the Epilepsy Foundation and the World Health Organization [2], approximately 65 million people worldwide are diagnosed with epilepsy.
Scalp electroencephalography (EEG) is a noninvasive signal acquisition method for clinically recording brain activity. It is characterized by high temporal resolution and is very important in the diagnosis, treatment, therapeutic evaluation, and pathological research of epilepsy [3][4][5]. Seizure prediction based on scalp EEG is a topic of great interest in the epilepsy community [6]. However, seizure events are infrequent. Interictal epileptiform discharges (IEDs) that show up in the EEG are a hallmark of epilepsy and are widely adopted in clinical evaluations [7]. Unlike seizure events, IED events are more frequent and appear in the raw scalp EEG signals in various forms, such as spikes, sharp waves, spike and slow wave complexes, and polyspike wave complexes [8]. Because these discharges are related to a high likelihood of recurrent seizures [9], IED detection based on scalp EEG is significant for the diagnosis of epilepsy or an evaluated risk of seizures. Visual analysis by a human electroencephalographer is a reliable and main method for IEDs detection in the clinical diagnosis [10]. However, this work is very tiring, tedious, and time-consuming. Moreover, because of the variety and individual differences among IEDs, the accuracy of detection is susceptible to several subjective factors, such as the experience and work status of the reviewer. A large-scale study showed the significant variability in IED annotation even among experts [10]. Therefore, it would be useful to design a computer-assisted method for the automatic detection of IEDs.
In the past few decades, several algorithms, such as template matching [11,12], subband decomposition [13], and morphological filter [14], were proposed for the detection of IEDs. In these methods, similarities between the waveform of interest and the IED template are measured and then is determined whether the waveform is IED or not. Some researchers have also established feature engineering methods according to the time domain, frequency domain, or nonlinear features of scalp EEG, and detection of IEDs via classifiers with one or more features, such as decision trees [15], artificial neural network [16], AdaBoost [17], and clustering [18]. Although the aforementioned automated IED detection methods have achieved some good results, several obstacles remain to their clinical application. The main challenges are as follows: (a) Complicated preprocessing. There are numerous artifacts (caused by eye movement and chewing, among others) in scalp EEG that cannot be removed by filters. The IEDs are difficult to distinguish from these artifacts. In the aforementioned approaches, a number of hand-crafted or automatic methods for removing the artifacts have been designed. However, these methods require large amounts of work and may cause damage to the epileptiform transients in the signal. (b) Poor performance in cross-group test. Differences in type, focus, age, and even gender may result in IEDs with different morphologies or patterns [19,20]. Typical IEDs are shown in figure 1. IEDs are abundant in variety and size. In addition, the scalp EEGs of some subjects may have one or more bad channels. Because of these factors, the features of IEDs cannot be extracted deeply and fully via feature engineering methods designed based on subjective experience. Thus, the automatic recognition of IEDs currently lacks robustness and is difficult to apply in clinical settings.
With the rapid development of machine learning, deep learning has become the focus of research in many fields. Compared to classical methods, deep structures with nonlinear mapping, driven by big data, are able to find more generalized sample features, which improve performance in cross-group and cross-center situations. Two of the main architectures currently being explored in research related to deep learning are convolutional neural network (CNN) and recurrent neural network (RNN) [21].
CNN is an architecture inspired by the natural visual-perception mechanism and can be used to identify the visual rules from an original image without large amounts of preprocessing [22]. Several CNN architectures have exhibited good performance in medical signal processing [23,24]. The multi-layer structure of CNN has the ability to independently learn multiple levels of features and is helpful for extracting sample features regardless of size, shape, and position. Therefore, CNN has been used to model IED detection by several researchers. Combining 2D CNN, dropout, and max pool layers, Tjepkema-Cloostermans et al [25] proposed a model for detecting IEDs in 2 s scalp EEG. The model had a crossgroup sensitivity of 47.4% (specificity was 98.0% and false positive rate (FPR) was 0.6 min −1 ). Lourenço et al [26] also aimed to detect IEDs in 2 s scalp EEG but used VGG, one of the classical architectures of CNN. Their model exhibited a cross-group sensitivity of 79% (specificity was 99%). Thomas et al [27] improved a 1D CNN IED detector and validated it on scalp EEG datasets recorded from different institutions. Their proposed detector achieved a performance with a mean cross-validation area under the precision-recall curve of 0.833 and false detection rate of 0.2 min −1 for a sensitivity of 80%. In parallel work to their work, Jing et al [28] designed an IED detector consisting of 2D CNN and achieved an area under curve (AUC) of 0.98. However, the above CNN architecture only depends on top-level feature (i.e. the last layer of stack of convolution) and may ignore small object (such as single spike as shown in figure 1(c)) because of the lack of location information in the top layer.
In contrast to CNN, which is sensitive to local features, RNN comprises memory cells and performs better in analyzing the correlation between contexts of sequence samples [29]. The RNN utilizes context when processing input elements of the sequence. Thus, each output is determined by all previous inputs. Scalp EEG is the sequence signal from multiple sensors. The chain structure of RNN can contribute to both analyzing the temporal sequence and the correlation among EEG sensors. Therefore RNN was used for seizure detection [30], prediction [31], and IED detection on intracranial EEG [32] among others. However research on the use of RNN for IEDs detection on scalp EEG has not been reported.
In this study, an end-to-end structure that utilizes the correlation among sensors and multi-level morphologic features of scalp EEG is proposed. For this structure, the process of detecting IEDs involves the extraction of multi-scale morphologic features, multi-channel information fusion, and multi-level decision information fusion. The proposed model was validated on a balanced dataset of 30 patients with epilepsy from different institutions and tested independently on three other datasets. The results show that our method exhibits satisfactory performance and has the potential to help neurologists achieve diagnosis and treatment of epilepsy more efficiently.
The rest of the paper is organized as follows: section 2 introduces the scalp EEG datasets, data preparation, and the architecture of our model. Section 3 illustrates the results and observations. Section 4 discusses the effectiveness of our model and proposes the future scope of the study. Section 5 summarizes the conclusions of study.

Datasets
We randomly selected 36 patients with focal epilepsy from Xuanwu Hospital of Capital Medical University, Beijing, China and Fengtai Youanmen Hospital, Beijing, China. The one recording come from epilepsy monitoring unit and the other come from outpatient. Working under an Institutional Review Board-approved protocol at the Xuanwu Hospital of Capital Medical University, Beijing, China and Fengtai Youanmen Hospital, Beijing, China. This task is a sub work of the Beijing Natural Science Foundation (Grant Number Z200024) and approved by the biological and medical ethics committee of Beihang University (Approval Number: BM20200155). Tables 1 and 2 show the related information. All patients underwent video-EEG, with monitoring ranging from 1 to 13 h (1 h is a cycle). During the recording, patients were asked to lie in a supine position and remain awake. The scalp EEG data were recorded according to the international 10-20 electrode system at a 256 Hz sampling rate. The electrode impedances were maintained below 10 kΩ. Informed written consent was obtained from all the patients. A total of 3686 IEDs events, including spike and shape waves, spike (shape) and slow wave complexes, polyspike complexes, and polyspike and slow wave complexes, were independently annotated by neurologists from the two hospitals.
Additionally, to further test the applicability of our model, a dataset with healthy people and a public dataset were used. In the dataset with healthy people, scalp EEGs from six healthy volunteers (age: 12-43, three male and three female), with 1 h monitoring, were recorded as the control group. The recording mode is same with above patients. Written informed consent was obtained from each subject. This task is also a sub work of the Beijing Natural Science Foundation (Grant Number Z200024) and approved by the biological and medical ethics committee of Beihang University (Approval Number: BM20200155). The EEG recording without epileptiform transients were reviewed by clinical neurophysiologist to confirm. The public dataset [33] contains 100 persons (54 people with epilepsy and 46 people without epilepsy, age: 2-77, 60 female and 40 male). Scalp EEG with sharp transients were recorded according to the international 10-20 electrode system at a 500 Hz sampling rate in set B. A segment with sharp transients of each person was selected by two authors and the details were provided in (doi.org/10.5061/dryad.xsj3tx99w).

Data preparation
The data preparation consisted of signal preprocessing, dataset division, signal segmentation, and data enhancement.

Signal preprocessing
First, the noises in the raw EEG signals, such as baseline drift and 50 Hz power line interference, were filtered using a bandpass filter between 0.5 and 49 Hz. We then selected 'A1' and 'A2' as reference electrodes, according to the experience of neurologists, and calculated the signal after re-reference. Finally, the EEG signals were normalized using the following equations: where X w is the EEG signal from channel w, w ∈ [1,19], max is the maximum voltage in channel w and min is the minimum voltage.

Dataset division
Six of the 36 patients with epilepsy were randomly selected for independent testing and clinical effect evaluation. These data were assigned to the 'real dataset' . The other 30 patients were selected for training the model and adjusting its parameters. These data were assigned to the 'balanced dataset' , where the 30 patients were randomly divided into five groups, six  patients in each group, for five-fold cross-validation. The dataset, contained six healthy volunteers, were set as the control group and named set A. The public dataset was named set B.

Signal segmentation
For the balanced dataset, we selected a time window of 1 s (256 samples) for segmenting the EEG signal. EEG segments with IED annotation were extracted as positive samples. To present the IED as completely as possible, we manually realigned the EEG segments containing IED waveforms as shown in figure 4. The negative samples, equal in number to the positive samples, were extracted from the EEG background (i.e. the EEG segments without IED annotation). The model can be constructed and optimized better with balanced samples. In addition, the real dataset was built using different methods of signal segmentation for independently testing our model. For the real dataset, the scalp EEGs were extracted into 1 s segment based on a sliding window with an overlap of 50%, as shown in figure 4. Sets A and B are similar to the real dataset.

Data enhancement
To increase the diversity of the samples, and to teach the network the desired invariance and robustness properties during the training phase, we handled the data segments by translating left or right and covering the channel, to simulate the realities in clinical recording, such as bad channels and data segments loss. In detail, translating left or right signifies that entire segments were randomly moved left or right in the range between 0 and 50 data-points. Any deficiencies in length caused by the translation were padded with 0 in the 1 s window. Covering the channels denotes that a few channels were selected randomly among the entire set of 19 channels and that the values in these channels were replaced with 0.

IED detection algorithm for scalp EEG
The goal of the model is to perform a judgment (IED or non-IED) on an assigned scalp EEG with a size of 19 × 256. The construction of the proposed model, illustrated in figure 2, integrated a morphological analysis in the time domain, correlation analysis in sensor space, and decision fusion according to multiscale features.
The varied morphological features of IEDs in the time domain, such as amplitude, duration, and slope, are the types of typical features for detecting IEDs. CNN have been proven to learn these sophisticated features [27]. Most of CNN architectures depend only on top-level features and thus focus on large objects and ignore small objects [34] (such as signal spike, only 70-100 ms in a 1 s segment.). Considering this, we designed the morphological analysis module based on feature pyramid networks (FPNs) [35]. FPN is a framework for multi-target detection by fusing multi-scale features with shallow content descriptive and deep semantical features. The multilevel output of the FPN can capture the comprehensive shape features regardless of their size. The architecture of the module is shown in figure 2(a). On the left, five convolution layers with boundary padding are stacked; each convolution block consists of a 2D-convolution layer with a 1 × 3 kernel, a leaky rectified linear unit (leaky RELU) layer, a batch normalization (BN) layer, and a max pooling layer. The multi-level outputs supply integrated features from different size objects for further analysis. On the right, three up-sampling blocks are stacked. The input of each up-sampling block is the sum of the output of the prior block and of the convolution block at the same level on the left. Before the addition, the output of the convolution block at the same level on the left should be convoluted with a 1 × 1 kernel to ensure consistency in size. The skip connection can transfer the location information from the shallow layer to the top layer and alleviate optimization problems, such as vanishing gradient and gradient explosion, caused by increases in network layers. Finally, each layer on the right outputs feature matrices at different levels.
The signal of each channel in the scalp EEG is considered as the sum of neuro cell discharges. Additionally, evidence has shown that IEDs do not occur in isolation and that multi-channel co-occurrences are very common [36,37]. Inspired by this, we designed the correlation analysis module in sensor space based on gated recurrent unit (GRU) [38]. This chain structure, as a type of RNN, is able to fuse sequences and analyze their dependencies. In our model, this capacity of sequence modeling is used to fuse EEG channels and extract the unique distribution of IEDs, different from artifacts, in sensor space. The architecture is shown in figure 2(b). Each level output in module A is connected with an independent correlation analysis module. In the module B, each GRU cell state h t is matched with the signal from one channel x t and prior cell state h t−1 , t ∈ [1,19]. In other words, the output y depends on the 19 channels and the weighted sum of channels 1-18, calculated using the reset gate r t and update gate z t . Figure 3 showed the structure of GRU cell, and the specific calculation process is as follows: where * represents matrix multiplication, σ (·) is the sigmoid function, W r , W z , Wh, and W y are the weight matrices of the corresponding gates, andh t is the candidate status about x t . The four level features are sensitive to IEDs with different sizes. Simple dense layers cannot highlight the valuable features regarding IEDs. The attention mechanism dynamically quantifies the importance of the preliminary results on different scales and the combinations of these results, relying on the more informative results to enhance feature representation without prior expert knowledge [39]. Aimed at a variety of IEDs, this module can help the model to concentrate on one or more feature scales and their combinations that are relevant to the IEDs, and therefore, has the potential to improve detection performance. Thus, we designed a decision fusion module that combined attention mechanism and GRU. The architecture of the module is shown in figure 2(c). The four level preliminary results regarding IEDs were achieved based on module B. These preliminary results, arranged according to scale, were input into module C. Thereafter, the outputs of the GRU were weighted via attention weights. Finally, the outputs of the cells after weighting were input into a summation layer and the judgment (IED or non-IED) was obtained via sigmoid. The specific calculation process is as follows: where x t represents the different level features; h t is the cell status of the GRU; GRU (·) represents the calculation process from x t to h t ; W g , W a , and V a are weight matrices; a l is attention weight; and y is the final judgment (IED or non-IED). Figure 4 illustrates the flow of performance evaluation. First recorded raw data were preprocessed, including filtration, reference, and normalization. It is noted that we did not use any hand-crafted or automatic methods to remove the artifacts. Specially, public datasets were down-sampled to 256 Hz. Second, these data were divided into four datasets. Third, the signals in the four datasets were segmented using 19 × 256 windows via the corresponding method. Fourth, only in the balanced dataset, the data were tripled via randomly translating or covering and were used in an ablation experiment and a comparison experiment via five-fold cross-validation.

Performance evaluation
Finally, other datasets (the real dataset, set A, and set B) were used to test the applicability of our method in clinical use. In other words, we tested unused data using a sliding window rather than the pre-segmented epochs of the balanced dataset.

Environment and hyper-parameters
The proposed model was trained and tested on a server equipped with an NVIDIA Tesla K80 GPU and running an Ubuntu 16.04 LTS system. The model was implemented using Python 3.7.9 and the TensorFlow 2.1 framework.
In each phase of five-fold cross-validation, four fold data were used to train the model, and one fold data were used for evaluation. The four fold data were divided into two sets, 80% for training and 20% for validation. The parameters of the proposed model were optimized via minimization of a loss function. A binary cross-entropy cost function was applied because of the sigmoid function in the model. The loss function was: where, y i is the output of the model, y i is the corresponding label, and n is the number of training samples. We selected adaptive moment estimation with mini-batch as the optimizer algorithm. The batch size was set to 128 based on experience and the moments of beta1 and beta2 were set to 0.91 and 0.999, respectively. The validation loss was obtained after each batch using the validation set. The initial learning rate was set to 10 −3 and was shrunk to one-tenth of it whenever the validation loss did not decrease through five training iterations. Subsequently, the optimizer saved the model with the least validation loss in 50 epochs.

Ablation experiment on balanced dataset
In our study, three modules were designed to achieve end-to-end IED detection in scalp EEG. To verify the contributions of the modules, we regarded module A as the baseline and designed an ablation experiment. Details on the methods and the average results of fivefold cross-validation are outlined in table 3. As shown in table 3, the best results were obtained with the combination of three modules, and the significant improvements were achieved using module A with B or C instead of using module A alone. In other words, a superior improvement can be gained through the employment of correlation features in sensor space and decision fusion according to multi-scale features, based on the morphological analysis.

Performance comparison on balanced dataset
Past studies have shown that deep learning result in better performance for IED detection than those of traditional methods [25][26][27][28]. Thus, in this study, we compared our model only with some classifiers based on deep learning. We also used five-fold crossvalidation. Module A in our model was used as the baseline. Furthermore, the popular structure ResNet [40], which is similar in structure to FPN, was used to validate the availability of multi-scale feature in IED detection. This is because the decision of Res-Net depends only on the top-level feature, in contrast to that of FPN, which depends on multi-level features. Based on the differences in the datasets used in the respective studies, the performance of traditional methods based on deep learning could not be directly compared. The two most representative types of networks for IED detection were therefore selected for performance evaluation on our dataset. Their hyperparameters were set according to [25] and [26]. The hyper-parameters of ResNet and the baseline were comparable to our model. The performance comparison is shown in table 4.
According to table 4, our proposed model achieved the best performance with the least parameters in IED detection when compared to other common methods based on deep learning. In particular, the accuracy, sensitivity, and F1 score of our model are dramatically higher than those of the other methods. Meanwhile, figure 5 shows the ROC curves and their corresponding AUCs for our model and other methods. The two evaluation indicators illustrated the balance between sensitivity and true positive rate (i.e. 1-specificity) and showed that our model is the best among the tested methods. These results demonstrate that our model could accurately distinguish IEDs and background signals and detect all IEDs in scalp EEG as much as possible while ensuring the accuracy of background signal detection.

Applicability test on three datasets
It is well known that IEDs are rare in clinical recordings and approximately 90% of segments in scalp EEGs are background signals without IEDs. Therefore, we evaluated the performance of our model in clinical conditions using the real dataset, set A, and set B. Prior to this, our model was trained using the balanced dataset. Because the patients whose data were included in the real dataset are entirely different from those whose data were included in the balanced dataset and were never involved in the training process, the test in this section is more similar to clinical application. The results for six patients with epilepsy were outlined in table 5, and the confusion matrices are shown in figure 6. Because of the 50% overlap in signal segmentation on the real dataset, an IED event may generate two or more IED labels. Thus, in table 5, the number of IED events are not equal to the number of IED labels. The results of the evaluation on the real dataset are as satisfactory as those on the balanced dataset except for precision and F1 score. This is a sensible result for the real dataset, which contains more complicated and a greater variety of background signals because the real dataset was not used to train the model. Moreover, the samples were severely unbalanced, and there were only approximately a hundred or less IED events in an hour of scalp EEG. The greater proportion of background signals also caused an increase in FP. Thus, the precision and F1 score were slightly inferior. However, the average precision was still 85.9%. Furthermore, our model required only 7.17 s to interpret an hour of scalp EEG. By contrast, for neurologists, this task requires 1-3 h. Therefore, the results on the real dataset suggest that our model is able to automatically detect IEDs in scalp EEG and has the potential to help neurologists reduce the overwhelming majority of their work in interpreting scalp EEG.

Discussion
In this study, we developed an end-to-end detector to automatically detect IEDs in scalp EEGs and proved the effectiveness of the detector on balanced and real datasets. Combining the modules for morphological analysis in the time domain, correlation analysis in sensor space, and decision fusion based on attention mechanism, the proposed method was able to extract For module A, the stack of pooling was able to obtain comprehensive features of IEDs in different size windows. At the same time, several convolutional layers, for determining deep-level features relevant to IEDs, were also stacked in this module, because this structure was able to increase nonlinear mapping at the cost of slight parameter increase. However, the problems of vanishing gradient and exploding gradient, caused by increasing layers, would have inevitably occurred. Based on these possibilities, BN and skip connections from shallow layer to deep layer were used in our model. BN was able to maintain the same distributions for each layer during network training and thus enlarge the gradient and improve the training efficiency. Skip connection, on the other hand, is a characteristic structure in FPN and ResNet. The shallow-level features were connected to deeplevel features by this structure, allowing an efficient path of gradient propagation to be obtained [35,40]. Moreover, skip connections enrich to the deep-level features through the shallow-level features that contain more location information.
The performance of our model and of the baseline were better than those of other methods, as shown in table 4, and this advantage can be better observed in figure 5. The possible reason is that the output of our model and that of the baseline depends on multilevel features, whereas those of other methods depend only on top-level features. The output from multilevel features could comprehensively capture both the large and small morphological features of IEDs. The attention mechanism in module C can assign weights to each level decision according to the differences and dynamically combine these decisions. In other words, the information that is beneficial to the judgment (IEDs or non-IEDs) was highlighted, and useless information was downplayed. The coverage contributes to the IED detection in scalp EEGs obtained from patients with epilepsy with differences in the types, focus, age, and even gender, and improves the accuracy and robustness of the model. The results in figures 7(a) and (b) provide strong evidence of this. IEDs, have different sizes in 1 s time windows, were detected via our model.
A scalp EEG, which is recorded using macroelectrodes, is the sum of the electrical activities of neurons in a certain space [41]. Abnormal discharges of neurons due to epilepsy would be recorded simultaneously by multiple sensors. Hence, IEDs are characterized by particular patterns in sensor space, in comparison to artifacts in scalp EEGs that are easily confused with IEDs [35]. The results of the ablation experiment, outlined in table 3, showed that the performance, especially sensitivity, improved significantly after the addition of the correlation analysis module (i.e. module B). Thus, analysis and learning of correlation features in sensor space were able to help the model improve its ability of IEDs detection, particularly its ability to distinguish between IEDs and artifacts. Furthermore, the distribution information of IED in sensor space has the potential to aid neurologists in exploring the pathogenic mechanisms of epilepsy, because several studies have shown that the distribution of IEDs may contribute to locating the epileptic focus and to performing qualitative analyses of patients with epilepsy [27,42].
Furthermore, because of module B, we extracted only single channel features via 2D-convolution with a 1 × 3 kernel in the morphological analysis module and did not use any hand-crafted or automatic methods to remove the artifacts. In addition, the results in table 4 demonstrated that parameter reduction did not sacrifice the learning ability of the model. The connection between convolutional layers and fully connected layers was shown to be the major cause for other models to inflate. On the contrary, the chain structure in module B and C achieved heavy reuse of parameters via iteration of GRU cells. Hence, our proposed model achieved the best performance with the least parameters in IED detection. A more parameterefficient model structure would be less prone to overfitting, and superior in terms of generalizability and memory consumption [43].
This study has a few limitations. The examples in figure 7 shows that incorrectly classified epochs are centered on a single sharp transient. There are probably two reasons for this. First, small and single transients are easily covered by complex background in a 1 s window. Although our model detected most single spikes, the outputs P are not high, because some transients were left out or incorrectly classified. Thus, a variable step size adaptive window may be more suitable for IED detection compared to a 1 s window. Second, the performance degradation in table 6 shows the limitations in terms of data size. The nature of the deep learning framework is to learn available representations for a specific task from large amounts of raw data with fine labels. Because of the lack of high quality data, our model cannot achieve the best possible performance in IED detection. In addition, because of the lack of standardized public datasets regarding IEDs in scalp EEGs, it is difficult to objectively compare our model with all existing methods. Nonetheless, we compared our model with a few representative methods based on deep learning.

Conclusions
Based on deep learning, we have proposed an end-toend, automated IED detection algorithm with multiscale morphological features in the time domain and correlation feature in sensor space. The proposed algorithm was evaluated on a balanced dataset of 30 subjects and achieved state-of-the-art detection performance. Further, the clinical effectiveness was tested using a real dataset, a dataset with healthy people, and a public dataset. The results suggest that our algorithm could assist neurologists in the diagnosis of epilepsy.

Data availability statement
The data generated and/or analyzed during the current study are not publicly available for legal/ethical reasons but are available from the corresponding author on reasonable request.