A Low Complexity Estimation Method of Entropy for Real-Time Seizure Detection

In recent years, many studies have proposed seizure detection algorithms, but most of them require high computing resources and a large amount of memory, which are difficult to implement in wearable devices. This paper proposes a seizure detection algorithm that uses a small number of features to reduce the memory requirements of the algorithm. During feature extraction, this paper proposes an entropy estimation method that uses bitwise operations instead of logarithmic operations to reduce the algorithm’s demand for computing resources. The experimental results show that the computing time can be reduced by about 81.58%. The seizure detection algorithm in this paper is implemented in an ultra-low power embedded system and performs 7 classification tasks in the Bonn data set to verify the performance of the algorithm. The average classification performance is: Accuracy (97.13%), Specificity (97.57%) and Sensitivity (98.42%). Compared with previous studies, the algorithm of this paper has comparable classification performance, but the proposed algorithm only needs 0.23 seconds from feature extraction to classification result, to the best of our knowledge, which is the seizure detection algorithm with the least computing time currently applied to wearable devices.


I. INTRODUCTION
Epilepsy is the second most common neurological disorder in the world. About 50 million people worldwide suffer from epilepsy and 2.4 million people are diagnosed with epilepsy every year. About 80% of the patients live in lowand middle-income countries [1]. During an epileptic seizure, the abnormal discharge of brain cells causes the patient to have symptoms such as convulsions, fainting, and loss of consciousness. The occurrence of epilepsy in daily life may not only cause physical harm to the patient, but also more likely to have psychological problems in the long run [2]. Electroencephalography (EEG) is a method of recording electrical activity in the brain through electrodes. The traditional epilepsy diagnosis method for neurologists is to visually analyze the EEG to determine whether there are epilepsy symptoms. However, the accuracy of the traditional detection method is affected by the ability and experience The associate editor coordinating the review of this manuscript and approving it for publication was Md. Kafiul Islam . of the examiner. Manual long-term EEG analysis is very important. It is cumbersome and time-consuming, which will lead to delayed diagnosis results and deterioration of the condition. Therefore, it is necessary to develop a seizure detection system to increase the efficiency and accuracy of diagnosis. Early treatment to reduce the mortality rate caused by sudden unexpected death in epilepsy (SUDEP) [3].
In the last decade, seizure detection has been regarded as a classification problem, and many related technologies have been proposed. Those methods can be roughly divided into two types: typical machine learning and deep learning. The algorithm based on typical machine learning usually includes data preprocessing, feature extraction, screening, classifiers, etc., and deep learning algorithms mainly use a large amount of data to make the model converge for classification.
The features used in typical machine learning algorithms can be divided into four categories [4]  The most commonly used model based on deep learning algorithms is Convolutional Neural Network (CNN). For example: [5] uses a 13-layer one-dimensional convolutional network to analyze the original EEG signal, [6] uses the time spectrum and VGG16 [7], based on transfer learning, tunes the model. [8] used multi-scale convolutional neural networks to analyze raw EEG signals at different time scales.
Generally, the algorithm based on deep learning has better accuracy, but it requires a large amount of memory and high computational complexity [9]. In this case, the algorithm is only suitable for environments with high computational performance, and it is difficult to be implemented in wearable devices or battery-powered mobile devices. However, most epilepsy patients live in low-and middle-income countries, and algorithms that require high computing performance are difficult to implement for those countries.
To implement an effective seizure detection algorithm in low computing resource environment, this paper proposes the use of Local Binary Pattern Mean Absolute Deviation (LBPMAD) [9], entropy, variance of local entropy and logistic regression. It is noted that the proposed method for estimating entropy is very powerful in low computing resource environment. Compared with previous algorithms based on typical machine learning, the proposed algorithm has two advantages: 1. very few features; 2. feature extraction using only basic arithmetic operations (addition, subtraction, multiplication and division) and bitwise operations. It is more suitable for low-cost and low-power wearable devices.

II. METHODOLOGY
Algorithm proposed in this paper is shown in Fig. 1, it can be divided into three parts: preprocessing, feature extraction and classification. Previous study has shown that the active frequency band of epileptic seizures is mainly 3-30 Hz [10]. Thus, during preprocessing, we use a band-pass filter of 0.5-40 Hz to filter out signals outside the epileptic activity band. In the feature extraction part, we extract three features: the mean absolute deviation of the local binary pattern, the entropy, and the variation of the local entropy. Based on the data of the training set, we obtain the normalized standard deviation and average value of each feature through (1) as the input to the classifier.
where feature norm the normalized feature, µ feature is the mean of the representative feature, and σ feature is the standard deviation of the representative feature. Finally, the normalized features are conducted into a third-order logistic regression, we could conclude the feature generated by the current data is or is not epilepsy. VOLUME 11, 2023 A. DATASET This paper uses the public dataset provided by Bonn University [11] to evaluate the performance of the algorithm. The Bonn dataset is one of the most commonly used datasets for evaluating the performance of seizure detection algorithms. There are two main reasons for choosing the Bonn Dataset. First, the Bonn dataset contained different EEG states of healthy subjects and epileptic patients, which allowed the performance of the algorithm can be verified under different conditions. Second, the EEG signals in the Bonn Dataset are segmented EEG data., so that the bias resulted from different data segmentation approaches can be eliminated. The Bonn dataset has a total of 500 EEG records, each of which is 23.6 seconds long and is divided into five sets from A to E. Sets A and B signals were collected from five healthy awake subjects, with eyes open for set A and eyes closed for set B. Sets C, D and E are collected from five epilepsy patients. Sets C and D are pre-epileptic data (inter-ictal); set D measures the epileptogenic zone; set C measures the hippocampal formation of the opposite hemisphere of brain of the epileptic area; and set E is the brain wave signal during epileptic seizures.
The Bonn dataset is derived by a 12-bit digital-to-analog converter sampled at 173.61 Hz for EEG measurements.

B. LOCAL BINARY PATTERN MEAN ABSOLUTE DEVIATION
YAZID et al. used the local binary pattern (LBP) [12] and the mean absolute deviation [13], [14], [15] to propose the LBP-MAD simple feature extraction method [9], which combines the LBP code selection method with the local average value. Combined with the mean absolute deviation, the selection of LBP code and local average value are calculated.
LBPMAD is calculated through (2), in which f (x, i, P, step) represents the regional average value, x is the EEG signal, i is the current calculated position, P is the number of data points required for the calculation of LBPMAD, and step represents the spacing of each data point.
When step is equal to 2 and P is equal to 8, the LBPMAD of the EEG is calculated with the average value as the feature and be normalized through (1). After normalization, draw the box-and-whisker plot of each category of the Bonn data set as Fig. 2 We can see that LBPMAD can effectively distinguish brain waves in general and those during epileptic seizures.

C. ENTROPY
Entropy is used to evaluate the degree of data dispersion. In general, the higher the entropy value, the more scattered and unpredictable the data is. On the contrary, the lower the entropy value is, the more concentrated the data is. Many previous studies have used entropy as a feature of seizure detection algorithms [16], [17], [18], [19], [20], Shannon entropy can be calculated through (3) [18].
where X represents the range of the time signal and p (x) represents the appearing probability of the element x.
Observing (3), one sees that a large number of floatingpoint and logarithmic operations are required to calculate entropy. Therefore, it is difficult to the realization of low-performance environments such as wearable devices or battery-powered mobile devices. For this reason, this paper proposes a fast entropy estimation method. The method reduces the computing resources required to calculate entropy as following: According to (3), one expands p (x) to get (4).
where n x is the number of times the element x appears, and N is the total number of measurement points.
where k is an adjustment parameter. Assuming the total number of measurement points N = 2 m , (6) becomes Based on (7), one has The most significant bit (MSB) of log 2 n x can be found through the bit operation [21].  The relation between the MSB of n x and n x satisfies Taking log 2 operation of (9) results in It is obvious that Therefore one gets log 2 n x = MSB. Fig. 3 takes S001 of the Bonn dataset as an example. In the case, the moving window is set to 500 sampling points. It shows the estimated entropy using (8) and the entropy using   (1) is used as features. The box-and-whisker plots of each category of the Bonn data set can be drawn. Fig. 4 and Fig. 5 shows the related set entropy and estimated entropy of EEG respectively. It can be seen that both can effectively distinguish brain waves in general and those during epileptic seizures.
In the feature extraction method proposed in this paper, we use not only entropy but also the variation of local entropy as features. Fig. 6 divides the EEG signal into 16-segment. One first uses (8) to estimate the local entropy, and then calculates its variance. The normalized variance is thus taken as a feature. The box-and-whisker plots for each category of the Bonn data set, and it can be seen that the variation of local entropy can effectively distinguish brain waves in general and brain waves during epileptic seizures. Fig. 7 shows the distribution of EEG after feature extraction. In the figure, the brainwave signals of epileptic seizures (Set E) and normal brain waves (Set A, Set B, Set C and Set D) VOLUME 11, 2023 are divided into two groups. Such a feature space distribution situation proves the effectiveness of the features selected in this paper.

E. LOGISTIC REGRESSION
This paper uses logistic regression, similar to the literature [22], to discriminate epilepsy features from other brain wave features. If there are m extracted features, one defines X ∈ R m+1 as.
where x i represents the i-feature and T represents the transpose. Define the hypothesis function as the sigmoid function, the output range is [0, 1] as (13), in which feature weights Finally, one calculates the prediction result The prediction resultŷ depends on the feature weight θ.
In this paper, the loss function is defined as the (15). It uses Stochastic Average Gradient (SAG) [23] to minimize the overall loss function of the training data to obtain the best feature weights θ.

III. EXPERIMENTS A. CLASSIFICATION TASK
This paper uses 5-fold cross-validation and considers different scenarios to combine different types of Bonn datasets to verify the performance of the classifier. Table 1 shows a total of seven classification tasks, Case 1. Case 2 and Case 5 are used to evaluate the efficiency of the classifier to distinguish the brain waves of epileptic seizures from those of normal people; Case 3, Case 4 and Case 6 are used to evaluate the classifiers to distinguish the brain waves of epileptic seizures in patients with epilepsy and seizure-free brainwave performance; Case 7 evaluates the classifier to distinguish the brainwave performance of seizure and nonseizure brainwaves.

B. EVALUATION METRICS
This paper uses the accuracy, sensitivity, and specificity to evaluate the performance of the model in each classification task. Accuracy shown as (16) is used to evaluate the accuracy of the overall classification performance of the classifier. Accuracy = TP + TN TP + FP + TN + FN (16) where TP, TN, FP and FN represent true positives, true negatives, false positives, and false negatives, respectively. Sensitivity defined as (17) is used to calculate the correct rate of evaluating the classifier when it is positive. And the specificity defined as (18) is used to evaluate the correctness of the classifier on negatives. and

IV. RESULTS
The experimental architecture of this paper is shown in Fig. 8. At first, the statistical training set feature distribution and logistic regression are performed on a personal computer. Then the test set validation is performed on an ultra-low power embedded system (STM32L432KCU6). The experimental results are presented in Table 2. The averages of the seven classification tasks are higher than 97% in accuracy, sensitivity and specificity. The calculation time of each operation of the classification algorithm proposed in this paper are presented in Table 3. The overall execution time of the algorithm is only 230.75 ms. This confirms the feasibility of the algorithm in wearable systems and real-time applications. Table 4 and Fig. 9 compare the time required by the proposed estimation method and the direct calculation of entropy. The proposed estimation method only needs 6.63ms/per trail on average, while the direct calculation of entropy requires 36.00 ms/per trail. The calculation time can be remarkably reduced by about 81.58% through the proposed estimation method.  Table 5 Compares the classification performance of this paper with previous studies on the Bonn dataset. The deep learning-based methods [6], [8], [24], [25], and [26] can achieve excellent classification performance; however, it requires excellent computing resources and large memory space for an epilepsy auxiliary diagnosis system on a personal   computer. In this case, it will be very difficult to implement in a wearable device for the SUDEP real-time notification system, because the wearable device only has a few KB of RAM and low computing resources. Those methods using typical machine learning can be found as [9], [27], [28], and [29]. They mainly concern the classification performance; in such condition, they have to transform time domain signals, such as Fourier transform, discrete cosine transform or wavelet transform, into frequency domain signals. In this case, those methods, based on the transformed signal, perform indirect feature extraction. However, the proposed method, without using transform, performs direct feature extraction in the time domain. As for the direct feature extraction in the time domain, lower computing resources are generally required. [30] directly extract time domain features; however, it uses a lot of matrix operations and therefore requires excellent computing resource.
On the other hand, when considering the memory space, [9] needs 18 features for classification and it has to store at least 2 frequency band signals obtained through DWT in the process of feature extraction. [27] uses a DCT-based 3rd-order filter bank to divide the EEG signal into 5 frequency VOLUME 11, 2023 bands, and extracts 20 features from the 5 frequency bands for classification. In the process of feature extraction, at least 3 filters need to be stored. Moreover, in the process feature extraction of [28], it has to store the time-frequency map generated by Fourier synchro-squeezed transform (FSST) with a size of 177 × 4096, and uses 8 to 35 features for classification. Although [29] only uses 2 features for classification, it is necessary to store not only the time and frequency domain signals after wavelet transformation, but also the value domain distribution for calculating entropy for feature extraction.
Compared with those methods mentioned above, the proposed method uses 3 features for classification. In addition to the original EEG signal, it only needs to store the value range distribution for estimating the entropy in feature extraction, uses much less memory space and small number of features. Table 6 shows the related papers that provide the calculation time of the algorithm in recent years. Among them, [31], [32], and [33] have the algorithm that has low computational complexity and can be implemented in edge devices. But this paper has better classification performance and lower computation time. Although [34] and [35] have better classification performance, the computation time is higher than the proposed method, not to mention that they are implemented on a personal computer (PC).
Based on the above, the proposed method not only has the same classification performance as the previous methods, but also is much easier to implement for real-time application with low computing resources.
The ablation experiment of the algorithm in this paper is carried out to verify the influence of each feature on the classification performance. Fig. 10 shows the average performance histogram of the seven classification tasks. The removal of  LBPMAD does not affect the sensitivity and specificity, but the accuracy decreases, which means that the false positives of the algorithm increase. And whether the local entropy variation or entropy is removed, the classification efficiency of the algorithm will decrease. Table 7 compares the average performance of the proposed algorithm on seven tasks with different input signal lengths. In the case of only 64 sampling points (about 0.36 seconds), it still has an accuracy of 92.42%, which means that the algorithm in this paper can correctly identify most epilepsy signals even if only a very short EEG is required.
This study adopted LBPMAD and Entropy-based approaches to measure distinct EEG features. The LBPMAD calculated the average of absolute standard deviation between each sampling point and its local mean in time domain [9]. The entropy-based approaches, i.e., entropy and local entropy variance, evaluated the dispersion of the signal according to the occurrence probability for each EEG value [18].
The LBPMAD provided the benefit to analyze the relation between signal amplitude and local waveform, even it was calculated from a small amount of samples. In contrast, the entropy-based approaches considered the joint probabilities among EEG values which required a large amount of data to achieve better statistical results, in which the relationship between signal amplitude and local waveform was not considered. We combined the LBPMAD and Entropy-based features to achieve a more comprehensive representation of the signal description and better detection performance for epilepsy detection was obtained.

V. CONCLUSION
This paper proposes a real-time seizure detection algorithm that can be implemented in ultra-low power embedded systems. The proposed algorithm, without using transform, performs direct feature extraction in the time domain. The proposed algorithm uses LBPMAD, local entropy variation and entropy as features, and then classifies those features through logistic regression. In order to reduce the computational complexity of the algorithm, the proposed method finds the most significant bit to replace the logarithmic operation. In this way, the calculation time can be considerably reduced by about 81.58%. Compared with previous studies, this paper has comparable classification performance with lower computation complexity which is suitable to be implement in a lower-cost computing environment on microprocessor. The microprocessor only needs to store EEG data, and the LBP-MAD and entropy use small amount of memory space with low computational complexity. VOLUME 11, 2023 SZU-CHI HUANG was born in Taichung, Taiwan, in 1999. He received the B.S. degree in electronic engineering from the National Taipei University of Technology, Taipei, Taiwan, in 2021. He is currently pursuing the M.S. degree in electrical engineering with National Central University, Taoyuan, Taiwan. His research interests include the signal processing and machine learning with applications in biomedical systems.  He joined the Department of Electrical Engineering, National Central University, Taiwan, in 2005. His research interests include signal and image processing of EEG and MEG signals and designing the EEG-based brain-computer interfaces. VOLUME 11, 2023