Automated Seizure Detection Based on State-Space Model Identification

Wang, Zhuo; Sperling, Michael R.; Wyeth, Dale; Guez, Allon

doi:10.3390/s24061902

Open AccessArticle

Automated Seizure Detection Based on State-Space Model Identification

¹

Department of Electrical and Computer Engineering, Drexel University, Philadelphia, PA 19104, USA

²

Department of Neurology, Thomas Jefferson University, Philadelphia, PA 19107, USA

³

Jefferson Hospital for Neuroscience, Philadelphia, PA 19107, USA

^*

Author to whom correspondence should be addressed.

Sensors 2024, 24(6), 1902; https://doi.org/10.3390/s24061902

Submission received: 27 February 2024 / Revised: 12 March 2024 / Accepted: 14 March 2024 / Published: 16 March 2024

(This article belongs to the Special Issue Sensing Brain Activity Using EEG and Machine Learning)

Download

Browse Figures

Versions Notes

Abstract

:

In this study, we developed a machine learning model for automated seizure detection using system identification techniques on EEG recordings. System identification builds mathematical models from a time series signal and uses a small number of parameters to represent the entirety of time domain signal epochs. Such parameters were used as features for the classifiers in our study. We analyzed 69 seizure and 55 non-seizure recordings and an additional 10 continuous recordings from Thomas Jefferson University Hospital, alongside a larger dataset from the CHB-MIT database. By dividing EEGs into epochs (1 s, 2 s, 5 s, and 10 s) and employing fifth-order state-space dynamic systems for feature extraction, we tested various classifiers, with the decision tree and 1 s epochs achieving the highest performance: 96.0% accuracy, 92.7% sensitivity, and 97.6% specificity based on the Jefferson dataset. Moreover, as the epoch length increased, the accuracy dropped to 94.9%, with a decrease in sensitivity to 91.5% and specificity to 96.7%. Accuracy for the CHB-MIT dataset was 94.1%, with 87.6% sensitivity and 97.5% specificity. The subject-specific cases showed improved results, with an average of 98.3% accuracy, 97.4% sensitivity, and 98.4% specificity. The average false detection rate per hour was 0.5 ± 0.28 in the 10 continuous EEG recordings. This study suggests that using a system identification technique, specifically, state-space modeling, combined with machine learning classifiers, such as decision trees, is an effective and efficient approach to automated seizure detection.

Keywords:

EEG; system identification; state-space model; automated seizure detection

1. Introduction

Epilepsy is a chronic neurological condition that affects approximately 50 million people of all ages worldwide [1]. Many patients whose seizures fail to respond to therapy undergo prolonged EEG monitoring to aid in diagnosis and plan therapy. This may be conducted in an ambulatory or inpatient setting [2]. EEG is recorded for variable periods [3,4], and these studies generate substantial data. Trained electroencephalographers must spend significant time and effort reviewing EEG to make a diagnosis, and manual review is subject to error [5]. An automatic system that detects and annotates seizures by analyzing EEG would be beneficial in reducing the time professionals must spend reviewing long-term EEG studies.

To automatically detect seizures, various detection algorithms have been proposed. Deburchgraeve et al. [6] offered an approach, tested on the recordings of 21 patients, to both analyze the correlation between high-energetic segments of EEG and detect increases in low-frequency activity with high autocorrelation, yielding 88% sensitivity and 75% positive predictive value. In another study [7] with EEG data from 17 patients, a support vector machine (SVM) was used to distinguish between seizure and non-seizure EEG epochs. The system achieved an average detection rate of 89%, with one false seizure per hour. A study employing a convolutional neural network (CNN) [8] used a training dataset of five patients and showed accuracy, specificity, and sensitivity of 88.7%, 90.0%, and 95.0%, respectively. Hassanpour et al. [9] utilized a singular value decomposition (SVD) technique to apply to a time–frequency (TF) distribution of EEG epochs and showed a 92.5% detection rate. Wang et al. [10] employed a combination of multi-domain and nonlinear features, which increased the classification accuracy to 99.3%. However, in most of these works, only a few EEG recordings were used, limiting performance evaluation.

An automated seizure detection method based on EEG state-space model identification is presented here. This method was previously shown to be effective and efficient in classifying sleep stages [11]. This work presents a preliminary evaluation of the use of machine-learning-based system identification techniques to detect seizures. We applied this method to three datasets: two provided by Thomas Jefferson University Hospital for training and testing in clinical settings, and the third was a publicly available CHB-MIT dataset [12] for cross-validation. First, we applied a system identification technique to build a mathematical model of dynamic systems of EEG epochs. Then, different orders of the dynamic system were simulated and compared with the original EEG signal. Each element of the state matrices was considered part of the features to be fed into various classifiers for training and testing. The results of this preliminary study demonstrate the potential of this proposed method to effectively detect seizures and justify further development. The use of machine-learning-based system identification techniques for seizure detection has the potential to significantly improve the accuracy and efficiency of epilepsy diagnosis, making it a valuable tool for healthcare professionals.

2. Material and Methods

2.1. Jefferson Dataset

This retrospective study of deidentified EEG data was approved by the Thomas Jefferson University Institutional Review Board. Two datasets were provided by Thomas Jefferson University Hospital. The first dataset was used for training and testing. A total of 124 EEG recordings from 79 patients were included: 55 EEG recordings with interictal, non-seizure EEG data and 69 recordings containing seizures. EEG was recorded using international 10–20 system electrode placements plus T1 and T2 leads, and EEG was sampled at 1000 Hz. EEG recordings were visually interpreted and manually annotated by board-certified clinical neurophysiologists. The mean recording duration per EEG was 90.36 min (range, 2.8 to 180 min) for 55 non-seizure EEGs and 6 min (range, 0.7 to 20.1 min) for 69 EEGs containing seizures, for a total combined duration of 89.73 h. In Figure 1, an example of both 20 s long seizure and non-seizure EEG data is presented, illustrating the distinct differences between the two conditions. In the seizure EEG data, the characteristic pattern of seizure activity is evident through the appearance of abnormal discharges that manifest as bursts, which progressively increase in frequency, evolving into rapid, continuous spikes and waves. This distinctive pattern sets the seizure activity apart from normal, non-seizure EEG data, which does not exhibit these pronounced abnormal discharges.

We then cut the entire EEG dataset into epochs of various lengths, specifically, 1 s, 2 s, 5 s, and 10 s. The increment overlap length between any two consecutive epochs was always 50% of the epoch length. Table 1 illustrates the number of epochs generated for each epoch size.

Using the same EEG leads, the second dataset comprised ten continuous EEG recordings collected from 10 patients. Two recordings had a sampling rate of 1000 Hz, while the rest had a sampling rate of 500 Hz. The mean recording length was 24.4 h (range, 19.7 to 34.4 h), making it an extensive dataset for testing purposes. The dataset included 411 seizure periods, ranging from 16 s to 9.5 min; 1-second-long epochs with 0.5 s overlaps were used for analysis. The specificity was represented by the number of false detections per hour to evaluate the performance in clinical settings. A standard method [13] was applied where a 30 s window was considered a positive seizure detection if more than 50% of the epochs within the window were predicted as seizures. The predictions were then compared with the corresponding time segments in the annotations, and if different, it was marked as a false window. In addition, consecutive false windows were considered a single false detection.

2.2. CHB-MIT Dataset

This dataset was used to validate and compare the results. It was collected at Boston Children’s Hospital [14]. A total of 664 EEG recordings were collected from 23 subjects, 129 of which contained seizures. Most EEG recordings were sampled at 256 Hz using the international 10–20 system. The length and increments of each epoch were 1 s and 0.5 s, respectively. We randomly selected twenty 30 s long, non-seizure EEG segments from each subject for training. In total, we had 28,320 non-seizure EEG epochs and 14,840 seizure EEG epochs.

2.3. Data Preprocessing

As shown in Figure 2, the EEG dataset was initially divided into several segments with identical lengths. Next, a bandpass filter was applied to remove unwanted signals from the EEG data, such as low-frequency noise and high-frequency artifacts. Then, the dynamic systems were estimated for each bandpass-filtered EEG epoch. Finally, the state matrices of the estimated dynamic systems were extracted and used as features to train the machine learning classifiers. The state matrices were considered a compact representation of the dynamic characteristics of the EEG signal, which can capture the unique patterns of seizures. The classifiers were trained using these feature vectors, and the trained classifiers were then used to detect seizures in new EEG recordings.

The EEG data were filtered by a second-order Butterworth bandpass filter at 0.5–29 Hz, as this frequency range removes most unwanted signals from the EEG data, such as low-frequency noise and high-frequency artifacts, yet includes most seizure frequencies [15]. A 60 Hz notch filter was applied to remove the power line interference. The effectiveness of these filters is evident in Figure 3, which contrasts the raw and filtered EEG data. The raw data display considerable contamination from motion artifacts and power line noise, whereas the filtered data present a clearer signal in which the seizure indicators are preserved without interference. Once filtered, the data were kept as a

19 \times n

(

c h a n n e l \times l e n g t h

) matrix without time information. Each time we trained the classifier, we selected the data of a single channel from this matrix as a

1 \times n

vector:

Y = [y_{1}, y_{2}, y_{3}, \dots, y_{n}], n = L, y_{n} \in R

(1)

where

Y

is the vector of the filtered EEG data points,

y_{n}

is sampled at 1000 Hz, and

L

is the length of the vector. Using a one-second-long epoch as an example,

Y

was cut into epochs containing 1000 data points as 1000 Hz × 1 s. There were 500 data point increments between two consecutive epochs, as indicated in Figure 4.

Eventually,

Y

was resized as

Y = {[Y_{1}, Y_{2}, \dots, Y_{p}]}^{T}

, described in (2).

[\begin{matrix} \begin{matrix} Y_{1} \\ Y_{2} \end{matrix} \\ ⋮ \\ \begin{matrix} Y_{i} \\ ⋮ \\ Y_{p} \end{matrix} \end{matrix}] = [\begin{matrix} \begin{matrix} y_{1} \\ y_{501} \end{matrix} \\ ⋮ \\ \begin{matrix} y_{500 * (i - 1) + 1} \\ ⋮ \\ y_{500 * (p - 1) + 1} \end{matrix} \end{matrix} \begin{matrix} \begin{matrix} y_{2} \\ y_{502} \end{matrix} \\ ⋮ \\ \begin{matrix} y_{500 * (i - 1) + 2} \\ ⋮ \\ y_{500 * (p - 1) + 2} \end{matrix} \end{matrix} \begin{matrix} \dots \\ \dots \\ \dots \end{matrix} \begin{matrix} \begin{matrix} y_{1000} \\ y_{1500} \end{matrix} \\ ⋮ \\ \begin{matrix} y_{500 * (i + 1)} \\ ⋮ \\ y_{500 * (p + 1)} \end{matrix} \end{matrix}]

(2)

p = f l o o r (\frac{L}{500}) - 1, i \in [1, p]

(3)

where

Y_{i}

was considered the dataset for dynamic model estimation.

2.4. Model Estimation

This study used a system identification technique to estimate state matrices from each EEG epoch as a feature for classification. System identification uses measurements of the EEG output signal to build mathematical models of dynamic systems [16]. This means no inputs are specified since the focus is on the dynamic properties of a time series, the EEG signal [17].

There are multiple system identification approaches available, namely, autoregressive (AR), transfer function (TF), and state space (ss). We chose the state-space method, as it does not require any input to identify systems. The state-space method represents a dynamic system in terms of state and output equations, describing how the states and outputs change over time.

Now, consider the following discrete-time state-space dynamic system to be estimated from

Y_{i}

:

\{\begin{matrix} x_{T + 1} = A x_{T} + B u_{T} + {K e}_{T} \\ y_{T} = C x_{T} + D u_{T} + e_{T} \end{matrix}

(4)

where, for an

m^{t h}

-order system,

y_{T} \in R

is the output vector, and

x_{T} \in R^{m \times 1}

is the vector of states.

A \in R^{m \times m}

is the state transfer matrix.

B \in R^{m \times 1}

is the input matrix.

C \in R^{1 \times m}

is the output matrix.

C \in R^{1 \times 1}

is the feedthrough matrix.

K \in R^{m \times m}

is the steady-state Kalman gain.

e_{T} \in R

is the zero-mean white noise. As for the input,

u_{T}

, typically, in a case like this, it should be a

1 \times 1

scaler. However, as the output for this model is an EEG signal, also known as a time series, no input is available [17]. Thus, (4) can be written as Equation (5), also represented in Figure 5.

\{\begin{matrix} x_{T + 1} = A x_{T} + {K e}_{T} \\ y_{T} = C x_{T} + e_{T} \end{matrix}

(5)

Without the input,

u_{T}

, the remaining state matrices,

A

,

C

, and

K

, of the state-space dynamic system can be estimated using the n4sid method [18]. We began by constructing a block Hankel matrix,

H (k)

, from each EEG epoch. The Hankel matrix exhibited a unique structure, with each row shifted one time step down from the previous row.

H (k) = [\begin{matrix} y_{k + 1} & y_{k + 2} & \dots & y_{k + q} \\ y_{k + 2} & y_{k + 3} & \dots & y_{k + q + 1} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ y_{k + p} & y_{k + p + 1} & \dots & y_{k + q + p - 1} \end{matrix}]

(6)

where the number of rows and columns of the Hankel matrix is

p

and

q

.

p

is the chosen block size, while

p + q - 1

equals the size of the system’s output data, which is 1000 in our case. In this study, the block size,

p

, was chosen to be twice the estimated system order,

m

, to ensure that the constructed Hankel matrix adequately captured the underlying system dynamics while avoiding unnecessary complexity. As a result, the size of the Hankel matrix is

2 m \times (1001 - 2 m)

, and (6) can be rewritten as

H (k) = [\begin{matrix} y_{k + 1} & y_{k + 2} & \dots & y_{k + 1001 - 2 m} \\ y_{k + 2} & y_{k + 3} & \dots & y_{k + 1002 - 2 m} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ y_{k + 2 m} & y_{k + 2 m + 1} & \dots & y_{1000} \end{matrix}]

(7)

We then applied singular value decomposition (SVD) to

H (0)

in order to obtain a low-rank approximation of the matrix that captures the essential system dynamics.

H (0) = R Σ S^{T}

(8)

where

H (0)

is the Hankel matrix at

k = 0

;

R

and

S

are

2 m

by

2 m

and

(1001 - 2 m)

by

(1001 - 2 m)

orthonormal matrices, respectively; and

Σ

is a

2 m

by

(1001 - 2 m)

matrix with nonnegative numbers in the diagonal. For an

m^{t h}

-order system, the ideal matrix

Σ

can be written as

Σ = [\begin{matrix} Σ_{m} & 0 \\ 0 & Σ_{*} \end{matrix}]

(9)

The singular values,

Σ_{m} \in R^{m \times m}

, in matrix

Σ

represent the importance of the corresponding singular vectors in capturing the variance of the output data. Smaller singular values or zeros in

Σ_{*}

indicate that the corresponding singular vectors contribute less to the overall structure of the data. Thus, a minimal realization is obtained by eliminating

Σ_{*}

, and (9) can be rewritten as

Σ = [\begin{matrix} Σ_{m} & 0 \\ 0 & 0 \end{matrix}]

(10)

Then, we chose only the rows and columns corresponding to the

m^{t h}

model to form the matrices

R_{m}

and

S_{m}

and rewrite (8) as:

H_{m} (0) = R_{m} Σ_{m} {S_{m}}^{T}

(11)

Then, the discrete-time system realization can be represented by

A = {Σ_{m}}^{- \frac{1}{2}} {R_{m}}^{T} H_{m} (1) S_{m} {Σ_{m}}^{- \frac{1}{2}}

(12)

C = E^{T} R {Σ_{m}}^{\frac{1}{2}}

(13)

K = {Σ_{m}}^{\frac{1}{2}} {S_{m}}^{T} E

(14)

where

E = [I_{m}, 0]

with

I

being an

m \times m

identity matrix.

Constant

m

donates as the order of the system. This study estimated the models in the 3rd–10th orders. To arbitrate the most felicitous order for our study, on the one hand, we analyzed the performance of model estimation (Fit) that indicated similitude between the original data and the simulation of the estimated model. The Fit was evaluated by the normalized root mean squared error (NRMSE):

F i t = 1 - 100 \times \frac{\sqrt{\frac{1}{1000} \sum_{T = 1}^{1000} {(y_{T}^{s i m} - y_{T})}^{2}}}{σ (y)}

(15)

where

y_{T}^{s i m}

is the estimated output at time

T

, and

σ (y)

is the standard deviation of the EEG epoch.

On the other hand, this ultimate system order was also determined by the performance of feature classification as we fed different orders of models into various classifiers.

2.5. Features and Classification

For each

1 \times 1000

EEG epoch vector,

Y_{i}

, the state matrices are

A = [\begin{matrix} a_{11} & \dots & a_{1 m} \\ ⋮ & ⋱ & ⋮ \\ a_{m 1} & \dots & a_{m m} \end{matrix}], C = [\begin{matrix} c_{1} & \dots & c_{m} \end{matrix}], K = {[\begin{matrix} k_{1} & \dots & k_{m} \end{matrix}]}^{T}

(16)

The

1 \times (m^{2} + 2 m)

feature vector to be fed into the classifier should consist of

A

,

C

, and

K

, as

{f e a t u r e}_{T} = [a_{11} \dots a_{m m}, c_{1} \dots c_{m}, k_{1} \dots k_{m}]

(17)

For each EEG recording, there were

p

feature vectors, as described in (2). For example, the Jefferson dataset contained 149,595 feature vectors, of which 49,595 were labeled as seizures. The label of each feature vector was made according to the annotation provided by the expert viewers. There were only two classes, “1” for seizure and “0” for non-seizure.

Sensitivity, specificity, and accuracy were used as the performance statistics. These terms can be determined by the “Standard of Truth” [19] as true positive (TP), true negative (TN), false negative (FN), and false positive (FP) (18)–(20).

S e n s i t i v i t y = \frac{T P}{T P + F N}

(18)

S p e c i f i c i t y = \frac{T N}{T N + F P}

(19)

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(20)

Both TP and TN indicate consistency between the annotation and classifier, while FP and FN suggest contradiction. In this study, we defined seizure as positive and non-seizure as negative. Therefore, the sensitivity and specificity represent the degree of excellence of a classifier in identifying seizures and non-seizures, respectively. And the accuracy is the proportion of correct labels throughout the entire dataset.

All three statistics were verified by the 10-fold cross-validation method. The EEG epochs were first shuffled randomly and divided into ten equal portions. Then, 90% of the EEG epochs were used as a training dataset, while the rest, 10%, were for testing. This cross-validation process was repeated ten times, with each of the ten portions used once as the validation dataset. The final evaluation statistics reported in this study were the average value obtained from the above ten tests.

To validate the efficacy of our proposed model and feature set for automated seizure detection, this study selected kernel naïve Bayes, linear discriminant, linear SVM, fine KNN, and bagged trees as the candidate classifiers. By employing a range of basic, yet diverse, classifiers, we aimed to test our model’s robustness and generalizability across different algorithmic approaches. We trained all the classifiers with features converted from 3rd- to 10th-order models. The comparison was made across different classifiers and orders to designate the most suitable dual. Once the system order and classifier were settled, we performed the validation by applying our method to a publicly available dataset: CHB-MIT.

3. Results

3.1. Model Estimation

State-space models of orders 3 to 10 were estimated for each EEG epoch, and the model order,

m

, was chosen based on a trade-off between the feature’s size and the classifier’s performance. Choosing the appropriate model order is vital for an automated seizure detection system, as it can affect the system’s performance. The higher the order, the more complex the model, which can lead to overfitting. On the other hand, if the order is too low, the model may not be able to capture the essential dynamics of the EEG signal. The feature size is the number of parameters used to represent the EEG signal and was provided in (17) as

1 \times (m^{2} + 2 m)

. Therefore, the selected order should be as small as possible to minimize the training cost while maintaining a reasonable

F i t

accuracy and a decent classifying performance.

The average

F i t

accuracy was relatively steady at 98.3% ± 0.11% across the selection of model orders from 3 to 10. The decision on the model order was solely based on the classifier’s performance rather than the fit accuracy of the state-space model. By selecting a more petite model order, the feature size will be smaller, reducing the computational cost of the classifier and allowing for faster training and prediction.

3.2. Classification

We have chosen kernel naïve Bayes, linear discriminant, SVM, KNN, and decision trees as the classifiers to train our features. These basic classifiers consume less computing time and resources to train and validate than deep learning, as we hope to find that the system identification method can effectively and efficiently detect seizures. By feeding 1 s epoch feature vectors of the 3rd- to 10th-order systems into the classifiers, we could then evaluate the performance of different model orders and determine which order is most appropriate for our application.

The results are illustrated in Figure 6 and Table 2. Correlations between the growth of system order and validation accuracy were not consistent among different classifiers. The KNN and decision trees started with an escalation trend and then reached a peak turning point at the fifth order. The kernel naïve Bayes, linear discriminant, and SVM increased accuracy as the system order increased from 3 to 9 and then declined in performance at order 10. Moreover, decision trees outperformed the rest of the classifiers with the highest standard over the systems of every order. KNN caught up with the others at the fifth order with a 93.6% accuracy. Eventually, with the fifth-order system, the trees provided the highest accuracy, 96.0%, with a sensitivity of 92.7% and a specificity of 97.6%. These results indicate that the decision trees classifier was better suited for this analysis and that choosing the fifth model order balanced complexity and performance.

We also estimated the fifth-order system of epochs in different lengths, such as 2 s, 5 s, and 10 s, in addition to 1 s long epochs. The decision tree classifier then trained the feature vectors obtained from these epochs. The results of the different epoch lengths are listed and compared in Table 3 to show the effect of the length of the epochs on the system’s performance. As is seen in the table, performance metrics decreased as the epoch length increased. Specifically, accuracy dropped from 96.0% to 94.9%, sensitivity fell from 92.7% to 91.5%, and specificity sank from 97.6% to 96.7% as the epoch length grew from 1 to 10 s. These results indicate that the choice of epoch length impacts the performance of the seizure detection system.

The analysis of the ten continuous EEG recordings showed that, on average, the system produced 0.5 false detections per hour of the EEG recording, with a margin of error of ±0.28 false detections per hour. In addition, by using a 30 s decision window and 1 s long epochs with 0.5 s overlaps, the method could detect all seizures within the first feasible window, with no additional delay other than the minimum required 15 s within each 30 s window to make the decision. Figure 7 shows a continuous EEG recording lasting one minute. During this recording, a seizure began at 46 min and 47 s. Our detection method accurately identified this seizure. More information about the overall results can be found in Table 4.

To validate the proposed method and conduct a further assessment, we also applied this method to the CHB-MIT dataset. This method achieved 94.1% accuracy, with 87.6% sensitivity and 97.5% specificity. Hence, our proposed method works on an independent dataset. Additionally, we observed that these results were not subject-specific but generalized over the entire dataset. Table 5 illustrates a comparison of the performance results of other studies that also worked on the whole CHB-MIT database. The results show that the proposed method outperforms other studies regarding accuracy and specificity. Sensitivity was better than all but one method.

4. Discussion

The present analysis shows that state-space modeling, combined with a decision tree machine learning classifier, is an effective approach to automated seizure detection. We achieved good sensitivity and specificity, as well as accuracy. However, accuracy cannot fully represent the actual performance of a detector. The reason is that most epochs are non-seizure epochs for any EEG recording. For example, a 90% accuracy could be generated by 99% specificity and 50% sensitivity if 80% of the dataset is negative. Hence, the problem at hand is complex. For clinical utility, both high sensitivity and specificity are required. Sensitivity is most important, as failure to detect seizures makes any detection system less useful. Efficiency is achieved with high specificity, minimizing the time needed for human review. The method described in this paper yielded superior accuracy and specificity compared with most other reports. Our method better-identified epochs without seizures and was as good at detecting seizures with high accuracy. This result is acceptable for preliminary work but may be insufficient for clinical needs and needs further improvement, as each patient can yield a unique EEG wavelet pattern [25] due to differing seizure types, and some patients have multiple seizure types. Therefore, if we model and train on individual EEGs yet characterize them as similar, the detection rate is undoubtedly decreased.

There are other studies that have focused on a “case-by-case” situation. For example, Zabihi et al. [26] performed a subject-specific study with an average of 93.7% sensitivity and a specificity of 99.05% in four subjects. Another study [27] proposed a patient-dependent system with 97.12% specificity and 99.29% sensitivity. Alternatively, an individual classifier can be built for each channel and seizure pattern [28], eventually reaching an average accuracy of 95.12%. Similarly, for a subject-basis analysis, our method generated an average of 98.3% accuracy, 97.4% sensitivity, and 98.4% specificity. However, this level of performance has only been shown post hoc, and the clinical problem of seizure detection in large numbers of patients is not suited to this approach. Such a precisely customized classifier would only prove useful if individualized preliminary data were available for training. Therefore, both comprehensive dataset-based and customized classifiers can be useful, though the former are best suited when an unknown EEG dataset must be analyzed. A classifier pre-trained on a full-scale dataset validates itself in robustness, adaptation, and performance. A customized classifier would likely be better with delayed deployment after onsite training.

Our method demonstrated its clinical potential through a dataset of 10 consecutive EEG recordings. By using a 30 s decision window and 1 s long epochs with 0.5 s overlap, the system could accurately detect all seizures within the first feasible window. The low false detection rate confirms the system’s effectiveness. On the other hand, the sizeable false detection range indicated the variability in the false detection rate across the ten patients. Several solutions might address this issue, including improving the signal quality by implementing advanced noise reduction techniques or re-evaluating the decision threshold to account for the impact of noise and artifacts on the data. Additionally, extra data sources, such as clinical history, demographic information, or behavioral data, could be integrated into the model to reduce the false detection rate.

Additional work is required. The proposed method is still in the early stages of development, and there is room for improvement. Our study was restricted to the analysis of one-second epochs. A one-size-fits-all approach may not be appropriate. However, as seen in Table 3, increasing epoch size is associated with decreased detection sensitivity. One possible reason for this is that using longer epochs reduces the number of available segments for training decreases. This negatively impacted the classifier’s overall performance. Also, the length of the EEG epoch can affect the sensitivity and specificity of seizure detection methods. Shorter epochs can increase the temporal resolution but may miss ongoing seizures with ictal patterns that last longer than the epoch length [29]. On the other hand, longer epochs may provide a better view of prolonged seizures and increase the chance of missing short seizures that occur within these longer epochs. Therefore, it is essential to carefully consider the trade-offs and choose an appropriate epoch length for the specific application and dataset.

One of our proposed method’s limitations is that it does not specify the seizure types. Defining seizure types may allow the classifiers to be trained to recognize specific seizure types and improve the method’s specificity. However, it is important to acknowledge that this hypothesis must be tested and validated through further research and experimentation to determine its actual impact on the model’s specificity. Another area for improvement is the method’s robustness in dealing with artifacts, as artifacts can significantly affect accuracy. Deep learning techniques, such as convolutional neural networks (CNNs), have shown promise in identifying and removing artifacts from EEG signals [30]. These techniques could increase the proposed method’s robustness and performance.

5. Conclusions

The proposed system leverages a state-space-model-based system identification method for automated seizure detection in EEG recordings. By processing EEG time series signals, this method constructs mathematical models to efficiently represent signal epochs with a minimal set of parameters. These parameters, utilized as features for machine learning classifiers, demonstrated the efficacy of combining a fifth-order dynamic system with a decision tree classifier. An evaluation of this approach using the Jefferson and CHB-MIT datasets yielded high accuracies and excellent specificities while also indicating that sensitivity has the potential for enhancement. Adjusting the classifier on an individual basis substantially improved sensitivity and accuracy, underscoring the effectiveness of personalized detection strategies. The proposed method also demonstrated its potential in a clinical setting through this dataset of ten consecutive EEG recordings. Future developments could focus on integrating larger and more diverse datasets alongside advanced deep learning classifiers to broaden the system’s capability in identifying various seizure types with increased sensitivity. This work lays the foundation for more automated, accurate, and efficient seizure diagnosis, promising to augment clinical practice and patient outcomes.

Author Contributions

The authors confirm their contributions to the paper as follows: Z.W. and A.G. made a substantial contribution to the study conception and design; M.R.S. and D.W. made a substantial contribution to the acquisition of data; Z.W. and A.G. made a significant contribution to the analysis of data; and Z.W. and M.R.S. drafted and edited the article. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The data-sharing project with this study was approved by the Institutional Review Board of Thomas Jefferson University Hospital.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data supporting this study’s findings are available from the corresponding authors upon reasonable request. The data are not publicly available due to the data sharing restriction from the data provider.

Acknowledgments

We extend our heartfelt thanks to Karishma Kurowski, Zachary Waldman, and other staff members at Thomas Jefferson University Hospital and Drexel University for their invaluable contributions and dedicated efforts in handling the extensive paperwork required for this study. Their commitment and meticulous attention to detail were instrumental in facilitating the successful completion of our research. We deeply appreciate their hard work and support throughout this project.

Conflicts of Interest

Michael Sperling has received compensation for speaking at CME programs from Medscape, Projects for Knowledge, International Medical Press, and Darnitsa. He has consulted for Medtronic, Neurelis, and Johnson & Johnson. He has received research support from Medtronic; Neurelis; SK Life Science; Takeda; Xenon; Cerevel; UCB Pharma; Janssen; Equilibre; Epiwatch; and Byteflies. He has received royalties from Oxford University Press and Cambridge University Press. None of the other authors have any conflicts of interest to disclose.

References

World Health Organization. Epilepsy. Available online: https://www.who.int/en/news-room/fact-sheets/detail/epilepsy (accessed on 1 August 2022).
Epilepsy Foundation. Ambulatory EEG. Available online: https://www.epilepsy.com/learn/diagnosis/eeg/ambulatory-eeg (accessed on 1 August 2022).
Brophy, G.M.; Bell, R.; Claassen, J.; Alldredge, B.; Bleck, T.P.; Glauser, T.; LaRoche, S.M.; Riviello, J.J.; Shutter, L.; Sperling, M.R.; et al. Guidelines for the Evaluation and Management of Status Epilepticus. Neurocrit. Care 2012, 17, 3–23. [Google Scholar] [CrossRef] [PubMed]
Stanford Health Care. Neurodiagnostic Labs. Available online: https://stanfordhealthcare.org/medical-clinics/neurodiagnostic-labs/services.html (accessed on 15 March 2024).
Smith, S.J. EEG in the diagnosis, classification, and management of patients with epilepsy. J. Neurol. Neurosurg. Psychiatry 2005, 76 (Suppl. S2), ii2–ii7. [Google Scholar] [CrossRef] [PubMed]
Deburchgraeve, W.; Cherian, P.J.; De Vos, M.; Swarte, R.M.; Blok, J.H.; Visser, G.H.; Govaert, P.; Van Huffel, S. Automated neonatal seizure detection mimicking a human observer reading EEG. Clin. Neurophysiol. 2008, 119, 2447–2454. [Google Scholar] [CrossRef] [PubMed]
Temko, A.; Thomas, E.; Marnane, W.; Lightbody, G.; Boylan, G. EEG-based neonatal seizure detection with support vector machines. Clin. Neurophysiol. 2011, 122, 464–473. [Google Scholar] [CrossRef] [PubMed]
Acharya, U.R.; Oh, S.L.; Hagiwara, Y.; Tan, J.H.; Adeli, H. Deep convolutional neural network for the automated detection and diagnosis of seizure using EEG signals. Comput. Biol. Med. 2018, 100, 270–278. [Google Scholar] [CrossRef] [PubMed]
Hassanpour, H.; Mesbah, M.; Boashash, B. Time-frequency feature extraction of newborn EEG seizure using SVD-based techniques. EURASIP J. Adv. Signal Process. 2004, 2004, 898124. [Google Scholar] [CrossRef]
Wang, L.; Xue, W.; Li, Y.; Luo, M.; Huang, J.; Cui, W.; Huang, C. Automatic epileptic seizure detection in EEG signals using multi-domain feature extraction and nonlinear analysis. Entropy 2017, 19, 222. [Google Scholar] [CrossRef]
Shen, H.; Xu, M.; Guez, A.; Li, A.; Ran, F. An accurate sleep stages classification method based on state space model. IEEE Access 2019, 7, 125268–125279. [Google Scholar] [CrossRef]
Goldberger, A.L.; Amaral, L.A.; Glass, L.; Hausdorff, J.M.; Ivanov, P.C.; Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.K.; Stanley, H.E. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation 2000, 101, e215–e220. [Google Scholar] [CrossRef] [PubMed]
Gómez, C.; Arbeláez, P.; Navarrete, M.; Alvarado-Rojas, C.; Le Van Quyen, M.; Valderrama, M. Automatic seizure detection based on imaged-EEG signals through fully convolutional networks. Sci. Rep. 2020, 10, 21833. [Google Scholar] [CrossRef] [PubMed]
Shoeb, A.H. Application of Machine Learning to Epileptic Seizure Onset Detection Treatment. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 2009. [Google Scholar]
Gotman, J. Automatic recognition of epileptic seizures in the EEG. Electroencephalogr. Clin. Neurophysiol. 1982, 54, 530–540. [Google Scholar] [CrossRef] [PubMed]
MathWorks. System Identification Overview. Available online: https://www.mathworks.com/help/ident/gs/about-system-identification.html (accessed on 2 August 2022).
Caicedo, J.M. Practical guidelines for the natural excitation technique (NExT) and the eigensystem realization algorithm (ERA) for modal identification using ambient vibration. Exp. Tech. 2011, 35, 52–58. [Google Scholar] [CrossRef]
Van Overschee, P.; De Moor, B. N4SID: Subspace algorithms for the identification of combined deterministic-stochastic systems. Automatica 1994, 30, 75–93. [Google Scholar] [CrossRef]
Zhu, W.; Zeng, N.; Wang, N. Sensitivity, specificity, accuracy, associated confidence interval and ROC analysis with practical SAS implementations. In Proceedings of the NESUG Proceedings: Health Care and Life Sciences, Baltimore, MD, USA, 14 November 2010; Volume 19, p. 67. [Google Scholar]
Khan, Y.U.; Rafiuddin, N.; Farooq, O. Automated seizure detection in scalp EEG using multiple wavelet scales. In Proceedings of the IEEE International Conference on Signal Processing, Computing and Control, Solan, India, 15–17 March 2012; pp. 1–5. [Google Scholar]
Gill, A.F.; Fatima, S.A.; Usman Akram, M.; Khawaja, S.G.; Awan, S.E. Analysis of EEG Signals for Detection of Epileptic Seizure Using Hybrid Feature Set. In Theory and Applications of Applied Electromagnetics; Sulaiman, H., Othman, M., Abd. Aziz, M., Abd Malek, M., Eds.; Lecture Notes in Electrical Engineering; Springer: Cham, Switzerland, 2015; Volume 344. [Google Scholar] [CrossRef]
Lima, C.A.M.; Coelho, A.L.V. Kernel machines for epilepsy diagnosis via EEG signal classification: A comparative study. Artif. Intell. Med. 2011, 53, 83–95. [Google Scholar] [CrossRef] [PubMed]
Birjandtalab, J.; Pouyan, M.B.; Cogan, D.; Nourani, M.; Harvey, J. Automated seizure detection using limited-channel EEG and nonlinear dimension reduction. Comput. Biol. Med. 2017, 82, 49–58. [Google Scholar] [CrossRef] [PubMed]
Fergus, P.; Hussain, A.; Hignett, D.; Al-Jumeily, D.; Abdel-Aziz, K.; Hamdan, H. A machine learning system for automated whole-brain seizure detection. Appl. Comput. Inform. 2016, 12, 70–89. [Google Scholar] [CrossRef]
Sperling, M.R. EEG Reading Session. J. Clin. Neurophysiol. 2006, 23, 230–237. [Google Scholar] [CrossRef] [PubMed]
Zabihi, M.; Kiranyaz, S.; Ince, T.; Gabbouj, M. Patient-specific epileptic seizure detection in long-term EEG recording in paediatric patients with intractable seizures. In Proceedings of the IET Intelligent Signal Processing Conference 2013 (ISP 2013), London, UK, 2–3 December 2013. [Google Scholar]
Pinto-Orellana, M.A.; Cerqueira, F.R. Patient-specific epilepsy seizure detection using random forest classification over one-dimension transformed EEG data. In Proceedings of the International Conference on Intelligent Systems Design and Applications, Porto, Portugal, 16–18 December 2016; Springer: Cham, Switzerland, 2016; pp. 519–528. [Google Scholar]
Ahmad, M.A.; Khan, N.A.; Majeed, W. Computer assisted analysis system of electroencephalogram for diagnosing epilepsy. In Proceedings of the 2014 22nd International Conference on Pattern Recognition (ICPR), Stockholm, Sweden, 24–28 August 2014; IEEE Computer Society: Washington, DC, USA, 2014; pp. 3386–3391. [Google Scholar]
Jadeja, N.M. Montages. In How to Read an EEG; Cambridge University Press: Cambridge, MA, USA, 2021; pp. 149–159. [Google Scholar]
Nordin, A.D.; Hairston, W.D.; Ferris, D.P. Dual-electrode motion artifact cancellation for mobile electroencephalography. J. Neural Eng. 2018, 15, 056024. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Exemplar non-seizure (left) and seizure (right) EEG data.

Figure 2. Schematic overview of the complete seizure detection algorithm.

Figure 3. Comparison between raw (left) and filtered (right) data within a 20 s window.

Figure 4. Epochs with increments.

Figure 5. Block diagram for the proposed dynamic state-space model.

Figure 6. Classifier accuracy over systems of different orders.

Figure 7. Exemplar of seizure EEG and its prediction score.

Table 1. Number of epochs generated for each epoch length.

Epoch Size	1 s	2 s	5 s	10 s
Seizure Epochs	49,595	24,386	9837	4866
Non-Seizure Epochs	100,000	49,611	19,182	9340
Total Epochs	149,595	73,997	29,019	14,206

Table 2. Classifier accuracy over systems of different orders.

	3rd	4th	5th	6th	7th	8th	9th	10th
Classifiers	3rd	4th	5th	6th	7th	8th	9th	10th
Discriminant	72.7%	73.0%	74.1%	76.1%	77.9%	77.6%	77.5%	77.3%
Bayes	72.9%	73.5%	74.2%	76.1%	76.2%	76.1%	75.5%	72.3%
KNN	89.7%	91.9%	93.6%	92.4%	90.8%	89.2%	87.8%	84.6%
SVM	72.0%	72.3%	73.5%	75.6%	76.6%	77.1%	76.5%	72.3%
Trees	93.8%	94.9%	96.0%	95.1%	94.8%	94.1%	93.6%	93.1%

Table 3. Training results for different-length epochs.

Epoch Size	Increment	Sensitivity	Specificity	Accuracy
1 s	0.5 s	92.7%	97.6%	96.0%
2 s	1 s	92.6%	97.4%	95.8%
5 s	2.5 s	92.3%	97.0%	95.4%
10 s	5 s	91.5%	96.7%	94.9%

Table 4. Detail of 10 continuous EEG records.

Subject	EEG Duration (h)	Number of Seizures	Seizure Length (s)	Total False Detections	False Detections per Hour
1	24	1	576	5	0.2
2	24	7	43~104	10	0.4
3	21.3	44	30~310	18	0.8
4	24.1	114	30~49	28	1.2
5	24	13	30~120	10	0.4
6	19.7	120	16~120	10	0.5
7	24	1	122	6	0.3
8	24.1	72	19~180	12	0.5
9	24.8	33	37~77	10	0.4
10	34.4	6	58~297	9	0.3
Total	244.4	411		118
Average	24.44	41.1		11.8	0.5
STDEV.	3.6	43.6		6.3	0.28

Table 5. Performance comparison of studies on whole CHB-MIT datasets.

Method	Accuracy	Sensitivity	Specificity
Khan [20]	91.8%	83.6%	100%
Gill [21]	86.93%	86.26%	87.58%
Lima [22]	88.45%	85.59%	91.32%
Birjandtalab [23]	-	80.87%	47.45%
Fergus [24]	-	84%	85%
This Work	94.1%	87.6%	97.5%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, Z.; Sperling, M.R.; Wyeth, D.; Guez, A. Automated Seizure Detection Based on State-Space Model Identification. Sensors 2024, 24, 1902. https://doi.org/10.3390/s24061902

AMA Style

Wang Z, Sperling MR, Wyeth D, Guez A. Automated Seizure Detection Based on State-Space Model Identification. Sensors. 2024; 24(6):1902. https://doi.org/10.3390/s24061902

Chicago/Turabian Style

Wang, Zhuo, Michael R. Sperling, Dale Wyeth, and Allon Guez. 2024. "Automated Seizure Detection Based on State-Space Model Identification" Sensors 24, no. 6: 1902. https://doi.org/10.3390/s24061902

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automated Seizure Detection Based on State-Space Model Identification

Abstract

1. Introduction

2. Material and Methods

2.1. Jefferson Dataset

2.2. CHB-MIT Dataset

2.3. Data Preprocessing

2.4. Model Estimation

2.5. Features and Classification

3. Results

3.1. Model Estimation

3.2. Classification

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI