Abstract

Epilepsy is one of the neurological conditions that are diagnosed in the vast majority of patients. Electroencephalography (EEG) readings are the primary tool that is used in the process of diagnosing and analyzing epilepsy. The epileptic EEG data display the electrical activity of the neurons and provide a significant amount of knowledge on pathology and physiology. As a result of the significant amount of time that this method requires, several automated classification methods have been developed. In this paper, three wavelets such as Haar, dB4, and Sym 8 are employed to extract the features from A–E sets of the Bonn epilepsy dataset. To select the best features of epileptic seizures, a Particle Swarm Optimization (PSO) technique is applied. The extracted features are further classified using seven classifiers like linear regression, nonlinear regression, Gaussian Mixture Modeling (GMM), K-Nearest Neighbor (KNN), Support Vector Machine (SVM-linear), SVM (polynomial), and SVM Radial Basis Function (RBF). Classifier performances are analyzed through the benchmark parameters, such as sensitivity, specificity, accuracy, F1 Score, error rate, and g-means. The SVM classifier with RBF kernel in sym 8 wavelet features with PSO feature selection method attains a higher accuracy rate of 98% with an error rate of 2%. This classifier outperforms all other classifiers.

1. Introduction

Epilepsy is an immensely sensitive and intensely fatal neurological disorder. Approximately, 1% of the world population is suffering from this ailment. It is normally identified by analyzing EEG signals [1]. In the clinics, visual observation of EEG signals is leaned on as the standard method to detect it. This type of detection is time-consuming and induces a lot of errors. Above all, the epileptic seizure should be timely and accurately diagnosed before the patient goes to an ictal state [2]. Hence, an accurate seizure detection system will serve as a top-of-the-line boon to humanity. Various methods of seizure detection technique have been attended; these methods are broadly classified into three major groups: Feature extraction techniques, feature selection, and classifiers [3]. The interpretation and identification of epilepsy using EEG signals have emerged as an interesting study field in the last a few decades. Identification of epileptic seizures, spike detection, interictal and ictal analysis, linear and nonlinear analysis, and optimization algorithms have all been extensively studied [4].

Epilepsy is characterized by abrupt disturbances in the brain's electrical activity, and it is a condition that afflicts a significant number of individuals all over the globe. Epilepsy can lead to many serious injuries, such as broken bones, accidents, and burns. Some of these injuries could even be fatal. This issue reflects a very high societal cost for families of the middle class, and as a result, it causes a great deal of financial difficulties for such families. Both surgical and pharmaceutical approaches may be used, depending on the patient's epilepsy degree of severity, in order to successfully treat the condition [3]. It is not possible to properly manage seizures in all people by using antiepileptic medication, and surgery may also not be an option for certain patients due to the severity of their condition [4].

Therefore, forecasting the onset of an epileptic seizure and then identifying the kind of seizure that has occurred is highly significant. The technique for feature extraction, feature selection, and classification is explained in tremendous detail in this article. There are significant number of publications that have been presented in the literature about the identification of epilepsy based on EEG data.

1.1. Related Works

Discrete wavelet transform (Haar, dB4, Sym8) was employed to extract EEG signal features, and epilepsy risk levels were identified using EM, MEM, and SVD classifiers with code converter technique by Harikumar et al. [4] with an overall accuracy of 97.03% achieved. Murugavel and Ramakrishnan [5] utilized the wavelet transform with approximate entropy to extract the features of EEG signals and multiclass SVM with ELM to identify the epilepsy seizures and reached 96% of classification accuracy. Truong et al. [6] described a hills algorithm to extract the EEG features with a sensitivity of 91.95% and a specificity of 94.05%, and their data demonstrated the efficacy of their proposed approach. Manjusha and Harikumar [7] proposed detrend fluctuation analysis with power spectral density to reduce the dimensionality of EEG data. K-means clustering and KNN classifier were applied to identify the epilepsy risk levels. The proposed work achieved 90.48% of sensitivity and 92.85% of specificity. Radüntz et al. [8] projected a support vector machine (SVM) and artificial neural network (ANN) to identify the epilepsy risk levels, and they used two classification methods, SVM and ANN, and found that ANN was more accurate than SVM (95.85% vs. 94.04%).

Ijaz et al. [9] utilized hybrid prediction model with density-based spatial clustering of applications with noise to detect the outliers of diabetes and hypertensions data and synthetic minority over sampling technique with random forest to identify the diabetes and hypertensions and reached 92.56% of classification accuracy. Vulli et al. proposed a fast AI and a one-cycle policy with tuned dense net 169 to normalize the breast data. The proposed model was used to detect breast cancer metastasis. The proposed work achieved 97.4% accuracy [10]. Ghaemi et al. [11] utilized the improved binary gravitation search algorithm with wavelet domain to extract the features of EEG signals and SVM to identify the optimal channels and reached 80% of classification accuracy. Binary particle swarm optimization (BPSO) was used to choose the best channels, and Gonzalez et al. [12] used fisher discriminant analysis to find the auditory event-related potentials, which gave the best accuracy overall. Poli [13] analyzed the applications of particle swarm optimization (PSO). Independent component analysis (ICA) was employed to extract EMG signal features, and muscle activation intervals were identified using wavelet transform by Azzerboni et al. [14]. Greco et al. [15] used ICA to minimize EMG signal interference and the Morlet wavelet transform to determine muscle activation intervals. To detect the features of an epileptic seizure, various expansion methods have been proposed in the literature, such as discrete wavelet transform (DWT), continuous wavelet transform (CWT), Fourier transform (FT), discrete Fourier transform (DFT), fast Fourier transform (FFT), and short-term Fourier transform (STFT). From the detailed literature survey, it is acceptably assumed that DWT is the best method to detect seizure features. The DWT has the advantage of evaluating the signal in both the time and frequency domain. The following is the list of most important objectives that this research aims to achieve:(a)In DWT, Haar, db4, and Sym8 techniques are proposed in this study to detect the seizure feature.(b)Besides, this research proposes the Particle Swarm Optimization technique to select the best feature.(c)The derived features from DWT are fed into the classifier for further classification. Normally classifiers are used to identify the signals, whether it has epileptic or not. The seven classifiers LR, NLR, GMM, K-NN, and SVM (Linear, Polynomial, and RBF) are used in this study.

The organization of the paper is as follows: Section 2 describes the materials and methods and explains the Haar wavelet, dB4 wavelet, and Sym8 wavelet-based feature extraction of EEG signals, Section 3 discusses the PSO-based feature selection, Section 4 describes the Classifiers, Section 5 exhibits results and discussion and Section 6 presents the conclusion and future work.

2. Materials and Methods

The suggested method for automated epileptic seizure detection is presented in this section. The schematic diagram of the proposed method is shown in Figure 1. In this schematic diagram, the effectiveness of the EEG signal is maximized in the feature extraction stage by using multiple feature extraction approaches. The remainder of this section provides a full discussion of the feature extraction techniques used. A Particle Swarm Optimization (PSO) technique is used to choose the best features of epileptic seizures after the features have been extracted. After feature extraction and selection, the extracted and selected features are deployed to several classifiers, and performance benchmark results are analyzed and compared. The most effective classifiers have the highest benchmark value. Next, the dataset and specifics of each subsystem are detailed. The implementation environment details of the study are given in Table 1.

As given by Table 1, all of the datasets from A to E have 100 epochs. There are 4096 EEG sample values recorded throughout each epoch. The input EEG signals of [4096 × 100] samples per set are reduced to [256 × 100] estimated sample values after passing through wavelets at level 4 decomposition. The input EEG signals of [256 × 100] samples per set are again scale down to [256 × 10] estimated sample values after passing through PSO feature selection. The MATLAB 2019a environment was utilized for each and every simulation that was executed.

2.1. Data Description

The publicly available Bonn University datasets are chosen for the analysis. The Bonn University EEG datasets have A, B, C, D, and E with a sampling frequency of 173.6 Hz [16]. Dataset A represents the normal signal, and E represents the abnormal (epileptic seizure) signal, which is considered for this analysis. The details of the dataset are exhibited in Table 2. All of these segments have 100 epochs, with a recording period of 23.6 seconds. In sets (A) and (B), signals were obtained from healthy patients who would not even have epilepsy, with the set (A) being recorded when the subjects' eyes were open and set (B) when their eyes were closed. Signals from patients with epilepsy were obtained in sets (C), (D), and (E). For set (C) and (D), signals were composed of epileptic patients but not during an incidence of epilepsy, whereas in Set (E), signals were obtained from individuals during an existence of epilepsy [17]. Each epoch has 4096 samples of EEG signal. In this research, we purport to perform the analysis on the A–E epilepsy sets only.

2.2. Wavelet Feature Extraction

In this work, the first step in analyzing epileptic seizures is the extraction of features from the obtained EEG datasets from the Bonn University database. Discrete wavelet transform (DWT) is used to extract the EEG features. The three wavelet families employed for feature extraction from EEG signal (A-E Bonn) datasets at level 4 wavelets decomposition are Haar wavelet (HAAR), dB4 wavelet (Daubechies), and Sym8 wavelet (Symlet8). After passing through the wavelets at level 4 decomposition, the input EEG signals of [4096 × 100] samples per set are reduced to [256 × 100] approximate values of samples. The essential features of wavelets are described in the following section of the paper.

2.2.1. Haar Wavelet

It is essentially a discontinuous function that appears like a step function. It is a wavelet that is comparable to Daubechies dB1. The Haar wavelet is a basic kind of compression that involves average and difference terms, storing detail coefficients, removing data, and reconstructing the matrix to make it seem like the original matrix [18]. Only the Haar wavelet is well supported, orthogonal, and symmetric. The Haar decomposition has excellent time localization because of the compact support of the Haar wavelets [19]. The mathematical expression of the Haar wavelet function and scaling function is represented as follows:

2.2.2. dB4 Wavelet

Ingrid Daubechies, one of the most lustrous luminaries in the domain of wavelet research, devised compactly supported orthonormal wavelets, which made discrete wavelet analysis feasible. The order of the Daubechies family wavelets is N, and dB is the wavelet's “family name.” These wavelets are energy-saving since they are orthogonal and compactly supported [20]. dB4 wavelet function is utilized in this work. Due to the overlapping windows used by Daubechies (dB) wavelets, all high-frequency changes are reflected in the spectrum of the high-frequency coefficient. Filter coefficients are used to create the Daubechies (dB) family of wavelets and scaling functions [21]. The cyclic trigonometric polynomial related with the filter is the first step in Daubechies technique to creating orthogonal compactly supported wavelets. The filter's element sequence is deduced as follows [22]:

The mathematical expression of the Haar wavelet scaling function is represented as

By creating this function to provide orthogonally and smoothness, a new family of wavelets may be generated. As dB4 has a very small basis function, it may separate signal discontinuities more effectively.

2.2.3. Sym8 Wavelet

The Symlet wavelet family is an abbreviation for “symmetrical wavelets.” They are well constructed also to have the least amount of asymmetry and the greatest number of vanishing moments for a certain compact support. In this work, a wavelet function of type Sym8 was used. Sym8 Wavelet is a nearly symmetrical and smooth wavelet function [23]. In order to identify the presence of nonlinearity in the wavelet features, the statistical parameters, such as Mean, Variance, Skewness, Kurtosis, Pearson correlation coefficient, Canonical Correlation Analysis (CCA) for without feature selection method are given in Table 3.

As indicated in Table 3, the statistical parameters of the wavelet feature depict the presence of nonlinearity among the A–E sets for all three wavelets. Pearson Correlation Coefficient (PCC) exhibits peculiar types of no correlation in the intra epochs of A set. At the same time, CCA demonstrates more correlation among the two classes of A–E sets. This is an indication that features in the A–E sets are correlated and overlapped. It glitters in the histogram plots shown below.

Histogram of Haar Wavelet features for Epilepsy E-Set is exposed in Figures 2 and 3 displays the Histogram of Haar Wavelet features for Normal A-Set. Figure 2 demonstrates the nonlinear nature of the wavelet features of the E-set with less outlier. Figure 3 flaunts the availability of outlier in the wavelet features for normal A-set. Finally, these extracted features are then fed as input to the feature selection using Particle Swarm Optimization (PSO) algorithm.

3. PSO as a Feature Selection Algorithm

PSO is an illustrious method developed by Kennedy and Eberhart in 1995 [24]. Each search space is traversed by a collection of particles. The parameters for location and velocity are included in each swarm member . Each particle's location parades a possible optimization solution.

3.1. Algorithm for PSO

Consider an N-dimensional space with particles, each of which represents a significant solution. Particles then are propelled into hyperspace, where their positions are influenced by their own and neighbors’ experiences. Each and every particle is represented as a potential solution to the obvious problem in a -dimensional space in the basic formulation of PSO [25]. In a -dimensional space, the particle is represented as follows:

Furthermore, each particle remembers its prior optimum location. The particle's best prior location may be expressed as

The particle's velocity is expressed as follows:

The greatest fitness value is assigned to the global best. The best particle in the world is chosen from all the particles in the population. It is mathematically expressed as follows:

The cognitive component represents the location of the velocity adjustments made by the particle's prior best position. In contrast, the social component represents the position of the velocity adjustments made by the particle's global best position and is expressed as follows [26].where denotes the inertia weight, η1 and η2 represent the positive acceleration constants. The velocity vector drives the optimization process, which in turn depicts the socially exchanged information. Figure 4 determines the performance of MSE in a number of iteration for PSO feature selection at different weights. It is observed from Figure 4 that the optimum weight is chosen at = 0.5 with lower MSE values compared with other weights values. In this circumstance, inertia is set to , while and are both set to 1. The output of PSO feature selection will make [256 × 100] input as wavelet features are reduced to [256 × 10].

Let us, forthwith, analyze the presence of nonlinearity in the PSO features. In this case, the statistical parameters, such as Mean, Variance, Skewness, Kurtosis, Pearson correlation coefficient, and Canonical Correlation Analysis (CCA) are the best-suited ones. Hence, these parameters are extracted with wavelet feature along with the PSO feature selection method, and the same is given in Table 4. From Table 4, the statistical parameters indicate the presence of nonlinearity for the PSO features among both classes. PCC demonstrates the uncorrelated condition among the intraclass PSO features among the classes. CCA also distinguishes the noncorrelation among inter-classes that are A–E sets.

The normal probability plot for dB4 wavelet coefficient with PSO feature selection for Epilepsy E-Set is shown in Figures 5 and 6 displays the normal probability plot for dB4 wavelet features with PSO feature selection for Normal A-Set. It is observed from Figures 5 and 6 that the PSO features for dB4 wavelet feature extraction exhibits uncorrelated, overlapped, and nonlinear nature of the A–E sets.

The extracted features without PSO feature selection and with PSO feature selection are then fed as input to the various classifiers like linear regression (LR), nonlinear regression (NLR), Gaussian mixture model (GMM), K-Nearest Neighborhood (K-NN), SVM (Linear), SVM (Polynomial), and SVM (RBF) classifiers. These are discussed in the following sections.

4. Mathematical Model-Based Classifiers for Epilepsy Detection

In this section, model-based classifiers are used to classify the features that were extracted and selected with the help of wavelet (Haar, db4, and sym8) techniques and PSO methodology.

4.1. Linear Regression

Linear regression is a supervised learning technique in which one or more independent variables are linearly connected to the dependent variable [27]. Simple linear regressions employ only one independent variable, whereas multiple linear regressions use several independent variables [28]. A residue value is computed depending on the targeted value using conventional linear regression. The linear regression model equation is then implemented to the residue value. The performance of the classifier is evaluated based on the variation from its target value. Mathematical expression for simple linear regression is as follows:where represents the dependent variable, represents the independent variable, indicates the slope line, and represents the intercept of. Therefore, slope line and intercept mathematically expressed as follows:

4.2. Nonlinear Regression

Nonlinear regression (NLR) is a regression analysis method in which empirical data are represented by a function that depends on one or more independent variables and is a nonlinear combination of model parameters. An approach of successive approximations is used to fit the data. Statistical model for nonlinear regression is expressed as follows [29]:where represents the independent variables of vector, indicates the dependent variables of vector, and represents the expectation nonlinear function. Thereupon, expectation nonlinear function mathematically is expressed as follows:

On the basis of the target set, a residue value is computed. The performance is then evaluated by applying the residue value and the EEG signal samples to the nonlinear equation.

4.3. Gaussian Mixture Model (GMM)

The Gaussian mixture model (GMM) is a weighted sum of Gaussian component densities that defines a parametric probability density function. Arbitrary density modeling is possible with GMMs with numerous coefficients. The random vector with probability density is expressed as follows [30]:where represents the number of Gaussian mixture components, indicates the weight of the mixture. Therefore, the mixing parameters are often computed by increasing the log-likelihood function. Mathematical expression for log-likelihood function as follows:

The expectation-maximization (EM) method is a frequently employed strategy for maximum likelihood outcomes.

4.4. K-Nearest Neighborhood (K-NN)

The K-nearest neighborhood (K-NN) method is based on the supervised learning approach and is one of the most basic machine learning algorithms. The K-NN approach may be wielded for both regression and classification. However, it is more commonly utilized for classification tasks [31]. The steps of the K-NN algorithm are as follows:Step 1: choose a neighbors’ number Step 2: determine the Euclidean distance between K neighborsStep 3: using the estimated Euclidean distance, find the K closest neighborsStep 4: compute how many data points each category has between all these K neighborsStep 5: define the additional data points to the class with the highest number of neighbors

4.5. Support Vector Machine (SVM)

SVM is widely used for pattern classification. The SVM algorithm is applied to separate nonlinear samples into another higher dimensional space by kernel functions and then to locate the optimal separating hyperplane by solving a quadrate optimization problem [32]. The kernel function of SVM is the linear kernel, polynomial kernel, radial basis function (RBF), and sigmoidal neural network kernel. SVM–Linear, SVM-RBF, and SVM-Polynomial are used in this work.where represents the bandwidth of the kernel and indicates the positive parameters to standardize the radius.

5. Results and Discussion

This paper considers regular 10-fold training and testing with 90% and 10% of the input features used for training and testing, respectively [33]. Table 3 highlights the average MSE results for Haar, dB4, and Sym8 wavelet features in various classifiers without PSO feature selection, and Table 5 illustrates the Average MSE for Haar, dB4, and Sym8 wavelet features in various classifiers with PSO feature selection. Table 6 depicts the confusion matrix for the seizure detection. Table 7 displays the Average performance of the classifier for Haar, dB4, and Sym8 wavelet features in various classifiers without PSO feature selection, and Table 8 exhibits the average performance of classifier for Haar, dB4, and Sym8 wavelet features in various classifiers with PSO feature selection. The following performance parameter measurements may be calculated and employed to examine the classifier's performance based on the confusion matrix. The following are the formulae for the sensitivity, specificity, accuracy, F1 Score, error rate, and G-mean and MSE.

From Table 6, True-Positive is represented as TP, True-Negative as TN, False-Positive as FP, and False-Negative as FN [34]. A TP states a positive sample that has been accurately forecasted as positive. A TN states a negative sample that has been accurately forecasted as negative. A FP occurs when a result is incorrectly assumed to be positive but is really negative. A FN occurs when a result is incorrectly assumed to be negative when it is really positive [35].

The Sensitivity is computed as follows:

The specificity is expressed as follows:

The overall accuracy of the classifier is computed as follows:

F1 Score is expressed as follows:

Geometric Mean (G-mean) is computed as follows:

Mean Square Error (MSE) is computed as follows [36]:where indicates the value of observed at a particular time, represents the value of target at typical and represents the number of observations per patient, in our case, which is 25600.

Table 7 shows the average MSE for Haar, dB4, and Sym 8 wavelet features in various classifiers without feature selection. From Table 7, it is observed that for the Haar wavelet, SVM with RBF kernel classifier attains a low MSE value of 2.14E − 05. In the case of the dB4 wavelet, the GMM model achieves a low MSE value of 2.98E − 05, For Sym 8 wavelet; once again SVM(RBF) classifier reaches the top with a low MSE value of 8.89E − 06. The low value of MSE always demonstrates higher classification accuracy of the Classifier.

Table 8 depicts the average MSE for Haar, dB4, and Sym 8 wavelet features in various classifiers with the PSO feature selection method. Table 8 depicts that Haar wavelet SVM with RBF kernel classifier attains a low MSE value of 3.48E − 06. In the case of the dB4 wavelet, the KNN model achieves a low MSE value of 7.44E − 06. For Sym 8 wavelet, the SVM(RBF) classifier reaches the top with a low MSE value of 1.96E − 06. The low value of MSE always demonstrates higher classification benchmark parameters of the Classifier.

Table 9 portrays the average performance measures like Sensitivity, Specificity, Accuracy, F1 Score, Error Rate, and G-mean for Haar, dB4, and Sym 8 wavelet features in various classifiers without feature selection method. Table 9 illustrates that Haar wavelet SVM with RBF kernel classifier attains a high accuracy of 77% with an error rate of 23%. In the case of the dB4 wavelet, the GMM model achieves high accuracy of 73.5% with an error rate of 26.5%. For Sym 8 wavelet, once again SVM (RBF) classifier reaches the top with high accuracy of 92.5% with an error rate of 7.5%. SVM (RBF) classifier's high accuracy demonstrates the classifier's ability to distinguish correct classes among various Features.

Table 10 unveils the average performance measures, such as Sensitivity, Specificity, Accuracy, F1 Score, Error Rate, and G-mean for Haar, dB4, and Sym 8 wavelet features in various classifiers with PSO feature selection method. Table 10 also exemplifies that for Haar wavelet, SVM with RBF kernel classifier attains high accuracy of 87% with an error rate of 13%. In the case of the dB4 wavelet, the K-NN model achieves high accuracy of 84.5% with an error rate of 15.5%. For Sym 8 wavelet, once again SVM (RBF) classifier reaches the top with high accuracy of 90% with an error rate of 10%. Overall, high classification parameters, such as 98% accuracy, 98% F1 score, and 2% error rate are achieved in the SVM (RBF) classifier for sym 8 wavelet features with PSO feature selection. Table 11 outlines the previous identification efforts for EEG signals. The accuracy of these efforts ranged from 73.5% to 97.3%.

The suggested approaches for linear regression, nonlinear regression, GMM, K-NN, SVM-linear, SVM-polynomial, and SVM-RBF classifiers using wavelet (Haar, dB4, sym8) and PSO features outperformed other existing approaches in epileptic seizure classification. The SVM classifier with RBF kernel in sym 8 wavelet features with the PSO feature selection method attains a higher accuracy rate of 98% with an error rate of 2%. This classifier outperforms all other classifiers.

6. Conclusion

Epilepsy or “seizure disorders” is a chronic disorder and is the fourth most common neurological disorder affecting people across all ages. Early diagnosis can help the patient's rehabilitation. This paper proposed the four levels of decomposition using Haar, dB4, and Sym 8 wavelet transforms for feature extraction from Bonn A and E EEG signals. The PSO technique was used to reduce the magnitude of decamped signals. Then seven classifiers were used to classify the signals as seizure and nonseizure. The SVM classifier with RBF kernel in sym 8 wavelet features with the PSO feature selection approach achieves a higher accuracy rate of 98% with a 2% error rate. This kind of classification algorithm outperforms all others. It is thereby proposed to engage further research in the direction of deep neural networks and other mathematical model-based classifiers like NBC and Random Forest.

Data Availability

The data used to carry out this study can be obtained from the corresponding author upon request. The dataset of EEG can be obtained from BONN university EEG database.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Authors’ Contributions

V.S.H. proposed the methodology. V.S.H. was responsible for resources and provided software; R.V. supervised the study; V.S.H. and R.V. validated the study; V.S.H. reviewed and edited the manuscript. All authors have read and agreed to the published version of the manuscript.

Acknowledgments

The authors are grateful to the R&D Division of Kongu Engineering College for providing the technical support to carried out this research.