Automatic seizure detection based on Teager Energy Cepstrum and pattern recognition neural networks

About 1–3% of the world population suffers from epilepsy. A long term inpatient/ambulatory electroencephalogram (EEG), lasting from a few hours to several days, which definitely contains hallmarks of epilepsy, is required clinically to diagnose, monitor and localize the epileptogenic zone. The traditional method relies on well-trained neurophysiologists who visually inspect the entire lengthy EEG signals, which is tedious and time-consuming. Therefore, many automated epileptic detection systems have been developed and such systems significantly reduce the time taken to review off-line the long-term EEG recordings and facilitate the neurologist to diagnose and treat more patients in a given time. There are not many studies that have explored, to a sufficient depth, the features used in other domains of signal processing, for example, the Teager energy cepstrum (TECEP), attempted use in seizure detection. Epileptic seizures are abnormal sudden discharges in the brain with signatures manifesting in the EEG recordings by frequency changes and increased amplitudes. These changes, in this work, are captured through static and dynamic features derived from TECEP. We compared the performance of the baseline TECEP and its two composite vectors with those of the traditional cepstrum (CEP) and its corresponding two composite vectors, in EEG epileptic seizure detection. The first composite vector includes velocity cepstral coefficients and the second includes velocity, as well as, acceleration cepstral coefficients. The comparison is tried on eight different classification problems which encompass all the possible discriminations in the medical field related to epilepsy, using pattern recognition neural network (PRNN). In this case, it is found that the overall performance of both the Teager energy composite vectors excels those of traditional cepstral composite vectors. The static and dynamic features derived from TECEP outperform those derived from CEP, suggesting their suitability in epilepsy seizure detection.


BACKGROUND
Epilepsy is a chronic neurological disorder in which patients suffer from recurrent unprovoked epileptic seizures, which are episodic, rapidly evolving temporary events.About 1-3% of the world population suffers from epilepsy. 1 The unforeseen nature of these seizures make the daily life of patients miserable with temporary impairments of perception, speech, memory, motor control and/or consciousness, and sometimes may lead to enhanced risk of injury and/or death.Epilepsy can be controlled, but not cured with anti-epileptic medication.The epileptic brain can be considered to function in one of the two states: interictal state with occasional transient waveforms, as isolated spikes, sharp waves, or spike-wave complexes and ictal (seizure) state with continuous discharge of polymorphic waveforms of varying amplitude and frequency, spike and sharp wave complexes, rhythmic hypersynchrony, or electrocerebral inactivity, observed over a longer than average duration of these abnormalities during interictal intervals. 2 long term inpatient/ambulatory electroencephalogram (EEG), lasting from a few hours to several days, which definitely contains interictal and ictal hallmarks of epilepsy, is required clinically to diagnose, monitor and localize the epileptogenic zone. 3The EEG during seizure is significantly different from that of the interictal state and that of a normal subject.The traditional methods rely on well-trained neurophysiologists who visually inspect the entire lengthy EEG signals, which is tedious, time-consuming and costly.Therefore, many automated epileptic detection systems have been developed using different approaches in the recent years. 4Such automated systems significantly reduce the time taken to review off-line the long-term EEG recordings and facilitate the neurologist to diagnose and treat more patients in a given time.This implies that the selected feature set must be such that besides accuracy in seizure detection, the processing time must be very short.However, the wide variety of EEG patterns that characterize the nature of seizures, such as spikes and waves, low-amplitude desynchronization, polyspike activity, rhythmic waves for a wide range of frequencies and amplitudes, tend to increase the complexity of the automated seizure detection problem.
The issue of selecting an optimal set of relevant features plays an important role in developing a good classification system, particularly when using pattern recognition paradigm.A general rule of thumb is to use those features that capture aspects of the time series that are relevant for discriminating between the classes.To meet a higher accuracy it is not adequate to just have the best pattern classification system.It is found that performance of most classifiers deteriorates when some of the selected features are redundant.Thus, it is important that the selected features must be screened for redundancy and irrelevancy.Also, the number of extracted features must be small.Otherwise it will add to computational overheads and a longer processing time.Therefore, many a time pattern classification turns out to be a problem of classification with smallest number of extracted features.Different methods have been used to extract diverse features, including those that capture frequency, energy and structural content of the signal for the task of epileptic seizure detection. 5 -10In a recent study, we had found that the overall performance of both the composite vectors of the traditional cepstrum (CEP) deteriorates compared to that of the baseline vector in the seizure detection and classification of EEG segments. 11However, there are not many studies that have explored other features used in different domains of signal processing, for example, the use of Teager energy operator based cepstrum (TEOCEP), being tried in seizure detection.TEOCEP has been used for speech recognition and analysis. 12 -15However, to the best of our knowledge, this is the first study where, Teager energy cepstrum (TECEP), a modified version of TEOCEP, is applied and investigated for unbalanced general EEG data classification.TECEP is used to take the advantage of the modulation energy tracking capability of the Teager energy operator (TEO).Also, no other work addresses the eight classification problems discussed below, which encompass all possible discriminations in the medical field related to epilepsy.EEG signal is nonstationary in nature; it contains high frequency information with short interval segments and low frequency information with long period segments.Computation of conventional CEP demands the EEG segment to be stationary.Hence, the EEG segment length must be short enough to meet stationarity requirement, while long enough to capture specific patterns.TECEP computation, however, does not demand such requirements.To prove the efficacy of our approach we compare the performance of TECEP and its two composite vectors with those of CEP and the corresponding composite vectors in discriminating the eight classes on the general EEG database by Andrzejak. 16We also compare the performance of our approach with those of other researchers who had used the same database.
There are two variants in the approach adopted in automated detection of seizures.The first is based on a set of heuristic rules and thresholds.The second is based on a classifier, which employs pattern recognition techniques.In the former approach the results depend upon a single operating point and hence, there is little control over the accuracy.On the other hand, the latter permits the classifier to adapt to the desired performance and meet the requirements.Hence, we undertake the latter approach.As such there is no well-established method to select an optimal network for classification.The rationale behind choosing PRNN is that; (1) It is generally agreed that a well-defined and sufficiently constrained recognition problem (small intra-class variations and large interclass variations) will lead to a compact pattern representation and a simple decision making strategy; (2) PRNN is a more suitable classifier for the type of cepstral data we use, due to their high percentage of recognition; (3) PRNN is also suitable from the point of view of its short training time, high speed, low resource consumption, high accuracy, and real-time property implementation.Pattern recognition using artificial neural network has been used in a variety of applications, like electrocardiogram beat classification, pattern recognition of EEG signals, optical character recognition, iris recognition, image recognition, protein secondary structure prediction; new and emerging applications, such as data mining, web searching, retrieval of multimedia data, face recognition, cursive handwriting recognition, biometrics, interactive voice response, stock price pattern recognition, electric power system short term load forecasting, to name a few. 17 -20

METHODS 2.1. EEG records
The EEG data used for this work is from the University of Bonn EEG database, which is available in the public domain. 16The choice of this database is based on the rationale that many seizure detection methods have employed this database and it becomes relatively easy to compare end results.The database consists of five sets (designated Z, O, N, F and S) each containing 100 single channel EEG segments of 23.6 second duration.These segments have been picked from continuous multi-channel EEG recordings after removal of any artifacts, such as, muscle activity, or eye movements, making sure that they fulfilled stationarity requirements.Sets Z and O contain segments taken from surface EEG recordings acquired from five healthy volunteers, using a standard 10-20 electrode placement scheme.The subjects were awake and relaxed with their eyes open for set Z and eyes closed for set O. The segments for sets N, F, and S were acquired from five epileptic patients undergoing presurgical diagnosis.The type of epilepsy identified was temporal lobe epilepsy, with the epileptogenic focus as the hippocampal formation.These recordings were taken from intracranial electrodes as they offer the most precise access to the emergence of seizures.Sets N and F contained activity measured during seizure free intervals (interictal epileptiform activity), with segments in set N recorded from hippocampal formation of the opposite hemisphere of the brain and those in set F recorded within epileptogenic zone.On the other hand, set S contained seizure activity (ictal intervals), with all segments recorded from sites exhibiting ictal activity.The patients had attained complete seizure control after resection of one of the hippocampal formations, which was confirmed to be the epileptogenic zone.All the EEG signals were recorded using the same 128-channel amplifier system using an average common reference.The data was digitized at 173.61 samples per sec with 12 bit resolution.The bandpass filter setting was at 0.53 -40 Hz (12 dB/octave).Each single channel EEG segment has 4096 samples.

Teager energy operator (TEO) and Teager energy
TEO is a non-linear energy tracking operator widely used in speech applications. 21 -23An important property of TEO is that it is characterized by a time resolution that can track rapid changes in signal energy (squared product of amplitude and frequency).This is attributed to the fact that TEO operation is almost instantaneous as it uses only three/four samples, hence, it is most suitable for real time applications.Thus, besides energy, the operator can also track amplitude envelope and instantaneous frequency.Although the energy of any two tones at different frequencies, but equal amplitude, is same, the energy required to generate the two tones is different.The specialty of TEO is that it measures the energy of the system that generated the signal based on mechanical and physical considerations, rather than the energy of the signal itself. 24 -25Thus the advantage of using TEO is that it models the energy of the non-linear system that generated the signal.Hence, disturbances in signal generation and conduction path get reflected in the Teager energy function.

If a signal sample is represented as x
where A is the amplitude and F is the initial phase.V is the digital frequency in radians/sample and is given by V ¼ 2pf/f s , where f is the analog frequency in Hz and f s is the sampling frequency in Hz.Then as per the TE algorithm, the instantaneous TE, E n at a given instant of time n, is given by 24 -25 for small V.With V , p/4 or f/f s , 1/8 the relative error in the last approximation, is always less than 11%.From the above equation ( 1) it is clear that the instantaneous TE can track modulation energy and identify instantaneous signal amplitude and corresponding instantaneous frequency.For example, in a normal subject there is a fine balance in the brain between factors that generate electrical activity, factors that restrict it, and there are also systems that limit the spread of the electrical activity.Usually, during a seizure, these limits breakdown and an abnormal hypersynchronous neuronal activity due to a large number of neurons in the cerebral cortex of the brain occurs.The cerebral activity during an epileptic seizure is completely different from that of the interictal state or that of a normal subject.
During interictal state the EEG is normal with occasional transient waveforms that are apparently random with a higher complexity, while during a seizure the EEG tends to become hypersynchronized and cyclical, with decreased complexity. 26 -27Unlike the usual instantaneous signal energy, which is only proportional to squared instantaneous amplitude, TE is proportional to the squared product of both instantaneous amplitude and instantaneous frequency.This new energy measure is therefore capable of responding rapidly to changes in both amplitude and frequency.Consequently disturbances in EEG signal generation and conduction path get reflected in the TEO energy. 28 -29he general form of Teager nonlinear energy operator in the time domain for a discrete time signal x[n ] as given by Plotkin et al. and Agarwal et al. 30 -33 is, where l þ m ¼ p þ q and C td denotes generalized TEO.They showed that for lm and pq, C td is very robust to noise.That is, if the input signal contains additive white noise, then the output of equation (2) will not contain a component related to input noise.This is attributed to the removal of the square term in equation ( 2) while satisfying the conditions l þ m ¼ p þ q, lm, and pq.In this work, we empirically found that the combination l ¼ 1, m ¼ 2, p ¼ 0 and q ¼ 3 is a suitable choice for noise reduction in EEG signals.

Cepstrum (CEP) derived from log magnitude spectrum
Cepstrum (CEP) analysis is a nonlinear signal processing technique with a variety of applications in areas such as speech and image processing.Among the speech recognition approaches, the family based CEP, has been prominent due to its performance and simplicity.In CEP models, a time evolving signal as an ordered set of coefficients represents the signal spectral envelope.This in fact is a curve passing close to the peaks in the original spectrum.The CEP, though a compact representation, has been found to capture most of the relevant information in the original time series.It is possible to compare two relatively long time series with only a few cepstral coefficients.This implies that if two cepstral series are close then the corresponding signals have a similar evolution in time.
The real CEP is defined as the inverse Fourier transform of the log magnitude spectrum as given by where C r [k ] represents k th order real cepstral coefficient.If the inverse Fourier transform is replaced by discrete cosine transform (DCT), the resulting equation becomes where C[k ] represents k th order pseudo cepstral coefficient.The advantages are that; (1) DCT has better energy compaction properties than the DFT and hence decreases memory requirements; (2) it reduces the computational complexity drastically without degrading the information content in the CEP and hence decreases execution time; and (3) DCT produces highly uncorrelated features.The resulting sequence of coefficients C[k ], called pseudo CEP, is an approximation to the CEP, and in reality simply represents an orthogonal and compact representation of the log magnitude spectrum.The difference between cepstral coefficients of different time series can serve as a similarity measure among these time series.The cepstral coefficients decay rapidly to zero and hence, only the first few coefficients are needed to capture most of the dynamic information in the time series.This property of cepstral coefficients helps in reducing the dimensionality.Also, the number of coefficients to be retained does not depend upon the length of the time series.Moreover, the higher order coefficients represent the excitation process, which is less useful.The coefficient C[0] is similar to log energy (or DC component) of the signal and represents the segment energy.It is, usually, not treated as a cepstral coefficient and in this study, we drop C[0].

Teager energy cepstrum
Teager energy operator based cepstrum (TEOCEP) was introduced by Jabloun et al., 12 for speech recognition.To compute TEOCEP, the signal is divided into a few subbands using a multi-rate filterbank.The formal definition is related to computation of average TE for each of the subbands, followed by log compression and the application of inverse DCT.However, in this work we adopt a different procedure, which does not use filterbank.For a given EEG segment we compute the instantaneous TE based on eqn.(2).Treating this as signal, we apply cepstral analysis using DCT as given by eqn.(4).We call the resulting cepstrum C T [k ], Teager energy cepstrum (TECEP) of the EEG segment under consideration, to differentiate it from TEOCEP.The coefficient C T [0] is similar to log energy (or DC component) of the TE signal.Unlike CEP, in this study, we account for C T [0].
EEG signals tend to be arbitrary in nature, and with some epileptic conditions, the frequency of the signal can change drastically with time depending upon the severity of the condition.In particular, during a seizure the frequency components of the EEG signal become extremely erratic and unpredictable.To reduce the edge-effects a Hanning window was applied before the spectrum was computed for such signals.

Pattern recognition neural network (PRNN)
Pattern recognition addresses the problem of classifying objects, often represented as vectors or strings of symbols, into categories.The four best known approaches for pattern recognition are: 1) template matching, 2) statistical classification, 3) syntactic or structural matching, and 4) neural networks. 34The recent research activities that use neural networks for classification have established that neural networks can be a promising alternative to conventional methods of classification.The main advantage of neural networks lies in the fact that it makes use of self-adaptive techniques to adjust to the data without any explicit specification.Neural network models attempt to simulate the information processing that occurs in the brain and are widely used in a variety of applications, including automated pattern recognition.Pattern recognition is generally categorized according to the type of learning procedure used to generate the output value.(1) Supervised learning assumes that a set of training data (the training set) has been provided, consisting of a set of instances that have been properly labeled by hand with the correct output.A learning procedure then generates a model that attempts to meet two, sometimes conflicting, objectives: Perform as well as possible on the training data, and generalize as well as possible to new data.(2) Unsupervised learning, on the other hand, assumes training data has not been hand-labeled, and attempts to find inherent patterns in the data that can then be used to determine the correct output value for new data instances.Neural networks have shown promising results in pattern recognition and also in training processes.Pattern recognition can be implemented by using a feed-forward neural network that has been trained accordingly.The network is trained to associate outputs with input patterns.When the network is used, it identifies the input pattern and tries to output the associated output pattern.The power of neural networks comes into play when a pattern that has no output associated with it, is given as an input.In this case, the network gives an output that corresponds to a taught input pattern that is at least different from the given pattern.
We use MATLAB pattern recognition toolbox in this study. 35Pattern recognition networks are feedforward networks that can be trained to classify inputs according to target classes.The target data for pattern recognition networks should consist of vectors of all zero values except for a 1 in element i, where i is the class they are to represent.Its syntax in MATLAB is: fitnet (hiddenSizes, trainFcn), where hiddenSizes is a row vector of one or more hidden layer sizes, and trainFcn is the training function, the default being the Levenberg-Marquardt backpropagation algorithm.

RESULTS AND DISCUSSION
In this work, we handle all the different classification problems proposed by Guo et al. 36 and Tzallas et al. 37 to encompass most of the possible discriminations in the medical field related to epilepsy and compare the performance of our approach with those of other researchers.The first three classification problems were proposed by Guo et al., 36 the next three classification problems were proposed by Tzallas et al., 37 while the seventh and eighth are proposed by us.These classification problems have been chosen in that they are close to clinical applications.As mentioned in the background section, in a previous study we had found that the overall performance of both the composite vectors of the CEP deteriorated, compared to that of the baseline vector in the seizure detection and classification of EEG segments. 11This study was conducted on seven classification problems (CPs).Now, we evaluate the diagnostic capability of the composite vectors of TECEP in the above eight classification problems.Empirically repeating the same procedure as in Kamath 11 , we had found that in the case of TECEP, an analysis window length, W $ 500 samples (2.88 seconds) and a number of cepstral coefficients, N $ 11 lead to optimum PRNN results.The reduction in minimum window length in TECEP compared to CEP is attributed to applying TEO to signal before the computation of CEP.Distance-based classifiers demand normalization of the data and hence, feature vectors are normalized before they are applied to PRNN.
We now compare the results of the performance of the composite vectors with the traditional CEP obtained from Kamath 11 and the TECEP methods, in general EEG seizure detection.The first composite vector includes velocity vector, together with the static cepstral vector.The second composite vector includes velocity vector as well as acceleration vector, together with the static cepstral vector.The comparison is tried on each of the abovementioned eight classification problems that have been widely used in literature related to epilepsy.Typical EEG segments, one from each dataset (in the order Z, O, N, F and S), are shown in Figure 1. Figure 2 shows the first 11 static TECEP coefficients, for the same EEG segments shown in Figure 1, in the same order.We adopted leave-one-record-out cross-validation scheme.Specifically, we had run 10 runs of a 10-fold cross-validation (with 10 runs for each fold split), thus having a total of 100 PRNN runs to average to produce the final result.With each new fold split, the EEG data segments were randomized.Descriptive results of PRNN analysis using TECEP baseline and composite vectors for discriminating different classification problems are depicted in Table 1.It is found that the TECEP baseline and composite feature vectors show the excellent performance in all the cases.Computation of conventional baseline cepstrum (CEP) demands the EEG segment length to be long enough to capture specific patterns while, short enough to meet stationarity requirement. 11TECEP, however, does not impose such restrictions and can be computed from a shorter window size.This is attributed to applying TEO to signal before the computation of CEP.
It has been found from one of our previous studies 11 that the first composite vector demonstrated a reduction in the overall accuracy in discriminating the EEG segments in different classification problems in the CEP method.The second composite vector exhibited a greater decline in the overall accuracy in discriminating the EEG segments.It is interesting to note that the baseline CEP vector alone showed the best performance.The composite CEP vectors, instead of at least maintaining best   performance, showed a degraded performance.This implies that the velocity and acceleration CEP features are negatively affecting the performance, possibly because of the nonlinearities introduced in the EEG significantly affecting the computation of derivatives.Thus, it is not much use, to use composite CEP vectors for discriminating the different classes in the various classification problems.Both the composite cepstral vectors of TECEP method, however, show excellent performance in all the eight classification problems.
Various researchers have proposed different methods for epileptic seizure detection using the database by Andrzejak et al. 16 .Table 2 provides a comparison between our method and other methods that have used the same database.In the table, we present a listing of the method, dataset used and classification accuracy, for the eight classification problems.It is to be noted that all the methods shown in the table, including ours, had used modern classifiers for first training and then classification.In the first classification problem (Serial numbers: 1-8), the results obtained by Tzallas, 37 Subasi, 38 Wang, 39 Iscan, 40 and Orhan 41 methods are the best (100%).Our method showed an overall accuracy of 99.9%.In the second problem (Serial numbers: 9 -11), our method shows the best results (98.9%).For the third classification problem (Serial numbers: 12 -15), the result found by Orhan 42 is the best (100%), while our method exhibited 99.6%.For the fourth, fifth, and eighth classification problems (Serial numbers: 16 -18, Serial numbers: 19 -21, and Serial numbers: 23-25), our method showed the best average accuracy of 99.5%, 99.0%, and 99.9%, respectively.In the seventh and eighth classification problems (Serial number: 22 and Serial number: 26), the new classification problems appended by us in this paper, the results are excellent (99.3% and 99.8%, respectively).All these results collectively show a tremendous improvement in our approach over some previous epilepsy detection methods.The above comparison implies that an automated system developed based on this approach should provide feedback to the experts for quick and accurate EEG classification.
The database used has already been preprocessed by the removal of artifacts by visual inspection.This is a limitation of our method (like many who have used the same database).Nevertheless, the results of this study provide sufficient evidence to warrant the assessment, under actual clinical situations, that can provide more robust confirmation of the application of this approach to capture diagnostically significant information.Hence, the method is well suited for implementation not only in epilepsy detection system, but also in other applications, such as seizure warning systems, closed loop seizure control systems, or delivering abortive responses/monitoring patients using implantable therapeutic devices. 48

CONCLUSIONS
A comparison of the EEG epileptic seizure detection based on baseline and composite vectors comprising velocity and acceleration features, using TECEP and CEP methods, is presented.In the literature it is found that in applications, such as speech analysis and recognition the velocity and acceleration features do enhance the performance.However, our previous study showed that in the case of EEG discrimination using CEP method, the velocity and acceleration features were negatively affecting the performance.The chief finding of this study is that, unlike the CEP method, in the TECEP method, the composite vectors do perform on a par with the baseline vector in the discrimination of EEG segments in a variety of classification problems close to clinical applications.An automated system developed based on TECEP method should provide feedback to the clinical neurophysiologists for quick and accurate EEG discrimination.Such discrimination is important in some applications, such as seizure warning systems, closed loop seizure control systems, or delivering abortive responses/monitoring patients using implantable therapeutic devices.

AUTHOR'S CONTRIBUTION
The author, who is also the corresponding author, is the sole contributor to this work.

1 .
In the first classification problem, two classes are examined, normal and seizure.The normal class includes set Z, while the seizure class includes set S. In this classification problem, 200 EEG segments are included.2. In the second classification two classes, namely, non-seizure and seizure are examined, but not all sets are used.The non-seizure class includes sets Z, N, and F, while seizure class includes set S. In this classification problem, the dataset includes 400 EEG segments.3.In the third problem, again, two classes, non-seizure and seizure are examined.Now the non-seizure class includes sets Z, O, N, and F, while seizure class includes set S. In this classification problem, 500 EEG segments are included in the dataset.4. In the fourth classification problem, three classes are examined, normal, non-seizure and seizure, but not all sets are used.The normal class includes set Z, non-seizure class includes set F and the seizure class includes set S. In this case, 300 EEG segments are used.5.The fifth classification problem takes care of five datasets comprising 500 EEG segments into three classes, normal (Z and O), non-seizure (N and F) and seizure (S).6.The sixth classification problem handles five datasets comprising 500 EEG segments into five individual classes, eyes-open (Z), eyes-closed (O), non-seizure interictal (N), non-seizure interictal (F) and seizure (S). 7.In the seventh classification problem, three datasets comprising 300 EEG segments into two classes, non-seizure (N and F) and seizure (S) are examined.8. Finally, in the eighth classification problem, three classes are examined, normal, non-seizure, and seizure, but not all sets are used.The normal class includes set Z, non-seizure class includes set N and the seizure class includes set S. In this case, 300 EEG segments are used.

Figure 1 .
Figure 1.Typical EEG segments from each of the five sets (Z, O, N, F, and S), from top to bottom.

Figure 2 .
Figure 2. The first 11 static TECEP coefficients for the same EEG segments shown in Fig. 1 in the order Z, O, N, F, and S.

Table 1 .
Percentage average accuracy of PRNN analysis using TECEP method (W ¼ 500 and N ¼ 11) for baseline and composite vectors in discriminating eight classification problems (CPs).

Table 2 .
A comparison of classification accuracy achieved by our method and best performed others' method for eight classification problems.