IDENTIFYING PRESEIZURE STATE IN INTRACRANIAL EEG DATA USING DIFFUSION KERNELS

The goal of this study is to identify preseizure changes in intracranial EEG (icEEG). A novel approach based on the recently developed diffusion map framework, which is considered to be one of the leading manifold learning methods, is proposed. Diffusion mapping provides dimensionality reduction of the data as well as pattern recognition that can be used to distinguish different states of the patient, for example, interictal and preseizure. A new algorithm, which is an extension of diffusion maps, is developed to construct coordinates that generate efficient geometric representations of the complex structures in the icEEG data. In addition, this method is adapted to the icEEG data and enables the extraction of the underlying brain activity. The algorithm is tested on icEEG data recorded from several electrode contacts from a patient being evaluated for possible epilepsy surgery at the Yale-New Haven Hospital. Numerical results show that the proposed approach provides a distinction between interictal and preseizure states.


1.
Introduction.Approximately 50 million people worldwide suffer from epilepsy, including 2.7 million people in the United States, making it one of the most common neurological disorders [7].Seizures are successfully controlled in 64% of patients, while the remaining 36% have pharmacoresistant epilepsy [9].Epilepsy surgery is an option for some of the patients with pharmacoresistant epilepsy if their seizures can be localized to a focal area of the brain.
It is important to find a reliable method to predict seizures so that a patient can be warned at least a few minutes prior to the seizure and take effective precautions.Finding an accurate predictor of seizures has become a major focus of research during the last few decades [5].Seizure prediction algorithms can be used to improve devices to control seizures.The Neuropace device, for example, which is placed within the skull, contains a programmable feature detector that can be employed to stimulate on detection of target features [15].However, it is still important to 580 D. DUNCAN, R. TALMON, H. P. ZAVERI AND R. R. COIFMAN improve these feature detectors and find ways to predict seizures with sufficient sensitivity and specificity.
In our ongoing research, we have investigated various methods to analyze icEEG data, such as coherence, mutual information, and approximate entropy [4] to analyze resting state icEEG activity.Seizure prediction involves a wide variety of methods, including linear methods, such as autoregressive models, as well as nonlinear methods, such as estimations of the largest Lyapunov exponent and correlation [13].Other approaches involve the comparison of interictal and preictal epochs [1].
Often the main problem in many seizure prediction studies is that the analysis is performed on a small amount of data from few patients.
In EEG data, we assume that the measurements are controlled by underlying processes that represent brain activity.We would like to recover these underlying brain processes to distinguish different brain states, such as interictal and preseizure states.Diffusion maps [3] have been a useful tool in reducing the dimensionality of the data as well as providing a measure for pattern recognition and feature detection.Since diffusion mapping may detect abnormal behavior in the data, it can be used to determine changes in brain states.However, diffusion maps assume access to the underlying process that it aims to reveal.In EEG data, the relationship between samples of the data and the underlying activity may be stochastic, and the data are assumed to be noisy.Hence diffusion mapping is not the most suitable approach to use with our icEEG data.A recently developed algorithm, which is an extension of diffusion maps, may be more applicable in our case [16,17].The new algorithm assumes a stochastic mapping between the underlying processes and the measurements; the mapping is inverted and a kernel is used to recover the underlying activity [16].Thus, the proposed algorithm is more appropriate than diffusion maps for our data.
In this paper, we propose an algorithm that relied on [16] for extracting the underlying brain activity from the icEEG data.The algorithm is an extension of diffusion maps and uses local principal components analysis (PCA).PCA is another dimensionality reduction method.In PCA, the goal is to compute the most meaningful basis to re-express a large and noisy data set.This new basis can reveal hidden patterns and structure in the data as well as remove the noise.An orthogonal linear transformation converts the data to a new coordinate system.The greatest variance in the data is represented by the first coordinate or the first principal component.In EEG, the data may be generated by a nonlinear combination of sources, the underlying processes, so that more sophisticated methods are required to represent the data.An important difference between the proposed algorithm and PCA is the use of nonlinear locality in the extension as opposed to PCA, which retains the linear global information of the data.For the icEEG data, we perform PCA on local regions of data and then integrate the local information using a kernel and obtain a single model.We use a data-driven adapted distance between samples of icEEG recordings to approximate the Euclidean distance between the underlying processes from the noisy EEG.

Intracranial EEG. 1
1 Intracranial depth, subdural strip, and subdural grid electrodes were placed as required for the patient.Subdural strip and grid electrode contacts were recessed platinum disk contacts with 2.3 mm exposed surface and 1 cm center-to-center separation (Ad-Tech, Racine, WI, U.S.A.).
Intracranial EEG (icEEG) data were collected from a patient with localization related epilepsy who was undergoing presurgical evaluation at the Yale-New Haven Hospital.A total of 182 electrode contacts were used during the monitoring.The seizure onset area was located on the right occipital lobe.In this paper, we focus on the 3 electrode contacts, which overlaid the seizure onset area.Figure 1 shows the location of the 3 electrode contacts which were studied.We studied 6 icEEG epochs, each of which corresponds to a seizure experienced by the patient over the course of the icEEG monitoring.Figure 2 shows an example of approximately 16 minutes of icEEG data from one electrode contact that includes a seizure.The seizure start time is marked with a red line.Figure 3 shows the corresponding spectrogram, where the horizontal axis represents time (each point is one time window).We considered the delta (0.5-4 Hz), theta (4-8 Hz), alpha (8-13 Hz), beta (13-25 Hz), and gamma frequency bands (25-100 Hz).
For each case, we considered 5 minutes of data immediately preceding the seizure and 5 minutes of data from 35 minutes before the seizure, which we considered to be the interictal state.An example is shown in Figure 4.The data were collected Intracranial electrode contacts were located after placement using a computer program (BioImage Suite) developed at Yale University [11] and a procedure which has been described [6].Briefly, intracranial electrode contacts were located from a postimplantation computed tomography (CT) image.The postimplantation CT was then coregistered first with a postimplantation magnetic resonance (MR) image using a linear coregistration procedure and then with a preimplantation MR image using both a linear and a nonlinear coregistration procedure.This procedure allowed us to visualize intracranial electrode contacts on preimplantation MR images.The icEEG were collected with clinical icEEG acquisition equipment (Natus Medical Inc./Bio-Logic Systems Corp., San Carlos, CA).The signals were sampled at 256 Hz with 16 bit A/D conversion.at 256 samples per second, so for each case, we analyzed 10 minutes or 153,600 samples.The horizontal axis represents time (in samples).From the raw icEEG data for this patient, it is typically unclear to determine when a seizure began until the actual start time of the seizure.
2.2.Data features and metric.Let y(n) be the icEEG data from a single electrode contact where n is the time index; we describe the following procedure for data from merely one electrode contact for simplicity.Vectors and matrices will be denoted in lowercase and uppercase bold, respectively.Applying the Short Time Fourier Transform (STFT) on y(n) yields and m is the time frame index, k is the frequency band index, φ(n) is a Hanning window of length N , and L is the discrete-time shift [12].The Hanning window is defined as φ The data are split into segments overlapped by 50%, or L = N/2, each of which are windowed and then Fourier transformed to produce an estimate of the short-term frequency content of the signal.Let be the squared amplitude of the STFT.We collect a few frequency bands for each time frame into vectors defined as where K is the collection of empirically relevant frequency bands.The STFT is commonly used in time series analysis because it allows us to see the changes of the spectral composition over time.In EEG, it may enable us to identify differences in the data closer to seizures as opposed to interictal states.We call {s y (m)} the feature vectors.We assume that the high dimensional features s y (m) are controlled by the low dimensional processes that represent brain activity.The factors that describe a preseizure state are considered to be a collection of d stochastic processes θ m = (θ 1 m , ..., θ d m ), where m is the time frame index, such that each of these processes satisfies a stochastic differential equation that can be described as where i = 1, ..., d.It is assumed that the drift terms (a 1 , ..., a d ) are unknown and (w 1 m , ..., w d m ) are d independent white Gaussian noises.s y (m) is a random variable whose statistics can be expressed as a nonlinear stochastic function of the underlying factors.Thus, the goal is to recover these underlying factors, which represent the preseizure state, from the features, s y (m).
We propose the following method to reduce the dimension of the data while keeping the information we need to distinguish preseizure from interictal states.
We compare feature vectors for the 5 minutes of data by calculating the STFT of the windows of data.Combining the vectors, s y (m), and plotting the outcome shows that the feature vectors (Figure 5) do not exhibit significant variation between interictal and preseizure state for the 6 separate seizure episodes.The lower frequency bands are most often used for detecting different states in patients using EEG; we consider the frequency bands up to 100 Hz for our analysis.These are used as our feature vectors for each point in time (each window of almost 4 seconds).
Given a feature vector s y (m), we compute the local covariance matrix in a time interval of length J: where µ m is the empirical local mean of the feature vectors in the interval.We define a nonsymmetric distance using the covariance matrices Σ m s: and a symmetric distance The distance in ( 9) is termed the Mahalanobis distance, and it is shown in [14] that it approximates the Euclidean distance between the underlying factors.This distance allows us in Section 2.3 to recover the underlying factors in the icEEG data via eigendecomposition of an appropriate Laplace operator (kernel).
2.3.Kernel and embedding.Consider a collection {s y (m)} of N s feature vectors.Let R be a subset of reference feature vectors of size Ns and Ns < N s .R is obtained by selecting an arbitrary subset of measurements to enable a more efficient computation of the data.
A kernel is used to compare the underlying factors, whose (m, m )th element is where is the kernel scale set according to the Mahalanobis distance defined above.This kernel, W R , depends on the reference set and defines the local geometries of the graph [3].W R is normalized by a diagonal density matrix, which enables us to view the sampling as uniform.This normalized matrix is denoted by We can further normalize according to [8].In order to obtain the representation of the training set, we perform an eigendecomposition to handle the nonuniform sampling of the data and acquire the eigenvalues, λ j , and eigenvectors, ϕ j .The eigenvalues of the diagonal matrix are all non-negative and sorted in decreasing order.The Laplace-Beltrami operator is given by I − WR , which shares the same eigenvectors as WR .
In order to incorporate samples from the entire set, we form an N s × Ns nonsymmetric affinity matrix A whose (m, m )th element is defined as where > 0 is the kernel scale, m ∈ R, and m = 1, ..., N s .We define the diagonal matrix D and normalize A, similar to the previous normalization, to form Next we define a kernel on the entire set: Then we extend the eigenvectors of W R and use the diagonal density to get the eigenvalues of W without computing the eigendecomposition of W. It can be shown by [8] that The eigenvalues and eigenvectors of WR are determined, and it is known that W converges to a diffusion operator [8,17].Proposition 3.1 from [8] states that the eigenvectors ψ j of W corresponding to nonzero eigenvalues λ j > 0 are Using W R based on our reference set is useful, because it is smaller than the size of the kernel based on the whole data set, and using the entire data set would be computationally intensive.In practice, it is difficult to handle the icEEG data, because it is so large, so using a small matrix based on the reference set is efficient.In the algorithm, we use W R to denote the first stage with the reference feature vectors as opposed to W for the second stage.Initially, we use the reference feature vectors to perform our analysis using the kernel based on the Mahalanobis distance (9) and apply an eigendecomposition.Then we use (17) to acquire a representation of the entire set and finally define a mapping that allows us to recover the underlying brain activity in the icEEG data.
Based on the eigendecomposition, we define an -dimensional embedding of each feature vector as the diffusion mapping: and is related to the intrinsic dimensionality of the data for each m = 1, ..., N s .The eigenvectors of W corresponding to the largest eigenvalues provide a parametrization of the features.Thus, we reduce the data from a large dimensional space of noisy observations to a small dimensional space, which captures the features of preseizure activity.In our case, we set = 3, so our embedding is in a 3-dimensional space.Our empirical example approximates heat flow on a manifold; we have high dimensional complex data, and we want to understand the intrinsic coordinates of the manifold.The proposed algorithm is useful for distinguishing different states (interictal and preseizure) in the icEEG data.
3. Results. Figure 5 shows the spectrograms for the two segments of interictal and preseizure icEEG data.The horizontal axis represents time frames, and the vertical axis represents frequency ranging from 0-100 Hz.The spectrogram is computed by using the Short Time Fourier Transform (STFT) in ( 4) with a window length of 1,000 points overlapping by 50%.For EEG data, lower frequency bands reveal the most information about the state of a patient [2].In our analysis, we considered the delta (0.5-4 Hz), theta (4-8 Hz), alpha (8-13 Hz), beta (13-25 Hz), and gamma (25-100 Hz) frequency bands.
After combining the resulting spectrograms for the two segments (five minutes immediately preceding the seizure and five minutes from 35 minutes before the seizure), we apply our proposed algorithm.Figure 6 depicts a 3D scatter plot of the embedding obtained using the three leading eigenvectors from (18).Each point in the figure represents a time frame defined in (5).The colors represent time for the 10 minutes of data, with the red points corresponding to the data directly before the seizure.We find that this method shows a clear separation of the preseizure points from the interictal state points, which tend to be scattered around the origin.From the three dimensional plot, it is quite evident that there is a clear separation between the points in the preseizure state and the points in the interictal state.The red points on the graph are the points closest in time to the seizure onset, and they are separated from the rest of the cloud of points that are near the origin.For the data analysis of this patient, we see that for each of the times prior to the seizure occurrences, there is no apparent difference between the interictal states and the preseizure states in the raw data or in the spectrograms of the data.On the other hand, there is a clear separation between the two states from the embeddings that arise from an application of our method.This experiment was repeated for all six seizures that the patient experienced while being monitored, and we obtained similar and consistent results.
To verify that the separation did not occur from merely the time separation of the two data segments, two separate time segments of interictal state data were compared, as seen in Figure 7.In the previous case, we use one segment of preseizure data compared to one segment of interictal data from 30 minutes before that preseizure segment.We compare that segment of interictal data to another segment of interictal data from 30 minutes prior to it and also computed the spectrograms using (4), as seen in Figure 8.It was seen that there was no discernible separation of these two interictal state data segments using (18) in the embedding in Figure 9.An analysis of other interictal state and preseizure data for this patient showed the same type of separation as shown in Figure 6, whereas other embeddings using two separate interictal periods showed no separation.
4. Discussion.Based on this initial study, we have found our algorithm to be a potentially useful method for characterizing the preseizure state in intracranial EEG recordings from the seizure onset area.Our algorithm may identify the underlying processes that represent the brain activity.In our analysis, in which we examined  the preseizure icEEG data from a patient who experienced 6 seizures, we found that our method resulted in an embedding that showed a distinct separation between interictal and preseizure states.Using the same method while comparing two separate interictal segments that were also 30 minutes apart resulted in no separation in the embedding.Thus it appears that the method distinguished a preseizure state from an interictal state.While this analysis of the data is preliminary, it is planned to develop this approach to construct a method for predicting seizures.Our goal, relying on these results, is to define a threshold that separates the interictal state from the preseizure state in a wide range of cases and develop an automatic method to predict seizures.

Figure 1 .
Figure 1.The seizure onset area is located in the right occipital lobe.The three electrode contacts overlying the seizure onset area which are used in this study have been circled.

Figure 2 .
Figure 2. The icEEG from one electrode contact for approximately 16 minutes including one seizure where the seizure start time is marked with a red line.The horizontal axis represents time (in samples).

Figure 4 .
Figure 4.The raw icEEG data recorded from the 3 electrode contacts in the seizure onset area for the interictal (left) and preseizure (right) periods; the horizontal axis represents time (in samples).

Figure 5 .
Figure 5.The spectrograms from the interictal (left) and preseizure (right) periods; the vertical axis represents frequency ranging from 0-100 Hz, and the horizontal axis represents time (in analysis windows).

Figure 6 .
Figure 6.The embedding using 3 eigenvectors; the colors represent time, with the red points closest to seizure onset.

Figure 7 .
Figure 7.The raw icEEG data recorded from the 3 electrode contacts in the seizure onset area for two interictal periods 30 minutes apart from each other.The horizontal axis represents time (in samples).

Figure 8 .
Figure 8.The spectrograms from two interictal periods; the vertical axis represents frequency ranging from 0-100 Hz, and the horizontal axis represents time (in analysis windows).

Figure 9 .
Figure 9.The embedding of the two interictal state segments of data using 3 eigenvectors; the colors represent time.