Evaluation of Empirical Mode Decomposition for Event-Related Potential Analysis

Current methods for estimating event-related potentials (ERPs) assume stationarity of the signal. Empirical Mode Decomposition (EMD) is a data-driven decomposition technique that does not assume stationarity. We evaluated an EMD-based method for estimating the ERP. On simulated data, EMD substantially reduced background EEG while retaining the ERP. EMD-denoised single trials also estimated shape, amplitude, and latency of the ERP better than raw single trials. On experimental data, EMDdenoised trials revealed event-related differences between two conditions (condition A and B) more effectively than trials lowpass filtered at 40 Hz. EMD also revealed event-related differences on both condition A and condition B that were clearer and of longer duration than those revealed by low-pass filtering at 40 Hz. Thus, EMD-based denoising is a promising data-driven, nonstationary method for estimating ERPs and should be investigated further.


Introduction
Event-related potentials (ERPs) are changes in the ongoing electroencephalographic (EEG) signal related to processing of a stimulus [1]. The amplitude of ERP is usually much lower than that of the background EEG and is hard to visualise at the level of single trials. Because of this low signal-to-noise ratio, a number of single-trials time-locked to the presentation of the stimulus are averaged to obtain an estimate of the ERP. Single-subject averages are first obtained, and then, the single-subject averages are averaged across subjects to obtain a Grand Average. This procedure has a number of disadvantages. Firstly, when the signal-tonoise ratio (SNR) is particularly low, the number of single trials per participant can be high, leading to long experiment times and participant fatigue [2]. Secondly, the conventional averaging procedure is often unable to detect effects when the number of trials is low. An example of such a situation arises when examining differences between ERPs from trials when the subject responded correctly to when he/she responded incorrectly, and so forth. Thirdly, averaging does not reveal trial-to-trial variations in amplitude and/or latency of ERP. Thus, there is a need for methods which improve the SNR of single trials thereby enabling subtle effects between conditions to be detected.
A number of methods have been proposed to perform such a filtering function. Model-based approaches using autoregressive models (AR) and autoregressive moving average models (ARMA) have been proposed [4,5]. Approaches based on linear transforms such as Independent Component Analysis (ICA) and Principal Component Analysis (PCA) have also been applied [6][7][8] as have neurofuzzy techniques and methods based on the Wiener formalism [9,10]. However, all these methods assume that the frequency content of the ERP is stationary with respect to time, while the ERP is recognised to actually be nonstationary in most cases [11].
A few time-varying denoising methods have also been proposed. de Weerd and Kap [12] introduced a time-varying Wiener filter while [11,13] use the wavelet transform to obtain a multiresolution decomposition from which they each employ different criteria to select the event-related wavelet coefficients. Recently, [14] proposed a variant of this approach by applying wavelet decomposition to phasespace trajectories of raw single trials. While these methods do not assume stationarity, they do implicitly assume 2 EURASIP Journal on Advances in Signal Processing knowledge of meaningful temporal scales in the data. This implicit knowledge is used to set the number of levels of decomposition for the wavelet analysis. Since the EEG is considered to contain energy at five different temporal scales corresponding to the gamma, beta, alpha, delta, and theta frequencies, the number of levels of wavelet decomposition is usually set to five.
However, this partitioning of temporal scales is somewhat arbitrary. Finer distinct temporal scales such as upperalpha frequencies and lower-alpha frequencies have been found [15]. Hence, there is a need for a method that is data driven in that it decomposes the data according to time scales intrinsic to the data itself rather than making assumptions about meaningful temporal scales. In addition, the method should allow for the signal to be nonstationary.
Empirical Mode Decomposition (EMD) is a data driven method that has been used to analyse data from such diverse domains as meteorology and finance [16,17]. It allows for the signal to be nonstationary. In biosignal analysis, it has been used for the analysis of EMG (electromyographic activity) in [18]. The possibility of denoising using partial reconstructions of the EMD-based decomposition was first proposed in [19]. In [20,21] EMD-based denoising was shown to compare favourably against mean and median denoising and give better results than wavelet denoising for some signal classes. The evaluations were done on different classes of artificial signals and ECG (electrocardiographic) data. In the field of neuroscience, it has been used to estimate the ERP on high SNR Local Field Potentials (LFPs) data [22]. It has also been used in conjunction with the Hilbert Transform to detect transient periods of synchronisation in neuronal activity [23]. A multichannel extension of EMD has been applied to EEG responses from steady-state stimulation and shown to be able to estimate onset and offset of the steady-state response and also estimate the SSVEP (steadystate visual evoked potential) due to the stimulation [24,25]. Recently, [26] demonstrated promising results from a procedure applying wavelet denoising to the individual modes from an EMD-based decomposition. The current paper presents results from a study evaluating the feasibility of using EMD for denoising event-related potentials in a low SNR environment. Section 2 contains a description of the EMD technique. Section 3 presents details of how the simulated data were generated and how the experimental data were collected. Further, it outlines the analyses to be conducted on simulated and experimental data. Section 4 presents results of the analyses on simulated and experimental data. Section 5 discusses the results obtained and their implications, and Section 6 summarises the findings.

Empirical Mode Decomposition
Empirical Mode Decomposition (EMD) is a data driven method which extracts oscillatory modes intrinsic to the data by identifying time scales characteristic of individual oscillatory modes. It decomposes a signal into a small number of Intrinsic Mode Functions (IMFs) which represent the oscillatory modes contained in the data. The counterpart of the IMF in Fourier analysis is the simple harmonic component. However, the IMF is much more general than the harmonic component since it can be modulated both in amplitude and frequency while a harmonic component has constant amplitude and frequency. The amplitude and frequency modulations are possible because the decomposition depends on the local characteristic time scale of the data. Because of this flexible nature of the decomposition, the IMFs are often physically meaningful. Also, because the decomposition depends on the local characteristic time scale, EMD is suitable for application to nonstationary signals. The working of EMD is described in detail in [3].
An IMF is identified by two criteria.
(i) The number of extrema and number of zerocrossings must differ at most by one. This criterion eliminates riding waves and ensures a meaningful instantaneous frequency.
(ii) At every discrete time point, the mean of the upper and lower envelopes should be zero. This ensures that the IMF is symmetric about the x-axis and eliminates fluctuations in the instantaneous frequency that might otherwise have been present.
IMFs are identified and extracted through a sifting process. The sifting process consists of a number of steps: (i) identify all extrema of the signal, X(t); (ii) interpolate between all maxima to get upper envelope; (iii) interpolate between all minima to get lower envelope; (iv) obtain the mean of the upper and lower envelopes, m 1 .
The first component, h 1 , should be Ideally, h 1 should give the first IMF, but in practise, there is often overshoot or undershoot, and either or both criteria for IMF are not satisfied. When this happens, h 1 is treated as the data such that This is repeated k times until h 1k is an IMF or until the stopping criterion for the sifting process is reached For the implementation of EMD that we used, the sifting process was stopped based on a double threshold criterion.
The stopping criterion was formulated as a way of allowing for locally large excursions in the mean while ensuring globally small fluctuations. The criterion is described in detail in [27]. Then, h 1k is designated as the first IMF from the data: Overall, c 1 should contain the highest frequency component of the signal. c 1 is then subtracted from the original signal, X(t) to give the residual, r 1 : Since r 1 might contain further IMFs, it is subject to the same sifting process. This process can be iteratively applied to all subsequent residuals such that The extraction of IMFs stops when r n becomes a monotonic function from which no more IMFs can be extracted. Thus, it follows that

Simulated Data.
Tests were conducted on simulated data, to evaluate the denoising performance of EMD. Since the data were artificially generated, peak amplitude, peak latency, and morphology of ERP are known. Hence, denoising performance could be evaluated accurately.

Data Generation.
For generating artificial ERP data, the additive model of ERP generation was assumed, which specifies that the ERP is additive to the background EEG [28]. Background EEG has previously been shown to be well modeled by an autoregressive (AR) process [29]. AR coefficients for a 3rd-order AR process were estimated by the Burg method, by fitting to a 1-second segment of background EEG obtained from the dataset in [30]. An order of 3 was chosen since it marked the steepest fall when forward prediction error was plotted against model order. The AR process was described by where x(t) was the AR-process and r(t) was the Gaussian white noise process driving the AR system. The ERP was a smoothed version of a P300 obtained by averaging 48 target trials from a visual oddball experiment [30]. The ERP was scaled such that its peak amplitude was 5 μV. Figure 1 displays the ERP over 1 second while Figure 2 shows a typical artificial raw single trial with the ERP added to the background EEG.
100 raw single trials were generated in this way with the standard deviation of the noise process adjusted to give an overall SNR of 1. SNR was calculated as the ratio of the variance of the ERP to the average variance of the 100 trials of background EEG.

EMD Denoising.
Each artificial raw single-trial was decomposed using EMD. To obtain the EMD-denoised single trial, it is necessary to identify those IMFs carrying eventrelated energy and calculate their sum. Since the ERP is a low-frequency signal, we included all IMFs having dominant frequency below a threshold in the EMD-based single-trial reconstruction. We call this the threshold frequency ( f t ). Selecting Threshold Frequency ( f t ). To develop a quantitative criterion for selecting a threshold frequency, we assumed a signal model where u(t) is a raw single trial and s(t) is the event-related potential (ERP) additive to the background EEG, n(t). The performance of a denoising method has two aspectsthe amount by which it reduces variance due to noise (background EEG) and the amount of variance due to signal (ERP) it retains. A measure of noise reduction, N f , can be obtained by calculating the reciprocal of average variance of the denoised single trials, x i=1,...,N with higher values meaning better noise reduction. Assuming that the ERP is fully present in every denoised single trial, N f reflects the variance due to the background EEG removed by the denoising method A measure of signal retention, S f , can be calculated as ratio of variance of denoised average, X fD , to variance of raw average, X fR , with higher values meaning better signal retention. Assuming that average of raw trials is the true ERP, S f reflects the variance due to ERP retained in a denoised single trial Note that for a denoising method which does not discriminate between signal (ERP) and noise (background EEG), N f and S f are opposed to each other-high (low) N f would imply a low (high) S f . Thus, a measure of overall denoising performance, D f , is given by combining N f and S f through multiplication D f reflects the extent to which the ERP has been retained while also reducing the background EEG. It is a measure of the ability of the denoising method to discriminate between signal and noise. This is illustrated by contrasting two situations-when the denoising method attenuates a large proportion of signal while retaining a large proportion of the noise, N f and S f will be low and hence D f will also be low. On the contrary, when the denoising method attenuates noise and retains signal, N f and S f will be high and hence D f will also be high.
Based on this criterion, we chose f t as that corresponding to maximum D f for frequencies from 1 to 10 Hz. A range from 1 to 10 Hz was chosen because the ERP is known to be a low-frequency waveform [13]: For the simulated dataset, f t according to (13) was 1 Hz as can be seen from Table 1. Figure 3 shows a close correspondence between EMD-denoised average and raw average, suggesting that most of the variance due to the signal has been retained in spite of the denoising

Analysis.
To evaluate the performance of EMD-denoising on simulated data, with f t = 1 Hz, a number of tests were conducted. The SNR of the dataset with raw trials was compared to the SNR of the denoised dataset. To further investigate denoising performance, we calculated measures reflecting the two aspects of denoising-noise reduction and signal retention.
(a) Noise Reduction Percentage. The average variance due to noise on each raw single trial, NV R , can be calculated by   subtracting the ERP from each raw single trial and calculating the average residual variance. The average variance due to noise on each EMD-denoised single trial, NV D , can be calculated by subtracting EMD-denoised average from each EMD-denoised single trial and calculating the average variance of the resultant residual. The noise reduction percentage was calculated as (b) Signal Retention Percentage. The signal retention percentage was calculated as the ratio between variance of denoised average, X D , to variance of raw average, X R , multiplied by 100 Since the ERP waveform was known for the simulated data, Root Mean Square Error (RMSE), Amplitude deviation, and Latency deviation were used to assess the method's ability to estimate shape, amplitude, and latency of the singletrial ERP, respectively. Since the simulated data had similar properties to experimental ERP data, it was considered that these results would reflect the method's ability to estimate shape, amplitude, and latency of a single trial ERP on experimental data as well.
(c) Root Mean Square Error (RMSE). Since in the simulated data, we have knowledge about the time course of the ERP, we used RMSE to evaluate the extent to which both the EMDdenoised trials and raw single trials accurately represented the ERP. If x is a single trial from either EMD or raw trials and x ref is the reference ERP, where M is the number of samples for each single trial. Distributions corresponding to EMD and raw trials were then compared using either paired-samples t-test or Wilcoxon signed-rank test depending on whether both distributions were normal or not. Normality was tested for using Lilliefors test. Lower RMSE indicates better estimation of ERP time course.
(d) Amplitude Deviation. The extent to which peak amplitude is estimated correctly was measured for each EMDdenoised single trial and raw single trial using the metric of amplitude deviation where x amp is actual peak amplitude while x amp is the observed peak amplitude. Since this metric was calculated on artificial data, x amp was known to be 5 μV. x amp was calculated using peak-picking, choosing the maximum positive voltage value. Distributions corresponding to EMD and raw trials were then compared using either paired-samples t-test or Wilcoxon signed-rank test depending on whether both distributions were normal or not. Normality was tested for using Lilliefors test. A smaller amplitude deviation indicates better amplitude estimation.
(e) Latency Deviation. The extent to which peak latency is estimated correctly was measured for each EMD-denoised single trial and raw single trial using the metric of latency deviation where x lat is the actual peak latency while x lat is the observed peak latency. x lat was known to be 486 ms while x lat was calculated as the latency corresponding to the observed peak amplitude. Distributions corresponding to EMD and raw trials were then compared using either paired-samples t-test or Wilcoxon signed-rank test depending on whether both distributions were normal or not. Normality was tested for using Lilliefors test. A smaller latency deviation indicates better latency estimation.

Experimental Data.
In addition to comparing EMDdenoised trials and raw trials on simulated data, we also evaluated EMD on experimental data.

EMD Denoising
(a) Electrode Selection. The analysis was to be performed on single-channel data, hence it was necessary to choose an electrode which gives good separation between condition A and condition B. To do this, mean effect size between baseline-corrected raw trials from conditions A and B was calculated for each electrode. The mean was calculated across effect sizes for each sample from stimulus onset up to 1 s poststimulus.
Effect size, d, is calculated according to the formula where x 1 is average of the first sample of a given variable and x 2 is average of the second sample of the same variable. The pooled standard deviation s is given by where n 1 is the number of items in the first sample of standard deviation s 1 and n 2 is the number of items in the second sample of standard deviation, s 2 . Based on this criterion, the C2P electrode was identified as providing the best separation between condition A and condition B. Subsequent analysis was thus confined to C2P electrode.
(b) Selecting Threshold Frequency ( f t ). It is also necessary to choose a threshold frequency ( f t ) as parameter for use by EMD. EMD will include all IMFs with dominant frequencies below this threshold frequency, for single-trial reconstruction. The approach to selecting threshold frequency was 6 EURASIP Journal on Advances in Signal Processing similar to that applied on the simulated data as specified in (13). However, since there are two conditions in this case, denoising performance was calculated for each condition at each threshold frequency between 1 and 10 Hz. The selected threshold frequency was the one which had the highest sum of values for denoising performance, D f , which was estimated for condition A and condition B separately. The other difference from the selection of f t in simulated data was that the 25 single-subject averages were used in this case rather than using single trials. This was because singlesubject averages are of higher SNR, thereby potentially giving more accurate estimation of optimal threshold frequency. The selected threshold frequency was 3 Hz for this dataset. As seen in Figure 6, there is a close correspondence between EMD-denoised averages and averages of trials lowpass filtered at 40 Hz.

Analysis
(a) Effect Size. For experimental data, the actual time course of the ERP is unknown, and so, it is not possible to assess the ability of EMD to estimate time course, amplitude, and latency of the ERP. However, one can compare the ability of EMD to separate the two conditions against that of the conventional preprocessing method of lowpass filtering at 40 Hz. Since effect size is a measure of separation of responses from two conditions, it is employed as a performance measure to compare EMD against this baseline condition of lowpass filtering.
From inspection of Figure 6, it seems that event-related differences extend from time of stimulus onset up to 1 s poststimulus. Thus, for each of the 25 subjects, mean effect size of each sample from stimulus onset to 1 s poststimulus was calculated for both EMD-denoised trials and trials lowpass filtered at 40 Hz. Both EMD-denoised trials and lowpass filtered trials had been baseline corrected with activity 200 ms prestimulus. Distributions corresponding to EMD and lowpass filter were then compared using either pairedsamples t-test or Wilcoxon signed-rank test depending on whether both distributions were normal or not. Normality was tested for using Lilliefors test.
(b) Comparison of Correct and Incorrect Trials. On both condition A and condition B, participants correctly responded to questions most of the time while responding incorrectly on rare occasions. Due to the low number of incorrect responses, any differences between correct and incorrect responses will likely be hard to detect. Hence, it presents a scenario suitable for evaluating effectiveness of denoising methods.
For condition A, responses from all 25 participants were included. In all, there were 1010 correct responses (84.7%) and 183 incorrect responses (15.3%). For condition B, only 17 participants were included (subjects 9 to 25) since correct/incorrect information was available for this condition only for these subjects. The number of correct responses was 764 (95.7%), and number of incorrect responses was 34 (4.3%).
The correct and incorrect responses for both conditions were denoised using two methods-EMD and lowpass filtering at 40 Hz. lowpass filtering at 40 Hz is a conventional preprocessing procedure for ERP analysis. Then, correct and incorrect responses from different methods were compared using independent samples t-test, for each condition separately. This was done on trials both from condition A and condition B that had been baseline corrected with activity 200 ms prestimulus. Figure 4 shows an ERP image of the ensemble of raw singletrials while Figure 5 shows an ERP image of an ensemble of EMD-denoised single trials. An ERP image, introduced in [31], is a convenient way of visualising an ensemble of single trials, with time on the x-axis, trial number on the yaxis, and normalised amplitude represented as colour. From the smoother ERP image of Figure 5, it is seen that EMDdenoising reveals the ERP much more clearly.

Results
This suggests that EMD-denoised single trials could reveal trial-trial variations in amplitude or latency of the single-trial ERPs as a function of some behavioural or experimental variable. For example, in [6], Jung et al. used an ERP image to reveal that the peak latency of the P3 ERP component from a visual selective attention experiment was a function of reaction time.

Simulated
Data. SNR of the denoised dataset was 4.3 compared to 1 for that of the raw dataset. Table 2 gives details about performance of the denoising method on two important criteria. On artificial data, EMD is able to reduce variance due to background EEG by as much as 77% and retains almost all variance due to the ERP (99.5%).
At least one of the two distributions (corresponding to EMD and raw trials) was found to be significantly nonnormal according to Lilliefors test for RMSE, amplitude deviation, and latency deviation comparisons. Hence, Wilcoxon signed-rank test was used for all comparisons.
EMD performs significantly better than the baseline condition of raw single trials for RMSE, amplitude deviation, and latency deviation. As shown in Figure 7, EMD (mean = 1.33) has significantly lower RMSE than raw single trials (mean = 2.53), Z = −8.7, P < .001.
EMD also has lower amplitude deviation (mean = 1.19) than raw single trials (mean = 4.57) Z = −8.5, P < .001. Further, it has lower latency deviation (mean = 0.04) than raw single trials (mean = 0.07), Z = −5, P < .001, as shown in Figure 9. Thus, EMD gives a significantly better estimate of the shape, peak amplitude, and peak latency of the ERP compared to raw single trials. In terms of decibels, the performance improvement due to EMD-denoising is −5.6 dB for RMSE and −11.7 dB for amplitude deviation, with negative values due to denoised values being lower than raw values. Performance improvement of latency deviation cannot be expressed in decibels as it is measured in ms.  for its ability to differentiate between two conditions, as measured by effect size. According to Lilliefors test, both distributions (corresponding to EMD and lowpass filtering) are normal distributions. Hence, paired-samples t-test was used for the comparison. As illustrated in Figure 10, EMDdenoised trials (mean = 0.33) were found to have a higher effect size than trials lowpass filtered at 40 Hz (mean = 0.26), t(24) = 5.15, P < .001 suggesting that it is able to better discriminate between conditions. Performance improvement in effect size due to EMD-denoising cannot be expressed in decibels as it is a dimensionless quantity. When comparing correct and incorrect trials on condition A, EMD reveals a significant difference in a time window from 436 ms to 528 ms poststimulus, as shown in Figure 11. On condition B, EMD-denoised trials reveal significant differences between correct and incorrect trials from 348 ms to 616 ms and again from 884 ms to 1000 ms, as shown in Figure 12. Figure 13 shows the comparison for trials lowpass filtered at 40 Hz for condition A. The windows of significance occur sporadically and are of much shorter length, none longer than 20 ms (488 ms to 508 ms). For the equivalent EMD comparison, the longest window of significance is 92 ms (from 436 ms to 528 ms). For trials lowpass filtered at 40 Hz on trials from condition B, clear differences are revealed between correct and incorrect trials as shown in Figure 14. However, the windows of significance also occur sporadically and are of shorter duration than for the equivalent EMD comparison. The window of longest duration for trials lowpass filtered at 40 Hz was 112 ms (from 436 ms to 548 ms) while it was 268 ms (from 348 ms to 616 ms) for the equivalent EMD comparison.

Discussion
EMD is a promising data driven method for denoising ERP data. It has an advantage over other denoising methods because its data driven nature allows it to decompose a time series according to intrinsic time scales. Specifically, the decomposition is done according to the local characteristic temporal scale, and hence, EMD can estimate signals which are nonstationary in nature.

Simulated Data.
On artificial data, EMD showed the ability to reduce noise substantially while retaining the signal. It also showed the ability to estimate the shape, amplitude, and latency of the ERP with low error. Its ability to estimate the amplitude and latency of the ERP with low error on a single-trial basis lends itself to being used to track trial-trial variations in the amplitude and/or latency of the ERP which could then be related to some aspect of cognitive processing. Some examples of these applications are given in [32]. The ability to estimate latency more accurately than from raw trials can also be used to improve performance of latencycorrected averaging proposed by Woody in [33].

Experimental Data.
EMD was also shown in our study to increase the effect size between two conditions, showing that it can be used to increase the discriminability between two different conditions. This ability translates to requiring fewer stimulus presentations to obtain a significant difference between two conditions. This in turn could lead to shorter experimental times and less participation fatigue. The level of denoising offered by EMD also makes it possible to conduct experiments and analyse data from single subjects rather than averaging across large groups of subjects. This opens up the possibility of investigating brain dysfunction in brainimpaired populations on a single-subject basis. The noise reduction offered by EMD should also make it possible to clearly detect effects with only a few trials that were hitherto detected only weakly or not detected at all using conventional processing procedures. One example of this is the differences in event-related activity revealed by EMD on condition A, between correct and incorrect trials, when subjects were asked to assess grammatical correctness of a sentence. Another potential application is a comparison between responses categorised according to reaction time as "fast" and "slow."

Physical Meaningfulness of EMD-Based Decomposition.
The ability of EMD to decompose the data in such a way that small groups of IMFs are event related might or might not be because individual IMFs are physically meaningful. It might be that since EMD tends to decompose a signal into IMFs of progressively lower frequencies and the ERP contains mostly low frequency energy, forming a partial reconstruction from just low-frequency IMFs would naturally perform a denoising function. Hence, the good performance of EMD need not be due to physical meaningfulness of IMFs. There is reason to expect, however, that the EMD decomposition does in fact produce physically meaningful IMFs. EMD assumes that an input time series is a superposition of intrinsic oscillatory modes. This incidentally corresponds to our understanding of EEG/ERP data-that oscillations of different frequency ranges (alpha, beta, gamma, delta, and theta) are functionally distinct and superimposed on each other. In fact, Wang et al. showed in [34] that when EMD was applied to EEG data and the IMFs were clustered, distinct clusters were formed whose frequency spectra corresponded to the naturally occuring frequency bands mentioned above. However, to fully understand whether the EMD decomposition of ERP data is physically meaningful or not, further investigations on simulated should be conducted.

Limitations of EMD-Based
Denoising. The proposed denoising methods does have its limitations. One of these is that it is not parameter-free, but rather requires the setting of a parameter, the threshold frequency ( f t ). The decomposition also does not lend itself to being guided according to a priori knowledge, within a regularization framework, as is the case for the ICA-based decomposition proposed in [35].  Further, the EMD technique by itself has some drawbacks. Its theoretical foundations are not yet well understood, hence making it difficult to interpret the results of decomposition within any formal theoretical framework. It also suffers from border effects which can propagate towards the center especially when trial lengths are short, as mentioned in [24]. The efficacy of the decomposition also depends to a large extent on the suitability of the interpolation scheme which forms the upper and lower envelopes. Cubic spline interpolation is used most often, but other types have also been tried, sometimes with poor results. The stopping criterion for the "sifting process" has also been the subject of much debate, and a number of approaches have been proposed. Also, because of its high dependence on identifying extrema, a certain amount of oversampling is required. Finally, it has been shown that when two oscillatory modes are superimposed with certain combinations of frequencies and relative amplitudes, EMD is not able to separate them [27]. In spite of these limitations though, EMD does decompose the data into physically meaningful components-the primary reason for growing interest in this technique.

Comparing with Previous EMD-Based Studies.
It is useful to compare our study with those by [24,25]. In that case, a multichannel version of EMD has been used but more importantly, they have applied EMD to a special class of evoked potentials, called SSVEPs (steady-state visual evoked potentials). SSVEPs are different from conventional ERPs in a number of ways. For example, SSVEPs are periodic and stationary, and the method for choosing eventrelated IMFs exploits prior knowledge about this periodic, stationary character of SSVEPs. By contrast, when applying  EMD to single-channel ERP analysis, we do not make any assumptions about periodicity or stationarity of the signal. In fact, EMD-based denoising provides a potential benefit compared to other methods because it allows for nonstationarity of the signal. Thus, although our method and the method described in [24,25] both use EMD, they are used in different ways to solve different types of filtering problems. 5.6. Future Work. As a next step, it is crucial to compare the denoising performance of EMD with other methods, especially nonstationary methods like that based on the Wavelet Transform (WT), recently proposed by Quian Quiroga and Garcia [11]. It is also important to consider alternative ways of selecting event-related IMFs, for example, by using their statistical properties as proposed in [19] or by using measures such as Cumulative Mean Square Error (CMSE) as proposed in [20,21]. Finally, investigations into the functional significance of IMFs might be useful, for example, by assessing the ability of EMD to reveal event-related effects on higher-frequency IMFs not normally included in the EMD-denoised single trial.

Conclusion
EMD is a promising data driven, nonstationary method for estimating ERPs. For single-trial analysis, it could be used to study trial-trial variations in amplitude and/or latency of the ERP and relate it to some aspect of cognitive processing. For few-trial analysis, and it can be used to detect effects by comparing averages of only a few EMD-denoised trials. Often, these effects would not be revealed by averaging a few trials that have been processed by conventional means. For the full potential of EMD to be exploited, however, it is important to understand the functional significance of the IMFs, thereby potentially facilitating our understanding of event-related dynamics in the context of ongoing EEG.