Multiple linear regression to estimate time-frequency electrophysiological responses in single trials

Transient sensory, motor or cognitive event elicit not only phase-locked event-related potentials (ERPs) in the ongoing electroencephalogram (EEG), but also induce non-phase-locked modulations of ongoing EEG oscillations. These modulations can be detected when single-trial waveforms are analysed in the time-frequency domain, and consist in stimulus-induced decreases (event-related desynchronization, ERD) or increases (event-related synchronization, ERS) of synchrony in the activity of the underlying neuronal populations. ERD and ERS reflect changes in the parameters that control oscillations in neuronal networks and, depending on the frequency at which they occur, represent neuronal mechanisms involved in cortical activation, inhibition and binding. ERD and ERS are commonly estimated by averaging the time-frequency decomposition of single trials. However, their trial-to-trial variability that can reflect physiologically-important information is lost by across-trial averaging. Here, we aim to (1) develop novel approaches to explore single-trial parameters (including latency, frequency and magnitude) of ERP/ERD/ERS; (2) disclose the relationship between estimated single-trial parameters and other experimental factors (e.g., perceived intensity). We found that (1) stimulus-elicited ERP/ERD/ERS can be correctly separated using principal component analysis (PCA) decomposition with Varimax rotation on the single-trial time-frequency distributions; (2) time-frequency multiple linear regression with dispersion term (TF-MLRd) enhances the signal-to-noise ratio of ERP/ERD/ERS in single trials, and provides an unbiased estimation of their latency, frequency, and magnitude at single-trial level; (3) these estimates can be meaningfully correlated with each other and with other experimental factors at single-trial level (e.g., perceived stimulus intensity and ERP magnitude). The methods described in this article allow exploring fully non-phase-locked stimulus-induced cortical oscillations, obtaining single-trial estimate of response latency, frequency, and magnitude. This permits within-subject statistical comparisons, correlation with pre-stimulus features, and integration of simultaneously-recorded EEG and fMRI.


Introduction
The human electroencephalogram (EEG) and magnetoencephalogram (MEG) largely reflect synchronous changes of slow postsynaptic potentials occurring within a large number of similarly oriented cortical pyramidal neurons (Nunez and Srinivasan, 2006). Brisk sensory, motor or cognitive events can elicit transient changes in the ongoing EEG activity. Such changes are commonly detected as event-related potentials (ERPs) that are both time-locked and phase-locked to the stimulus onset Pfurtscheller and Lopes da Silva, 1999). The same events can also induce non-phase-locked modulations of ongoing oscillatory EEG activity, consisting in transient decreases (event-related desynchronization, ERD) or increases (event-related synchronization, ERS) of EEG power, usually confined to a specific frequency band. The functional significance of ERD and ERS differs according to the frequency band in which they occur. For example, within the alpha band (frequencies ranging from 8 to 12 Hz) ERD and ERS have been suggested to reflect cortical activation and cortical deactivation, respectively (reviewed in Pfurtscheller and Lopes da Silva, 1999). In contrast, ERS in the gamma band (frequencies N 30 Hz) has been suggested to reflect the formation of transient cortical assemblies and thus to play a role in cortical integration Tallon-Baudry et al., 1997).
Similarly to ERPs, the magnitude of ERD/ERS is often several factors smaller than the magnitude of the background EEG activity (Hu et al., 2010). To enhance the signal-to-noise ratio (SNR) of ERD/ERS, across-trial averaging of time-frequency decompositions (Pfurtscheller and Lopes da Silva, 1999) is the most widely used approach. This approach is based on the expression of signal power within a specific frequency band of interest, as a function of time, or the joint time-frequency distribution (TFD) of the power changes obtained using different approaches, like the windowed Fourier transform or the continuous wavelet transform . Unfortunately, all the dynamic information concerning across-trial variability of ERP/ERD/ERS is lost by this across-trial averaging procedure . This variability may reflect important factors such as changes in stimulus intensity, habituation Ohara et al., 2004;Stancak et al., 2003), as well as fluctuations in vigilance, expectation, task complexity, and emotional or attentional status (Mu et al., 2008;Ploner et al., 2006). The availability of robust signal processing techniques that can quantify latency, frequency, and magnitude of stimulus-induced oscillations in single-trials would allow exploring such physiologically-relevant information in a range of analyses. These include within-subject statistical comparisons, correlation with pre-stimulus features, integration of simultaneously-recorded EEG and functional magnetic resonance imaging data (fMRI) (Debener et al., 2006).
A variety of advanced methods for single-trial analysis of phaselocked ERPs have been proposed Barbati et al., 2006;Hu et al., 2010;Hu et al., 2011b;Jung et al., 2001;Mayhew et al., 2010;Mayhew et al., 2006;Porcaro et al., 2008;Porcaro et al., 2009;Porcaro et al., 2010;Quiroga, 2000;Quiroga and Garcia, 2003;Tang et al., 2005;Tecchio et al., 2007). An important step to improve the effectiveness of extracting single-trial information is enhancing the low SNR of ERPs. This can be obtained by exploiting the spatial information of multielectrode EEG recordings using spatial-temporal filtering based on blind source separation (BSS) methods (e.g., independent component analysis [ICA] and second-order blind identification) to isolate stimulusrelated responses from background EEG (Bingham and Hyvarinen, 2000;Hyvarinen, 1999;Hyvarinen and Oja, 2000;Jung et al., 2001;Makeig et al., 1997;Tang et al., 2005). Time-frequency filtering based, for example, on a continuous or discrete wavelet transform (Hu et al., 2010;Hu et al., 2011b;Jongsma et al., 2006;Mouraux and Plaghki, 2004;Quiroga, 2000;Quiroga and Garcia, 2003) can also be used to isolate effectively stimulus-related, phase-locked responses from background EEG and non-cerebral artefacts. In a previous study aiming at measuring single-trial ERP features, Mayhew et al. (2006) suggested the use of a multiple linear regression method to estimate ERP latencies and amplitudes. This approach was later extended by including a dispersion term to increase the accuracy and the number of estimated features (Hu et al., 2011a). Importantly, most of the available single-trial analysis methods were developed to estimate stimulus-evoked phase-locked responses in the time domain (i.e., ERPs). Therefore, these methods are entirely blind to stimulus-induced non-phase-locked modulations of ongoing EEG oscillations (i.e., ERD and ERS). Extending these approaches to explore the dynamic information of stimulus-induced non-phaselocked activity at a single-trial level constitutes the objective of the present study.
Laser-evoked potentials (LEPs) are considered the best tool to assess the function of nociceptive pathways, and are widely used in both physiological and clinical studies (e.g., Bromm and Treede, 1991;Cruccu et al., 2008;Iannetti et al., 2001). When applied onto the skin, brief laser heat pulses excite selectively Aδ and C fibre free nerve endings in the superficial epidermal layers without coactivating Aβ mechanoreceptors (Bromm and Treede, 1984;Carmon et al., 1976), and EEG responses have been shown to be related to the activation of the spinothalamic tract (Iannetti et al., 2003;Treede et al., 2003). Latency, amplitude and morphology of LEPs exhibit an especially high acrosstrial variability (Hu et al., 2011a;Iannetti et al., 2005a;Purves and Boyd, 1993), most probably due to a unique combination of peripheral (e.g., time-dependent fluctuations in baseline skin temperature, variability in the number of activated nociceptive fibres, variability in conduction velocity resulting in differences in the spatial summation at central synapses) and cognitive factors (e.g., fluctuations in vigilance, attentional focus and task strategy) (Baumgartner et al., 2005;Lee et al., 2009;Legrain et al., 2003). Therefore, laser-evoked EEG responses, adopted in the current study, represent an interesting model to develop novel approaches to estimate time-frequency features at single-trial level, with potential applications for basic and clinical research.
We describe a novel approach to measure different parameters (latency, frequency, and magnitude) of non-phase-locked time-frequency responses in single trials. Briefly, this approach consists of two steps. First, we used a principal component analysis (PCA) decomposition with Varimax rotation to isolate different response features (i.e., ERP, ERD and ERS) from single-trial TFDs in the time-frequency domain. Second, we used a time-frequency multiple linear regression with a dispersion term (TF-MLR d ) to estimate latency, frequency and magnitude of ERP, ERD and ERS in each single trial. To validate our approach, we applied it to both real and simulated EEG datasets.

Subjects, experimental paradigm and EEG recording
EEG data were collected from ten healthy volunteers (4 females) aged from 22 to 36 years (29.7 ± 4.6, mean ± SD). All participants gave written informed consent, and the local ethics committee approved the procedures.
Noxious radiant-heat stimuli were generated by an infrared neodymium yttrium aluminium perovskite (Nd:YAP) laser with a wavelength of 1.34 μm (Electronical Engineering, Italy). These laser pulses activate directly nociceptive terminals in superficial skin layers (Baumgartner et al., 2005;Iannetti et al., 2006). Laser pulses were directed to the dorsum of the right hand and a He-Ne laser pointed to the area to be stimulated. The laser pulse was transmitted via an optic fibre and focused by lenses to a spot diameter of approximately 7 mm (38 mm 2 ) at the target site. The duration of the laser pulses was 4 ms. Three different energies of stimulation were used (E1: 3.5 ± 0.7 J; E2: 4 ± 0.8 J; E3: 4.5 ± 0.7 J). With these parameters laser pulses elicit a clear pinprick sensation, related to the activation of Aδ skin nociceptors  and result in subjective reports of a range of perceived intensities. After each stimulus, the laser beam target was shifted by approximately 10 mm in a random direction, to avoid nociceptor fatigue and sensitization. The laser beam was controlled by a computer that used two servo-motors (HS-422; Hitec RCD, USA; angular speed, 60°/160 ms) to orient it along two perpendicular axes (Lee et al., 2009).
EEG data were collected in three recording blocks with different stimulus energies, which were counterbalanced across subjects. Participants were seated in a comfortable chair and wore protective goggles. They were asked to focus their attention on the stimuli, relax their muscles and keep their eyes open and gaze slightly downward. Acoustic isolation was ensured using earplugs and headphones. Both the laser beam and the controlling motors were completely screened from the view of the participants. The experiment consisted of three blocks of 60 trials, with an inter-stimulus interval ranging between 20 and 25 s. Between 3 and 6 s after each stimulus, participants were asked to rate verbally the intensity of the sensation evoked by the stimulus, using a numerical scale ranging from 0 to 10, where 0 was "no pain" and 10 was "pain as bad as it could be" (Jensen and Karoly, 2001).
The EEG was recorded using 32 Ag-AgCl electrodes placed on the scalp according to the International 10-20 system, using the nose as reference. Electrode impedances were kept b 5 kΩ. To monitor ocular movements and eye blinks, electro-oculographic (EOG) signals were simultaneously recorded from two surface electrodes, one placed over the lower eyelid, the other placed 1 cm lateral to the outer corner of the orbit. Signals were amplified and digitized using a sampling rate of 1024 Hz and a precision of 12 bits, giving a resolution of 0.195 μV digit −1 (System Plus; Micromed, Italy).
Continuous EEG data were down-sampled to 500 Hz and band-pass filtered between 1 and 30 Hz. EEG epochs were extracted using a window analysis time of 1500 ms (500 ms before and 1000 ms after the laser stimulus) and baseline corrected using the pre-stimulus time interval. To test the possible bias in the automated single-trial detection method, the same number of trials of resting EEG (4000 ms to 2500 ms pre-stimulus) were extracted from the dataset of each subject.
Trials contaminated by eye-blinks and movements were corrected using an ICA algorithm (Delorme and Makeig, 2004;Jung et al., 2001;Makeig et al., 1997). EEG epochs were then visually inspected and trials contaminated by artefacts due to gross movements were removed. In all datasets, individual eye movements, showing a large EOG channel contribution and a frontal scalp distribution, were clearly seen in the removed independent components.
After these pre-processing steps, 58 ± 2 epochs remained for the automated analysis for each subject (577 for all subjects). Similarly, 58 ± 2 epochs of resting EEG were kept for testing detection bias.
EEG data analysis: time-frequency feature (TF-feature) separation A time-frequency distribution of each single EEG epoch was calculated using the continuous wavelet transform (CWT) (Mouraux et al., 2003;Mouraux and Iannetti, 2008). The explored frequencies ranged from 1 to 30 Hz in steps of 1 Hz, and the explored latencies between − 500 and 1000 ms in steps of 2 ms. For each estimated frequency, the magnitude of the power spectrum was baseline-corrected by subtracting the average power of the signal in the time-interval between −400 and −100 ms, which avoids the positive bias introduced by the percentage approach (Hu et al., 2014). The result of CWT is an expression of the oscillation magnitude (in μV) as a function of time and frequency, including both phase-locked (ERP) and non-phase-locked (ERD/ERS) brain responses (Mayhew et al., 2010;Mouraux and Iannetti, 2008). To distinguish between phase-locked and non-phaselocked EEG responses, we calculated, in each subject, the phaselocking value (PLV; Lachaux et al., 1999), as follows: where N is the number of trials, and F(t,f) is the complex time-frequency estimate at each point (t,f) of the single-trial EEG time course.

PCA separation
In order to separate physiologically relevant TF-features (i.e., ERP, ERD, and ERS) within the TFDs of single-trial laser-evoked EEG responses measured from Cz-nose, we performed a PCA decomposition with Varimax rotation (Bernat et al., 2005;Dien, 2010;Kayser and Tenke, 2003), as successfully implemented in several recent studies (Bernat et al., 2007;Bernat et al., 2005;Mayhew et al., 2010). This approach allows the separation of physiologically distinct EEG activities that are contained in the time-frequency plane. The procedure of PCA with Varimax rotation consists in the following five steps (summarized in Fig. 1) (Bernat et al., 2005;Mayhew et al., 2010).
(1) Data concentration. The TFD of each single trial was re-arranged as a vector, and all vectors from all single trials across all subjects were stacked sequentially to form a single matrix. In this study, we re-arranged the time-frequency matrices from K T × N T × N F (three dimensions: trial numbers of all subjects × time points × frequency points) to K T × N TF (two dimensions: trial numbers × time-frequency points).
(2) PCA decomposition of the covariance matrix. The matrix generated in step 1 was decomposed into a set of principal components (PCs) by PCA. (3) Varimax rotation. These PCs were further rotated using the Varimax algorithm, which maximizes the sum of the variances of the squared loadings so that the matrix can be optimally described by a linear combination of few basis functions (Kaiser, 1958;Kayser and Tenke, 2003;Richman, 1986). were averaged across all single-trial TFDs of all subjects, after subtracting the baseline. As compared to the baseline, significant differences in PLVs are outlined in white (bootstrapping test with FDR correction), which indicate that only the ERP response was phase-locked to stimulus onset, while the other TFD responses were not. Middle: The eigenvalue plot shows the variance explained by the first 30 extracted PCs. The first three PCs explained the largest amount of variance in the data (23.1%, 9.2% and 5.9%, corresponding to ERP, ERD and ERS responses), while the explained variance for any of the remaining PCs was less than 5%. Right: The first three PCs correspond to the ERP, ERD and ERS in the time-frequency plane, respectively. The thresholded TFDs were obtained by applying a two-SD cut-off. The amount of background EEG noise was remarkably reduced, while the regions corresponding to ERP/ERS/ERD responses were clearly preserved. The ERP region was located at 50-550 ms post-stimulus (in time) and 3-10 Hz (in frequency); the ERD region at 50-1000 ms and 9-12 Hz; the ERS region at 217-447 ms and 10-19 Hz in frequency.
(4) Rearrangement of PC vectors to TFDs. The three PCs that explained the maximum variance in the matrix were selected, and re-arranged into three-dimensional matrices (i.e., with the same number of dimensions of the original single-trial TFDs). Note that the number of PCs was chosen empirically based on our previous studies Mayhew et al., 2010;Mouraux et al., 2003), where three main PCs represented the stimulus-elicited ERP, ERD and ERS. The obtained PCs were then re-arranged from K P × N TF (two dimensions: PC numbers × time-frequency points) to K P × (N T × N F ) (three dimensions: PC numbers × time points × frequency points). (5) TFD thresholding. To isolate significant signal changes from background noise, the time-frequency maps of each PC were thresholded using a cut-off at two standard deviations from the mean of all time-frequency points (Mayhew et al., 2010). The effect of TFD thresholding using different cut-off levels on the performance of single-trial estimation is shown in the Supplementary Materials. Therefore, only the time-frequency points with amplitudes above (ERP and ERS) or below (ERD) two standard deviations from the mean were retained. The value of all other timefrequency points was set to zero (Mayhew et al., 2010).
The scalp topography of each thresholded TFD was computed by spline interpolation of the mean power of the 20% time-frequency points displaying the highest power increase (ERP and ERS) or decrease (ERD), across all channels (Fig. 2). This "top 20%" summary value reflects, in each subject/trial, the highest response magnitudes in the thresholded TFD for each TF-feature, and reduces the noise that would be presented by including in the mean all points of the spectrogram, some of which would display little or no response. This approach avoids only selecting outlier values, and it has been successfully applied in previous studies Iannetti et al., 2005b;Mayhew et al., 2010;Mitsis et al., 2008;Mouraux and Iannetti, 2008).

EEG data analysis: single-trial analysis
We describe two methods to estimate automatically the single-trial latency, frequency and magnitude of each time-frequency feature (ERP, ERD, and ERS): multiple linear regression in the time-frequency domain without a dispersion term (TF-MLR) and with a dispersion term (TF-MLR d ). These methods are summarized in Figs. 3 and 4 respectively. Both methods have been developed into user-friendly software running under the MATLAB environment, which can be freely downloaded from www.iannettilab.net.

Time-frequency multiple linear regression (TF-MLR)
The MLR method to estimate the single-trial latency and amplitude of ERPs in the time domain was first described in Mayhew et al (2006), and successfully applied to the single-trial detection of the N1 wave of laserevoked potentials (LEPs) (Hu et al., 2010), and of auditory-evoked potentials (AEPs) collected during simultaneous EEG-fMRI recording (Mayhew et al., 2010).
When extended in the time-frequency domain, such time-frequency MLR (TF-MLR) approach takes into account not only the latency jitter of the examined TF-feature, but also its variability in frequency. Thus, the variability in single trials can be modelled as follows.
Where F(t, f) is the TFD of a single-trial LEP waveform which represents as a joint function of time t and frequency f, and F 1 (t,f), F 2 (t,f), and F 3 (t, f) are the across-trial averages of ERP, ERD, and ERS, respectively. F(t, f) can be modelled as the weighted sum of the ERP, ERD, and ERS, plus background noise ε. As unknown parameters in the single-trial analysis, k 1 , k 2 and k 3 are the weighted constants; a 1 , a 2 and a 3 are the values representing the variability in latency; and b 1 , b 2 and b 3 are the values representing the variability in frequency of ERP, ERD and ERS respectively.
Using the Taylor expansion, the MLR model can be simplified as: are the temporal derivatives of ERP, ERD, and ERS; and are the frequency derivatives of ERP, ERD, and ERS respectively. Thus, a single-trial TFD can be approximated as the sum of a set of weighted basis (average, its temporal derivative and its frequency derivative) (Fig. 3).
Based on the fitted single-trial TFD (Fig. 5, top panel), we calculated, for each TF-feature, the correlation coefficient (CC 1 , CC 2 and CC 3 for ERP, ERD and ERS, respectively) between the fitted single-trial TFD and the thresholded TFD obtained from PCA decomposition with Varimax rotation. Single-trial ERP and ERS magnitudes were finally obtained by calculating the mean of the 20% of points (relative to all the non-zero points in the thresholded TFD for each TF-feature [ Fig. 1, right panel], the same hereinafter) displaying the highest increase (if CC 1 N 0 or CC 3 N 0, i.e., a positive fit) or the highest decrease (if CC 1 b 0 or CC 3 b 0, i.e., a negative fit), respectively. In contrast, single-trial ERD magnitude was obtained by calculating the mean of the 20% of points displaying the highest decrease (if CC 2 N 0, i.e., a positive fit), or the highest increase (if CC 2 b 0, i.e., a negative fit). Finally, single-trial latencies and frequencies corresponding to the measured ERP/ERD/ERS were obtained by calculating the mean latency and frequency of the selected "top 20%" of points in the time-frequency plane.

TF-MLR with dispersion term (TF-MLR d )
Similarly to what observed in ERP waveforms Spencer, 2005), the time-frequency response averaged across trials is more dispersed in both time and frequency domains compared to each of single-trial TFDs, because of the latency and frequency variations from trial to trial.
To obtain an accurate estimate of single trial time-frequency responses, not only their variability in latency and frequency, but also their variability in morphology (both in time and frequency domain) should be taken into account. This has a physiological rationale. For example, in some clinical conditions (e.g., optic neuritis during multiple sclerosis), visual-evoked potentials are "desynchronized", i.e., their amplitudes are reduced because of increased latency jitter, as well as increased width of single-trial responses (Orssaud, 2003;Pelosi et al., 1997). Latency jitter, as well as trial-by-trial variability in response morphology (i.e., wave width or frequency variability) could thus be important parameters for clinical studies. Therefore, in addition to the basis set of TF-MLR, two more regressors, representing the scaling of the single-trial response in time or frequency domain are considered, which leads to the following TF-MLR d model (Fig. 4): where s 1 , s 2 and s 3 are the coefficients that determine the compression ratios of the time width of ERP, ERD and ERS of each single-trial TFD compared to those of the average TFD, respectively, while c 1 , c 2 and c 3 are the coefficients that determine the compression ratios of the frequency width of ERP, ERD and ERS of each single-trial TFD compared to those of the average TFD, respectively.
To estimate the unknown parameters k, s, c, a, and b of each single-trial TFD, we generate a basis set to fit the ERP, ERD and ERS by employing PCA, i.e., a non-parametric data-driven approach (Jolliffe, 2002). We generated five PCs representing, for each feature of the TFD (1) the average of the response, (2) the variability in latency, (3) the variability in frequency, (4) the variability in morphology in the time domain, and (5) the variability in morphology in the frequency domain. The procedure for generating the TF-MLR d regressors consists in the following five steps (Fig. 4).
(1) Generating the variability matrices (Fig. 4, step 1). For each TFfeature, four groups of plausible TFDs were generated by shifting ([1] from −100 ms to 100 ms in steps of 2 ms in the time domain, [2] from − 2 Hz to 2 Hz in steps of 0.1 Hz in the frequency domain, centred at the peak of the TF-feature) and compressing ([3] from 1 to 2 in steps of 0.01 in the time domain, [4] from 1 to 2 in steps of 0.025 in the frequency domain, centred at the peak of the TF-feature) the thresholded TFD in an enumerative fashion using resampling technique (i.e., increasing the frequency resolution before shifting and compressing the plausible TFDs, as well as decreasing the frequency resolution afterwards). Each of these plausible bidimensional TFDs was re-arranged into a monodimensional vector, and all vectors were stacked to form four bidimensional data matrices that we called variability matrices (latency-shift matrix, frequency-shift matrix, latencycompression matrix, frequency-compression matrix), for each TF-feature. Note that the variability of TF-feature magnitude is captured by the coefficients weighting the basis sets, and thus it is not included in these variability matrices.
(2) PCA separation (Fig. 4, step 2). Each of the four variability matrices was separately fed into a PCA, to obtain the PCs representing the linear subspace for each variability matrix. As the first few PCs are responsible for most of the data variance (Hossein-Zadeh et al., 2003;Jolliffe, 2002), it is expected that the first PCs resulting from the variability matrices would represent the average TFD for each TF-feature and the second PCs embody its variability in latency, frequency, morphology in the time domain, and morphology in the frequency domain.
(3) Basis set definition (Fig. 4, step 3 (Fig. 5, bottom panel). The obtained basis sets were regressed against the TFD of each single trial. The coefficient (i.e., β value) of each of the five regressors was estimated by the least square approach, as described in Mayhew et al (2006) and Hu et al (2010). By multiplying these estimated coefficients by the corresponding regressors, the fitted TFDs of each single trial were reconstructed. (5) Single-trial latency, frequency, and magnitude estimation.
Based on the fitted single-trial TFD (Fig. 5, bottom panel), we calculated the correlation coefficients (CC 1 , CC 2 , and CC 3 for ERP, ERD, and ERS respectively) between the fitted single-trial TFD and the threshold TFD obtained from PCA decomposition with Varimax rotation for each TF-feature. The calculation of single-trial parameters is the same as the approach described for TF-MLR.

Single-trial performance assessment
The performance of the TF-MLR and TF-MLR d in estimating singletrial parameters was assessed both on simulated and real EEG datasets (performance on simulated data is reported in the Supplementary Materials).
(1) Goodness of fit (TF-MLR vs. TF-MLR d ). Compared with the TF-MLR approach, the TF-MLR d takes the variability of morphology (both latency and frequency) into consideration, and thus generates two additional basis sets for each TF-feature. To determine whether the TF-MLR d approach gives a significantly better fit to the data than the TF-MLR approach regardless of the number of basis sets in each of the two approaches, we performed an F test, which takes into account the number of the compared model parameters (Motulsky and Christopoulos, 2004). Let's consider two models, 1 and 2, where model 1 is "nested" within model 2. That is, model 1 has p 1 parameters, and model 2 has p 2 parameters, where p 2 N p 1 , and, for any choice of parameters in model 1, the same regression curve can be fitted by some choice of the parameters in model 2. It is obvious that in this example the model with more parameters will always fit the data at least as well as the model with fewer parameters. Therefore, any method to compare a simple model with a more complicated model has to balance the decrease in sum-of-squares with the increase in the number of parameters. This can be achieved using the F test (Eq. (5)), which determines whether model 2 gives a significantly better fit to the data than model 1 regardless of the number of parameters (Motulsky and Christopoulos, 2004), as follows: where RSS 1 and RSS 2 are the residual sum-of-squares of model 1 and model 2 respectively, and n is the number of data points. F will have an F distribution with (p 2 − p 1 , n − p 2 ) degrees of freedom. The null hypothesis is that model 2 does not provide a significantly better fit than model 1, and this hypothesis should be rejected if the F value is greater than the critical value of the F distribution for a desired false-rejection probability (p b 0.05).
(2) Detection bias. To examine whether the two explored approaches (TF-MLR and TF-MLR d ) introduce a biase into the analysis by, for example, fitting noise, each of them was applied to resting EEG epochs obtained from the same subjects. Such possible detection bias was quantified by comparing the obtained magnitudes of each TF-feature against zero, using a one sample t test.

(3) Comparison of single-trial magnitudes (TF-MLR vs. TF-MLR d ).
Single-trial magnitudes of each TF-feature obtained using the TF-MLR approach were averaged across trials and compared to the corresponding values obtained using the TF-MLR d approach. Their differences were then assessed using a paired sample t test. (4) Correlation between different single-trial estimates, as well as correlation between single-trial estimates and corresponding single-trial subjective pain intensity. Single-trial parameters (latency, frequency, and magnitude) obtained using the TF-MLR and TF-MLR d approaches were further compared with each other, as well as with the single-trial ratings of the subjective pain intensity. For both approaches, all possible correlations (between each estimated single-trial parameter and the corresponding subjective pain intensity, and between each pair of the estimated single-trial parameters) were performed for each subject (Iannetti et al., 2005a). The obtained correlation coefficients were transformed to Z values using Fisher R-to-Z transformation. The resulting Z values were finally compared against zero using a one sample t test. Fig. 1 shows the first three PCs obtained from PCA decomposition with Varimax rotation. These three PCs explained the largest amount of variance (23.1%, 9.2% and 5.9%, respectively), while the explained variance for any of the remaining PCs was less than 5%.

EEG data analysis: TF-feature separation
After thresholding using the two-SD cut-off, the amount of background EEG noise on the time-frequency plane was remarkably reduced while the regions corresponding to ERP/ERD/ERS were clearly preserved (Fig. 1). The ERP region (feature 1) was located at 50-550 ms post-stimulus (in time) and 3-10 Hz (in frequency); the ERD region (feature 2) was located at 50-1000 ms and 9-12 Hz; and the ERS region (feature 3) was located at 217-447 ms and 10-19 Hz. The group-level PLVs indicated that only the ERP response was phase-locked to stimulus onset, while the other TFD responses (ERD and ERS) were not (Fig. 1, left panel). Similar time-frequency distributions of the EEG responses elicited by transient laser pulses Mouraux et al., 2003;Mouraux and Iannetti, 2008;Ohara et al., 2004;Ploner et al., 2006) and by transient auditory stimuli (Makinen et al., 2004;Mayhew et al., 2010) have been reported previously. Fig. 2 shows the scalp topographies of the ERP, ERD, and ERS responses elicited by laser stimulation of the right hand dorsum. While the scalp topography of ERP response was centrally distributed and maximal at the vertex, the scalp topography of ERS was slightly more frontal. The scalp topography of the ERD response had a maximum contralateral to the stimulated side, and for this reason single-trial ERD parameters were measured from electrode C3. The similarity between the scalp topography of ERP in the time-frequency domain and the scalp topography of N2-P2 complex in the time domain ), together with the observation of their similar modulation by different experimental factors Schulz et al., 2011;Valentini et al., 2011;Zhang et al., 2012), suggests that the neural activity reflected in the N2-P2 complex in the time domain corresponds to the ERP region in the time-frequency domain.
Single-trial performance assessment Fig. 3 shows the three regressors for the ERP response (i.e., the thresholded feature 1) generated using the MLR approach. These regressors represent the average amplitude (Gaussian smoothed) of the response, its temporal derivative and its frequency derivative. The regressors for the ERS and ERD responses were generated in the same way. Fig. 4 shows the procedure used to generate the regressors for the ERP response (thresholded feature 1), in the TF-MLR d approach. The first two PCs, obtained from the latency-shift, frequency-shift, latencycompression, and frequency-compression variability matrices, explained 96%, 95%, 99%, and 96% of the total variance, respectively. The same procedure was used to generate regressors for the ERS and ERD responses.
When comparing the regressors obtained using the TF-MLR and TF-MLR d approaches (Figs. 3 and 4), we observed that the temporal derivative in TF-MLR is very similar to the second PC of the latency-shift variability matrix in TF-MLR d , and that the frequency derivative in TF-MLR is very similar to the second PC of the frequency-shift variability matrix in TF-MLR d . In addition, the second PC obtained from the latencycompression variability matrix and the second PC obtained from the frequency-compression variability matrix were included in the TF-MLR d approach. Thus, the TF-MLR d analysis can better explain the variability of single-trial TFD than the TF-MLR analysis, especially for the variability of response morphology (Hu et al., 2011a), both in the time domain and in the frequency domain.
(1) Goodness of fit assessment (TF-MLR vs. TF-MLR d ). The top and bottom panels of Fig. 5 show the fitted responses of a same singletrial using TF-MLR and TF-MLR d , respectively. In the fitted response the information-of-interest, reflected in the ERP, ERS and ERD, was correctly preserved, while the information-of-nointerest, represented by the stimulus-unrelated background EEG was removed. This procedure increased the SNR of the brain responses (both phase-locked and non-phase-locked) in the time-frequency plane. Importantly, the F test performed between the goodness of fit obtained with the TF-MLR and the TF-MLR d approaches indicated that the better performance of TF-MLR d was not simply due to the larger number of regressors (F = 126, p b 0.001; Fig. 5, right panel), but to the actual fitting of physiologically-relevant sources of variability (e.g., variability of morphology in the time and frequency domains).
(2) Detection bias. To test whether the methods (TF-MLR and TF-MLR d ) used to estimate single-trial magnitude of ERP, ERS and ERD introduced any detection bias, they were applied to an equal number of resting EEG epochs obtained from all subjects. When estimated using TF-MLR approach, the mean (±SEM) of single-trial estimate of response magnitude in resting EEG epochs were 0.15 ± 0.12 μV, − 0.09 ± 0.21 μV, and − 0.06 ± 0.14 μV for ERP, ERD, and ERS. These magnitude values were not significantly different from zero (ERP: p = 0.24; ERD: p =  (Fig. 1). Right Panel: The three regressors obtained by the TF-MLR approach represent the average (Gaussian smoothing; the spread parameters of the Gaussian kernel are: σ x = 30 ms, σ y = 3 Hz), the temporal derivative and the frequency derivative of the ERP response, respectively. The temporal and frequency derivatives will be used to capture the variability in latency and frequency of single-trial TFDs.  Fig. 1). Second column: TFDs representing plausible variability in latency, frequency, morphology in the time domain, and morphology in the frequency domain were generated by shifting and compressing the thresholded response in an enumerative fashion. Third column: For each source of variability, the plausible responses were re-arranged into vectors, which were subsequently stacked into a data matrix (variability matrix). Fourth column: The eigenvalue plots show the explained variance for each of the first 20 generated PCs, for each variability matrix. Note that the first two PCs explain the largest part of the total variance of each source of variability. Fifth column: Five regressors capturing the average magnitude of the considered TF-feature, and its variability in latency (the second PC in latency-shift variability matrix), in frequency (the second PC in frequency-shift variability matrix), in morphology in the time domain (the second PC in latency-compression variability matrix), and in morphology in the frequency domain (the second PC in frequency-compression variability matrix).
sample t test). These results clearly show that both TF-MLR and TF-MLR d approaches provide an unbiased estimate of singletrial magnitude of ERP, ERS and ERD. A comparison of the single-trial magnitude values obtained from LEP trials vs. the resting EEG trials using the two approaches is shown in Fig. 6. (3) Comparison of single-trial magnitudes (TF-MLR vs. TF-MLR d ). Fig. 6 shows the comparison of the average of single-trial ERP, ERD and ERS magnitudes estimated using TF-MLR and TF-MLR d . The single-trial ERP and ERS magnitudes estimated using TF-MLR were significantly smaller than those estimated by TF-MLR d (ERP: 4.70 ± 0.79 μV vs. 4.93 ± 0.80 μV; p b 0.007; ERS: 1.39 ± 0.35 μV vs. 1.52 ± 0.35 μV; p b 0.008, paired sample t test). In contrast, there was no significant difference between the single-trial ERD magnitudes estimated using TF-MLR and TF-MLR d (ERD: −0.91 ± 0.29 μV vs. −0.94 ± 0.32 μV; p N 0.05, paired sample t test). (4) Correlation between different single-trial estimates, as well as correlation between single-trial estimates and corresponding single-trial subjective pain intensity. Fig. 7 shows all possible correlations between single-trial parameters estimated using TF-MLR and TF-MLR d . Overall, correlations were markedly similar in the data estimated using TF-MLR and TF-MLR d . We observed significant negative correlations between single-trial ERP latencies and the corresponding ERP frequencies (mean R = − 0.31 ± 0.07,   6. Comparison of single-trial magnitudes estimated using TF-MLR and TF-MLR d of LEP trials and resting EEG trials. Single-trial magnitudes of ERP (left), ERD (middle), and ERS (right) estimated using TF-MLR (blue) and TF-MLR d (red) from LEP trials (palisaded) and resting EEG trials (filled). For both TF-MLR and TF-MLR d , the magnitudes of ERP, ERD and ERS estimated from resting EEG trials are not significantly different from zero (p N 0.05 for all comparisons; one sample t test). Whereas the single-trial magnitudes of ERD estimated from LEP trials using TF-MLR and TF-MLR d are not significantly different, the single-trial magnitudes of ERP and ERS estimated from LEP trials using TF-MLR were significantly smaller than those estimated using TF-MLR d (p b 0.01 for both comparisons, paired sample t test). p b 0.0001, obtained from TF-MLR d , hereinafter), and between the single-trial ERD latencies and the corresponding ERD frequencies (mean R = −0.36 ± 0.09, p b 0.0001). In addition, we observed significant positive correlations between the single-trial ERP and ERS magnitudes and the corresponding subjective pain intensity (ERP: mean R = 0.51 ± 0.15, p b 0.0001; ERS: mean R = 0.15 ± 0.19, p = 0.03).

Discussion
The present study shows that different features (i.e., ERP, ERD, and ERS) of the time-frequency EEG response elicited by transient sensory stimuli can be (1) isolated and characterized using PCA with Varimax rotation, and (2) reliably estimated at single-trial level using multiple linear regression approaches (TF-MLR and TF-MLR d ). Such approaches allow estimating the trial-to-trial variability of the latency, frequency and magnitude of several time-frequency features. When testing such approaches on a real EEG dataset we observed meaningful correlations between different stimulus parameters and the subjective sensations elicited by a somatosensory stimulus.

PCA-separation of different time-frequency features
We showed that PCA with Varimax rotation can extract successfully different time-frequency features of the EEG response elicited by a transient stimulus (Fig. 1). Such approach has been previously applied to separate and quantify different ERP peaks in the time domain (Dien, 2010;Dien et al., 2007;Kayser and Tenke, 2003), as well as different features of the time-frequency EEG responses (Bernat et al., 2007;Bernat et al., 2005;Mayhew et al., 2010). It should be noted that PCA converts a set of observations of possibly correlated variables into a set of principal components, which are orthogonal (i.e., linearly uncorrelated; Jolliffe, 2002). This orthogonality is still present after the Varimax rotation, which is normally applied to the PCA solution to simplify the structure of the components by maximizing the sum of the variances of the squared loadings (Kaiser, 1958). Dien (1998) argued that the orthogonality requirement of PCA with Varimax rotation (i.e., the requirement that the components should be linearly uncorrelated) may constitute a problem when extracting ERP components, given their possible intrinsic correlation. To address this issue, Dien (1998) suggested the use of oblique/non-orthogonal rotations (e.g., Promax rotation), since these non-orthogonal rotations relax the requirement of orthogonality of the Varimax rotation. However, it should be also noted that the requirement of non-orthogonality has been suggested to represent a disadvantage per se (Kayser and Tenke, 2003), since non-orthogonal components are dependent on each other, resulting in the increase of the probability of Type I errors in subsequent statistical analyses (Kayser and Tenke, 2003).
It should be also noted that to minimize overinterpretations and misindentifications, researchers have been suggested to examine carefully the time/time-frequency distribution, scalp topography, and sensitivity to experimental manipulations of the PCA-isolated components, based on a priori knowledge of well-characterized event-related features (Fabiani et al., 1987). Following this suggestion, we have closely compared the components isolated by PCA with Varimax rotation with previously-documented time-frequency features in LEP studies. Several previous studies Mouraux et al., 2003;Ohara et al., 2004;Ploner et al., 2006;Stancak et al., 2003) consistently Fig. 7. Correlation between different single-trial estimates, as well as correlation between single-trial estimates and the corresponding single-trial subjective pain intensity. Top panel: All possible correlations between different single-trial estimates, measured using TF-MLR (left) and TF-MLR d (right), as well as between these single-trial estimates and the corresponding subjective pain intensity. Red and blue dots represent positive and negative correlations. When a correlation was significant, the corresponding box was marked in yellow. Note that the correlations were markedly similar between TF-MLR and TF-MLR d . Bottom panel: Representative correlations between different single-trial estimates, measured using TF-MLR d . The single-trial ERP latencies showed a significant negative correlation with the corresponding ERP frequencies (mean R = −0.31 ± 0.07, p b 0.0001; left). The single-trial ERD latencies showed a significant negative correlation with the corresponding ERD frequencies (mean R = −0.36 ± 0.09, p b 0.0001; middle). The single-trial ERP magnitudes showed a significant positive correlation with the corresponding pain perception intensity (mean R = 0.51 ± 0.15, p b 0.0001; right).
reported that laser-evoked EEG responses show three typical timefrequency features: a phase-locked ERP, and a non-phase-locked ERD and ERS (Fig. 1). Importantly, these three time-frequency features are at least partially independent, as demonstrated by their differential sensitivity to experimental manipulations Mouraux et al., 2003). The phase-locked response (ERP: 50-550 ms, 3-10 Hz), mainly located in theta band, corresponds to the time-frequency representation of the N2-P2 complex of LEPs in the time domain (Mouraux et al., 2003). Indeed, the scalp topography of the ERP response is similar to that of the N2-P2 complex, i.e., maximal at the vertex and symmetrically distributed over both hemispheres (Fig. 2). Therefore, the timefrequency ERP response in the theta band is likely to reflect neural activities from the anterior cingulate cortex (ACC) and bilateral operculo-insular areas, which are regarded as the main generators of the N2-P2 complex of the LEPs (Frot et al., 2008;Garcia-Larrea et al., 2003). The non-phase-locked ERD in the alpha band (50-1000 ms, 9-12 Hz) was maximal over the parietal region contralateral to the stimulated hand (Fig. 2). This feature might reflect activation/disinhibition of the primary sensorimotor cortex Neuper and Klimesch, 2006;Pfurtscheller and Neuper, 1994). In addition, the non-phaselocked ERS (217-447 ms, 10-19 Hz), mainly located in the beta band, was symmetrically distributed, with a maximum over the frontal regions (Fig. 2). This feature may partly reflect neural activities in the prefrontal cortex (PFC), which is commonly activated in response to nociceptive stimuli in most fMRI studies, but rarely reported in time domain EEG/ MEG studies (Apkarian et al., 2005). These observations suggest the presence of a neurovascular coupling, resulting in a Blood Oxygenation Level Dependent (BOLD) signal measured by fMRI, not only triggered by the phase-locked responses measured by EEG/MEG in the time domain (Debener et al., 2005;Liu et al., 2012), but also by the non-phaselocked oscillatory responses .
It should be noted that different features of the EEG response (e.g., the ERP and ERS) overlap in both time and frequency, and have different time and frequency limits in different subjects. Previously, single-trial magnitudes of such responses have been simply calculated from the mean of the "top 20%" points within a given time-frequency region-of-interests (TF-ROI), empirically defined based on the group average timefrequency distributions (e.g., Iannetti and Mouraux, 2009;Mayhew et al., 2010). Therefore, this TF-ROI approach has two major limitations: (1) it cannot isolate different but overlapping time-frequency features, and (2) it ignores the between-subject variability of the TF-ROI limits.
PCA with Varimax rotation allows overcoming these two crucial limitations. Indeed, it can separate physiologically distinct EEG activities, even if they overlap in the time domain (Dien, 2010;Dien et al., 2007;Kayser and Tenke, 2003), in the time-frequency domain, or both ( Fig. 1). Furthermore, when jointly applied with TF-MLR or TF-MLR d , it allows effectively modeling not only the within-subject, but also the between-subject variability of TF-ROI limits, using a set of regressors (e.g., the temporal derivative and the frequency derivative; Figs. 3-4).
We selected three PCs, which explained the maximum variance in single-trial TFDs and isolated the stimulus-evoked ERP, ERD and ERS. The number of selected PCs was determined empirically based on previous studies Mayhew et al., 2010;Mouraux et al., 2003). When extending the same approach to other applications (e.g., stimulus-evoked EEG/MEG responses of other sensory modalities), it will be necessary to ensure that (1) the time-frequency features isolated by PCA are well studied, and (2) these features are clearly presented in the average TFDs. Also, the separation of time-frequency features using PCA with Varimax rotation could be, in principle, applied to single-trial TFDs measured at all recorded channels (thus yielding to four dimensions: trial number × time point × frequency point × channel). Although this approach will require significant computational resources, 1 novel or more precise time-frequency features may be identified. Importantly, this technique may not be able to isolate some high-frequency features of the response (e.g., gamma band oscillations) due to their lower signal-to-noise ratio, as compared with the low frequency features (e.g., ERP, ERD, and ERS in the present study). To achieve this aim, the possibility of processing band-pass filtered TFDs (e.g., from 50 Hz to 100 Hz, when exploring gamma band oscillations) should be considered.

Robust and unbiased single-trial estimate of time-frequency features
In the time domain, both MLR and MLR d permit to estimate latency and amplitude of single-trial ERPs, in an automatic and unbiased fashion (Hu et al., 2011a;Hu et al., 2010;Mayhew et al., 2006). Here we described an important extension of these approaches, to allow estimating single-trial EEG responses in the time-frequency domain (TF-MLR and TF-MLR d ; Figs. 3 & 4).
When comparing the goodness of fit obtained with the TF-MLR and TF-MLR d approaches, the latter provides a significantly better fit to the single-trial time-frequency responses (F = 126, p b 0.0001) (Fig. 5, right  panel). This observation indicates that the TF-MLR d model fits the single-trial TFDs better than the TF-MLR model regardless of the number of basis sets employed in each of the two approaches. In addition, the magnitudes of single-trial ERP and ERS responses estimated by TF-MLR d were significantly greater than those estimated by TF-MLR (ERP magnitude: 6.4 ± 7.3% increase, p b 0.007; ERS magnitude: 5.8 ± 19.4% increase, p b 0.008; two tailed t test) (Fig. 6). Importantly, both TF-MLR and TF-MLR d approaches provide an unbiased estimate of single-trial magnitude of ERP, ERS and ERD (p N 0.05 for all comparisons, one sample t test) (Fig. 6). Indeed, when the stimulus does not elicit a physiological response (a condition we modelled using the resting EEG dataset), the average of the single-trial estimates of magnitude tends towards zero, under the unique assumption that a sufficiently high number of trials are estimated. In contrast, when estimated using the TF-ROI approach (estimate single-trial magnitudes from the mean of "top 20%" points within the pre-defined TF-ROIs, Mayhew et al., 2010), the mean (± SEM) of single-trial estimate of response magnitude in resting EEG epochs was always significantly different from zero (ERP: 2.74 ± 0.26 μV; ERD: − 2.81 ± 0.34 μV; ERS: 2.73 ± 0.30 μV; p b 0.001 for all comparisons, one sample t test).
Altogether, these findings indicate that the TF-MLR approach, which uses fewer regressors than the TF-MLR d approach, is more specific when detecting stimulus-related responses (Friman et al., 2003), but is unable to capture the variability of the response morphology, both in the time domain and in the frequency domain. Therefore, the TF-MLR provides a simple and robust approach to estimate single-trial parameters, and is particularly appropriate for EEG responses with relatively low SNR (for example the early N1 wave of LEPs; Hu et al., 2010). In contrast, the higher number of regressors in the TF-MLR d approach makes it more sensitive in detecting response variability (Hu et al., 2011a), with the possible drawback of fitting some noise (Friman et al., 2003). The TF-MLR d approach is thus better suited to estimate accurately singletrial responses with relatively high SNR (e.g., intracranial recordings, interictal spikes in epilepsy patients). Therefore, TF-MLR and TF-MLR d perform differently when applied to estimate single-trial timefrequency features with different SNR. This notion is also supported by the simulation study, in which we quantified the performance of the two approaches at controlled SNR levels (see Supplementary Materials). Lastly, given that it can be applied to any kind of trial-to-trial variability, the TF-MLR d approach lends itself to a wide range of applications in system neuroscience, although the definition of the parameters to generate the variability matrices in TF-MLR d (Fig. 4, step 1) could be improved using some prior knowledge of the possible range of the latency and frequency variability. In addition, the SNR can be significantly improved using spatial filtering (e.g., ICA and common spatial pattern analysis; Huang et al., 2013). Finally, multi-way analysis (e.g., tensor decomposition) of EEG data spanning the spatial-temporal-spectral domain could be considered to further enhance the SNR (Cichocki, 2013). After such filtering, TF-MLR and TF-MLR d could be also used on independent components (ICs), or applied on spatially-filtered signals, thus obtaining singletrial estimates of brain responses with enhanced SNR.
Understanding the relationship between the magnitude of stimulusevoked cortical responses and perceptual experiences is the objective of perceptual neuroscience (Mountcastle, 1998). Here we related the brain responses elicited by nociceptive specific laser stimuli (LEPs) with the corresponding painful percepts. Laser stimuli activate selectively free nerve endings of skin nociceptors  and LEPs are widely used to assess the function of nociceptive pathways in health and disease (Treede et al., 2003). In the time-frequency domain, we observed a strong correlation between the phase-locked ERP magnitude and subjective pain intensity (mean R = 0.51 ± 0.15, p b 0.0001; Fig. 7). This is in line with several previous studies (Carmon et al., 1978;Carmon et al., 1980;Iannetti et al., 2008;Kakigi et al., 1989), showing that the amplitude of the N2-P2 complex is often highly correlated with the subjective pain intensity. We also observed a positive correlation between non-phase-locked ERS magnitude and subjective pain intensity (mean R = 0.15 ± 0.19, p = 0.03).
In addition to exploring the trial-by-trial correlation between magnitude of the time-frequency EEG responses and perception, the described approaches provide a reliable estimation of other parameters defining an EEG response in the time-frequency domain. Besides response magnitude, these approaches allow estimating trial-by-trial variability in latency and frequency, as well as response dispersion in both time and frequency domains (Fig. 7). The availability of reliable estimates of all these parameters allows a more complete use of the information contained in stimulus-elicited EEG response, thus having the potential of providing novel physiological information in both basic and clinical studies.