Scanning for oscillations

Objective. Oscillations are an important aspect of brain activity, but they often have a low signal-to-noise ratio (SNR) due to source-to-electrode mixing with competing brain activity and noise. Filtering can improve the SNR of narrowband signals, but it introduces ringing effects that may masquerade as genuine oscillations, leading to uncertainty as to the true oscillatory nature of the phenomena. Likewise, time–frequency analysis kernels have a temporal extent that blurs the time course of narrowband activity, introducing uncertainty as to timing and causal relations between events and/or frequency bands. Approach. Here, we propose a methodology that reveals narrowband activity within multichannel data such as electroencephalography, magnetoencephalography, electrocorticography or local field potential. The method exploits the between-channel correlation structure of the data to suppress competing sources by joint diagonalization of the covariance matrices of narrowband filtered and unfiltered data. Main results. Applied to synthetic and real data, the method effectively extracts narrowband components at unfavorable SNR. Significance. Oscillatory components of brain activity, including weak sources that are hard or impossible to observe using standard methods, can be detected and their time course plotted accurately. The method avoids the temporal artifacts of standard filtering and time–frequency analysis methods with which it remains complementary.


Introduction
Oscillations play important roles in cortical processing (Buzsáki 2006, da Silva 2013, Wang 2010. They are manifest visually as a regular series of oscillatory deflections in electroencephalography (EEG) or local field potential (LFP), or equivalently as a local concentration of power in their Fourier spectra. Oscillatory activity is usually superimposed on less structured activity, and it may be hard to see it in recorded data due to source-to-sensor mixing and competing sources. Indeed, oscillatory activity known to exist based on depth electrode recordings may be invisible from surface recordings in EEG (da Silva 2013). Scientists interested in measuring this activity face a never-ending battle against noise.
Spectral filtering (bandpass, etc) is an effective means to attenuate the background and improve signal-to-noise ratio (SNR). However, filters are prone to ringing triggered by noise or transients, and such ringing may masquerade as oscillations. Even for activity that is genuinely oscillatory, convolution of the signal with the impulse response of the filter changes its time course, both phase and temporal envelope. This may obscure temporal or causal relationships between a stimulus and its response, or between activity in different parts of the brain. Time-frequency analysis is prone to similar issues due to the temporal extent of its analysis kernels, which introduce smearing of response patterns and uncertainty as to their temporal relation to underlying brain events. An alternative means to improve the SNR of oscillatory activity is therefore welcome.
The correlation structure of multichannel data such as EEG, LFP, magnetoencephalography (MEG) or electrocorticography (ECoG) can be exploited to factor out noise sources. Many techniques have been proposed under the general name of component analysis. Here we use a technique known as joint decorrelation (JD) (de Cheveigné and Parra 2014) to determine optimal linear combinations of observations. JD subsumes techniques such as denoising source separation, common spatial patterns and earlier techniques (Koles et al 1990, Fukunaga 1990, Särelä and Valpola 2005, de Cheveigné and Simon 2008, de Cheveigné 2010. Here we use it to scan the data for evidence of narrowband activity. The method is closely related to ideas proposed by Särelä 2004 and to the spatial spectral decomposition method of Nikulin et al (2011). Our contribution is to describe the method in an accessible way, within the JD framework, and to illustrate it on a number of common tasks.
The aim of this paper is to draw attention to a useful tool for the study of oscillatory activity, that can either replace filtering and time-frequency analysis, or else complement them. We describe the tool and illustrate its capabilities using synthetic and real EEG and MEG data.

Optimizing the SNR of narrowband activity
We use JD to maximize the SNR of narrowband activity. The J sensor or electrode signals, arranged as columns of a matrix x X , where t is time, are combined linearly to produce K component signals y tk : where w jk are optimal weights produced by the JD algorithm. In matrix notation, Y XW, = where W is the analysis matrix of dimensions J K, which converts J sensors into K components, K J.
 The JD algorithm finds weights that optimize some criterion, implemented as a 'bias filter' applied to the data: X X .
The nature of this filter determines what aspect of the signal is optimized by the linear combination of channels. In our case the filter  is simply a narrowband filter centered on f, and thus JD finds components that maximize power near f. JD finds this solution by jointly diagonalizing the covariance matrices of the raw and filtered data.
The covariance matrix of the raw data is calculated as C X X, 0  = and that of the filtered data as C X X, 1¯ = and the desired transformation matrix is obtained by joint eigendecomposition of these matrices: C W WD C W WD , , 0 0 1 1 = = where diagonal matrices D 0 and D 1 represent the eigenvalues, and the columns of W the eigenvectors (Fukunaga 1990). The matrix W defines the desired transform.
The first component y t1 (defined by the first column of W) gives the largest possible ratio of filtered power to raw power: it is the linear combination with the best possible SNR for oscillatory activity with frequency f. The second component y t2 is the best linear combination within the signal subspace orthogonal to the first, and so-on. Depending on the purpose of the analysis, we may keep just one component (the optimal linear combination of channels), or several that together define an 'oscillatory subspace'. Note that the y tk are not filtered by the filter ,  which serves merely to find the weights W. Once the weights are found, the components are obtained according to the formula Y XW = that does not involve ,  and thus the component signals are not distorted by spectral filtering.
The method thus far optimizes the power at f relative to the entire spectrum. It is possible to optimize it relative to the immediate spectral vicinity instead, by preprocessing the data with a bandpass filter wider than the bias filter and centered on f. This maximizes the local spectral contrast in the vicinity of the desired frequency (Nikulin et al 2011).

Scanning for oscillatory activity
Given a frequency f, we can find a linear combination of channels that maximizes the SNR of oscillatory activity at that frequency. If f is unknown, we can discover it by applying the procedure with a series of filter center frequencies f i covering the range of interest. For each f i the optimal component y t1 is examined for the presence of narrowband activity using standard tools such as the spectrum or spectrogram, or visual examination of the waveform for the presence of oscillations. The JD algorithm being computationally efficient, it is quite feasible to apply it repeatedly in this fashion.

Simulated data
The method is first illustrated with synthetic data. A target source was created by filtering white Gaussian noise through a narrowband filter centered on 16 Hz (figure 1(a), blue), and mixed into 20 observation channels via a 1 × 20 mixing matrix with random coefficients. In addition to the target, 19 independent white Gaussian noise sources (figure 1(a), red) were added to the same channels via a 19 × 20 mixing matrix. The relative amplitudes of target and background were adjusted to attain a predefined SNR. Spectra of mixtures for various SNR are shown in figure 1(b). For SNR = 10, the presence of the target is evident in the power spectrum of the mixed data, but for less favorable SNR it is quite difficult to discern. Applying JD with a bias filter tuned to 16 Hz reveals the target even for an unfavorable SNR, in this case SNR = 0.001 (figure 1(c)). This example demonstrates how a weak oscillatory source can be extracted from a much stronger noise background.
In a second example, eight narrowband target sources with different center frequencies (figure 2(a), blue) were mixed with 12 white Gaussian background sources at an overall SNR = 0.001. At this SNR the targets are completely concealed within the noise (figure 2(b)). The data were then scanned systematically at 1 Hz intervals using as bias filter a second-order resonator filter with quality factor Q = 8 (Smith 2007). The presence of the sources is evident in the raster plot of color-coded power spectra (figure 2(c)). Specifically: for each source there exists a transform, defined as a set of weights, that preserves that source and locks out the noise and the competing sources. The sources found are genuine: in the absence of a source (SNR = 0) the method finds nothing (figure 2(d)). The faint ridge along the diagonal reflects slight overfitting (see below). This example shows how multiple weak oscillatory sources can be discovered and isolated within the data.

Simulated 8 Hz target embedded in real EEG
In this example an 8 Hz oscillatory pulse (figure 3(a)) was added to 72-channel EEG at overall SNR = 0.1 ( figure 3(b)). At this SNR the target is hard to discern within the EEG. The target SNR can be enhanced by applying a narrowband filter (resonator filter with Q = 8), but the response is temporally stretched and delayed (figure 3(c)). Applying instead a zerophase filter (using the Matlab filtfilt function) eliminates the delay but causes the response to appear to start earlier than it should (figure 3(d)), potentially leading to erroneous inferences concerning the causal relation between events. In addition, spurious oscillatory patterns occur at other times triggered by the background noise. Time-frequency analysis is prone to similar issues (not shown). If instead we apply JD (using the same narrowband filter as bias) the SNR is improved without introducing any temporal distortion or spurious oscillations (figure 3(e)). In this case the noise rejection is not perfect, presumably because JD was calculated on a data segment too short (3s) to find an optimal filter. This example shows how JD can be useful as an alternative to the standard practice of applying narrowband filters or time frequency analysis.

Oscillatory activity in spontaneous EEG
Spontaneous EEG activity was recorded from a human subject in an exploratory study on brain state. The subject was placed within an electromagnetically shielded booth and instructed to remain with eyes closed. Data were recorded with a BioSemi EEG system with 32 electrodes with standard layout. Data were sampled at a 2048 kHz sampling rate, downsampled in software to 256 Hz, and high-pass filtered with a 2nd order Butterworth filter with cutoff 1 Hz. A sample of 850 seconds of data was scanned for narrowband activity at 1 Hz intervals. The bias filter at each frequency was a secondorder resonator filter with quality factor Q = 8 (Smith 2007).  component for one bias frequency. The shape of the power spectrum (horizontal lines in the raster plot) changes according to the bias frequency. For certain bias frequencies (e.g. 6, 11 or 28 Hz) the spectrum has a clear concentration of power at a particular frequency, that is typically (but not always) close to the bias frequency.
Figures 4(b)-(d) shows waveforms, spectrograms, and topographies for three components discovered in this way. The first (6 Hz) has a focal frontal topography typical of theta band activity (Cavanagh and Frank 2014). The second (11 Hz) has a topography typical of occipital alpha activity originating from visual cortices, and the third (28 Hz) has a topography suggestive of ocular microsaccades (Yuval-Greenberg et al 2008). It is worth remarking that the signals shown in figures 4(b)-(d) are not affected by filtering (see methods). Thus there is no ambiguity as to whether the oscillatory pattern observed is genuine or the result of filter ringing (Yeung et al 2004). The time course of the oscillation can be followed accurately without the temporal smearing entailed by filtering. This example shows how multichannel data such as EEG may be scanned to reveal the presence of narrowband activity.

Exploring the alpha subspace in MEG
For each bias frequency, the JD algorithm gives a series of components ordered by decreasing score (ratio of filtered power to raw power), of which only the first is plotted in figure 4(a). For some bias frequencies (e.g. in the alpha range), several of these components may show a high proportion of narrowband power. Together these components define a subspace of the data containing narrowband activity at that frequency. This indicates that the data reflect several brain sources with similar spectral characteristics but distinct time courses and spatial signatures.
As an example, MEG data were recorded from a human subject during an exploratory study in which the subject was  performing a task that required responding to auditory stimuli.
Here we ignore the sensory responses and focus instead on the background activity. Data were acquired from 274 radial gradiometers at a 600 Hz sampling rate and high-pass filtered at 1 Hz with a second-order Butterworth filter. Environmental noise was suppressed by (a) projecting out signals from reference sensors that pick up environmental noise using the TSPCA algorithm (de Cheveigné and Simon 2007), and (b) projecting out remaining 50 Hz-related power using the JD algorithm with a bias filter emphasizing 50 Hz and harmonics (de Cheveigné and Parra 2014). The cleaned data were submitted to PCA and 100 components were selected to reduce dimensionality. The data were then scanned for oscillatory components as described in the previous section. This analysis (not shown) revealed narrowband activity circa 9 Hz.
JD was then applied with a narrowband bias filter centered on 9 Hz (Q = 4), yielding a series of components ordered in terms of decreasing alpha-to-total power ratio. The power spectra of the first 15 components are shown as a raster plot in figure 5(a). Based on visual inspection, it would seem that roughly 13 components are dominated by alpha activity, although there is no clearcut transition between these and the remaining 'non-alpha' components. Collectively the selected components span an alpha-dominated subspace of the 274dimension MEG signal space. There appears to be some diversity in their spectral properties (position and sharpness of the spectral peak, salience of a harmonic). Indeed, tuning the bias filter to neighboring frequencies emphasizes different aspects of this activity, and bias filters tuned to 8, 9 and 11 Hz yield different topographies ( figure 5(b)). The different topographies imply different source geometries, and the spatial progression between these topographies is consistent with reports of lower frequency for frontal alpha sources (da Silva 2013).
Spectral filtering is not involved in the JD transform (the bias filter that served to determine the weights is not retained in the transform, see methods), and the method is thus useful to determine the time course of oscillatory activity without the confounding effects of ringing or temporal smearing (figures 3(c), (d), but see Caveats below). Figure 5(c) shows the root mean (rms) temporal envelopes of the first 15 components for the 9 Hz bias. To calculate the envelopes, the component waveforms were normalized (equal power for each component), filtered by a mild bandpass filter centered on 9 Hz (Q = 1), squared, and smoothed with a 1 s boxcar window. Their time course is deeply modulated, reflecting bursts of alpha power separated by periods of quiescence. The envelopes are correlated between components, suggesting that multiple sources tend to activate simultaneously. Betweencomponent correlation is however not perfect, implying some temporal disparity between the activations of the various sources.
'Alpha-dominated' intervals in the data are characterized by a large ratio of power at 9 Hz relative to total power. Figure 5(d) right shows a histogram of this ratio for the first JD component (black). The distribution is bimodal, supporting a relatively clear distinction between 'alpha' and 'nonalpha' temporal intervals. For comparison, the same analysis applied to a raw data channel (the one most correlated with this component) yields a unimodal distribution dominated by lower values of the ratio (red dashed line). This example shows how JD-based oscillatory analysis can help to characterize the multidimensional cortical activity that underlies the complex phenomenon known as 'alpha'.

Stimulus-induced narrowband activity in MEG
This example uses MEG data borrowed from a published study that measured responses of human subjects to visual stimulation (Duncan et al 2009). During each 5 s trial, the subject fixated a cross during 2.5 s, followed by a grating within the lower right or left quadrant during 2.5 s. Stimuli were repeated for a total of 160 trials, of which a subset of 30 are used in the examples in this paper. Data were recorded with a 274-channel gradiometer MEG system (CTF) at a 600 Hz sampling rate. Further details can be found in the original study of Duncan et al (2009). The same data were also used for illustration in a later study on induced responses (de Cheveigné 2012).
Scanning for oscillatory activity, as explained above, revealed several narrow band components in the 14-17 Hz region. Here we focus on the time course of their power within each trial. The power spectra of the first 30 components of a JD analysis with a narrowband bias filter centered on 15 Hz are shown in figure 6(a) as a raster plot. The first ten or so are dominated by power near 15 Hz, whereas the following ones have more power near 10 Hz. Interestingly, the time course of the trial-averaged power of these various components ( figure 6(b)) shows a lull (event-related desynchronization, ERD) after the stimulus transition at 2.5 s (figure 6(b)). Figure 6(d) shows a raster plot of instantaneous power for the first component. The activity occurs as a series of bursts, with a reduced density after the stimulus transition. There is considerable trial-to-trial variability but a test based on the binomial distribution for no effect shows that the power reduction is highly significant, a conclusion that holds for several of these components ( figure 6(c)). This result is remarkable given that the components were selected purely on the basis of their spectral, not temporal, properties.
The existence of such narrowband ERD effects is well known (Pfurtscheller and Lopes da Silva 1999). What narrowband JD analysis contributes is the ability to overcome the poor SNR of such activity relative to background, and to follow its time course accurately without the temporal smearing entailed by filtering or time-frequency analysis.

Stimulus-induced narrowband gamma in MEG
This last example involves extracting a narrow gamma band event-related synchronization response from the same MEG data as used in the previous example. The data were high-pass filtered at 30 Hz to de-emphasize the prominent low-frequency power, and scanned for narrowband activity between 30 and 60 Hz. This scan revealed a component near 53 Hz. Its amplitude increased after the stimulus transition at 2.5 s, with a slight downward sweep in frequency, and its topography was focal in the occipital region ( figure 7). The same component had previously been found using other methods: beamforming (Duncan et al 2009), and quadratic component analysis (de Cheveigné 2012). The convergence of these widely different methods is reassuring.
These examples illustrate the range of situations where the method can be useful to observe oscillatory activity embedded within multichannel data at low SNR.

Discussion
Brain oscillations play a prominent role in the neurosciences of the brain, and tools to extract and enhance them are of crucial importance. Classic tools such as spectral filtering or time frequency analysis are hobbled by issues related to ringing and temporal smearing. The tool that we review in this paper uses instead the between-channel correlation structure of the data to design a spatial filter that isolates the desired activity. In favorable conditions it can extract extremely weak oscillatory activity, without the temporal distortion entailed by spectral filtering.

Scanning for spectrally shaped activity
JD is effective to extract narrowband cortical activity at a frequency f. Applying it repeatedly with a range of values f i allows the data to be scanned to discover unknown oscillatory activity. Advantages over alternative methods, discussed below, are (a) the analysis is sensitive to weak sources embedded in competing background activity, (b) the time course of oscillatory activity can be followed accurately without temporal smearing, (c) the method is easy to apply, requiring no selection of channels or knowledge of cortical or sensor geometry. It is a good exploratory tool to look for oscillatory patterns hidden within the data.
The method can in principle also be used to search for components with spectral characteristics other than narrowband, for example high-pass, lowpass, bandpass (e.g. beta or theta), or harmonic (multiple narrow bands at multiples of a common fundamental frequency).

Alternative methods
A straightforward approach to enhance the SNR of narrowband activity is to apply a bandpass filter (e.g. Basar et al 1999, Ben-Simon et al 2008, Mazaheri and Jensen 2009. A downside is that non-oscillatory events may excite the filter poles to produce ringing patterns that masquerade as oscillations, raising questions concerning the genuine oscillatory nature of the underlying cortical activity (Yeung et al 2004). Even if the oscillations are genuine, convolution with the impulse response of the filter alters their time course, smoothing the envelope and introducing latency shifts or non-causal effects that complicate the interpretation (figures 3(c), (d)). These are all the more marked as the filter is selective: there is a tradeoff between the benefit of filtering and the salience of ringing effects produced by the filter. In contrast JD improves SNR without distorting the waveform, and it can also operate at much lower SNRs (figure 1).
Time-frequency analysis is subject to the same issues as filtering. Each pixel of the representation is obtained by convolving the signal by an analysis kernel. The temporal span of this kernel is inversely related to spectral resolution (constant for short-term Fourier transform, frequency-dependent for wavelet methods), and the temporal alignment of the kernel relative to the analysis point is arbitrary. Typical choices are to align its center of gravity (to minimize latency at the expense of causality) or its origin (to preserve causality at the expense of latency). Unfortunately such details are often not fully reported, and this may lead to uncertainty concerning the interpretation of temporal relations and causality (Zoefel and Heil 2012).
A different approach, in the case of multichannel data, is to enhance SNR by linear spatial filtering as in ICA, CCA, beamforming, PCA, and so-on. JD belongs to this family. The methods differ in how they find the weights to apply to the different channels (equation (1)). For ICA the weights are chosen to maximize a measure of statistical independence such as non-Gaussianity, hopefully yielding components that map to neural sources. These are then scanned for oscillatory activity. Unfortunately, narrowband signals are only weakly non-Gaussian and may be missed if SNR is low. For JD, in contrast, the weights are chosen to maximize narrowband activity directly, so oscillatory sources are less likely to be missed. Components are also ordered, obviating the need for a post-hoc selection phase. Nikulin et al (2011) describe a method similar to ours that enhances known narrowband components, by maximizing spectral contrast between a target frequency band and adjacent bands. They also briefly discuss a sliding analysis to discover unknown narrowband components, using as criterion the mean ratio between ten successive eigenvalues. Haufe et al (2014) apply a similar process for dimensionality reduction as an alternative to PCA.

Caveats
The method is sensitive, but it may nonetheless miss oscillatory activity that is spatially collinear with high-amplitude activity in another frequency band. The analysis selects components that are active at the target frequency relative to other frequency bands, and so weak narrowband activity might be missed if it is spatially collinear with strong activity in those other spectral regions. This can be alleviated by prefiltering to attenuate power in remote unwanted bands. A useful strategy is to preprocess the data with a bandpass filter centered on the same frequency as the bias filter, but wider, so as to enhance spectral contrast (see methods).
For a given bias frequency, JD may produce several narrowband components that together span a subspace of narrowband activity (e.g. figures 4(a) and 5(a)). There is usually not a one-to-one correspondence between each component and a neural source. Indeed, brain sources with similar frequencies are likely to be correlated, whereas component signals are necessarily uncorrelated. The amplitude of each component signal depends on the phase-dependent vector summation over sources. It is thus difficult to infer the time course of neural sources from the data.
The method is prone to overfitting, particularly if (a) the number of channels is large, (b) the amount of data is small, and/or (c) the available degrees of freedom have been reduced by prior filtering. Overfitting is manifest as an apparent peak in the spectrum at the same frequency as the bias filter, in the absence of any genuine oscillatory activity (e.g. the faint diagonal in figure 2(d)). Overfitting may be tested for by cross-validation techniques (de Cheveigné and Parra 2014), and the tendency to overfit may be reduced by initially applying PCA and discarding low-amplitude PCs to reduce the dimensionality of the data.
Artifactual 'narrowband' features may arise due to prefiltering of the data (lowpass, high-pass, band reject, etc). For example the first row of the raster plot of figure 4(a) seems to indicate narrowband activity at 1 Hz. Instead it simply reflects the interaction between the overall low-pass spectrum of the data and the 1 Hz high-pass preprocessing filter. As explained in de Cheveigné and Parra (2014), activity that propagates across the sensor or electrode array may also incorrectly emerge as oscillatory activity. One must be attentive to such artifacts.

Oscillatory activity in the brain
The method is relatively sensitive, and can reveal the presence of multiple oscillatory sources within brain data. It is sobering to observe that these components represent only a fraction of the total activity. Most of the power in the data appears to be non-oscillatory, as far as we can tell using the method. It is of course possible that additional sources exist that are collinear with stronger non-oscillatory sources, or too weak to emerge even after processing. It is also conceivable that the apparently non-oscillatory activity is actually the superposition of multiple oscillatory sources, too numerous to be resolved within electrophysiological data. Nonetheless, we cannot exclude the alternative hypothesis that much of brain activity is inherently non-oscillatory.

Conclusions
Oscillatory patterns constitute an important aspect of brain activity, often hard to discern within data recorded from EEG or MEG because of the poor SNR relative to other activity. Spectral filtering, often employed to improve the SNR, is unsatisfactory because filter ringing triggered by non-oscillatory events may masquerade as oscillations, and the temporal extent of time-frequency analysis kernels may likewise blur temporal and causal relations. In this paper we introduced a methodology that relies instead on spatial filtering to reveal narrowband activity within multichannel data such as EEG, MEG, ECoG or LFP. The method exploits the betweenchannel correlation structure of the data to suppress competing sources and improve the SNR. Applied to real data, it can reveal weak sources that are hard to observe using standard methods, and allow their time course to be plotted accurately.