Phasic modulation of visual representations during sustained attention

Abstract Sustained attention has long been thought to benefit perception in a continuous fashion, but recent evidence suggests that it affects perception in a discrete, rhythmic way. Periodic fluctuations in behavioral performance over time, and modulations of behavioral performance by the phase of spontaneous oscillatory brain activity point to an attentional sampling rate in the theta or alpha frequency range. We investigated whether such discrete sampling by attention is reflected in periodic fluctuations in the decodability of visual stimulus orientation from magnetoencephalographic (MEG) brain signals. In this exploratory study, human subjects attended one of the two grating stimuli, while MEG was being recorded. We assessed the strength of the visual representation of the attended stimulus using a support vector machine (SVM) to decode the orientation of the grating (clockwise vs. counterclockwise) from the MEG signal. We tested whether decoder performance depended on the theta/alpha phase of local brain activity. While the phase of ongoing activity in the visual cortex did not modulate decoding performance, theta/alpha phase of activity in the frontal eye fields and parietal cortex, contralateral to the attended stimulus did modulate decoding performance. These findings suggest that phasic modulations of visual stimulus representations in the brain are caused by frequency‐specific top‐down activity in the frontoparietal attention network, though the behavioral relevance of these effects could not be established.

Early indications of discrete sampling come from psychophysical experiments, showing that there are limitations on the number of temporal events one can perceive per unit of time. It turns out that when visual flashes are presented to a subject at a constant rate, the subject can only account for a maximum of 10-12 items/second, even when the presentation rate is higher (White & Harter, 1969). In another experiment, reaction time distributions were found to be multi-modal, with ~100 ms in between the peaks, indicating a periodicity of about 10 Hz (Venables, 1960). Given that attention has a facilitating effect on perception (Hawkins et al., 1990), the rhythmicity in reaction times suggests that attention might be discrete and rhythmic.
More recently, neuroscientific experiments have linked the rhythmic modulation of behavior to brain activity, in particular to the rhythmic synchronization of cortical excitability in relation to visual attention (Fiebelkorn et al., 2013Gaillard et al., 2020;Landau & Fries, 2012;Landau et al., 2015;Song et al., 2014;Spyropoulos et al., 2018). For example, Busch et al. (2009) found that the detection of a near-threshold stimulus was related to the phase of spontaneous rhythmic activity in the theta/alpha band (7-10 Hz) just before stimulus onset. In a similar experiment, Busch and VanRullen (2010) later found that this effect was only present when the subject actively attended the near-threshold stimulus. Thus, according to these studies, the phase of the theta/alpha rhythm is inherently related to fluctuations in visual attention and, with that, behavior. Helfrich et al. (2018) pinpointed these top-down modulatory effects to the frontoparietal attention network, specifically the frontal eye fields (FEF), the intraparietal sulcus (IPS), and inferior and superior parietal regions.
The behavioral evidence that suggests that visual perception might inherently be discrete is thus supported by neural evidence that ascribes these effects to the phase of spontaneous activity. However, it remains unclear why visual attention is so strongly related to theta/alpha activity. According to theoretical accounts of visual processing and attention (Haegens et al., 2011;Jensen et al., 2014;Klimesch, 2012), these (particularly alpha) rhythms prioritize input by inhibiting neural processing. In short, neuronal firing is inhibited at oscillatory peaks, but once inhibition ramps down (i.e., in the downgoing flank and trough of the oscillation), visual representations will activate according to their excitability. Alternatively, theta rhythms could organize brain activity into alternating states of sensory sampling at behaviorally relevant locations and moving the focus of attention to another location through saccades .
Moreover, there are some studies that question the role of alpha power and/or phase in perceptual sensitivity altogether (Antonov et al., 2020;Benwell et al., 2017;Ruzzoli et al., 2019;Van Diepen et al., 2019), by highlighting limitations and potential confounds in the literature on this matter (Van Diepen et al., 2019), questioning the size of these effects (Ruzzoli et al., 2019), or providing evidence for a relation between alpha power and perceptual awareness but not visual sensitivity (Benwell et al., 2017). In case the theta/ alpha phase does, indeed, modulate perceptual sensitivity, we would expect the input gain of incoming stimuli to be modulated accordingly. We thus investigated whether the strength of a stimulus' representation in brain activity shows periodic fluctuations.
Developments in the analysis of neuroimaging data have resulted in the ability to decode mental states from non-invasive measurements (Haynes & Rees, 2006). This has been particularly successful in the visual domain in both fMRI and M/EEG (van de Nieuwenhuijzen et al., 2013;Zafar et al., 2015). The high temporal resolution of M/ EEG enables us to ask questions about the temporal evolution of visual presentations using multivariate pattern analysis (MVPA; Cichy et al., 2015;Pantazis et al., 2018;Ramkumar et al., 2013). In order to investigate whether the strength of visual representations in brain activity shows periodic fluctuations, in this exploratory study we decoded visual stimulus information from MEG activity as a function of the instantaneous phase of the ongoing activity. Human subjects participated in a spatial attention task, while MEG was being recorded. The visual stimuli consisted of oriented gratings, with two possible orientations. If the representations of these stimuli are activated only in distinct time windows as a result of discrete attentional sampling, we expect the decoding performance to fluctuate accordingly, with a dependency on the theta/alpha phase. Because the early visual cortex is most sensitive to oriented gratings (Hubel & Wiesel, 1962), this modulating signal was expected to be present there. Alternatively, the modulatory signal could come from the frontoparietal attention network, given its putative role in attention (Buschman & Kastner, 2015) and evidence that points to this network in the phasic modulation of behavior (Busch et al., 2009;Helfrich et al., 2018).

| Stimuli
The experimental task was programmed in MATLAB (2020, MathWorks, RRID: SCR_001622) using Psychophysics Toolbox (Brainard & Vision, 1997), RRID: SCR_002881). All stimuli were presented against a black background (Figure 1). A fixation cross (1.8 visual degrees (°)) was presented for a baseline period (800 ms), after which one of the horizontal arms widened to act as a spatial cue (800 ms). Then two sinusoidal gratings (3.2°) appeared on the lower part of the screen, one in either hemifield. The gratings were drifting upwards (1.33 cycles per degree, drift rate 1 Hz) and could be oriented clockwise (CW, 45°) or counter-clockwise (CCW, 135°). After a variable stimulus duration (0.5-1.5 s) one of the gratings rotated clockwise or counterclockwise with a variable rotation angle. The gratings then remained drifting for 100 ms, after which a question mark was presented to indicate the response window (max. 1,200 ms).

| Experimental equipment
Stimuli were presented by back-projection onto a semi-translucent screen (width 48 cm) by an Eiki LC-XL100L projector with a refresh rate of 60 Hz and a resolution of 1,024 × 768 pixels. Subjects were seated at a distance of ~76 cm from the projection screen in a magnetically shielded room. MEG was recorded throughout the experiment with a 275-channel axial gradiometer CTF MEG system at a sampling rate of 1,200 Hz. In addition, the subject's gaze direction and pupil size were continuously recorded using an SR Research EyeLink 1000 eye-tracking device (RRID: SCR_009602) at 1,000 Hz and resampled to 1,200 Hz. This resulted in three extra channels that were saved together with the MEG data: the horizontal and vertical gaze position and pupil diameter.
Head position was monitored in real-time during the experiment by head-positioning coils at the nasion and left and right ear canals of the subject (Stolk et al., 2013). Subjects were adjusted to the starting position of the first session at the start of the next session. Additionally, subjects readjusted to the original position when the head position deviated more than 5 mm from it. Behavioral responses during the MEG session were recorded using a fiber optic response pad (FORP). In addition to the MEG recording, anatomical T1 scans of the brain were acquired with one of the 1.5/3 T Siemens MRI systems present at the facility (Siemens). In order for the co-registration of the MEG and MRI datasets, the scalp surface was mapped using a Polhemus 3D electromagnetic tracking device (Polhemus).

| Procedure
The experiment contained three 1-hr MEG sessions per subject. Each session consisted of a maximum of 10 blocks of 80 trials or until the end of the session, with a self-determined break in between the blocks. Subjects were instructed to keep their head as still as possible and, preferably, to blink at the end of every trial (see Figure 1 for trial timeline). Subjects had to fixate at the fixation cross at all times and pay covert attention to the cued grating, instructed by the fat arm of the fixation cue. Both gratings could either have a clockwise or counterclockwise orientation and were presented for either 0.5 (10%), 1.0 (80%), or 1.5 (10%) seconds. Only the 1s trials were of interest, the other timings functioned to exclude the possibility that the subject would adapt their attention only after a fixed period following the stimulus onset. Then one of the gratings (cued grating in 80% of 1-s trials) rotated either F I G U R E 1 Trial timeline. Subjects had to look at the center of a fixation cross throughout the trial. After a baseline period, a spatial cue appeared, indicating the target hemifield for covert visuospatial attention. Then two oriented, drifting gratings were presented, after which the rotation direction of one of them had to be indicated with a button press. Semi-transparent arrows in the gratings indicate the direction of movement of the drifting grating | van ES Et al.
clockwise or counterclockwise. The amount of rotation was adjusted online throughout the experiment such that the performance level would be at 80% for all validly cued, 1-s trials, and was initiated at 1.5°. Subjects had to indicate the rotation direction with a button press of the index finger (CW: right index finger; CCW: left index finger). If the subject was too slow (i.e., slower than 1,200 ms) the question mark in the response window turned red for 100 ms, and a new trial was initiated. All trials were counterbalanced regarding grating orientation, cue validity, and stimulus presentation time, and trial order was randomly permuted per session.

| MEG preprocessing
MEG data were preprocessed for each session separately. First, excessively noisy channels and trials were removed from the data by visual inspection, including trials containing SQUID jumps and muscle artifacts. Similarly, eye-tracking data were inspected to remove trials containing eye blinks or saccades in the period [−1, 1] seconds relative to stimulus onset. The data were then demeaned and high pass filtered at 0.1 Hz using a finite impulse response windowed sinc (FIRWS; Widmann, 2006) filter. Additionally, line interference at 50 Hz was removed using a discrete Fourier transform (DFT) filter, as well as its harmonics at 100 and 150 Hz. Signals related to cardiac activity or eye movements were manually identified and removed using independent component analysis (ICA) with FastICA (Gävert et al., 1996) and visual inspection. This was performed prior to the removal of trials containing residual eye movements and blinks. Finally, the data were downsampled to 200 Hz. Only those trials that contained a valid cue and received a correct response were considered for further analyses. All trials 0.5 or 1.5 s duration (stimulus onset to stimulus change) were also disregarded (i.e., only trials of 1.0 s duration were considered).

| MRI preprocessing
MRI data were co-registered to the CTF coordinate system using the head-positioning coils and the digitized scalp surface. Volume conduction models of the head were created using a segmentation of the anatomical image, with SPM8. Freesurfer and Connectome Workbench were used to construct source models, with dipole positions positioned on a cortically constrained surface with approximate 6 mm spacing, containing 15,684 dipole locations. Subsequently, forward models were computed from individuals' structural MR images using a single-shell volume conduction model (Nolte, 2003).

| Time-frequency analysis
Time-frequency resolved power was estimated for the initial exploration of the data. This was performed separately for low (2-30 Hz) and high (30-80 Hz) frequencies. First, the data were transformed into synthetic planar gradients (Bastiaansen & Knösche, 2000) and padded to 4 s. For low frequencies, a Hanning tapered sliding time window of 500 ms was slid over these data in steps of 100 ms, from −1.0 to 1.0 s relative to stimulus onset, resulting in a frequency resolution of 2 Hz. For high frequencies, DPSS multi-tapers were used with a sliding time window equal to five cycles for each frequency and 50 ms steps, 4 Hz frequency resolution, and frequency-specific smoothing (20% of the frequency value). After spectral decomposition synthetic planar gradients were averaged to form a single spectrum per sensor.
High-frequency time-frequency spectra are plotted relative to the average power in the baseline. Low-frequency time-frequency spectra are plotted as the Attentional Modulation Index (AMI), the relative increase in attend left versus attend right trials: The lateralization of the AMI was quantified as the Lateralization Index (LI): the difference between selected left and right occipital channels (i.e., channels with labels ML/ RO 11,22,23,31,32,33) In order to bin data according to the phase, a phase-providing signal is required. Since gamma power is closely related to active stimulus processing (van Es & Schoffelen, 2019;Fries, 2015;Schroeder & Lakatos, 2009), the location in the brain with the maximum gamma power increase can be used as a central location for stimulus processing. In (2) LI = AMI channels left − AMI channels right . | van ES Et al. order to estimate this location for each subject individually, first the gamma center frequency was estimated as follows. A 600 ms window, padded to 1 s, was used to estimate the center frequency; right before stimulus onset for the baseline estimate, and from 400 to 1,000 ms after stimulus onset for the stimulus-induced estimate (excluding the ERF in the first 400 ms). Power was estimated using a fast Fourier transform (FFT) and a Hanning taper between 30 and 100 Hz, with a frequency resolution of 1 Hz. The relative power increase from baseline was manually inspected in order to find the local maximum in posterior channels, from which the gamma band was extracted (see Figure S1 for an example of a representative subject). The amount of smoothing was then defined as the range of the gamma band divided by two, and the center frequency as the median of the gamma band.
Another FFT (on the 1 s data) on the subject's individual gamma center frequency was used to inform a Dynamic Imaging of Coherent Sources (DICS) beamformer (Gross et al., 2001). The amount of smoothing was tailored to the individual's bandwidth using DPSS tapers. Spatial filters were created for each of the dipole locations in a 3-dimensional grid, using the cross-spectral density (CSD) estimated from the baseline and stimulus data concatenated, and a regularization of 100% of the mean sensor level power. This was performed for attend-left and attend-right trials together, and was performed separately for each session, after which single-trial power in source space was concatenated over sessions. Next, the two dipole locations (i.e., single points on the cortical surface) were selected, one for each hemisphere, that showed the largest increase in gamma power relative to baseline (i.e., the dipole location with the highest T-value resulting from a t test). Spatial filters for the time domain were estimated with a Linearly Constrained Minimum Variance (LCMV; Van Veen et al., 1997) beamformer for these two locations.
First, the time domain data were baseline corrected based on the 100 ms before stimulus onset. The data from 100 ms before to 1,000 ms after stimulus onset were used to estimate the covariance, and spatial filters were created using a regularization of 100% of sensor-level power.

| Brain-wide cortically constrained anatomical parcels
Source modeling can be used to increase the signal to noise ratio of a particular signal of interest and improve spatial consistency over subjects. Without any dimensionality reduction, this can lead to inefficiently large search space. Therefore, brain-wide time domain data were first modeled on cortically constrained meshes of dipole positions using an LCMV beamformer and then grouped in 374 parcels based on an anatomical atlas (Conte 69 atlas, Van Essen et al., 2012), as described in van Es and Schoffelen (2019). This was performed separately for each session and concatenated afterward.

| Phase estimation
Phase time courses were estimated from either the virtual channel based on the location of maximum gamma power increase (see Section 2.3.4) or the anatomically defined cortical parcels. The epoched data were padded to 4 s and for each frequency-of-interest, a Hanning tapered sliding time window was slid over the data with steps of 5 ms. The time window was equal to two cycles of the frequency-of-interest. The phase (φ) was estimated by taking the four-quadrant arctangent (tan −1 ) of the imaginary part and the real part of the Fourier coefficients (X):

| Decoding of stimulus orientation
All decoding analyses were performed separately for the attend-left, and attend-right conditions. The goal of the decoding analysis was to decode the orientation of the attended grating (clockwise or counterclockwise) from the MEG or eye tracker data. First, the data were selected from 400 to 1,000 ms after stimulus onset. The first 400 ms after stimulus onset were discarded in order not to be biased by superior decoding during the stimulus onset response. All time points of interest (i.e., belonging to a particular phase bin, see Section 2.3.8) were selected and concatenated, such that each time point functioned as an observation. Specifically, each time point of every trial was used as an observation and was assigned to a phase bin based on the estimated phase time course. The data in each phase bin thus acted as an aggregate of phase-specific signals within the 400-1,000 ms post-stimulus-onset window.
In order to increase SNR, assuming that the orientation-specific information in the signals can be generalized over observations, groups of five randomly chosen observations (i.e., belonging to the same phase bin) without replacement were averaged together to create pseudo-observations (Guggenmos et al., 2018). The resulting data were pre-whitened and reduced in dimensionality, by applying singular value decomposition (SVD) to the demeaned data, and multiplying the channel-level data with those vectors that explain at least 99% of the variance in the data. The trials with different orientations (clockwise/ counterclockwise) were then split up and equalized in terms of the number of observations. After this preprocessing, a support vector machine (SVM) with a linear kernel and 5-fold cross-validation was used to distinguish between the two stimulus orientations. Note that each observation (i.e., trial-time point) was only used for a single pseudo-observation and in only onefold. In order to reduce the variance in the accuracy estimate, we repeated the procedure 100 times, each time averaging a different subset of observations to obtain the pseudo-observations. In order to obtain a data-driven estimate of change accuracy for each of the subjects, we similarly estimated the chance accuracy for 100 repetitions of randomly labeled data.
In case of assessing whether we could decode orientation from the MEG data, all data were assigned to the same (phase) bin. In case of assessing phasic modulations in decoding performance, the approach above was repeated for each of 18 phase bins independently. Within a phase bin, the SVM was repeated 20 times, where every time different subsets of observations were averaged. This was performed to reduce the variance in decoding accuracy caused by the creation of specific pseudo-observations (analogous to the 100 repetitions performed for the raw decoding accuracy, described in the previous paragraph). The overall decoding performance was the average accuracy over folds and trial-averaging repetitions. A null distribution was estimated by circularly shifting the phase time courses of every trial at random from anything between −2π and 2π before phase binning the data. This way, the phase binning of the raw data was randomized without breaking the temporal (autocorrelation) structure within a trial. This was repeated 100 times to create a null distribution. Note that in the parcel-wise analysis, the procedure described above was repeated for every parcel of interest. Importantly, for every parcel decoding relied on the same data (i.e., pre-whitened sensor space data), but the assignment of each observation to a phase bin relied on the parcel's phase time course.
A similar approach was used in a control analysis that tested whether above chance MEG decoding was still possible when taking differences in gaze into account. The data were preprocessed as described above and the channels from the eye tracker were subsequently used in 10-fold cross-validation. From the training data in each fold, the one trial in each class that was most informative in training the SVM was removed, after which another decoding took place. These steps were repeated until orientation decoding based on eye tracker data was at chance level (evaluated with a t test over 10-folds vs. scrambled class labels). When the performance was at chance level, the same trials (i.e., those that were most discriminative in the eye tracker data) were removed from the MEG data. The MEG data were then used in subsequent decoding, and the accuracy was tested against the randomized version with a t test over 10-folds.
2.3.8 | Phasic modulation of decoding performance The dependency of decoding performance on the phase of a particular frequency was taken as a measure for the strength of periodic fluctuations of visual representations. For this, either the virtual channels based on maximum gamma power increase were used to provide the phase time course, or the individual anatomical parcels were used. Based on these phase time courses, each data point was assigned to any of 18 equidistant phase bins. The decoding approach described above was applied independently for each bin, resulting in 18 accuracy scores. To this, a cosine function of one cycle was fitted, with amplitude and phase as the two free parameters, estimating the strength of the phasic dependence and the optimal decoding phase. This procedure was repeated for every frequency of interest, each time using the phase time course of one particular frequency to bin the data.

| Phasic modulation of behavioral performance
We assessed whether behavioral performance, as quantified by the reaction time, depended on the phase of a particular frequency at the moment just before the stimulus change (to which the subjects had to respond) occurred. First, we estimated the phase based on a window with the size of two cycles and centered on the time point one cycle before stimulus change (i.e., target onset), such that it was not affected by the stimulus change. We then binned phase angles into 18 phase bins and computed the average reaction time across all trials within a 90° window centered on every phase bin (similar to Helfrich et al., 2018). Finally, a cosine function was fitted to the set of average reaction time as a function of phase bin, of which the amplitude indicated the strength of the phasic modulation. This was performed for every frequency of interest and using every anatomical parcel's time course as a phase-providing signal. In order to create a null distribution of cosine fit amplitude scores, the same approach was repeated 100 times after shuffling the reaction time over trials.

| Decoding of stimulus orientation
The average decoding accuracy over 100 estimated for the observed data (i.e., in which in every estimate different subsets of trials were combined into pseudo-observations to increase SNR) was compared with the average decoding accuracy of 100 accuracy estimates including the randomization of class labels. These were subjected to a nonparametric | van ES Et al. permutation test at the group level, with 1,024 permutations, and an alpha of 0.05, corrected for the number of contrasts (attend left trials, attend right trials, decode attended stimulus, decode unattended stimulus).

| Phasic modulation
The amplitude of the cosine fit for observed data and for the 100 first-level random permutations (i.e., in which phasebinning was randomized) was subjected to second-level non-parametric permutation tests with clustering across frequencies at the group level (based on 1,000 permutations; Maris & Oostenveld, 2007). In short, a null distribution of amplitudes was created for every frequency by averaging one sample from the first-level random permutation distribution over subjects. From this distribution, a threshold is established based on the critical alpha level (0.05). The test statistic (i.e., cosine-fit amplitude) of neighboring frequencies that exceed the threshold are clustered and their statistic is summed. This is performed both for the observed data and for all random permutations. A p-value is derived by comparing the largest summed statistic from the observed and random data, and counting the occasions in which the observed test statistic was larger than the random statistic. In case of testing phasic modulation on brain-parcels, statistical evaluation was restricted to a total of 36 parcels, spanning the left and right FEF and parietal cortex (Figure 2). Note that in this case the p-values were corrected for multiple comparisons for both frequency and parcels, but clusters were only formed over frequencies, not over space.

| Post hoc Bayesian statistics
The explorative nature of the current study and the small sample size (N = 10) on which the results are based, make interpretation of (non-) significant findings difficult. Non-significant results, in particular, are hard to interpret, because they do not distinguish between absence of evidence and evidence of absence. Therefore, we computed post hoc Bayes factors (BF) for statistically non-significant results. The Bayes factor is a ratio of the evidence in favor of H1 over the evidence in favor of H0. Bayes factors larger than 1 provide evidence in favor of H1, with values up to 3, between 3 and 10, and larger than 10 providing anecdotal, moderate, and strong evidence. Similarly, the inverse scale applies to evidence in favor of H0 (Lee & Wagenmakers, 2013). Bayes factors were based on the variables or cluster-averages that resulted from the (cluster-based) permutation tests, using flat priors. Note that this results in inflated BFs, because they are based on a selection of the data in which the difference is largest.

| RESULTS
Ten human participants were cued to covertly attend to one of the two gratings presented in each hemifield, while fixating on a cross in the middle of the projection screen (Figure 1). The subjects had to indicate the direction of change in the grating's orientation with a button press. The number of completed 1-s trials was 1956 on average (SD = 18.8).
Invalidly cued trials and trials containing artifacts were excluded, as well as one session from one subject, where behavioral performance was at chance level (i.e., 50%). This left on average 1,420 trials (SD = 207) for further analysis. The average behavioral performance in these remaining trials was 87% (SD = 3.6%, CI = 85%-89%). The behavioral performance of one subject was higher than average (95%) because the performance threshold was set at ~90% during the experiment, instead of the usual 80%. Excluding incorrect trials, 1,235 (SD = 186) trials were considered for further analysis.

| Stimulus-induced power changes
In order to assess whether the experimental manipulation led to expected changes in brain activity, we conducted a time-frequency analysis of power in channel space. In the low-frequency range (1-30 Hz), we expected differences in posterior alpha power (8-13 Hz) induced by the spatial attention cue. Since alpha power is a proxy for the amount of inhibition in a brain region, it is expected to increase in the hemisphere ipsilateral to the attended hemifield, and to decrease contralateral to the attended hemifield. Figure 3 shows this as the attentional modulation index (AMI, i.e., the relative power difference between attend-left and attend-right trials), with B showing the topography of the alpha band, pre-stimulus onset AMI, and A and C showing the AMI over selected left/right posterior channels on the left and right panels, respectively. Indeed, during the presentation of the spatial cue (i.e., before the onset of the grating stimulus), there is a hemispherical lateralization, with a relative increase in ipsilateral alpha power with respect to the cued hemifield and a decrease in contralateral alpha power. The Lateralization Index (see Methods 2.3.3) was calculated on the [−600 -200] ms and [8 13] Hz window, and was on average 10.4 (SD = 5.5, CI = 7.0-14). This lateralization was higher than chance, supported by a post hoc group analysis using a nonparametric permutation test (p < .001). Together with the superior behavioral performance in attended trials, this indicates that subjects successfully deployed covert spatial attention to the cued hemifield.
In the high-frequency range (30-90 Hz), we observed an increase in band-limited gamma power, induced by the grating stimuli ( Figure 3) the induced gamma power increase originated in posterior brain areas, and the maximum induced gamma power was in the occipital cortex in all subjects (Figure 4).

| Stimulus orientation can be decoded from the MEG signal
The accuracy of decoding stimulus information from the MEG signal is known to be the largest in the 0 to 250-400 ms interval; i.e., when the event-related fields are strongest (ERF; Cichy et al., 2015;Pantazis et al., 2018;Ramkumar et al., 2013). This high decodability is most likely due to the stimulus onset locked transients, and because these transients are also highly phase-locked, they could confound our analysis of a phasic modulation of decoding accuracy. For this reason, all subsequent decoding analyses excluded data in the time window of the main ERF components, and we only used data from interval 400-1,000 ms after the onset of the stimuli. In order to test if these data were usable to test our hypothesis (which would rely on subsets of these data), we first decoded the stimulus orientation from the full 600 ms interval. As in all decoding analyses, this was performed separately for attend-left and attend-right trials. We could reliably decode the orientation of the attended stimulus from the MEG data ( Figure 5) with an accuracy of 70% (SD = 1.8%, CI = 68.8%-71.1%) for attend-left trials, which is higher than expected by chance (non-parametric permutation test for observed vs. shuffled condition labels, p < .001). The mean accuracy for attendright trials was also 70% (SD = 1.8%, CI = 68.9%-71.2%, p < .001). The orientation of unattended trials could reliably be decoded too (unattend-left: accuracy = 70 ± 2.7%, CI = 67.9%-70.5%, p < .001; unattend-right: accuracy = 69 ± 2.1%, CI = 68.6%-72.0%, p < .001).
We ascribe these effects to the difference in the cortical processing of gratings with different orientations. However, an alternative explanation could be subtle, yet systematic differences in eye position and/or eye movements between conditions. This is because eye movements affect the MEG signal, and above chance, decoding could thus be a trivial result of eye movements (Thielen et al., 2019). Indeed, we were able to decode stimulus orientation with above-chance accuracy from just the eye tracker data ( Figure 5), but the accuracy was much lower than for the MEG data. The average accuracy was 53% for attend-left (SD = 1.8%, CI = 52.3%-54.5%, p < .001) and 52% for attend-right trials (SD = 1.3%, CI = 51.4%-52.9%, p < .001). This indicates that there is a bias in gaze depending on the stimulus orientation. We conducted a control analysis in order to exclude the possibility that the above-chance decoding of MEG data was exclusively a result of either eye movement induced artifacts, or because the visual stimulus ended up in different parts of the retinotopic visual cortex as a consequence of differences in gaze. We trained an SVM to distinguish orientation based on only the eye tracker data and evaluated this at the single-subject level. Next, two pseudo-observations (i.e., average of five trial-time points, see Section 2.3.7), one for each class, with the largest distance from the decision boundary, were removed. This was performed separately for each fold. After this, the remaining data were repartitioned into training/test data. This procedure was repeated until the accuracy over 10-folds did not surpass the chance level five times in a row. On average, the percentage of pseudo-observations that had to be removed from the data in order for the accuracy to drop to chance level was 3.3% (SD = 2.7%). The same trials were then discarded from the MEG data, which were subsequently used in another decoding analysis. The average decoding accuracy from the MEG data remained at 70% (SD = 2%) for both attend-left and attend-right trials, and did not statistically differ from the F I G U R E 4 Source of induced gamma activity. Stimulus-induced gamma power increase originated from occipital areas, mostly in V1-V2, and close to the midline. Maximum power increase for all subjects is denoted by white circles | van ES Et al. accuracy level before the removal of pseudo-observations. Table 1 in the Supporting Information lists the decoding performance at the single-subject level. These results show that even when discarding the trials that are most discriminating based on eye movements, it is still possible to decode stimulus orientation from MEG data with high accuracy. This indicates that the successful decoding of stimulus orientation in our data does not rely on differences in gaze or eye movements.

| Visual representations are not periodically modulated by oscillatory activity in the visual cortex
Now that we established that it is possible to decode stimulus orientation information from the MEG signal, we can look into fluctuations in decoding accuracy. Specifically, we hypothesized that decoding accuracy depends on the theta/alpha phase of ongoing activity in relevant brain areas. Specifically, we expected that the phase of brain activity in the regions where the stimulus is actively processed would influence decoding performance. Since gamma synchronization is a proxy for stimulus processing, we used the location of the maximum increase in induced gamma power to provide the frequency-specific phase signal that might modulate decoding performance. All data (i.e., from either attend-left or attend-right trials) were binned according to 18 phase bins, and the orientation of the attended stimulus was decoded from the MEG data separately for each phase bin. This was performed independently for frequencies in the theta, alpha, and beta band (4-30 Hz), in steps of 1 Hz. Resulting from this analysis were 18 accuracy scores, one for each phase bin. A cosine function was fit to these scores to estimate the strength of phasic modulation (Figure 6a). This was tested against a permutation distribution in which the phases within a trial were shifted randomly, while keeping the autocorrelation structure intact (see Section 2.3.7). In contrast to our hypothesis, the phasic modulation in the observed data was not significantly larger than that of the permutation distribution. The largest effect in attend-left trials was present at 4 Hz, with an average modulation strength of 0.98% (SD = 0.43%, CI = 0.71%-1.3%, p = .33, uncorrected, BF = 0.36). There was also a peak at 11 Hz (M ± SD = 0.89 ± 0.52%, CI = 0.56-1.2; Figure 6b). The largest effect in attend-right trials was 0.93% (SD = 0.38%, CI = 0.69%-1.2%, p = .85, uncorrected, BF = 0.47) at 9 Hz (Figure 6c). Although the spectral shape of the modulation depth showed a peak in the alpha band in both conditions, the observed peaks were not statistically significant, and post hoc computed Bayes factors revealed anecdotal evidence in favor of H0, even when BFs are inflated because of the selection of the data in the clusterpermutation test (see Section 2.4.3).

| Activity in the frontoparietal network periodically modulates visual representations
While the phasic modulation in the previous results showed a peak in the alpha band, it was not significantly larger than that of the permutation distribution. The phasic modulation was estimated based on the phase in visual brain areas. It is still possible that visual representations are phasically modulated, but that the modulatory signal has a different origin. Some previous reports suggest that the modulatory effect of theta/alpha phase on behavior originates from areas in the frontoparietal attention network, especially the frontal eye fields (FEF) and parietal cortex (Busch et al., 2009;Fiebelkorn et al., 2018;Gaillard et al., 2020;Helfrich et al., 2018). To investigate this, we performed an additional set of analyses, now using as phase-defining signal source reconstructed activity from anatomically defined cortical parcels from FEF and parietal cortex, resulting in 36 brain parcels (Figure 2) in 17 frequency bins (4-20 Hz). We explored whether the phase of the signals from these parcels and frequencies modulated decoding performance. Thus, the analysis strategy used before was now repeated, independently for each parcel and frequency, i.e., the phase time course was estimated on a single parcel and used for subsequent binning in the decoding procedure. We found that phasic modulation was higher than expected by chance in both attend-left and attend-right trials. In attend-left trials, the right FEF most contributed to this difference at F I G U R E 5 Stimulus information of the attended stimulus can reliably be decoded from magnetoencephalographic (MEG) data (red) and, to a lesser extent, eye-tracking data (blue). Decoding was done separately for attend-left and attend-right trials. Violin plots display the probability density over subjects. The horizontal line denotes the group mean and the dots denote the individual subject accuracies. The semitransparent gray line shows the empirical chance level | van ES Et al. 3200 17-20 Hz (M ± SD = 1.0 ± 0.42%, CI = 0.74% -1.3%, cluster-corrected p = .002; Figure 7c). We made sure this was not driven particularly by the subject with the aboveaverage behavioral performance (i.e., due to a technical error, see Section 3.1), by repeating the group analysis while leaving out the subject in question. The results remained significant.
Upon the exploration of the data, the mean modulation strength was found to be highest in the right FEF at theta frequency (4 Hz, Figure 7a; M ± SD = 1.3 ± 0.67, CI = 0.92%-1.7%), and in the right parietal cortex in the alpha band (8-11 Hz; Figure 7b; at 9 Hz, M ± SD = 1.0 ± 0.48%, CI = 0.70%-1.3%). In attend-right trials a cluster in the left FEF at 10-18 Hz mostly contributed to the significant difference (M ± SD = 0.98 ± 0.45%, CI = 0.36%-1.6%, p < .001; Figure 7e). The mean modulation strength was also relatively high in the left parietal cortex in the theta band (4 Hz, Figure 7d; M ± SD = 1.4 ± 0.70, CI = 0.97%-1.8%) and alpha band (8-12 Hz; Figure 7f; at 10 Hz, M ± SD = 1.0 ± 0.53, CI = 0.7%-1.4%). The optimal phases for decoding differed between subjects, but were to a large part consistent over parcels and frequencies (see Figures S2 and S3). Note that phase polarities are ambiguous: bimodal distributions with a 180-degree phase difference between adjacent parcels can occur due to the polarity ambiguity in the principal component analysis (PCA) in the construction of parcel time courses (see Methods 2.3.5). The subjects with the largest modulation effects (Figure 7) seem to have a more consistent optimal phase within neighboring parcels and frequencies (for example, compare the modulation strength of subjects F I G U R E 6 Phasic modulation of decoding performance. (a) Schematic of the phasic modulation depth, i.e. the amplitude of a cosine fitted to the decoding performance over phase bins. (b and c) Modulation depth as a function of frequency for attend-left (b) and attend-right (c) trials. Observed data in red; random data in grayscale (mean and standard deviation over subjects). Individual effect sizes at the spectral peak are shown on the right of the spectrogram, with the group average in black | van ES Et al. S1 ("good" subject) and S4 ("bad" subject) in Figure 7d and their phase distributions in Figure S3a), which makes it less likely that phasic modulation in these regions is purely noise.

| No support for behavioral relevance of phasically modulated visual representations
Having established that the phase of ongoing theta/alpha activity in FEF and parietal cortex modulates the strength of cortical stimulus representation, we next investigated whether the phase of these signals modulates behavioral performance as well. This has been reported in other studies (Helfrich et al., 2018). Specifically, we tested whether the phase in these areas at the moment before stimulus change (i.e., to which subjects had to respond) affected reaction times. A cluster-based permutation test within the same ROIs as before revealed no statistically significant phasic modulation of reaction times: for attend-left trials, p = .19 (corrected; M ± SD = 10.6 ± 5.1 ms, CI = 7.4-14 ms; BF = 3.5); for attend-right trials, p = .48 (corrected; M ± SD = 9.0 ± 4.7 ms, CI = 6.0-12 ms; BF = 2.1). Again, note that the post hoc F I G U R E 7 Phasic modulation of decoding performance in the frontoparietal network. The phasic modulation in attend-left trials (a-c), and in attend-right trials (d-f) is shown for all parcels, but only the parcels in the ROIs were tested statistically (see Figure 2) as described in Section 2.4.2. In a the anterior (A) to posterior (P) axis and left (L) and right (R) hemispheres are denoted. Spectrograms show the group-average modulation strength in the selected parcel in the green circle (red) and the average expected by chance (black). Shading reflects SD across subjects. Individual effect sizes at the spectral peak are shown on the right of the spectrogram, with the group-average in black | van ES Et al.
BFs are inflated because they are based on a selection of the data in which the difference is strongest. Upon the exploration of the data, the following observations were made (see Figure 8). In attend-left trials, the mean phasic modulation showed a peak in the alpha band in a right parietal parcel (at 9 Hz), with a mean modulation strength over subjects of 9.8 ms (SD = 4.7 ms, CI = 6.9-13 ms; Figure 8a, left), and in the left FEF (at 13 Hz), with a mean modulation strength of 11.3 ms (SD = 5.7 ms, CI = 7.7-15 ms; Figure 8a, right). In attend-right trials, the modulation strength did not show clear peaks, but was highest in the right parietal cortex in the alpha band (13 Hz), with a mean modulation strength over subjects of 9.2 ms (SD = 4.9 ms, CI = 6.2-12 ms; Figure 8b). An additional post hoc analysis revealed no statistically significant phasic modulation on behavioral accuracy (for attend left trials: M ± SD = 2.3 ± 1.1%, CI = 1.6%-3.0%, p = .75; BF = 3.5, and attend right trials: M ± SD = 1.9 ± 0.9%, CI = 1.3%-2.5%, p = .46; BF = 1.3).
Our results provide first, moderate evidence that stimulus information can be decoded with higher accuracy at particular phases of the theta/alpha rhythm in the frontoparietal network contralateral to the attended hemifield. This is consistent with the idea that visual perception is not continuous but to some extent discrete. The relevance of this phase for behavior (i.e., the response speed) was not supported by the current data.

| DISCUSSION
In this study, we investigated the presence of rhythmicity in visual representations during sustained spatial attention. Psychophysical and neural evidence suggest that (visual) perception is not continuous but discrete, as a consequence of discrete rhythmic attentional sampling. We reasoned that if this is the case, neural representations are also expected to fluctuate, i.e., the strength of a neural representation should depend on the phase of ongoing activity. This was tested by decoding stimulus orientation from human brain signals, as a function of the phase of the ongoing activity. We found that oscillatory activity in the visual cortex does not phasically modulate decoding accuracy. However, the frontal eye fields (FEF) and parietal cortex contralateral to the attended stimulus did phasically modulate decoding accuracy in the theta and alpha bands. The phasic modulation of visual representations is in line with discrete perceptual sampling during attention, though it is unclear at which step of visual processing occurs. The behavioral relevance of the phase of ongoing activity in this network was not confirmed, but trends in the phasic modulation of reaction times were in line with such modulatory effect. The current study was of limited sample size (N = 10), which could have affected the results. Therefore, both significant and nonsignificant p-values should be interpreted with caution, and these results should be confirmed in a study with larger sample size.
The phasic modulation of decoding performance was mostly present in the theta and alpha bands. These effects were band-limited and in no occasion did the effect extend to the entire investigated frequency range. This suggests that the above chance modulation effects were not a consequence of a bias in the permutation test or of other confounding effects. Effects in these frequency bands are also in line with the perceptual sampling literature in which found correlations between behavioral performance and the phase of ongoing activity were found. Busch et al. (2009) showed that the phase of the frontal theta/alpha rhythm (6-12 Hz) predicts whether a threshold stimulus is perceived or not. Similarly, Fiebelkorn et al. (2018) and Helfrich et al. (2018) showed that perceptual outcome depends on the theta phase in the frontoparietal network (including FEF and parietal areas) and that this is accompanied by increases in cortical excitability during "good" theta phases. From the current study, it is not clear whether phasic modulation in the theta and alpha band reflect the same or different neural processes, but evidence from the phasic modulation of behavioral performance suggests the presence of either rhythm might be task-dependent and rely on the number of objects that have to be monitored (Fiebelkorn et al., 2013;Holcombe & Chen, 2013;Landau & Fries, 2012). This would mean that the theta and alpha band effects are not necessarily related to different processes per se, but rather to temporal limits of attentional selection. Alternatively, Fiebelkorn and Kastner and colleagues Fiebelkorn et al., 2018) proposed that the two frequency bands do represent different processes. In short, frontal theta organizes two rhythmically alternating attentional states, one state with enhanced visual processing, accompanied by an FEF dominated beta increase and a LIP dominated gamma increase, and one state with attenuated sensory processing, possibly in the anticipation of an attentional shift. This second state would be accompanied by an increase in alpha activity in LIP and a decrease in gamma activity. The FEF might thus specifically be involved in (the attenuation of) exploration of space, while the parietal cortex modulates processing at the attended location. Gaillard et al. (2020) found support for this in a study with a cued target-detection task in macaques. They successfully decoded the location of the attentional spotlight from the prefrontal cortex (PFC) and observed that decoding accuracy fluctuated with 7-12 Hz rhythmicity.
While based on the current study alone we cannot disentangle the task-dependent involvement of theta and/or alpha rhythms for attentional processes, we did observe slight differences in the spatial locations of the theta and alpha effects. In attend-left trials, phasic modulation was strongest in the FEF in the theta band, and in the parietal cortex in the alpha band, in line with the proposed model by Fiebelkorn and colleagues. In attend-right trials, phasic modulation was strongest in the parietal cortex in the theta band, and parietal cortex and FEF in the alpha band, which fit their predictions.
Future research should thus disentangle the different contributions of the frontal and parietal cortex in rhythmic sampling. For example, one could conceive of an experiment in which the setup of the Gaillard et al. is combined with the approach in the current study. If it is possible to decode stimulus information and the position of the attentional spotlight on the same trial, we would expect both to be modulated by theta/alpha phase, but they would rely on different origins (e.g., parietal vs. prefrontal cortex) or have different optimal phases.
We did not observe a phase-dependent modulatory effect when using visual cortical activity as the phase delivering signal, which is at odds with the previous literature. Landau et al. (2015) found that that the sampling of two behaviorally relevant stationary grating stimuli occurred at 4 Hz each and that this modulation was present at cortical locations with maximal visually induced gamma-band activity. The current study used a very similar experimental paradigm (albeit with drifting instead of stationary gratings, and a different attentional manipulation), and it is likely that the distributed activity in early visual areas played a major part in successfully decoding stimulus orientation. Primary visual cortex is most sensitive to local contrast differences like oriented lines (Hubel & Wiesel, 1962), and empirical and modeling work from Cichy et al. (2015) confirmed it is likely that orientation information is decoded from this region with MEG (Stokes et al., 2015). The discrepancy in the locus of the phasic modulatory signal requires further investigation. It should especially be studied whether perceptual sampling in visual areas is always under control of the frontoparietal attention network, or whether it can be controlled locally. Some evidence for the former comes from Popov et al. (2017), who found that processing in the visual cortex (as measured by gamma power) was a function of the attentional modulation of posterior alpha power, while the latter was Granger caused by the FEF. Activity in the FEF itself was not modulated by attention. Consequently, in the current study as well, visual processing might be modulated by the FEF without clear changes in spectral power in that region ( Figure 2). There are a number of potential reasons why the relationship between theta/alpha phase and decoding accuracy was not observed from a source in the visual cortex. It could be that the induced visual cortical activity reflects multiple generators of oscillatory activity with heterogeneous phase relationships (Maris et al., 2016), thus prohibiting the estimated phase values to actually reflect the physiologically relevant phase. This possible explanation is supported by the presence of multiple generators of posterior alpha with different sensitivities to different visual | van ES Et al.
locations (Popov et al., 2019). Another option is that visual representations might not be modulated in the early visual cortex, but in downstream visual areas. This is in line with evidence from fMRI studies: Sprague and Serences (2013) showed that attention increases the amplitude of stimulus representations especially in higher visual areas. Moreover, orientation-selective representations are not only present in visual areas, but also in parietal and frontal areas including the FEF, and these areas are all modulated by attention (Ester et al., 2016).
We did not find any significant modulatory effects of neural oscillatory phase on behavior, in contrast to earlier findings. Studies relating behavioral performance to the phase of ongoing activity typically use binary performance measures, e.g., hits versus misses (Busch et al., 2009;Busch & VanRullen, 2010;Fiebelkorn et al., 2018;Helfrich et al., 2018). Perhaps modulatory effects in these performance measures can be detected with higher sensitivity than a continuous variable like reaction time. Yet, previous studies have reported positive outcomes relating the alpha phase to reaction times (Callaway & Yeager, 1960;Drewes & VanRullen, 2011;Lansing, 1957). These studies did use easier tasks, as indicated by a lower mean reaction time, and the latter study showed that in a more demanding task and more cognitive responses the effect of phase was less pronounced. Thus, it is possible we did not find an effect because the effect size is too small in our task. Still, our study is not the first to find a negative outcome in relating the theta/alpha phase to behavioral performance (Benwell et al., 2017;Ruzzoli et al., 2019). If the phasic modulation of visual representations is not behaviorally relevant, it questions the validity of this potential attentional mechanism. Future studies should therefore explore the limits of phasic effects on reaction times and other behavioral measures.
Multivariate pattern analysis (MVPA) has been used before to investigate how neural representations might be related to the alpha rhythm (Foster, Bsales, et al., 2017;Foster et al., 2016;Foster, Sutterer, et al., 2017;van Moorselaar et al., 2018;Samaha et al., 2016). In those studies, neural representations of the locus of spatial attention were decoded from the topographical distribution of spectral power. The location could be decoded specifically from alpha band power, both during attention selection and working memory (WM) maintenance, even when the spatial position was irrelevant to the WM task. This shows that there is a link between the spatial distribution of alpha-band activity and the representation of space, and one might ask whether and how this is related to the modulation of neural representations by the alpha phase in the current study. We also used a covert spatial attention task, and the attentional modulation index (AMI) showed lateralization in posterior alpha power as a function of the direction of spatial attention (Figure 3), which confirms a link between alpha activity and the locus of spatial attention. However, the stimulus feature (i.e., orientation from the attended stimulus) we decoded from the MEG data was independent of its spatial location (i.e., only the orientation of the attended stimuli were compared when the attended stimulus was in the same hemifield). In fact, all attend-left and all attend-right trials were analyzed separately in order to ensure that the decoding of stimulus orientation was not confounded by spatial attention. Furthermore, the topographical distribution of alpha activity did not play any part in the decoding analysis, since the phase binning was based on a single and the same alpha source for both orientations. Therefore, it is unlikely that the representation of space in a signature pattern of alpha-band activity contributed to, or confounded, the phasic modulation of neural representations.
Recently, a number of studies have demonstrated confounding effects of eye movements on decoding performance from the M/EEG data (Mostert et al., 2018;Quax et al., 2019;Thielen et al., 2019). Because eye movements affect the MEG signals, a systematic difference in gaze position between conditions could result in above-chance decoding performance independent of differences in the true underlying neural representation. We found that it is possible to decode stimulus orientation reliably from the eye tracker data only, which reinforces this concern. In order to make sure that the MEG decoding was not a trivial result of eye movements or gaze position, we investigated whether orientation could still be decoded from the MEG data after the removal of the trials that could distinguish orientation based on gaze. Removal of these trials resulted in chance level accuracy based on the eye-tracker data, but did not affect the accuracy based on the MEG data, which remained highly above chance level. Another concern could be the contamination of the decoding performance by evoked activity. Temporal MEG decoding performance is highest at the peak of evoked activity Cichy et al., 2015;Pantazis et al., 2018;Ramkumar et al., 2013). At the same time, these peaks trivially have a consistent phase over trials, which could lead to a trivial phasic modulation in decoding performance. We ensured that evoked activity did not affect estimates of phasic modulation effects in decoding by only using the data for the decoding analysis after the offset of the evoked activity, which was not phase-locked (data not shown).
Overall, this exploratory study shows that visual representations are modulated by the phase of ongoing activity. In particular, it indicates that the representation of visual input is not constant over time, but depends on the phase of the theta/alpha rhythm in the frontoparietal attention network. Further research should investigate the potentially differential roles of the frontal and parietal parts of this network, the different roles of frequencies in the theta and alpha band, and | van ES Et al.
the behavioral relevance of such modulatory effects on neuronal representations.

DATA ACCESSIBILITY STATEMENT
All analysis code can be accessed at GitHub (https://github. com/Donde rs-Insti tute/phase code). The corresponding data will become available at https://doi.org/10.34973/ g69b-ce21 after the publication of the manuscript.