Dynamics of retinotopic spatial attention revealed by multifocal MEG

Visual focal attention is both fast and spatially localized, making it challenging to investigate using human neuroimaging paradigms. Here, we used a new multivariate multifocal mapping method with magnetoencephalography (MEG) to study how focal attention in visual space changes stimulus-evoked responses across the visual ﬁeld. The observer’s task was to detect a color change in the target location, or at the central ﬁxation. Simulta-neously, 24 regions in visual space were stimulated in parallel using an orthogonal, multifocal mapping stimulus sequence. First, we used univariate analysis to estimate stimulus-evoked responses in each channel. Then we applied multivariate pattern analysis to look for attentional eﬀects on the responses. We found that attention to a target location causes two spatially and temporally separate eﬀects. Initially, attentional modulation is brief, observed at around 60–130 ms post stimulus, and modulates responses not only at the target location but also in adjacent regions. A later modulation was observed from around 200 ms, which was speciﬁc to the location of the attentional target. The results support the idea that focal attention employs several processing stages and suggest that early attentional modulation is less spatially speciﬁc than late.


Introduction
Covert spatial attention -attention to a certain region in visual space -is one of the fundamental visual functions.Attentional control allows the visual system to deploy limited higher-level processing resources to the task at hand.Attention to spatial locus improves various behavioral and psychophysical measures such as accuracy and reaction times ( Posner, 1980 ), perceived contrast ( Carrasco, Ling, & Read, 2004 ) and signal-to-noise ratio of sensory readout ( Dosher & Lu, 2000 ).Powerful demonstrations such as inattentional blindness ( Simons & Chabris, 1999 ) show that even large changes in a visual scene can go unnoticed if they take place outside the focus of attention.Covert spatial attention that operates without eye movements is often schematized as a 'spotlight' in visual space that determines what parts of the input stimuli have access to processing in higher-level cortical areas ( Desimone & Duncan, 1995 ).Moreover, attentional processes have characteristic time courses: Endogenous, refocusing voluntary spatial attention typically takes about 300 ms, whereas stimulus-driven and involuntary exogenous attention is faster and peaks around 100 ms (for a review, see: Carrasco, 2011 ).Further, successfully attending to a target leads to attentional blink, a period of decreased sensitivity around 200 -500 ms after the target detection ( Raymond, Shapiro, & Arnell, 1992 ).In sum-mary, spatial attention has multiple roles in cognition and may in fact consist of several subprocesses, i.e., amplification of responses to attended stimuli and controlling access to higher level processing, which all have unique time courses.
Covert spatial attention is both fast, operating on a sub-second timescale, and spatially focused, which makes it challenging to investigate using human neuroimaging paradigms.Attention effects have been found in early visual areas like primary visual cortex (V1), but it is unclear whether these effects are feedforward or feedback from higherlevel visual areas.Functional magnetic resonance imaging (fMRI) has shown that covert attention causes retinotopically specific increases in BOLD responses in several visual areas, including V1 (see e.g.: Brefczynski & DeYoe, 1999 ), and increasing the amplitude of stimulus representations, as estimated from multi-voxel analyses, especially at higher visual cortices ( Sprague & Serences, 2013 ).However, based on fMRI results alone, it is difficult to disentangle the input modulation from possible delayed feedback effects from higher areas.Whereas fMRI is limited by poor temporal resolution, magneto-and electroencephalography (M/EEG) can track cortical activity at millisecond timescale.
M/EEG attention studies have shown that covert attention modulates early event-related responses (see e.g.: Di Russo, Martínez, Sereno, Pitzalis, & Hillyard, 2002 ;Hillyard & Anllo-Vento, 1998 ) and alpha fre-quency activity (see e.g.: Foxe & Snyder, 2011 ), and increases the gain of spatially tuned population responses ( Foster, Thyer, Wennberg, & Awh, 2021 ).However, several key questions on the spatio-temporal properties of covert attention remain open: some studies (see e.g.: Luck, 1995 ) suggest that spatial distribution of attention is dependent on time, and have used the size of the attentional focus to identify attentional subprocesses.However, these studies have not attempted to systematically measure attentional modulation in the visual field.Moreover, there is no consensus on the earliest processing stage at which attention modulates sensory input.Whereas animal studies with macaques have shown that attention effects are already present in the primary visual cortex and even in the lateral geniculate nucleus ( McAlonan, Cavanaugh, & Wurtz, 2008 ), it is not known how well this finding generalizes to humans: clear attention effects have been found in extrastriate cortex but whether the first cortical feedforward sweep is affected remains debated ( Alilovi ć, Timmermans, Reteig, Van Gaal, & Slagter, 2019 ;Baumgartner, Graulty, Hillyard, & Pitts, 2018 ).
Compared to fMRI studies of visual cortical representations, the spatial resolution of M/EEG is commonly thought to be severely limited.Spatial resolution of MEG is often defined as the ability to localize active cortical sources through inverse modeling.Visual cortex with its complex geometry and several concurrently active sources poses a challenge here.Moreover, because of cortical folding, MEG is thought to have poor sensitivity for neural signals originating from early visual cortices.This problem is evident with spatially extended stimuli, where synchronous sources at opposite banks of the calcarine sulcus may result in cancellation of the evoked neuromagnetic fields ( Ahlfors et al., 2010 ;Kupers, Benson, & Winawer, 2021 ).With spatially focal stimuli, however, MEG may be able to spatially resolve signals better than EEG (for a review, see : Baillet, 2017 ).In this study, we use a novel combination of multifocal retinotopic mapping MEG technique and multivariate pattern analysis at the MEG sensor level to overcome some of these limitations.The multifocal technique ( Henriksson, Karvonen, Salminen-Vaparanta, Railo, & Vanni, 2012 ;James, 2003 ) works by stimulating the visual field using multiple, time-varying stimulus sequences so that temporal sequences at every location are statistically independent.We used a behavioral attention task concurrently with the multifocal MEG mapping stream so that the participants' task was to indicate a slight and rapid change in color in one of the stimulus regions: either at fixation, at a region located at the left lower or at a region located at the right lower visual field.The experiment was done in blocks, where the location of the attentional task was varied.We applied multivariate pattern analysis to look for attentional effects on the stimulus-evoked responses.Our approach enables estimation of the attentional modulation in every stimulated part of the visual field with high temporal resolution, and thus investigation of its temporal dynamics across the visual field.

Subjects
MEG data were collected from twenty subjects (15 females; mean age 22, age range 19-29).Four additional subjects were recruited but they dropped out from the study either before or during the data collection, due to dental braces causing strong artifacts in the measured MEG signals.Ethical approval for the research was obtained from the Aalto University Ethics Committee.Subjects gave written informed consent before participating in the study.

Multifocal stimuli and experimental design
The multifocal stimulus consisted of 3 annuli, each having 8 sectors (see Fig. 1 and Supplementary Video S1 ).The inner radius was 0.5 and outer radii were 2.3, 4.7 and 8.4 degrees.We used pattern-onset stimulation, where the checkerboard stimulus pattern appears abruptly and then fades out as a linear function of time.The 24 distinct regions ( Fig. 1 A) were stimulated asynchronously and in parallel, so that the stimulus regions that were 'on' during each trial were determined using orthogonal stimulation sequences ( James, 2003 ;Vanni, Henriksson, & James, 2005 ).We used a quadratic residue binary sequence, as originally proposed by James (2003) for multifocal EEG and later adapted for multifocal fMRI by Vanni et al (2005) .Trial onset was randomized so that its duration varied between 217-317 ms (trial-onset-asynchrony).Each run contained 469 trials and lasted approximately 2.5 minutes.Before the onset of the multifocal stimulus sequence, full-field checkerboard stimulus was flashed 20 times to habituate the response and to prepare the subject for the task.
Data were collected during three different tasks: (1) color change detection task at the fixation point, (2) color change detection task at a target region in the right visual field, and (3) color change detection task at a target region in the left visual field ( Fig. 1 A).Stimuli were identical in all tasks while task demands were varied.During a color change, an entire 4 × 4 checkerboard pattern corresponding to one stimulus region in the multifocal stimulus either in the left or right hemifield was overlaid with green color, lasting for 800 ms.Subjects were instructed to respond with a finger lift as soon as they noticed a color change in the attended location.In the fixation task, the fixation cross changed its color from red to green.The left and right target regions were overlaid with a light green color plate to minimize spatial uncertainty about attention target location.Target changes were also present in the ignored attentional location, but subjects were instructed to ignore the color changes when not specifically instructed to pay attention to them.In other words, the visual stimulus was exactly same in the attend left and attend right task conditions.The subjects also responded by lifting the same finger whatever the task.The task type (attention, fixation) and the location of the attention task was communicated before the start of each experiment run.
Seven runs were acquired for each of the three tasks.Order of task runs was interleaved.The fixation task was always the first whereas the order of the left and right task runs was pseudo-randomized between the subjects.In addition, 1-minute rest MEG data were collected at the beginning of the measurement, in the middle, and at the end of the measurement.18 of the 20 subjects completed all task runs (seven per task).With the remaining two subjects, we discarded one task run of each task type due to problems with data collection during at least one run.
The stimuli were presented on a semitransparent screen with a 3-DLP projector (Panasonic PT-D7700E) using PsychoPy version 1.82.01 ( Peirce, 2009 ).Eye-gaze was followed using an SR Research Eye-Link1000 system (SR-Research Ltd., Ontario, Canada; sampling rate 500 Hz).The subjects had practised the tasks at the Aalto Behavioral Laboratory before the MEG data collection.

Eye movement controls
Eye tracking data was first drift corrected by subtracting the baseline, which was measured from the average gaze position between 1000 ms and 2000 ms during the full-field stimulus that was shown at beginning of each run.Mean gaze positions in each condition are shown in Fig. 1 B. Average of the gaze position across the subjects was < 0.2 ˚from the centre of the screen in all conditions.Eye tracking data was partially or completely missing from 3 subjects because of technical errors.

MEG data acquisition and preprocessing
MEG data were recorded in a magnetically shielded room with a whole-scalp 306-channel MEG device (Elekta Neuromag, MEGIN Oy, Helsinki, Finland) at the MEG Core (Aalto NeuroImaging, Aalto University School of Science).The device comprises 102 triple-sensor elements, with one magnetometer and two orthogonal planar gradiometers at each location.The signals were sampled at 1000 Hz using a recording passband of 0.03 -330 Hz.The position of the subject's head with respect Fig. 1.Multifocal MEG stimulus and goodness-of-fit analysis of the GLM model.A) The visual field was divided into 24 regions that were stimulated in parallel using a multifocal stimulus sequence.An actual insert of the stimulus is shown with approximately half of the stimulus regions 'on' at this time instant (see also Supplementary Video S1 ).Data were collected while the subjects directed their attention either at the fixation cross ( "fix "), a target location in the left visual field ( "left "; shown here with red outline) or a target location in the right visual field ( "right "; shown here with blue outline).B) The stimulus extended to 8.4 degrees radius.Mean gaze locations with standard deviations are shown for the different task conditions.C) The evoked responses were extracted from the continuous multifocal MEG data using a general linear model with finite impulse response (FIR) basis functions.The goodness of the response estimation was evaluated by calculating cross-validated (cv) R2 values between the estimated model and an independent half of the data that was not used during the parameter estimation.Here are shown the cv-R2 values averaged across each gradiometer pair and across the 20 subjects.The best model fits are obtained for occipital gradiometers.D) Based on the cv-R2 values with different lengths of the FIR basis set (from [-50 -49] ms to [-50 1000] ms), the optimal length was estimated to be from -50 ms to 450 ms (dotted line) from stimulus onset.Here the average cv-R2 are shown for the gradiometer with the best performance, defined separately for each individual (location shown in the sensor layout; marker size reflects N of subjects).Line colors reflect data collected during the different task conditions.E) The cv-R2 values averaged across all gradiometers and subjects are shown as a function of the length of the FIR basis set.
to the MEG sensors was tracked throughout the experiment using five head position indicator coils.Horizontal and vertical electro-oculograms were recorded with the same recording passband and sampling rate as applied for the MEG.
The MEG data were preprocessed using spatiotemporal signal-space separation ( Taulu & Simola, 2006 ) implemented in the MaxFilter software (Elekta Oy, Helsinki, Finland) to suppress magnetic interference from external sources and to compensate for head movement.Independent component analysis-based eye blink artifact correction was applied to the data and the data were low-pass filtered at 45 Hz using the Fieldtrip software ( Oostenveld, Fries, Maris, & Schoffelen, 2011 ).For multivariate analysis, the data were spatially whitened based on noise co-variance matrix estimated from the 1-minute rest MEG data collected before the task runs.MEG sensors have overlapping sensitivity profiles for cortical sources and thus the measured signals contain redundant data.In whitening, the data are projected to an orthonormalized subspace without losing relevant information.In practice, whitening brings the magnetometer and gradiometer data on the same scale and reduces redundancy and dimensions in the MEG signals.

Univariate analysis: estimation of evoked responses
The evoked response waveforms were estimated from the continuous multifocal MEG data using a general linear model (GLM) approach orig-Fig.2. Multivariate pattern analysis of MEG data.A) MEG signals were recorded using 306 sensors.The data were spatially whitened as a preprocessing step for the multivariate pattern analysis.Pattern vectors corresponding to specific stimuli were extracted at each time point.B) For each pair of 24 stimulus regions, we compared the corresponding pattern vectors to determine how well the different parts of the visual field were discriminated by MEG signals.The discriminability was evaluated using the cross-validated squared Mahalanobis distance estimate ( d ).Distance estimates between each stimulus region pair were collected into a 24 × 24 representational dissimilarity matrix (RDM).Only the lower part of the symmetric matrix is shown.C) To determine the effect of attention, we compared the pattern vectors corresponding to the same region in the visual field during two different task conditions.inally proposed for multifocal EEG data ( James, 2003 ).We fit a finite impulse response (FIR) model where y contains the continuous MEG measurement data (number of time points × number of MEG channels), X is the design matrix (number of time points × number of independent variables), and b contains the regression coefficients (number of independent variables × number of MEG channels).The data y is modelled as a linear combination of predictors that are stored in the design matrix X.For the estimation of the univariate evoked responses, the data y contained all MEG data.For the multivariate analyses, the data were partitioned based on the experimental runs to allow for cross-validation.The design matrix X contained the stimulus onsets for the 24 stimulus regions for each task (one column for each stimulus region of each task condition) and the set of time-delayed versions of these impulses (separate columns), forming the FIR basis.The length of the FIR ( i.e. , range of delays) defines the timewindow of the estimated response waveform.For the main analysis, the FIR basis covered the time window from -50 to 450 ms from stimulus onset.Separate constant regressors for each MEG run were included to estimate the mean level of the signal.The regression coefficients in b were solved by minimizing the sum of the squared residuals e.The estimated regression coefficients b could be reorganized into a multidimensional array with dimensions 72 × 501 × 306 (24 stimulus regions × 3 tasks, 501 samples in the time window, 306 MEG channels) to ease the visu-alization of the estimated evoked response time courses (second dimension) or topographies (third dimension).For spatially whitened data, the dimensions were 72 × 501 × 49.The data analysis was implemented in Matlab (MathWorks, Natick, MA).
The goodness-of-fit of the model was evaluated by where b are the estimated regression coefficients and ȳ is the overall mean of  .The cross-validated  2 was evaluated by dividing the data into two sets and fitting the coefficients on one half of the data and evaluating the goodness-of-fit using those parameters on the other half of the data.A similar approach was used to evaluate the effect of the length of the FIR basis on the goodness-of-fit of the GLM model.Too short FIR basis may fail to estimate the response waveform correctly, whereas too long FIR basis may result in overfitting ( Fig. 1 C -E).Based on the cross-validated R 2 values, the time-window of [-50 450] ms for the FIR basis set was selected for further analysis.

Multivariate analysis: cross-validated Mahalanobis distance
The discriminability of the estimated evoked responses was evaluated using the cross-validated squared Mahalanobis distance estimate where (   −   ) is the difference between the activity patterns (topographies) of conditions  and ,  and  refer to two independent divisions of the data, and Σ refers to the noise covariance matrix ( Kriegeskorte & Diedrichsen, 2019 ;Nili et al., 2014 ;Walther et al., 2016 ).Activity patterns   and   refer here to the responses across the whitened sensor space at a time instant ( Fig. 2 ).Leave-one-run-out cross-validation folds were used to create unbiased estimates ( Arbuckle, Yokoi, Pruszynski, & Diedrichsen, 2019 ;Walther et al., 2016 ).Data from one of the seven experimental runs formed data partition B and the other six runs formed data partition A. Each run was left to data partition B once, and the results were averaged across the seven folds.The distances calculated at 1-ms time steps were temporally smoothed using a 20-ms centred moving average filter.The distance estimates were calculated both between the response patters to different visual field regions ( Fig. 2 B) and between the pat-terns to the same visual field region across two different tasks ( Fig. 2 C).With these analyses we can address the questions of whether all multifocal regions evoke significantly different responses and whether the responses are significantly affected by the tasks.

Behavioral results
A response between 200ms and 1200ms post target onset was defined as a valid hit, and a response outside this time window a false alarm.Average hit rates were quite high across the conditions (fixation: 0.98; target left: 0.87 [min: 0.71 -max: 0.97]; target right: 0.88 [min: 0.66 -max: 0.98]) as well as d' (fixation: 4.72; target left: 3.40; target right: 3.69).Average reaction time for a hit was 567 ms in the fixa-

Effect of spatially selective attention on retinotopic MEG response amplitudes
Using the multifocal stimulus design, the visual field was divided into 24 regions ( Fig. 1 A) that were stimulated in parallel using orthogonal stimulation sequences during the MEG data collection.The responses to the distinct visual field regions were estimated from the continuous MEG data using a general linear model approach.Fig. 3 shows examples of the estimated stimulus-evoked responses.The response topographies differed between individual subjects, likely reflecting the underlying individually unique anatomy of the cortical visual areas.Yet, within an individual, the topographies of the estimated responses for the different visual field regions appeared consistent with an underlying retinotopic organization ( Fig. 3 A).
During the experiment, the subjects directed their attention either to the fixation point, to a target location in the right visual field, or a target location in the left visual field.The individual evoked responses, shown in Fig. 3 B, suggest that attention affects the evoked multifocal MEG responses.However, the inter-individual variability of the evoked response shape makes any direct comparison challenging.
For a more comprehensive view, Fig. 4 shows grand-average waveforms for each visual field region during the different task conditions.The gradiometer pairs in the same location were first combined, after which the results were averaged across the 20 subjects.In the attended visual field regions, the response amplitudes were enhanced with attention.A modest attentional modulation was already present in early MEG response peaking at around 70 ms, but the effect was not significant in the right target region.In the later response, starting from about 200 ms after stimulus onset ( Fig. 4 B, D), more prominent modulation was seen in both target locations.A slight modulation was also apparent in the contralateral visual quadrant, likely reflecting the subjects' inability to fully ignore the opposite target location.But for the other visual field re- gions, the change in the spatial locus of visual attention had essentially no effect on the grand-average response amplitudes.

Spatially selective attention affects response discriminability
Averaging the responses across sensors and studying the grandaverage responses ( Fig. 4 ) reduces noise but also information in the MEG data.MEG responses to visual stimuli show substantial interindividual variability due to anatomical differences, and attentional modulation of responses may differentially affect the responses across multiple sensor locations.Hence, we next applied multivariate analysis to the MEG data, where we look for systematic differences between response patterns across multiple sensors during different stimulus or task conditions.We computed the cross-validated estimate of the Mahalanobis distance between each pair of responses (see Fig. 2 ) -a systematically positive distance estimate indicates reliable difference between the response patterns.This analysis addresses the question of whether there is information about the task conditions in the evoked MEG responses.First, however, we evaluate how well the different parts of the visual field were discriminated by MEG signals.
Fig. 5 shows the results as representational dissimilarity matrices (RDMs), where each entry reflects the distance estimate between the responses for two stimulus regions at a specific time point.The lower triangle of the 24 × 24 RDMs shows the average distance estimates ( Fig. 5 A); that is, how distinct are the estimated response topographies between two stimulus regions.The upper triangle shows the statistical significance of these distance estimates; that is, whether the topographies between two stimulus regions are significantly different.Results at different time points are shown separately in different rows.Results obtained during the different task conditions are shown separately in different columns.
At 45 ms from stimulus onset ( Fig. 5 B; see also Supplementary Videos 2-4 ), a subset of the visual field regions elicited significantly distinct responses from the other visual field regions.At 65 ms, the estimated responses were significantly distinct from each other for all 24 stimulus regions.This result was consistent across tasks (columns in Fig. 5 B).Moreover, attending to a region in the visual field made the response to this attended location more distinct from the other visual field regions.This effect is visible from about 200 ms after stimulus onset in the RDMs ( Fig. 5 ; see the arrows pointing at the attended stimulus regions), consistent with the univariate results shown in Fig. 4 .
Next, to address the questions of whether the attentional task affected the discriminability of the response patterns, the distance estimates corresponding to pair-wise distances between each region within a visual field quadrant were averaged.The results are shown in Fig. 6 .Shifting attention from the fixation point to the target region in the left lower visual field had a significant effect on the response discriminability in the attended left visual field quadrant ( Fig. 6 B).Similarly, shifting attention from the fixation point to the target region in the right visual hemifield had a significant effect on the response discriminability in the attended right visual field quadrant ( Fig. 6C ).A similar, yet smaller, effect was observed in the opposite lower visual field quadrant, likely reflecting subjects' inability to fully ignore the opposite task region.In the upper visual field, the average distances, i.e., the average response discriminability, remained unaffected.
Attention modulated response discriminability in the attended visual field quadrant both during the later (from about 200 ms post stimulus) and early response (around 60 -130 ms; Fig. 6 B-C).

Dynamic spotlight of visual attention
Univariate grand-average waveforms only showed robust attentional modulation during the late response ( Fig. 4 ) whereas multivariate pattern analysis on response discriminability was also able to reveal the influence of attention on the early response ( Fig. 6 ).Finally, we look directly at the timing of attentional modulation on each visual field region using multivariate pattern analysis ( Fig. 2 C).Fig. 7 compares the average distance between the covert attention tasks with the fixation task.Discriminability of attentional target region ( Fig. 7 B and D; see also Fig. 8 B) shows that discriminability had again two peaks, the first at around 60 -130 ms and the later emerging around 200 ms.Fig. 7 A shows the average distance estimate for each visual field region between attending the fixation and attending the target region in the left visual field.Fig. 7 C shows the comparison between fixation and attending the target region in the right visual field.Discriminability across multiple regions in the visual field were affected by the spatial locus of attention during the early response, as indicated by the systematically positive distance estimates.During the later response, however, the effect was focused on the location of the attended region in the visual field.This is shown in greater detail in Figs. 8 and 9 , which directly compares the left and right attention tasks.A widespread effect of attention can be seen in the visual field during the early response.The effect was restricted not only on the attended region.Attention apparently modulates several regions along the same meridian (the same radial angle in the visual field), and the effect is strong near the fovea, where several regions in lower visual field are modulated.
Since our multivariate pattern analysis is based on comparing responses in two tasks, and assuming that attention modulated heavily foveal regions in the fixation task, high discriminability near the fovea could also result from a decrease in attentional modulation when left/right attention tasks were compared to the foveal fixation task.However, the effect survives when we compared two attention tasks with each other ( Figs,8 ,9 ), suggesting that foveal attentional modulation cannot solely be a byproduct of fixation task.During the later response, attentional modulation was spatially specific to the attentional target region.
Taken together, the results suggest a dynamic spotlight of attention, where covert attention initially modulates sensory signals widespread in the visual field, and later narrows to the target location in visual space ( Figs. 7-9 ).

Discussion
This study used a novel combination of multifocal stimulus technique and multivariate data analysis to enable high-resolution mapping of MEG responses across the visual field.A concurrent covert spatial attention task to two spatial locations enabled us to then investigate the spatio-temporal dynamics of covert attention.

Distribution of attention in the visual field
Spatial attention was found to have two spatio-temporally distinct effects in MEG responses, manifesting to some extent in univariate responses and more clearly in multivariate discriminability.The early modulation started from around 60 ms, was spatially broad in the visual space and had inwards asymmetry towards fixation: attention modulated not only the target location, but also nearby regions that are towards the fovea and/or share the same meridian.The effect was brief, lasting only up to around 130 ms.Similar spatial and temporal characteristics for early attentional modulation has also been found in previous multifocal EEG studies ( Seiple, Clemens, Greenstein, Holopigian, & Zhang, 2002 ;Slotnick, Hopfinger, Klein, & Sutter, 2002 ); though with very different analysis methods and with only 3-5 individual subjects.fMRI studies have shown that enhancement of BOLD signal also spreads to representations of nearby regions in visual space ( Brefczynski-Lewis, Datta, Lewis, & DeYoe, 2009 ) and has inward asym-Fig.8. Dynamic spotlight of attention: from broad to narrowly focused effect.Comparisons between left and right attention tasks.A) Effect of attending left vs. right on each visual field region.A positive distance estimate ( d ) indicates distinct response profiles between the task conditions (markers indicate FDR of 0.05; p values obtained from two-tailed signed-rank tests across the 20 subjects; FDR adjusted across time points and pairwise tests).The layout of the waveforms follows the layout of the stimulus regions in the visual field.The task had a significant effect on the responses of many visual field regions, but B) sustained, spatially specific effects of attention were observed in the target regions in the left and right hemifields.See Supplementary Figure S3 for a longer time-window.metry ( Puckett & DeYoe, 2015 ).Previous multifocal M/EEG studies have not investigated attentional modulation beyond around 200 ms.Here, we found also a later, more sustained modulation with onset at around 200 ms, where the modulation was specific to the target location with little modulation elsewhere in the visual field.

Time course of attentional modulation in MEG
Previous M/EEG studies in humans have provided different accounts of how early spatial attention starts to modulate responses.Some studies (for a review, see: Slotnick, 2012 ) suggest a later time course, starting after 100 ms post stimulus onset.In this account, the earliest modulation is associated with the P1 ERP that is assumed to reflect extrastriate processing ( Alilovi ć et al., 2019 ;Baumgartner et al., 2018 ;Heinze et al., 1994 ;Martinez et al., 1999 ;Woldorff et al., 1997 ).However, other studies ( Alilovi ć et al., 2019 ;Kelly, Gomez-Ramirez, & Foxe, 2008 ;Mohr, Carr, Georgel, & Kelly, 2020 ;Slotnick, 2018 ;Zani & Proverbio, 2020 ) have provided evidence for earlier onset for attention, suggesting that it could operate as early as at 50 ms, modulating the early ERP C1, which is assumed to originate in early visual areas, most probably V1.The early onset of the first attentional effect in our results is in line with the early modulation hypothesis.However, timing information alone is insufficient for separating the feedforward sweep from recurrent processing as even the earliest responses can be affected by feedback connections ( Hupé et al., 2001 ).Moreover, V1 and V2 have been shown to have very similar onset latencies in human EEG data ( Ales, Carney, & Klein, 2010 ;Inverso, Goh, Henriksson, Vanni, & James, 2016 ).
While early attentional effects in the univariate responses were largely nonexistent in our data, the early effect could be reliably identified using multivariate analysis that combines information across measurement channels.This suggests that multivariate analysis that combines MEG responses across the sensors can achieve better sensitivity for small signal changes that are spread out across the sensors and may show up differently in individual subjects.Attentional changes in early M/EEG responses can be challenging to capture using traditional univariate ERP paradigms.Cortical folding causes attenuation and major individual differences in early visual cortex responses, and the C1 ERP paradigm may require individually tailored stimulus set-ups ( Kelly et al., 2008 ;Slotnick, 2018 ).Still, C1 attention results can be difficult to replicate ( Baumgartner et al., 2018 ).We note also that, since we show that early attentional modulation is spatially broad, the kind of large multifocal stimulus we used may be more optimal to measure it, compared to the small gratings that are conventionally used.
In addition to early, broad attentional modulation, a later effect was found from around 200 ms that was spatially specific to the target location.Several previous EEG studies have also provided evidence for two attention responses, where the first is spatially less specific and the later more specific to the target location.Studies investigating spatial attentional modulation in cuing and visual search tasks have found that modulation of the early EEG response (at about 60 -100 ms and concurrent with P1 ERP) is approximately as strong for the valid target location cue and for the neutral cue that did not indicate target location.On the other hand, attention modulation of later EEG responses at around 150 ms and associated with N1 ERP was found to be specific to the cued location ( Luck, 1995 ;Luck & Hillyard, 1995 ;Luck et al., 1994 ).Similarly, Shioiri et al (2016) compared attention modulation both for the steady-state visual potentials (SSVEPs) that presumably reflect early processing and P3 ERP, which presumably reflects later processing using a fast RSVP task.SSVEPs showed very broad spatial tuning, whereas attentional modulation in P3 ERP was observed only at the target location, supporting a change in spatial cover.Visual search studies have also reported that a later ERP N2pc, that onsets at 250 -350 ms time range, is specific to the target location, unlike the early P1 ( Eimer, 1996 ;Luck, 1995 ).

Neural mechanisms of focal attention
Our results thus support the idea that the spatial cover of attention changes dynamically.Early modulation is short, spatially broad, and concurrent with the early cortical MEG response.As noted previously, EEG studies have not provided a consensus on whether attention modulates responses in the initial cortical response to stimulus.Many fMRI studies have shown that spatial attention modulates BOLD responses in V1 ( Brefczynski & DeYoe, 1999 ;Gandhi, Heeger, & Boynton, 1999 ;Somers, Dale, Seiffert, & Tootell, 1999 ;Tootell et al., 1998 ).A popular hypothesis to explain this discrepancy has been to assume that initial afferent V1 responses that drive the early M/EEG responses are not modulated by attention, but rather attention at V1 through delayed feedback signals originating from higher visual cortices ( Baumgartner et al., 2018 ;Di Russo, Martínez, & Hillyard, 2003 ; see also : Muckli, 2010 ;Simola, Stenbacka, & Vanni, 2009 ).The time course of attentional modulation in our study, on the other hand, supports the idea that attention can modulate even the earliest cortical responses.Attentional modulation is largely spread in early multifocal responses in the visual space.This suggests that the effect of attention in initial responses at least for this non-foveal target has rather coarse spatial tuning.

Stages and models of attention
The later part of the attentional modulation at around 200 ms is more sustained, and specific to the location of the target.Thus, attentional modulation becomes more focused with time.Several theories postulate that top-down attention may modulate feedforward sweep representation at early retinotopic cortices (see e.g.: Lamme, Supèr, & Spekreijse, 1998 ).It is possible that the early spread of modulation is caused by inhibition of stimulus signals from regions close to the target, as predicted by models of biased competition ( Desimone, 1998 ;Desimone & Duncan, 1995 ;Franconeri, Alvarez, & Cavanagh, 2013 ).One possibility is that once a competitive selection process has completed, any signal enhancement would be present only at the target region that has the access to higher level in processing hierarchy.Even though our results remain speculative on the exact mechanisms, we note that the time course of the "late " response is consistent with "attentional blink " that is typically observed around 200 -500 ms after the target detection ( Raymond et al., 1992 ) and often interpreted as being caused by the higher level access control process ( Dux & Rent ḿarois, 2009 ;Franconeri et al., 2013 ).

Spatial resolution of multifocal MEG
A secondary aim in this study was to investigate the spatial resolution of MEG-based multifocal retinotopic mapping.We were able to discriminate responses to each stimulated visual field region.The relatively small stimulus regions in the multifocal stimulus activate small cortical patches with less signal cancellation compared to spatially extended stimuli, where synchronous sources at opposite banks of the calcarine sulcus may result in cancellation of the evoked neuromagnetic fields ( Ahlfors et al., 2010 ;Kupers et al., 2021 ).Moreover, the general linear modeling analysis takes into account the contribution of overlapping responses.Spatial resolution of the M/EEG is typically approached by transferring the signals from sensor to source space via inverse modeling.Inverse modeling is highly ill-posed with several source combinations able to produce the same field pattern.Retinotopyconstrained source estimation either based on retinotopic fMRI data ( Ales et al., 2010 ;Hagler et al., 2009 ) or based solely on cortical folding ( Inverso et al., 2016 ) has been shown to help in separating closeby sources in the visual cortex.However, source estimation of retinotopic M/EEG responses remains challenging and previous studies have typically involved data from only a few participants.Here we took a different approach and performed the multivariate pattern analysis on the response patterns in the sensor space.Cross-validated Mahalanobisdistance based multivariate analysis achieved reliable separation of sig-nals originating from distinct visual field regions when spatial resolution of the more foveal regions was just around one degree of visual angle.In agreement with recent studies ( Kupers et al., 2020 ;Nasiotis, Clavagnier, Baillet, & Pack, 2017 ), we found that it is possible to achieve spatial resolving ability comparable with fMRI by combining MEG with multivariate data analysis.Responses for upper visual field stimulation were generally weaker, but still reliably discriminable, unlike reported in ( Nasiotis et al., 2017 ).That said, our results likely reflect the contribution of several cortical areas, where multiple concurrently active cortical sources improve the discriminability of the responses between visual field regions.A future challenge is to extend the multivariate pattern analysis of the multifocal MEG data to source estimates to reveal the spatio-temporal spread of attentional modulation in the source space.

Limitations of the current study
The current study used a design where the attention target was always at the same region within an experiment block, and moreover, the same stimuli for every subject.The rationale for this was to use a simple experimental design, as this is a novel design for MEG, and it was uncertain how strong the attention effects would be.However, it is possible that maintaining the attention at a single region for the duration of a trial block (around 150 seconds) is rather demanding for the participant, potentially leading to fluctuations or decline of attention.Moreover, using the same stimulus parameters for every subject means that for some subjects the attention task was easier than for others, and thus potentially required less focused attention.Indeed, in unilateral analysis we observed slight modulation also for contralateral target, suggesting that targets in ignored region could sometimes capture attention.However, arguably the focused attention task was still quite effective as we did observe spatially localized attentional modulation through comparisons of attended and non-attended stimuli.Moreover, major fluctuation of attention would most likely cause spreading of attentional modulation, which was not observed in the late, localized response.However, a design where the attentional target region changes from trial to trial and where the task difficulty is matched to each participant would be an important extension in the future.
An experimental design where the task would change more often and not concurrently with experimental runs, would also likely help participants to keep fixation stable throughout the experiment.Though our study participants were highly motivated and had practiced the task before data collection, eye-gaze data revealed small, but systematic difference in eye-gaze position between the task conditions in some participants.The main results remained unaffected when participants with high task decoding accuracy from the mean gaze location were removed (see Supplementary Fig. S4).
Future studies could also consider longer interstimulus intervals.In the current study, 3 -5 multifocal stimulus frames were presented per second.This frequency is lower than typically used in SSVEP studies, where area V1 has been attributed as the major contributor to the evoked steady-state response (see e.g.: Di Russo et al., 2007 ), but longer average interstimulus interval could further increase the response amplitudes ( Uusitalo, Williamson, & Sepp, 1996 ).

General conclusions
Combination of multifocal MEG and multivariate data analysis provides a promising technique to map the neural responses to stimuli across the visual field.The technique provides excellent temporal resolution and high spatial resolving ability -being able to discriminate stimulated regions that lie just 1 degree apart.We used the method to investigate how covert attention tasks modulate visual stimulus responses, mapping the spatiotemporal dynamics of neural responses that correspond to an "attentional spotlight ".We found that attention causes statistically significant but rather subtle modulation of the early MEG response, starting at around 60 ms of the onset of the stimulus.This effect is spatially broad with response modulation extending from target location inwards towards the fovea in roughly annular shape.In addition, we found a later, more sustained, attentional effect starting at around 200 ms and constrained in space to the location of the target.

Fig. 3 .
Fig. 3. Interindividual variability in the retinotopic multifocal MEG responses.A) Examples of the estimated response topographies are shown for two subjects at 75 ms after stimulus onset.The layout of the topographies follows the layout of the 24 stimulus regions in the visual field.The locations corresponding to the target regions in the left and right hemifields are highlighted.B) Examples of estimated evoked response to the target region in the right hemifield (upper row) and left hemifield (bottom row) are shown for four subjects at the occipital sensor with the overall strongest response to the stimuli.Different task conditions are shown in different colors (black = fixation task, blue = attend right, red = attend left).

Fig. 4 .
Fig. 4.Results of mass-univariate analysis: spatially specific enhancement of grand-average response amplitudes by attention.A) Grand-average waveforms from occipital gradiometers are shown for each of the 24 regions in the visual field during fixation task (black) and attend left (red) conditions.Occipital gradiometer pairs were combined as a vector sum before averaging across subjects.The layout of the waveforms follows the layout of the stimulus regions in the visual field.The responses for the target location in the left hemifield is depicted with red outline.B) Grand-average waveforms for the target region in the left hemifield is shown separately.Shaded regions show standard errors of the means.The colored dots show the time points when the response is significantly affected by the covert attention task compared to the fixation task (FDR of 0.05; p values obtained from two-tailed signed-rank tests across the 20 subjects; FDR adjusted across time points and conditions).C) Grand-average waveforms from occipital gradiometers are shown for each of the 24 regions in the visual field during fixation task (black) and attend right (blue) conditions.D) Grand-average waveforms for the target region in the right hemifield is shown separately.See Supplementary FigureS1for a longer time-window.

Fig. 5 .
Fig. 5. Results of multivariate pattern analysis: distinct response profiles for each visual field region.A) All pair-wise comparisons between the 24 visual field regions were collected into representational dissimilarity matrices (RDMs).In each 24 × 24 RDM, the lower triangle shows the cross-validated distance estimates ( d ) and the upper triangle represents false-discovery rate (FDR).The red arrow indicates the target region in the left hemifield (entry 10 in the matrix) and the blue arrow indicates the target region in the right hemifield (entry 12 in the matrix).B) The results are shown separately for different time-windows and tasks (fix = fixation task; right = attend the target region in the right visual field; left = attend the target region in the left visual field).The distance estimates were averaged across the 20 subjects ( p values obtained using a twotailed signed-rank tests; FDR adjusted across time points and pairwise tests; n.s.= not significant).Note that the color scales change between the different time-windows.See Supplementary Videos 2-4 for time-resolved RDMs and corresponding multidimensional scaling visualizations.

Fig. 6 .
Fig.6.Attention improves response discriminability in the attended visual field quadrant.A) The average discriminability of the stimulus regions from the MEG responses was evaluated by averaging the cross-validated distance estimates between the stimulus regions ( d ; Fig.5).The results are shown separately for each visual field quadrant, where the distances were averaged between all regions in that visual field quadrant.The results obtained during different task conditions are shown in different colors (black = task at fixation; red = attend target region in the left visual field; blue = attend target region in the right visual field).A positive distance estimate indicates distinct response profiles (markers indicate FDR of 0.05; p values obtained from two-tailed signed-rank tests across the 20 subjects; FDR adjusted across time points and pairwise tests).In all quadrants and during any task condition, the response profiles for different regions started to differ at about 40 ms after stimulus onset.B) The mean differences between the average distance estimates are shown between the fixation task and attend left conditions.The shaded region shows standard error of the mean.Attention had a statistically significant effect on response discriminability in the attended visual field quadrant 55-130 ms and 190-450 ms post stimulus (markers indicate FDR of 0.05; p values obtained from two-tailed signed-rank tests across the 20 subjects; FDR adjusted across time points and pairwise tests).B) The mean differences between the average distance estimates are shown between the fixation task and attend right conditions.Attention had a significant effect on the response discriminability in the attended visual field quadrant 55-135 ms and 195-450 ms post stimulus.

Fig. 7 .
Fig. 7. Dynamic spotlight of attention: from broad to narrowly focused effect.Comparisons between left/right attention task and fixation task.A) Effect of attending left vs. fixation on each visual field region.A positive distance estimate ( d ) indicates distinct response profiles between the task conditions (markers indicate FDR of 0.05; p values obtained from two-tailed signed-rank tests across the 20 subjects; FDR adjusted across time points and pairwise tests).The layout of the waveforms follows the layout of the stimulus regions in the visual field.The result at the attended visual field region is depicted with a red outline and shown separately in (B).C) Effect of attending right vs. fixation on each visual field region.The result at the attended visual field region is depicted with a blue outline and shown separately in (D).See Supplementary Figure S2 for a longer time-window.

Fig. 9 .
Fig. 9. Dynamic spotlight of visual attention visualized in the visual field: comparison between left and right attention task.As an alternative visualization of the results shown in Fig. 8 A, the effect of the left vs. right attention task on the responses is visualized in the visual field.The upper row shows the cross-validated distance estimate at each of the 24 stimulus regions.The lower row shows the statistical significance of the distance estimate.Covert attention initially has a widespread effect on the visual responses but later narrows to the target locations in the visual space.