Cortical beta oscillations reflect the contextual gating of visual action feedback

Highlights • We decouple seen and felt hand postures during action via virtual reality.• Vision of the hand is either task-relevant or a distractor.• Task-relevance of vision is reflected by in- or decreases of occipital beta power.• DCM suggests underlying changes in cortical (visual) excitability.• Occipital beta may indicate the contextual gating of visual action feedback.


Introduction
The integration of sensory inputs with forward or predictive models of motor control is crucial for bodily actions. In this (Bayesian) sensorimotor integration process, the brain adjusts how model predictions should respond to novel evidence by 'gating' sensory data, depending on the current context ( Desmurget and Grafton, 2000 ;Friston et al., 2010 ;Körding and Wolpert, 2004 ;Sober and Sabes, 2005 ;Talsma et al., 2010 ).
Such contextual, 'top-down' influences on sensory gating are evinced, for instance, by the recalibration to novel (experimentally manipulated) visuo-motor mappings. Visuo-motor recalibration has been associated with modulations of activity in visual and proprioceptive brain areas, which has been interpreted as a temporary augmentation of visual action feedback ( Balslev, 2004 ;Bernier et al., 2009 ;Wasaka and Kakigi, 2012 ). Recently, in line with behavioral studies showing that cognitive-attentional factors can affect visuo-motor recalibration ( Ingram et al., 2000 ;Kelso et al., 1975 ;Redding et al., 1985 ), we used fMRI to show that this activity modulation was contextual; i.e., that it depended on the relative task-relevance of seen or felt hand posture ( Limanowski and Friston, 2020a ). However, as fMRI data provide only limited information, we could not fully characterize the fast neuronal mechanisms mediating these contextual effects.
Here, we approached this question by examining cortical oscillations with MEG. Oscillations have been linked to neuronal communication in Fig. 1. Task design and behavioral performance. A: Participants controlled a virtual hand (VH) model via a data glove worn on their real hand (RH), which was occluded from view. Their task was to track a 0.5 Hz sinusoidal size-change of a virtual target (the fixation dot) with repetitive 'grasping' movements; i.e., close and open the hand when the dot decreased and increased in size. Participants were instructed to synchronize either the VH or the RH movements to the target oscillation in blocks of 32 s duration. In half of the conditions, the virtual hand moved congruently (C); in the other half, a 500 ms delay was added to the VH movements to introduce visuo-proprioceptive incongruence (IC). In these conditions, synchronizing either VH or RH movements with the target precluded synchronization with the other; consequently, participants had to select one modality to track. We expected that under visuo-proprioceptive incongruence, visual action feedback should be differentially 'gated' depending on the instructed task set (VH or RH). B: Participants' mean ratings (given on 7-point visual analogue scales, shown with standard errors of the mean) of perceived task difficulty and attentional focus suggested that visuo-proprioceptive incongruence rendered each task more difficult, and that participants complied with task instructions by directing their attention to the respective instructed hand. C: Average movement trajectories of the real (red) and virtual (blue) hand in each condition, relative to the target's oscillation (gray). The individual participants' averages are shown as thin lines (the averages are based on individually calibrated glove data, where the fully open hand position corresponded to maximal dot size, and the fully closed hand position to minimal dot size). Crucially, whereas tracking was comparable when the hands moved congruently (VHC, RHC), participants exhibited phase shifts of the rhythmic movements to align the virtual hand significantly more strongly with the target in the VHIC condition, and the real hand in the RHIC condition. D: Bar plot showing the corresponding average deviation (lag in ms) of the real hand (red) and the virtual hand (blue) from the target in each condition, with associated standard errors of the mean. See Methods and Results for details. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) examine the interaction between sensory (visuo-proprioceptive congruence) and cognitive-attentional (instructed task-relevant modality) factors. We hypothesized that -specifically under visuo-proprioceptive conflict -visual feedback should be differentially 'gated', depending on the prevailing cognitive-attentional set ( Corbetta and Shulman, 2002 ;Posner et al., 1978 ). Specifically, visual feedback that conflicted with the felt hand posture should be augmented when vision was task-relevant but attenuated when vision was a distraction from the task. We expected corresponding diametrical changes in induced low-frequency oscillatory power within the cortical visuo-motor hierarchy, and used DCM to disambiguate between alternative hypotheses about how these were mediated in terms of synaptic efficacy and gain control.

Participants
18 healthy, right-handed volunteers (9 female, mean age = 29 years, range = 21-39, all with normal or corrected-to-normal vision) par-ticipated in the experiment. Similar sample sizes were used in recent fMRI experiments with analogous virtual reality based grasping tasks ( Limanowski et al., 2017 ;Limanowski and Friston, 2020a ). The experiment was approved by the local research ethics committee (University College London) and conducted in accordance with this approval.

Experimental design and procedure
During the experiment, participants sat underneath the MEG scanner wearing a non-magnetic data glove on their right hand, which was placed in a comfortable position on their lap and occluded from view by a black barber's gown. The data glove (Fifth Dimension Technologies, Pretoria, South Africa; 1 sensor per finger, 8 bit flexure resolution per sensor, 60 Hz sampling rate, communication with the PC via USB with approx. 10 ms delay) measured individual finger flexions via sewn-in optical fiber cables; i.e., light was passed through the fiber cables and to one sensor per finger -the amount of light received varied with finger flexion. Prior to scanning, the glove was carefully calibrated to fit each participant's movement range (if necessary, this was repeated be-tween runs). The raw glove data were fed to a photorealistic virtual right hand model Friston, 2020a , 2020b ), which was thus moveable by the participant in real-time with one degree of freedom (flexion-extension) per finger. In this way, vision ( seen hand position via the virtual hand) could be decoupled from proprioception ( felt hand position). A fixation dot (size: about 0.5°of visual angle) was presented in front of the virtual hand and was visible at all times (i.e., never occluded by the virtual hand movements). The virtual hand, the fixation dot, and the task instructions were presented via a projector on a screen in front of the participant (1024 × 768 pixels resolution, screen distance to eye 64 cm, image size 40 × 29.5 cm, 32 ms projector latency). The virtual reality task environment was instantiated in the open-source 3D computer graphics software Blender ( http://www.blender.org ) using its Python programming interface. An eye tracker (EyeLink, SR Research) was used to monitor the participants' eye position online, to ensure they maintained central fixation and did not close their eyes.
The participants' task was to perform repetitive right-hand grasping movements paced by the pulsation frequency of a central fixation dot; i.e., effectively a phase matching or non-spatial pursuit task ( Fig. 1 A). During the movement blocks, the fixation dot continually decreasedand-increased in size sinusoidally (12% change in diameter) with 0.5 Hz frequency. The participants had to track fluctuations in the size of the dot with repetitive right-hand 'grasping' movements; i.e., to close and open the hand when the dot got smaller and bigger. Participants were trained to match their fully open hand position to the maximal dot size, and their fully closed hand position to the minimal dot size; in other words, they did not track the actual size of the dot the phase of periodic changes in size. Choosing the fixation dot as the target required participants to look at the center of the screen -i.e., also at the virtual hand -under both instructions (see below), and constituted a 'non-spatial' target compared to e.g. targets moving along a trajectory ( Limanowski et al., 2017 ).
In half of the movement blocks, a visuo-proprioceptive incongruence was introduced between the participant's movements and the movements of the virtual hand; i.e., the virtual hand's movements were delayed with respect to the actual movement by adding a 500 ms lag. In other words, the seen and felt hand movements were always incongruent (phase-shifted) in these conditions. The delay was adopted following a recent behavioral study using the similar task ( Limanowski and Friston, 2020b ), which showed that participants reliably recognized the virtual and real hand movements as incongruent when applying this lag -and significant differences in behavior between conditions. Here, we likewise ensured that all participants were aware of the incongruence before scanning.
Crucially, participants had to perform the phase matching task with one of two goals in mind: In half of the movement blocks, they had to match the target's oscillatory phase with the virtual hand movements or with their unseen real hand movements, respectively. This resulted in a 2 × 2 factorial design with the factors Task (virtual hand vs real hand task) and Congruence (congruent vs incongruent VH/RH movement) .
Each of the four conditions 'virtual hand task under congruence' (VHC), 'virtual hand task under incongruence' (VHIC), 'real hand task under congruence' (RHC), and 'real hand task under incongruence' (RHIC) was completed in blocks of 32 s (16 close-and-open movements each) 3 times per run, in randomized order, interspersed with 6 s fixation-only periods. Participants completed five runs in total, thus completing 240 movements of 2 s each per condition. The task instructions ('VIRTUAL' / 'REAL') were presented 2.5 s before each respective movement trials for 2 s. Additionally, participants were informed whether in the upcoming trial the virtual hand's movements would be synchronous ('synch.') or delayed ('delay'). The instructions and the fixation dot in each task were colored (pink or turquoise, the color mapping was counterbalanced across participants), to help participants remember the current task instruction during each movement trial. Participants were trained extensively prior to scanning.
With these instructions, we aimed to induce a specific cognitiveattentional set in our participants; and with it, a different weighting of visual (vs proprioceptive) movement cues. Specifically, we hypothesized that -under visuo-proprioceptive conflict -visual action feedback should be prioritized in the VH vs RH task; i.e., depending on the currently active 'top-down' cognitive-attentional set ( Corbetta and Shulman, 2002 ;Posner et al., 1978 ). Note that whereas in the congruent conditions, both hand movements were identical, and therefore both hands' grasping movements could simultaneously be matched to the target's oscillatory phase (i.e., the fixation dot's size change), only one of the hands' (virtual or real) movements could be phase-matched to the target in the incongruent condition. This necessarily engendered a phase mismatch of the other hand's movements: In the VHIC condition, participants had to adjust their movements to counteract the visual lag; i.e., they had to phase-match the virtual hand's movements (i.e., vision) to the target by shifting their real hand's movements (i.e., proprioception) out of phase with the target. Conversely, in the RHIC condition, participants had to match their real hand's movements (i.e., proprioception) to the target's oscillation, and therefore had to ignore the fact that the virtual hand (i.e., vision) was out of phase. We hypothesized that this incongruence would increase task difficulty and require a sustained focus of attention on the instructed tracking modality -vision or proprioception -vs the non-instructed ('distractor') modality. In other words, visual feedback should be prioritized in the VHIC task (where it had to be used to recalibrate motor control to a new visuo-proprioceptive mapping) but attenuated in the RHIC task (where it was effectively distracting). In brief, we expected an interaction effect between sensory (congruence) and cognitive-attentional (task) factors.
After the experiment, participants were asked to indicate -for each of the four conditions separately -their answers to the following two questions: "How difficult did you find the task to perform in the following conditions? " (Q1, answered on a 7-point visual analogue scale from "very easy " to "very difficult ") and "On which hand did you focus your attention while performing the task? " (Q2, answered on a 7-point visual analogue scale from "I focused on my real hand " to "I focused on the virtual hand ").

Behavioral data analysis
To analyze the behavioral change in terms of deviation from the target (i.e., phase shift from the oscillatory size change), we calculated the phase shift as the average angular difference between the raw averaged movements of the virtual or real hand (averaged over the four fingers) and the target's oscillatory pulsation phase in each condition. The angles of the fixation dot's oscillation and the real/virtual hand's movement cycles were calculated using Matlab's continuous wavelet transform (using the analytic Morse wavelet and L1 normalization). The first target oscillation cycle of each block was excluded from analysis, because participants frequently only started moving with the second one. Cycles during which the hand movement was omitted (i.e., the fingers remained either flexed or extended across the entire cycle) were also excluded. On average, this left 222 trials of 2 s duration each per condition (0.4% omitted movements).
The resulting hand phase shifts for each participant and condition were entered into a 2 × 2 repeated measures ANOVA with the factors task (virtual hand, real hand) and congruence (congruent, incongruent) to test for statistically significant group-level differences. Note that the virtual hand-target alignment directly quantified real hand-target alignment, since a larger shift of the real hand corresponds to better alignment of the virtual hand with the target. Post-hoc t-tests (two-tailed, with Bonferroni-corrected alpha levels to account for multiple comparisons) were used to compare experimental conditions. As a control analysis, we compared average movement amplitudes (i.e., the difference between maximum extension and maximum flexion per movement cycle) between conditions following the same procedure.
The questionnaire ratings were evaluated for statistically significant differences using a nonparametric Friedman's test and Wilcoxon's signed-rank test (with Bonferroni-corrected alpha levels to account for multiple comparisons) due to non-normal distribution of the residuals. Furthermore, we tested whether participants were inconsistent in their fixation; i.e., we tested the recorded eye traces (after removing blinks) for between-condition differences in average Euclidean distance of measured fixation from the fixation dot, using a repeated-measures 2 × 2 ANOVA analogous to the above.
MEG signals were acquired using a 275-channel whole-head setup with third-order gradiometers (CTF Omega, CTF MEG International Services LP, Coquitlam, Canada) at a sampling rate of 600 Hz. All analyses were performed using MATLAB (MathWorks, Natick, MA, United States) and SPM12.6 (Wellcome Trust center for Neuroimaging, University College London, https://www.fil.ion.ucl.ac.uk/spm/ , ( Litvak et al., 2011 )). MEG data were high-pass filtered (1 Hz), downsampled to 300 Hz, and epoched into trials of 2 s each (each corresponding to a full target oscillation/grasping cycle). Epochs with z-score amplitudes + -6 SD of all trials in any of the channels (8.3% on average) were automatically rejected ( Auksztulewicz et al., 2017a ).
In the first (in sensor space) MEG data analysis, we looked for spectral power differences between experimental conditions under 'steady-state' assumptions; i.e., treating the spectral profile as a 'snapshot' of responses induced during condition-specific changes in quasistationary power spectra ( Donner and Siegel, 2011 ;Friston et al., 2019 ;Moran et al., 2008 ). We computed induced power spectra in the 0-98 Hz range using a multi-taper spectral decomposition ( Thomson, 1982 ) with a spectral resolution of ± 2 Hz. The spectra were averaged across trials using robust averaging , log-transformed, and then converted to volumetric scalp x frequency images -with two spatial and one frequency dimension . The resulting images were smoothed with a Gaussian kernel with full width at half maximum of 8 mm x 8 mm x 4 Hz and entered into a group-level general linear model (GLM) using a flexible factorial design. The statistical parametric maps obtained from the respective group-level contrasts were used to test for significant effects, using a threshold of p < 0.05, family-wise error (FWE) corrected for multiple comparisons at the peak level.
Following identification of regionally specific effects, source localization of induced power -in the 12-30 Hz (i.e., 'beta', cf. ( Donner and Siegel, 2011 )) range -was performed using a variational Bayesian approach with multiple sparse priors ( Litvak and Friston, 2008 ). The 6 Hz effect for congruent > incongruent was localized separately in the 4-6 Hz range. The (source space) localization results of each participant were summarized as 3D images per condition (unsmoothed), and entered into a group-level GLM using a flexible factorial design. Since the significance of the effects on induced responses had already been established with the sensor space analysis, the source space results were displayed at a threshold of p < 0.005, uncorrected ( p < 0.075 for the frontal 6 Hz-activation). The ensuing statistical parametric maps were rendered on SPM's brain template.

DCM of cross-spectral densities
The MEG data analysis and the analysis of our participants' behavior suggested that -specifically under visuo-proprioceptive conflict -visual action feedback was processed (i.e., 'gated') differentially depending on cognitive-attentional factors (i.e., task set). This differential processing was associated with changes in oscillatory power in the 'beta' range over visual brain areas; i.e., there was a significant interaction effect. These results were, in principle, in line with the proposed role of lowfrequency oscillations in gating sensory information flow 'top-down' ( de Vries et al., 2020 ;Donner and Siegel, 2011 ;Engel and Fries, 2010 ;Friston et al., 2015 ;Klimesch et al., 2007 ;C. Palmer et al., 2016C. Palmer et al., , 2019. To explain this effect in terms of underlying neuronal interactions, we modelled the MEG data with DCM. DCM allows one to compare multiple alternative hypotheses (models) about how some observed data feature (in our case: spectral power across the scalp) was most likely generated by underlying interactions between and/or within neuronal populations across a network of brain sources. To model the (induced) power differences observed in the spectral analyses -in terms of source-localized neuronal interactions -we used DCM for cross-spectral densities Moran et al., 2007Moran et al., , 2008Moran et al., , 2009 ). This type of DCM models the synaptic mechanisms that generate spectral-domain data features and has been validated with respect to a range of previous MEG and EEG data ( Auksztulewicz et al., 2017a ;Bastos et al., 2015 ;Hamburg et al., 2019 ;Rosch et al., 2019 ;Shaw et al., 2017 ).
We focused our DCM analysis on the crucial interaction effect identified in the spectral analysis: Beta power over occipital and temporal sensors decreased in VHIC and increased in RHIC relative to both congruent conditions. This result was in line with the behavioral results, which also showed differences between the incongruent, but not congruent, conditions. In other words, the congruent mapping conditions could be seen as a 'baseline' for our task, whereas the visuo-proprioceptive conflict in the incongruent conditions led to a task-dependent gating of visual information -potentially manifesting as spectral power differences -depending on the current task set. In the DCM analysis, we therefore modelled the effects of each incongruent (i.e., visuo-proprioceptive conflict) condition relative to the congruent conditions. In other words, we modelled two condition-specific effects corresponding to changes in connectivity during VHIC or RHIC, respectively, relative to the VHC and RHC conditions. To ensure optimal model fits in the frequency bands in which the spectral effects were significant (cf. Fig. 2 ), we modelled the 12-30 Hz range.
Sources of interest were chosen based on the source localization of significant power differences between conditions in the above spectral analysis (see above and Fig. 2 ). Our DCM architecture ( Fig. 3 A) therefore contained the bilateral V1, V5, STS, and the right PFC; which were identified as likely sources for the main effects and, most importantly, for the interaction effect (the bilateral STS and right PFC were identified as further sources of the interaction effect, when lowering the statistical threshold of the projections to p < 0.1). In short, our DCM encompassed the key regions of a well-established visuo-motor hierarchy ( Cisek and Kalaska, 2010 ;Decety et al., 1994 ;Goodale and Milner, 1992 ;Grafton, 2010 ;Iacoboni and Dapretto, 2006 ;Kilner et al., 2007 ;Makin et al., 2012 ). The reconstructed cortical locations of these effects were strikingly similar to the location of blood oxygen level dependent (BOLD) signal changes detected in our previous fMRI experiments using similar designs ( Limanowski et al., 2017 ;Limanowski and Friston, 2020a ).
Each cortical source was modelled as a local patch ( Daunizeau et al., 2009 ;Pinotsis et al., 2012 ) whose responses were generated by a neural mass model Moran et al., 2007 ) comprising three interconnected cell populations with excitatory spiny stellate cells (assigned to granular layer IV), excitatory pyramidal cells, and inhibitory interneurons (occupying both supra-and infra-granular layers). This kind of model distinguishes between 'extrinsic' ('forward' and 'back- Fig. 2. MEG spectral differences and corresponding source reconstructions. The 'glass brain' (maximum intensity) projections show the sensor level scalp-frequency maps of induced power differences between conditions depending on sensory (Congruence) and cognitive-attentional (Task) factors, and their interaction (in each plot, the darkest voxel shows the strongest effect along the respective projection; the maps are thresholded at p < 0.001, effects significant at p FWE < 0.05 are outlined in green; the top plots have one frequency dimension, 0-98 Hz, and one spatial dimension, P-A = posterior-anterior, L -R = left-right; the bottom plot has two spatial dimensions). The bar plots show the mean-centered estimates of oscillatory power in each condition from the respective peak voxel (strongest effect in scalp-frequency space), in arbitrary units and with associated standard errors. VH = virtual hand task, RH = real hand task, C = congruent hand mapping, IC = incongruent hand mapping. The renders show the corresponding source localization results using variational Laplace with multiple sparse priors (all suprathreshold voxels are colored in red, the intensity indicates the respective T statistic value, weighted by a function of its distance behind the brain surface). See Results for details. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) ward') between-area connections, and 'intrinsic' connections. The latter connections model effects of regional self-inhibition, which is inversely proportional to the input-output balance or 'excitability' of a given source, and are therefore usually associated with cortical gain control ( Bauer et al., 2014 ;Pinotsis et al., 2014 ;Ranlund et al., 2016 ;Shaw et al., 2017 ). In other words, when regional self-inhibition is relaxed, the same neuronal population will respond more strongly to the same input. Thus, reduced self-inhibition corresponds to increased neuronal gain, and vice versa. The resulting network allowed us to order the sources hierarchically (V1-V5-STS-PFC, with additional lateral con-nections for bilateral regions, Fig. 3 A), and to test competing hypotheses about the type of connectivity modulation underlying the observed spectral effects.
For each participant, the resulting model (of coupled neural fields) was inverted to fit to the complex MEG cross-spectral densities in the 12-30 Hz range (as summarized by 8 principal eigenmodes) across the scalp. We excluded one participant (P3) whose data could not adequately be fit by DCM (i.e., the model inversion 'flatlined', resulting in a markedly low model evidence as indicated by a free energy that was more than a standard deviation lower than the group's average; therefore the estimated Fig. 3. DCM architecture. A: A hierarchical cortical network was constructed based on the source localization of significant spectral differences between conditions; including the bilateral V1, V5, STS, and the right PFC (shown schematically, cf. Fig. 2 ). B: Individual model fits showing the first principal eigenmode of the prediction in sensor space (thin lines) and the corresponding mode of the empirical scalp data (thick lines) for the congruent 'baseline', and the VHIC and RHIC conditions. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 4. Bayesian model comparison. A:
Using the established network architecture, 7 different candidate models were compared to test whether the condition-specific effects of VHIC and RHIC would best be modelled by changes in forward (F), backward (B), and/or intrinsic (i) connections. For each model, only the connections modulated by the experimental effect are shown. B: Bayesian model comparison identified Model 4 (condition-specific modulation of local self-inhibition) as having the highest free energy, and thus as the most likely explanation for the observed spectral data, with a posterior probability of 92%. C: Bayesian model averages of parameter estimates with 95% confidence intervals, indicating changes in local self-inhibition during VHIC and RHIC. Asterisks denote significantly ( p < 0.05) reduced inhibition in VHIC relative to RHIC at the respective cortical node. See Results for details. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) connectivity could not sensibly be used for the subsequent model comparison). However, this exclusion did in fact not affect Bayesian model comparison (the same model still 'won' with 87% probability when P3 was included). The individual participants' model fits -which were the basis for the Bayesian model comparison of condition-specific changes in effective connectivity -are shown in Fig. 3 B. The aim of our DCM was to disambiguate between alternative explanations for how the observed effects of task instruction during incongruence on beta oscillatory power could have been mediated in terms of neuronal interactions between cortical regions or by changes in local cortical gain. Therefore, we asked whether the condition-specific effects were best explained by changes in extrinsic (forward and/or backward between-region) and/or intrinsic (within-region) connectivity ( Fig. 4 A). Model comparison was implemented by Bayesian model reduction Friston and Penny, 2011 ), which allows one to compare 'reduced' models with variations in a subset of the 'full' model's parameters. In our case, the full model allowed for modulations of all intrinsic and extrinsic connections, whereas the reduced models allowed for modulation of only one connection type, resulting in a model space of 7 models ( Fig. 4 A). The model with the greatest evidence (approximated via variational free-energy) was considered the 'winning' model. The posterior estimates of all reduced models were averaged using Bayesian model averaging ( Penny et al., 2010 ). To confirm that the reduction in inhibition during VHIC > RHIC was significant, post-hoc t-tests were used to compare the condition-specific parameter estimates (i.e., their Bayesian model averages), under the most likely model.
The participants' average movements, and the corresponding deviations from the target's phase, are shown in Fig. 1 C-D. Participants were able to track the 0.5 Hz (2 s per cycle) target oscillation with their grasping movements in all conditions, staying within ± 24°(~134 ms) off the target's oscillatory phase on average. The power spectra were comparable across conditions, showing a clear peak at the target frequency of 0.5 Hz (see Supplementary material). Importantly, participants aligned the virtual hand better with the target under the VH task -and, correspondingly, they synchronized the real hand better with the target under the RH task (ANOVA on hand-target phase synchronization, main effect of 'task', F (1,17) = 18.11, p = 0.0005). A significant interaction effect between task and congruence ( F (1,17) = 18.46, p = 0.0005) and a post-hoc t -test showed that this effect was due to a significantly better virtual hand-target synchronization in the VHIC than in the RHIC condition ( t (17) = 4.32, p = 0.0004; there was no significant difference between VHC and RHC conditions, t (17) = 0.16, n.s. ). Unsurprisingly, tracking performance was better overall when comparing congruent to incongruent conditions (ANOVA, main effect of 'congruence', F (1,17) = 136.66, p = 1.5e-9). Participants also evinced a partial shift of their real hand's movements during RHIC than during RHC ( t (17) = 4.43, p = 0.0004), but this shift was significantly smaller than during the VHIC condition ( t (17) = 4.73, p = 0.0002). Movement amplitudes did not differ significantly between conditions (means and standard deviations: VHC = 0.87 ± 0.05, VHIC = 0.86 ± 0.07, RHC = 0.87 ± 0.05, RHIC = 0.86 ± 0.07; ANOVA, all F s < 1, n.s. ). Participants maintained equal fixation in all conditions (means and standard deviations of Euclidean distance from fixation dot, in degrees visual angle: VHC = 1.66 ± 1.44, VHIC = 1.68 ± 1.43, RHC = 1.16 ± 0.87, RHIC = 1.26 ± 0.81; ANOVA, all F s < 1, n.s. ).
Together, the above results suggest that, as expected, participants adopted a specific attentional set to prioritize the instructed target modality, and that this was associated with significantly better target tracking with the instructed modality (vision or proprioception) under intersensory conflict.

MEG results
In the sensor space analysis, we looked for induced spectral power differences related to the effects of our experimental manipulations; i.e., the differential processing of incongruent vs congruent visual action feedback depending on the currently active cognitive-attentional set (VH or RH task).
This analysis revealed significant spectral correlates of visuoproprioceptive congruence ( Fig. 2 ): Movements under visuoproprioceptive incongruence were associated with relatively suppressed power in the 18-22 Hz range (peak at 20 Hz) over right temporal sensors ( T = 5.35, p FWE < 0.05). These effects were source-localized to bilateral temporal regions, focused on the superior temporal sulcus (STS). Conversely, we observed a power increase at 6 Hz ( T = 5.79, p FWE < 0.05) over frontal sensors during incongruent as compared with congruent movements; source-localized to regions in the right prefrontal cortex (PFC). Furthermore, we found a significant effect of cognitive-attentional task set: During the VH task, compared with the RH task, power in the 12-20 Hz range (peak at 16 Hz) was significantly suppressed over occipital sensors ( T = 5.53, p FWE < 0.05, Fig. 2 ). These effects were source-localized to distinct peaks in the bilateral primary (V1) and extrastriate (V3, V5) cortices. The reconstructed cortical sources of beta suppression during VH (relative to RH) were almost identical to the locations of fMRI activations identified in similar tasks and contrasts ( Limanowski et al., 2017 ;Limanowski and Friston, 2020a ).
Crucially, there was a significant interaction effect at 20 Hz over occipital sensors ( T = 5.00, p FWE < 0.05). This effect was localized to the left V1 and the right V5. In other words, the power suppression observed during the VH relative to the RH task was significantly stronger during incongruent > congruent conditions. In fact, beta power suppression at temporal and occipital sensors was markedly significant in the VHIC -RHIC contrast ( T = 6.35 and 6.96, respectively, both p FWE < 0.05), but nonsignificant in the VHC -RHC contrast, even at p < 0.001, uncorrected. In other words, the spectral effects were largely due to a difference between the incongruent conditions -in which there was a visuo-proprioceptive conflict -with small or no differences between the congruent conditions.

DCM results
To disambiguate between alternative hypotheses about how the interaction effect in the beta range was mediated in terms of changes in neuronal message passing among key cortical sources, we modelled the measured MEG data with DCM for cross-spectral densities. Specifically, we aimed at clarifying the nature of the task-dependent gating of visual information during VHIC and RHIC, respectively, relative to a congruent-movement 'baseline'.
Based on the localization of the most likely sources of the observed spectral power differences, we constructed a hierarchical network comprising the bilateral V1, V5, STS, and the right PFC ( Fig. 3 A). Using this network, the model inversion provided overall good fits of the empirical whole-scalp data in the beta range (except for participant P3 who was excluded from further analysis, see Methods). The individual model fits are shown in Fig. 3 B. Based on the established model architecture, we compared alternative hypotheses about how the identified modulations of induced spectral responses during VHIC and RHIC were caused in terms of neuronal interactions. We considered a model space of 7 models ( Fig. 4 A), each modeling the condition-specific effects as changes in (intrinsic) synaptic efficacy within regions and/or (forward and/or backward) synaptic connectivity between regions. Model comparison, implemented using Bayesian model reduction, showed that the most likely model (Model 4, posterior probability = 92%) described the condition-specific effects in terms of a modulation of local intrinsic connections ( Fig. 4 B). In the DCM framework, these connections determine the degree of self-inhibition -and therefore determine input-output balance or excitability -in other words, changes of their parameter estimates indicate changes in gain control Moran et al., 2007 ). The DCM results therefore suggest that, relative to a congruent-movement 'baseline', movements under visuo-proprioceptive conflict were most likely associated with changes in cortical gain.
Crucially, there was a striking asymmetry in the winning model's connectivity estimates for VHIC vs RHIC ( Fig. 4 C): Whereas most of the intrinsic connections mediated a disinhibition during VHIC relative to the congruent movements, they showed the opposite effect -an increased self-inhibition -during RHIC. This effect was significant in the bilateral visual areas (V1 and V5) and in the right STS. In other words, relative to the baseline condition, cortical gain in visual areas was increased during VHIC and decreased during RHIC. In sum, the DCM results indicated a contextual effect of cognitive-attentional task set on cortical gain control within the visuomotor hierarchy.

Discussion
Using a virtual reality based phase matching task under visuoproprioceptive incongruence, we induced a cognitive-attentional prior-itization of visual vs proprioceptive feedback, as evident from significant differences in target-tracking performance and self-reported attentional allocation. We found that sensory (visuo-proprioceptive congruence) and cognitive (instructed task set) factors and, importantly, their interaction effect were associated with significant changes in cortical oscillatory power -most prominently, in the 'beta' frequency range.
By isolating the interaction effect between sensory and cognitiveattentional factors, our study design allowed us to advance on previous work on visuo-motor recalibration: Relative to the congruent movement conditions, occipital beta power was suppressed in VHIC but enhanced in RHIC. Our DCM analysis identified diametrical changes in the self-inhibition of visual areas as the most likely causes of these spectral differences, i.e., relaxed self-inhibition during VHIC and increased self-inhibition during RHIC relative to movements without visuo-proprioceptive conflict. These effects were strongest in visual (V1, V5) and multisensory (right STS) areas, which are all known to process visual bodily information ( Farrer et al., 2008 ;Lebar et al., 2017 ;Leube et al., 2003 ;Limanowski et al., 2018 ;Limanowski and Friston, 2020a ). In DCM, the self-inhibition of a given node is inversely proportional to the excitability of the underlying neuronal population to its (sensory) inputs; in other words, it reflects cortical gain.
We, therefore, propose that these results directly reflect the contextual gating of visual bodily action information -during identical (conflicting) visuo-proprioceptive mapping -for integration with the current action plan, depending on the prevalent cognitive-attentional set. In other words, we propose that attenuated beta was associated with an increased sensitivity to visual feedback (via increased gain) when visual feedback had to be incorporated into the goal-directed action plan (VHIC), and conversely, enhanced beta was associated with an attenuation of visual feedback (via reduced neuronal gain of visual brain areas) when the visual movement was an incongruent 'distractor' (RHIC). Such attentional gating was not required in conditions without visuoproprioceptive conflict (VHC and RHC).
Thus, our results support -in a sensorimotor setting -the hypothesized link between beta oscillations and top-down contextual control in service of conveying behavioral context to lower sensory regions ( Auksztulewicz et al., 2017b ;Bressler and Richter, 2015 ;Buschman and Miller, 2007 ;Clark et al., 2015 ;Donner and Siegel, 2011 ;Friston et al., 2015 ;Spitzer and Haegens, 2017 ). Previous work has shown that task-irrelevant sensory brain regions can be disengaged by increasing low-frequency oscillatory activity, whereas low-frequency suppression can make stimulus processing more efficient ( de Vries et al., 2020 ;Frey et al., 2015 ;Jensen and Mazaheri, 2010 ;Klimesch et al., 2007 ;Schubert et al., 2009 ). Although such effects are frequently observed at 'alpha' frequencies, beta oscillations have also been linked to active suppression of sensory input that is deemed distractive ( de Vries et al., 2018 ;Engel and Fries, 2010 ;Kelly et al., 2006 ). In visual paradigms, attention to target stimuli while ignoring distractors suppressed occipital beta band power ( Fries, 2001 ). In multisensory tasks, several studies have reported a negative association between parieto-occipital alpha/beta power and attention to visual -as opposed to auditory or tactile -input ( Bauer et al., 2006( Bauer et al., , 2012Foxe et al., 1998 ;Foxe and Simpson, 2005 ;Fu et al., 2001 ;Haegens et al., 2012 ;Wittekindt et al., 2014 ). Our findings now show that beta oscillations can be directly linked to the 'top-down' gating of (conflicting) visual action feedback depending on current behavioral context by adjusting cortical gain.
Further support for our interpretation of beta power as being inversely related to sensory gating comes from our finding that the reconstructed cortical sources of beta suppression were almost identical to the locations of BOLD signal increases in very similar 'vision-prioritizing' tasks ( Limanowski et al., 2017 ;Limanowski and Friston, 2020a ). This inverse relationship between source-localized beta and the BOLD signal has been reported before ( Moosmann et al., 2003 ;Scheeringa et al., 2011 ;Yuan et al., 2010 ;Zaretskaya and Bartels, 2015 ) and is consistent with proposals that a loss of low-relative to high-frequency power may be associated with brain 'activation' detected with fMRI ( Chawla et al., 1999 ;Kilner et al., 2005 ;Laufs et al., 2003Laufs et al., , 2008. In principle, our results therefore support computational models following the 'predictive coding' framework, which link slow vs fast frequency oscillations to asymmetrical message passing of predictions and errors, respectively ( Arnal and Giraud, 2012 ;Bastos et al., 2012 ;Friston, 2008 ;Lee et al., 2013 ;Wang, 2010 ). Our findings also speak to the relationship between sensory precision or gain control and the predictability of sensory inputs ( Auksztulewicz et al., 2017b ;Auksztulewicz and Friston, 2015 ;Kok et al., 2012 ;Press et al., 2020a , b ;Corlett, 2020 ;Limanowski et al., 2019 ;Richter and de Lange, 2019 ;Yon et al., 2018Yon et al., , 2020. Although our paradigm focused on the redeployment of precision or gain control depending on cognitive-attentional task set, it implicitly entailed varying 'predictability' of visual action feedback. That is, congruent visual action feedback may be considered more 'predictable' due to life-long experience dependent learning (that, in our case, may have persisted despite extensive training in both conditions). Previously, movements under visuo-proprioceptive conflict have been associated with suppressed occipital beta power ( Lebar et al., 2017 ; cf. corresponding BOLD signal increases in the STS shown by Leube et al., 2003 ;Limanowski et al., 2017 ;Limanowski et al., 2018 ). Furthermore, beta oscillations in the STS have been linked to the predictability of observed actions ( Pavlidou et al., 2014 ;van Pelt et al., 2016 ). The beta power suppression in the STS induced by visuoproprioceptive incongruence (i.e., our main effect) could therefore in principle be attributed to stimulus predictability. However, the corresponding interaction with task relevance suggests that incongruent (i.e. conflicting with proprioception) visual input was processed differentially depending on task set. In other words, our results demonstrate that when the visual consequences of action are 'unpredicted', sensory gain can be differentially attenuated or augmented in the service of goaldirected action -through top-down attentional mechanisms, with beta correlates in visuomotor areas.
Furthermore, our results speak to the particular role of beta synchronization in mediating the precision of message passing during motor control ( Palmer et al., 2016bPalmer et al., 2019. However, note that the desynchronization phenomena in our paradigm originated in the visual, as opposed to the (somato)motor system; we did not observe modulations of beta power over somatomotor cortices. This was unexpected, as the attenuation of somatosensory processing during the prioritization of incongruent vision has been shown previously ( Bernier et al., 2009 ;Limanowski and Friston, 2020b ). Since these studies analyzed singletrial evoked potentials or fMRI data, the absence of somatomotor oscillatory effects could be due to our focus on induced power during repetitive movements -but this speculation remains to be verified by future work. Complementary task designs could be used to evaluate whether sensory (e.g., visual) and motor beta oscillations serve different functional roles, as has been speculated ( Kilavik et al., 2013 ;Palmer et al., 2019 ;Press et al., 2011 ;Tan et al., 2016 ). Such modified task designs could also be helpful to determine to what extent the concept of sensory attenuation proposed by the active inference framework -i.e., as a generalized and multi-modal suppression of sensory input from the effector to enable movement ( Brown et al., 2013 ) -applies to the visual domain (cf. Vasser et al., 2019 ). These results could inform discussions about a potential (temporal) distinction between general effects of sensory attenuation and those based on stimulus predictability in Bayesian ('predictive processing') frameworks of action and perception ( Press et al., 2020a ,b;Corlett, 2020 ;Yon et al., 2018Yon et al., , 2020. Furthermore, complementary task designs should also be used to clarify whether the (prefrontal) low-theta power increase under visuo-proprioceptive incongruence could potentially be related to the prioritization of different target information, as in visual working memory tasks ( Daitch et al., 2013 ;Johnson et al., 2017 ;Liesefeld et al., 2014 ;Riddle et al., 2020 ;Sauseng et al., 2010 ).
A potential limitation of our study, which is inherent in our experimental design, is the fact that participants could correct their movements in real-time in the RHIC condition, whereas there was a lag in the VHIC condition. Varying time delays of visual feedback may introduce 'chaotic' oscillations ( Glass et al., 1988 ;Beuter et al., 1993 ). However, the power spectra of the executed movements were comparable and did not suggest such biases -probably because our participants were extensively trained (i.e., they were familiar with the amount of delay and how it affected visual feedback during the rhythmic movements). Furthermore, one would expect cortical signatures of such biases in motor cortex, where we did not observe any significant effects. Finally, our spectral data (and their fit by DCM) suggest that cortical oscillatory power was diametrically up-or down-regulated during RHIC or VHIC relative to the congruent mapping conditions. We therefore assume that our results reflect sensory gain control related to distinct cognitive-attentional task sets, but this issue remains to be clarified by future work. Another potential limitation of our study is that we could only infer attentional allocation from participants' self-reports; future work could try to include more explicit measures of attention. The congruent and incongruent conditions differed in reported task difficulty -as expected -but note that we omitted this potential bias by focusing our spectral and DCM analyses on the interaction effect. Finally, it should be noted that participants exhibited a partial shift of their grasping phase in the RHIC condition, which could indicate a difficulty to fully ignore matching biological motion ( Borroni et al., 2005 ;Kilner et al., 2007Kilner et al., , 2003. Although this should be pursued, it does not pose a problem here because behavior still differed significantly between tasks. In conclusion, our findings suggest a critical role for beta oscillations in sensorimotor integration; i.e., indicating the 'gating' of visual (vs proprioceptive) action feedback depending on the current behavioral demands.

Declaration of Competing Interest
The authors declare no conflict of interest.