Ultra-high field fMRI reveals origins of feedforward and feedback activity within laminae of human ocular dominance columns

Ultra-high field MRI can functionally image the cerebral cortex of human subjects at the submillimeter scale of cortical columns and laminae. Here, we investigate both in concert, by imaging ocular dominance columns (ODCs) in primary visual cortex (V1) across different cortical depths. We ensured that putative ODC patterns in V1 (a) are stable across runs, sessions, and scanners located in different continents, (b) have a width (∼1.3 mm) expected from post-mortem and animal work and (c) are absent at the retinotopic location of the blind spot. We then dissociated the effects of bottom-up thalamo-cortical input and attentional feedback processes on activity in V1 across cortical depth. Importantly, the separation of bottom-up information flows into ODCs allowed us to validly compare attentional conditions while keeping the stimulus identical throughout the experiment. We find that, when correcting for draining vein effects and using both model-based and model-free approaches, the effect of monocular stimulation is largest at deep and middle cortical depths. Conversely, spatial attention influences BOLD activity exclusively near the pial surface. Our findings show that simultaneous interrogation of columnar and laminar dimensions of the cortical fold can dissociate thalamocortical inputs from top-down processing, and allow the investigation of their interactions without any stimulus manipulation.


Introduction
Ultra-high field functional MRI (UHF-fMRI, at field strengths of 7 Tesla or more) is transforming the field of human neuroimaging ( Marques and Norris, 2018 ;Trattnig et al., 2018 ;van der Zwaag et al., 2016 ). By increasing signal-to-noise ratio (SNR) and spatial resolution, UHF-fMRI allows researchers to non-invasively study cortical functional organization at the mesoscopic scale of cortical columns and layers in healthy humans ( De Martino et al., 2018 ;Dumoulin et al., 2018 ). Perhaps the most well-known form of cortical organization at the mesoscopic scale is columnar: ocular dominance columns (ODCs) are patches of cortex that are predominantly sensitive to input from only one of the two eyes, forming columns perpendicular to the surface of primary visual cortex, V1 ( Adams and Horton, 2009 ;, Adams et al., 2007 ;Dougherty et al., 2019 ;Hubel and Wiesel, 1969 ;Tootell et al., 1988 ). In addition, models of cortical microcircuits contend that the organization orthogonal to that of cortical columns, along cortical depth, separates input-driven and feedback activity in primary sensory regions ( Bastos et al., 2012 ;De Martino et al., 2018 ;Kuehn and Sereno, 2018 ;Rockland and Pandya, 1979 ;Self et al., 2019 ;Stephan et al., 2019 ). Specifically, thalamocortical "feedforward" inputs are believed to mostly reside at the "middle", granular layer ( Bastos et al., 2012 ;Stephan et al., 2019 ) and, "deep", in-

V1
Top-down attentional influence left eye right eye Fig. 1. Schematic illustration of laminae and columns in V1. Visual information enters the visual system via the cornea of the left and right eye (bottom), and then, through the optical nerve reaches primary visual (V1) via the lateral geniculate nucleus (LGN; middle) in the thalamus. The signal then predominantly projects into the granular, as well as the infragranular layer in V1. The ocularity of the signal is preserved along so-called "ocular dominance columns" of approximately 1 mm wide. Feedback processes from higher-order cortical regions project largely into supragranular and infragranular layers.
mapped ODCs to infer whether monocular stimulation in the left or right eye is increasing thalamocortical input for a given patch of cortex -without the need to resort to stimulus manipulations. We do so by using an isotropic voxel size and 3D (rather than 2D) computational modelling of the cortical sheet, which allow us to exploit functional-anatomical mappings both along and across the cortical sheet (see also earlier work by De Martino et al., 2015 ;Kemper et al., 2018 ;Schneider et al., 2019 ;Zimmermann et al., 2011 ). Concretely, subjects were presented with an identical stimulus frame and fixation mark to both eyes. However, within the stimulus frame, a rotating and flickering checkerboard was presented to only one eye at a time. This allowed us to compare strong input to the left versus the right eye for the retinotopic area corresponding to the rotating checkerboard. In addition to this input-manipulation, we independently manipulated the relevance of monocular or binocular information for our participants: participants were asked to either report on color changes of the binocularly presented fixation mark, or changes in the direction of the monocularly presented rotating-checkerboard stimulus. Crucially, visual stimulation was identical across these two attentional conditions. The only difference was the task instruction to the subject before the task began.
We used 7T gradient-echo functional MRI (GRE-fMRI) and isotropic voxels at a very high resolution ( Fig 2 A, 0.7 mm isotropic; 0.343 mm 3 ; ( Petridou et al., 2013 )) to sample activation patterns within the cortical ribbon at different cortical depths Dale et al., 1999 ;Fischl et al., 1999 ;Polimeni et al., 2018 ). A well-known disadvantage of GRE-fMRI compared to alternative, more complex techniques exploiting T2-(GRASE; De Martino et al., 2013 ;Yacoub et al., 2007 ), CBV-(VASO; Huber et al., 2015 ) or SFFP-contrast (bSSFP Lee et al., 2008 ;Liu et al., 2020 ;Miller, 2012 ), is that GRE-fMRI is more sensitive to draining veins effect. Specifically, due to the fact that blood gets drained outwards from cerebral cortex, GRE BOLD-effects tend to accumulate towards the pial surface, potentially clouding laminar-specific activity ( Heinzle et al., 2016 ;Koopmans et al., 2010 ;Markuerkiaga et al., 2016 ;, Siero et al., 2011 ). However, we still chose to use GRE-fMRI because of its important advantages: the signal-to-noise ratio of GRE at submillimeter scale is superior to alternative techniques (2-8 times higher z-values for the same amount of data; Haenelt et al., 2020 ) and GRE-fMRI is much less plagued by SAR-constraints compared to GRASE and VASO techniques. The reduced SAR-constraints make it feasible to scan with higher temporal resolution, which would be preferable in future, more cognitive-oriented paradigms like binocular rivalry. Moreover, recent modelling efforts have made it possible to infer laminar-specific activity in GRE-fMRI with some reasonable assumptions about how deoxygenated blood gets drained to the pial surface ( Havlicek and Uluda ğ, 2020 ;Heinzle et al., 2016 ;Markuerkiaga et al., 2016 ;Marquardt et al., 2018 ;Merola and Weiskopf, 2018 ;Puckett et al., 2016 ). These methods seem to be relatively robust to the exact parametric assumptions used ( Marquardt et al., 2020 ;Marquardt et al., 2018 ). Indeed, GRE-fMRI is still the main workhorse in the large majority of recent applied laminar UHF-fMRI literature ( Kok et al., 2016 ;Kasper et al., 2019 ;Lawrence et al., 2018 ;Marquardt et al., 2020Marquardt et al., , 2018Schneider et al., 2019 ;Zaretskaya et al., 2020 ).
Our approach allowed us to quantify ODC patterns across cortical depth, and investigate how these patterns change as a function of thalamocortical input and stimulus relevance (attentional state). By using isotropic voxels, we could also reduce the number of voxels that overlap with draining vasculature near the pial surface, thereby increasing the effective resolution on the cortical surface ( Kay et al., 2019 ). Finally, combining ODC measurements with population receptive field estimates at all locations allowed us to relate the ocular dominance column patterns to their place in the retinotopic visual field ( Adams and Horton, 2009 ;Dumoulin and Wandell, 2008 ).
In the following, we first establish the consistency and robustness of ODC patterns and replicate some key characteristics of ODCs. Then, we focus on the effects of (a) monocular stimulation of the preferred eye versus the non-preferred eye of an ODC and (b) the effect of attending a stimulus that is presented monocularly versus a stimulus that is presented binocularly. The central question of this study was how these effects manifest differently across cortical depth. We hypothesized that the effect of monocular stimulation should be most prevalent at the middle cortical depth, where the granular layers reside ( Bastos et al., 2012 ;De Martino et al., 2018 ;Kuehn and Sereno, 2018 ;Rockland and Pandya, 1979 ;Self et al., 2019 ;Stephan et al., 2019 , but see Constantinople andBruno, 2013 ), whereas the effect of attention should be most prevalent at outer cortical depth, where the projections originating from higher-order visual areas are located ( Bastos et al., 2012 ;De Martino et al., 2018 ;Kuehn and Sereno, 2018 ;Rockland and Pandya, 1979 ;Self et al., 2019 ;Stephan et al., 2019 ;Lawrence et al., 2019 ).
Finally, we were also interested to see if there was an interaction between the effect of monocular stimulation of the preferred eye and attentional state. In other words: is the effect of attention modulated by the stimulation of the preferred versus non-preferred eye? We speculated that humans might be able to suppress the thalamocortical input into V1 representing information that is not relevant to the task at hand ( Ling et al., 2015 ). If this would be the case, we would expect the effect of attention to be larger in ODCs where the preferred eye is stimulated as compared to ODCs where the non-preferred eye is stimulated.
In the following, we report robust V1 ODC patterns at all cortical depths. Laminar deconvolution and decoding analyses confirm that ocular-dominance information is indeed most robust at middle cortical depth. The behavioral relevance of the monocular stimulus induced a higher BOLD response but only at outer cortical depth, con- Fig. 2. PRF and ocularity maps in V1 A) functional MRI data was collected at 7 Tesla with isotropic resolution of 0.7 mm and occipital surface coils with a restricted field-of-view, but optimized signal from the calcarine sulcus. Overlaid contour profile shows white (green/white lines) and gray matter segmentations (red lines) created by Freesurfer at 0.64 mm resolution, based on the mean T1-weighted UNI image of a MP2RAGE and MP2RAGE-ME-sequence. The purple/orange image is the mean EPI image of a run registered to the anatomical data using boundary-based registration and illustrates the quality of the registration, as well as the field-of-view. B) A standard Population Receptive Field (PRF) mapping paradigm was presented to the subjects for 3 runs of approx 5 min. In this paradigm, a flickering bar consisting of many gabor patches was moved across the central aperture in 4 cardinal and 4 oblique directions. The projection screen was approximately 12-15 centimeters from the subject's eyes, which led to an ocular field-of-view of approximately 22-28 degrees-of-visual-angle. C) The PRF mapping paradigm yielded individual retinotopic field maps in and around the calcarine sulcus. These maps were used for a very fine-grained delineation of the V1/V2-border. Here, we show the retinotopic maps of three representative subjects. D) An ocular dominance mapping paradigm with both monocular and binocular stimulus elements was presented to each subject for 8-10 runs. Before half of the runs, subjects were instructed to report changes in the color of the binocularly-presented fixation dot; before the other half of the runs, subjects were instructed to report the rotating direction of the checkerboard. E) The ocular dominance mapping paradigm yielded a contrast map (left > right eye stimulation) . Note how the z-values change between large positive and negative values across the cortical surface. The polar angle maps, as well as the ocularity maps can be visually inspected interactively at http://aeneas.labs.vu.nl/odc/ sistent with the standard model that cortico-cortical feedback projections largely reside in the outer layers. Lastly, there was no significant interaction between these two effects, suggesting that, in our task, thalamocortical input is not modulated by attention in an ocular-specific manner.

Univariate contrasts on the surface
We first used population receptive field mapping (PRF; Dumoulin and Wandell, 2008 ) to delineate V1 on the cortical surface of individual subjects. Fig. 2 C illustrates the retinotopic maps and V1/V2 segmentations of three representative subjects. Fig. 2 E shows the z-statistics of the contrast 'left eye stimulation > right eye stimulation' in the same three subjects. Qualitative inspection suggests there are very small, significant clusters of ocular specificity that alternate between left-and right-sensitivity at a fine spatial scale.
Notably, visual inspection of Fig. 2 A suggests that, at least in some subjects, high z-values are mostly located in the dorsal bank of V1. Quantitative analysis confirms that 65% of the significant vertices at the p < 0.05-threshold is located in the (retinotopically-defined) dorsal bank of V1 (median 59%, range from 52% to 94%, different from 50% with t(6) = 2.9, p = 0.028). This overrepresentation of the dorsal bank is most likely driven by the fact that the ventral bank of V1 on average lies slightly more rostral, which makes it lie further away from the surface receiver coils and less well-covered by our field-of-view.

Reproducibility of ocular dominance maps
In line with earlier work on fMRI of ODCs ( Cheng et al., 2001 ;Yacoub et al., 2007 ), we first tested whether ocularity maps were repro-ducible over time. For six subjects, we split the data into the first and second part of the session and estimated activation patterns for these two halves separately. We correlated the activation patterns of these two halves on the surface within the V1 mask that was defined by the retinotopic mapping paradigm. For this analysis, the activity patterns were averaged across cortical depth. Out of six subjects, five subjects showed robust correlations in their activation pattern. These within-session activity pattern correlations ranged from r = 0.41-0.80 (median 0.59; see Fig. 3 A). The sixth subject (subject 5) showed no robust within-session correlation ( r = -0.02 for the left and r = 0.02 for the right hemisphere) and was removed from any subsequent analyses. Another subject's (subject 1) irregular folding of the left hemisphere placed a large part of the calcarine outside the scan protocol (see supplementary materials S3 for details). This hemisphere was left out of all subsequent analyses. For the five subjects that showed highly significant correlations, the binarized ocular preference of the surface vertices ('left' or 'right') was consistent across the two session halves for 60-79% of the vertices (median 70%).
For a seventh subject, we acquired data in two separate sessions in Amsterdam, as well as a third session in Beijing, China, using a different 7T MRI system. Across the two sessions in Amsterdam, activation maps in V1 correlated with r = 0.36 in both hemispheres. The correlation between Amsterdam and Beijing was considerably smaller but significant, see supplementary materials S4 for details.

ODC properties of ocular-preference maps
Having verified the consistency of these ocular dominance patterns, we turned to more detailed analyses of the spatial properties of these ocular dominance patterns to positively identify these patterns as ODCs.
First, we know that ODCs are largely constrained to V1, stopping rather abruptly at the V1/V2 boundary ( Adams and Horton, 2009 ; Fig. 3. Robustness across runs and sessions A) Correlations between ocularity maps of first and second half of the mapping sessions. All subjects except subject 5 show consistent ocularity patterns across the two halves of their session. B) One subject was scanned during two separate sessions in Amsterdam. Across activation maps from these two sessions, we found slightly lower, but still highly significant correlations between the ocularity maps.

Fig. 4.
Known properties of ODC that replicate in UHF-fMRI ODC mapping. A) In V1, but not V2, the distribution of z-values is robustly different from chance. This is consistent with earlier work that has shown that V1 is considerably more sensitive to ocularity than other areas (although there are monocular "stripes" in V2. B) Estimation of spatial frequencies on the surface: we used a filter bank of Gabor patches and convolved it with ocularity maps on a flattened cortical surface, to estimate a map of main spatial frequencies across V1. C) The dominant spatial frequency is constant across cortical depth. However, other spatial frequencies have more power near the pial surface. D) the total amount of power in the frequency band corresponding to wavelengths of 2-4 mm peaks at medium cortical depth. Different marker colors correspond to different subjects, shaded area corresponds to 67% confidence interval (S.E.M.). E) The individual retinotopic maps of V1 allow for the analysis of the ocularity maps in the visual field. Consistent with earlier work, when we combine data from both eyes, we find an overrepresentation of the ipsilateral eye at the approximate visual location of the blind spot (ovals). Adams et al., 2007 ). Indeed, the patterns we find are restricted to the area of V1. The percentage of significant z -values at p < 0.05 (two-sided, absolute z -values higher than 1.96) was higher than chance level in V1, while not in V2 for 5 out of 6 subject-sessions for the left hemisphere (average of 10.7% vs 5.2%; t(6) = 2.00, p = 0.10) and for 6 out of 7 subject-sessions for right V1 (average of 10.4% vs 5.3%; t(7) = 1.55, p = 0.17); see Fig. 4 A). Second, ocular dominance columns are known to occur at a specific range of spatial frequencies. Post-mortem work estimates that ODCs are about 730-995 micrometers wide ( Adams and Horton, 2009 ;Adams et al., 2007 ). Yacoub et al. (2007) estimated a column width of 1070 micrometer using GRE and Spin-Echo fMRI techniques and anisotropic voxels. To compare our fMRI results to earlier work, we convolved the left/right-difference z-maps within V1 on a flattened surface with a set of Gabor patches with different spatial frequencies, ranging from 1 mm to 10 mm/cycle, each using 16 different orientations (see Fig. 4 B). This yielded, for every location on the reconstructed surface, a distribution of power at different spatial frequencies.
All subjects showed a spectral power distribution with a peak around a wavelength of approximately 2.7 mm/cycle ( Fig. 4 C). This is consis-tent with a column size of approximately 2.7 / 2 = 1.35 mm wide. This is wider than the approximately 1 mm width that is found in post-mortem work. We note that the BOLD transfer function, blurring due to slow k-space sampling and post-hoc resampling to the cortical surface will shift the power spectrum towards longer wavelengths ( Chaimow et al., , 2018a( Chaimow et al., ,b , 2011;Shmuel et al., 2007 ). The dominant wavelength of 2.7 mm/cycle was constant across cortical depth (F(5, 60) = 0.74, p = 0.60).
Third, as ocular dominance preference is inherited from the bottomup afferents from the lateral geniculate nucleus (LGN) in the thalamus, the ODC pattern should be strongest in the middle layers of V1 which receive this afferent input. Indeed, the power at the dominant wavelength shows a non-monotonic curve, peaking not at the pial surface (where BOLD amplitude is greatest) but near the middle layers. When we performed Bayesian hierarchical fits of both a linear model and a quadratic model (see methods), the quadratic model explained the relationship between cortical depth and the total spectral power better than the linear model (Watanabe-Akaike information criterion (wAIC) = − 398.44 for the quadratic model vs wAIC = − 324.53 for the linear model). The peak was estimated to be at approximately 40% cortical depth (95% highest posterior density HPD: [5%, 75%]), where the pial surface corresponds to 0% and the gray matter/white matter-border to 100%.
Fourth, we focused on the blind spot: a small part of the visual field of each eye that is occluded by the optic disc (the start point of the optic nerve, where the retina contains no photoreceptors). These blind spots, one for each eye, should be detectable in our ODC patterns as a patch of monocular preference for the ipsilateral eye at about 15 degrees eccentricity, just below the horizontal meridian of the visual field ( Adams and Horton, 2009 ;Adams et al., 2007 ). To visualize this, we mapped the ODC patterns into visual space using the pRF estimates of our retinotopic mapping experiment ( Fig 2 B). This indeed revealed areas of dominance of the ipsilateral eye at a location consistent with the blind spot (although somewhat lower in the visual field; peak at x = -/ + 15.1 and y = -7.9 degrees-of-visual-angle; Fig. 4

E).
We contend that this collection of findings consistently identifies our ocular dominance patterns as human ODCs. We investigated a further set of possible predictions of ODC properties derived from post-mortem studies and animal studies, but technical constraints such as limited spatial resolution and the distortions induced by sampling volumetric data to the surface limit the testability of these predictions (see supplemental materials S2).

The effect of attention on the BOLD response across cortical depth
After having established that the ocularity maps we found are robust and consistently follow predicted ODC properties, we turned to the question how ODCs in V1 are modulated by spatial attention, across cortical depth. Participants were asked to report on either a monocularlypresented part of the stimulus array (the rotating, flickering checkerboard), or the binocularly-presented part of the stimulus array (the fixation dot) in separate experimental runs featuring identical visual stimulation.
We selected visually responsive voxels in V1 (absolute z-value of left + right stimulation > baseline contrast larger than 2.3; approx. 3000 − 9000 voxels per subject) and tested differences in BOLD response amplitude as a function of stimulation and attentional condition. Then, to prevent double dipping, we assigned voxels to be left-eye or righteye preferring, using leave-two-runs-out cross-validation (one for each attentional condition). Given these ocular preferences, we could now designate voxels as either "stimulated" (e.g., left-eye stimulation of leftpreferring column) or "unstimulated" (e.g., right-eye stimulation of leftpreferring column). Finally, voxels were assigned to 5 exclusive regions of interest based on cortical depth, as defined by the equivolumetric algorithm of CBS-tools (   A shows that, as should be expected when using GRE laminar fMRI ( Markuerkiaga et al., 2016 ;Polimeni et al., 2018 ;Siero et al., 2011 ), the BOLD response is larger for more superficial cortical depths, in all conditions (main effect of cortical depth on BOLD response: F(4, 20) = 22.9; p < 0.001 for left V1; F(4, 24) = 17.1; p < 0.001 for right V1).
In contrast, the main effect of attention was only marginally significant for left V1 (F(1, 5) = 14.2, p = 0.013) and even non-significant for right V1 (F(1, 6) = 1.5, p = 0.26). However, we did find a highly significant interaction effect between attention and cortical depth in both left (F(4, 20) = 7.29; p = 0.0009) and right V1(F(4, 24) = 15.4; p < 0.001 in right V1). This indicates that although the overall effect of attention on the BOLD response in V1 is relatively weak, it is much stronger for voxels near the CSF (see Fig. 5 B).
Interestingly, there was no significant interaction between stimulation condition and attention (F(1, 5) = 0.08, p = 0.78 for left V1; F(1, 6) = 0.23, p = 0.65 for right V1), nor a three-way interaction between stimulation, attention and cortical depth (F(4, 20) = 0.08; p = 0.99 for left V1; F(4, 24) = 0.26, p = 0.90 for right V1). In other words: the effect of attention on BOLD activity in ODCs was independent of whether the preferred eye of an ODC was stimulated or not. This is consistent with the notion that participants are not shifting their attention towards a specific eye.
To control for the effect of draining veins, where the BOLD signal in more superficial layers is contaminated by the signal of deeper layers, we used the linear laminar deconvolution approach developed by Markuerkiaga and colleagues ( Markuerkiaga et al., 2016 ;Marquardt et al., 2018 ). This deconvolution analysis also confirmed that the relevance of the monocular and binocularly presented stimulus has an effect exclusively at superficial cortical depth (at p < 0.05), whereas the effect of stimulation of the preferred versus non-preferred eye exists across all cortical depths , with the largest effect in the middle layers ( Fig. 5 C and D). This confirms a dissociation of neural activity related to thalamocortical input versus cortico-cortical feedback connections across cortical depth.

Decoding eye stimulation from moment-to-moment
Univariate results such as the ones reported above may overlook patterns in the data that multivariate decoding analyses do capture (multivariate pattern analysis MVPA; Haxby et al., 2001 ;Haynes and Rees, 2006 ;Kamitani and Tong, 2005 ;Mur et al., 2009 ;Norman et al., 2006 ;O'Toole et al., 2007 ;Tong and Pratte, 2012 ). Recent work has shown that MVPA-techniques offer a powerful alternative to univariate analysis for UHF-fMRI data and that they can dissociate laminar-specific effects in granular and supragranular layers ( Vizioli et al., 2020 ). We therefore conducted an additional layer-specific decoding analysis to investigate the impact of attention on ODC patterns.
We constructed a Bayesian encoding model for monocular population coding, based on the model proposed by van Bergen and colleagues ( van Bergen et al., 2015;van Bergen and Jehee, 2019 ). It assumes two independent neural populations coding for input of the left and right eye and voxels' activity is then modeled as a weighted sum of these two populations and their (co)variance as a multivariate normal distribu- A) The raw percent signal change increases towards the pial surface in all four conditions: with attention on the checkerboard, both stimulated and unstimulated voxels, as well as when the subject is attending the fixation dot for both stimulated and unstimulated voxels . Error bars are standard errors of the mean (SEM), bootstrapped across subjects. B) The size of the effects on the attentional and stimulation manipulation also increases towards the pial surface. However, stimulating the preferred eye led to increased BOLD activation across all cortical depths (strong main effect), whereas the effect of attention to the monocular stimulus only reaches (marginal) significance at the most superficial cortical depth. Still, the interaction between cortical depth and the attentional effect is highly significant at p < 0.001. C) The BOLD response was deconvolved across cortical depth using the model proposed by Markuerkiaga et al. (2018). This analysis showed a much flatter laminar profile of inferred activity. D) A plot of the laminar deconvolved responses. This analysis confirms the hypothesis that thalamo-cortical input processes and cortico-cortical feedback processes lead to different activity profiles across cortical depths. * * * p < 0.001, * p < 0.05, † p < 0.  Fig. 6. An inverted encoding model of neural ocular dominance populations can reliably decode the ocular stimulation condition on a moment-to-moment basis. A) Traces of the log 10 Bayes Factor odds between left and right eye stimulation for all runs of a single subjects. Note how the original stimulus design can be read out from decoded, previously unseen data from V1 using the activation patterns of single fMRI volumes. B) Mean Bayes Factor trace across subjects, splitted across different cortical depths. Note how the model is more certain about its predictions for BOLD data that was sampled closer to the pial surface. C) Accuracy of the decoder for different cortical depths and attentional conditions. Error bars indicate bootstrapped standard errors of the mean (SEM). The decoder performs best at middle cortical depths. There are no significant differences between the two task conditions at any cortical depth. There is also no significant main effect or interaction effect related to the task conditions. n.s. = not significant, † = p < 0.1 tion, consisting of a weighted sum of independent voxel noise, shared voxel noise, and binocular population noise. The model was fit and evaluated per sampled time point, sampled at different cortical depths and cross-validated across runs (see methods section for details). Since the data were collected using a 3D EPI with a relatively long TR (4 s) and stimulus blocks were at least 3-6 volumes long, we chose to classify individual volumes, rather than deconvolving the signal time-course in any way.

Size of raw BOLD responses increases towards pial surface
Stimulus eye-of-origin decoding was reliably above chance in all subjects (see Fig. 6 A for traces of the log 10 Bayes Factor for all the runs of an individual subject), with volume-to-volume accuracies ranging from 66% to 84% for left V1 and from 68% to 91% for right V1.
Bayesian hierarchical model comparison showed, both for left and right V1, that the relationship between cortical depth and decoding accuracy is better explained by a quadratic model than a linear model (wAIC of − 321.4 vs − 257.54 for left V1 and − 291.12 vs − 217.78 for right V1). Furthermore, modelling suggests that highest decoding accuracy is achieved by using BOLD activity at a cortical depth of approximately 27 % cortical depth for left V1 (albeit with large uncertainty, 95% HPD: [-193%, 315%]) and approximately 45% for right V1 (95% HPD: [24%, 107%]). This cortical depth roughly corresponds to the layers at which our ODC patterns are strongest ( Fig. 4 D). It is these layers where thalamocortical monocular fibers enter V1 and ocular neural activity patterns as measured by post-mortem techniques are most prominent ( Adams and Horton, 2009 ;Adams et al., 2007 ;Dougherty et al., 2019 ).
There was no effect of attention condition on the accuracy of the decoder (F(1, 5) = 0.063, p = 0.81 for left V1 and F(1, 6) = 0.86, p = 0.38 for right V1), nor an interaction between the attention condition and cortical depth (F(5, 25) = 1.97, p = 0.12 for left V1; F(5, 30) = 1.33, p = 0.28 for right V1). An apparent trend of increased decoder accuracy for the before-last cortical depth for binocular attention was not statistically significant ( Fig. 6 C; t(6) = 2.10, p = 0.07 when both hemispheres are combined; also not significant for individual hemispheres). In sum, the pattern of results from the decoding analysis was consistent with the univariate results presented above: thalamocortical input is most strongly represented at deep and middle cortical depth, and spatial attention does not selectively gate information from a specific eye.

Discussion
We applied 7 Tesla functional MRI to map ocular dominance columns (ODCs) in healthy human subjects, while participants focused their attention on either a monocularly presented stimulus or a binocularly presented fixation mark. Crucially, whereas earlier studies that mapped ODCs employed highly anisotropic "pencil voxels", with slices 5.5-7.3 times thicker than the in-plane resolution ( Cheng et al., 2001 ;Dechent and Frahm, 2000 ;Goodyear and Menon, 2001 ;Menon et al., 1997 ;Yacoub et al., 2007 ), for the first time, we mapped ODCs using isotropic voxels. This allowed us to study monocular BOLD activity in V1 along three dimensions: not only the 2D columnar location on the cortical surface, but also cortical depth ( Kuehn and Sereno, 2018 ). The relationship between cortical depth and neural activity as measured by BOLD fMRI has the potential to be highly informative of brain function, since feedforward-and feedback-processes are believed to reside at different cortical depths ( De Martino et al., 2018 ;Lawrence et al., 2019 ;Stephan et al., 2019 ). The combination of columnar and laminar sensitivity allowed us to measure the effect of thalamocortical input without manipulating stimulus properties, which are possibly confounded with other higher-order perceptual processes. The laminar patterns in our data align remarkably well with post-mortem tracing studies and electrophysiological recordings in rodents and non-human primates.
First, the effect of stimulating the preferred versus non-preferred eye, after correcting for draining veins, was most robust at deep and medium cortical depth, consistent with thalamocortical input into infragranular layer 6 and granular layer 4. Concordantly, the depth at which eye-of-origin decoding was most accurate was consistent with the granular layer and model comparisons show that its performance across depth is best described by an inverted-U pattern. Considering the effect of draining veins, where BOLD signals leak outwards towards the pial surface ( Heinzle et al., 2016;Polimeni et al., 2018 ), we infer that the decoder was mostly relying on BOLD activity originating from deep and middle cortical depths. Interestingly, it is well-known that thalamocortical projections into V1 are most numerous in the granular layer 4c and this layer has so-far received most attention as the thalamocortical input layer of primary sensory cortex in laminar fMRI studies ( Koopmans et al., 2010 ;Lawrence et al., 2019 ;Stephan et al., 2019 ). However, less well-known is that recent work in rodents has shown that neural activity in infragranular layer 6 also represents direct input from thalamus, with transmission latencies that are indistinguishable from those of layer 4C ( Constantinople and Bruno, 2013 ). Furthermore, additional work in rodents has shown that thalamocortical projections in layer 6 tightly follow the columnar structure of layer 4 in rodent barrel cortex ( Crandall et al., 2017 ). Intriguingly, Tootell and colleagues ( Tootell et al., 1988 ) used radioactively labelled 2-deoxyglucose (DG)uptake in post-mortem studies, to visualize ODCs in macaques and show that the largest DG uptake, after layer 4C, was to be found in layer 6. Similarly, in a recent study on binocular modulation of monocular neurons in layer 4 of V1 in macaques, Dougherty and colleagues ( Dougherty et al., 2019 ) found the largest proportion of monocular neurons in layer 4 and "layer 5/6" (their supplemental Fig. S1, panel D). Together, these findings suggest an important refinement of the perhaps overly-simplistic model common in the laminar UHF-fMRI literature, where thalamocortical feedforward terminations are thought to terminate exclusively in granular layers.
The effects of spatial attention on the laminar activation profile also fit earlier computational, anatomical and electrophysiological work. In our data, shifts of attention to the monocularly-presented stimulus features exclusively modulated BOLD responses near the pial surface. This is consistent with neurocomputational models of canonical microcircuitry where higher-order areas send feedback signals to supragranular layers of upstream cortical regions ( Bastos et al., 2012 ;De Martino et al., 2018 ;Dumoulin et al., 2018 ;Stephan et al., 2019 ). It is also consistent with tracing studies that suggest that the majority of neurons in V2 that project to V1, project to supragranular layer 1 ( Anderson and Martin, 2009 ;Rockland and Virga, 1989 ). Furthermore, other cortical depth-resolved fMRI studies have shown that the BOLD signal at superficial cortical depth is modulated by feedback processes in primary visual (e.g., Muckli et al., 2015 ;Lawrence et al., 2019 ) and auditory cortex . Finally, recent electrophysiological studies on attention and visual working memory in non-human primates by Van Kerkoerle and colleagues ( Van Kerkoerle et al., 2017 ) and Denfield and colleagues ( Denfield et al., 2018 ) showed that attention correlates with increased top-down inputs to infragranular layers and especially supragranular layers. An interesting question is why electrophysiological studies found effects of attention in infragranular layers, whereas our study (as well as others, e.g., Muckli et al., 2015 ;Lawrence et al., 2019 ) only find attentional effects at a cortical depth consistent with supragranular layers. Possibly, this difference in results is due to a smaller change of neural firing in infragranular layers as compared to supragranular layers: in fact, Denfield and colleagues, in their electrophysiological work, found a significant effect in infragranular layers in only one-outof-two attentional conditions ( Denfield et al., 2018 ). We speculate that activity in the infragranular layers is representing mostly intracolumnar processing: neural activation from supragranular to infragranular layers ( Bastos et al., 2012 ;Stephan et al., 2019 ) and that laminar fMRI is less sensitive to these short-range projections than the longer corticocortical feedback and thalamocortical feedforward projections. However, we can not exclude that the lack of an attentional effect at deeper layers is (partly) due to an inherently smaller GRE BOLD response.
We did not find an interaction between the monocular stimulation of the preferred eye and task relevance of the monocular stimulus. This implies that although attentional processes influence activity in the supragranular layers of V1, they do not impinge on the ocular dominance activity pattern as measured by BOLD fMRI. That is, a shift of spatial attention, when directed to a monocularly presented stimulus feature, does not gate the lateral geniculate-cortical inputs to V1 based on their ocularity (eye-of-origin). Earlier work in the context of binocular rivalry has, however, shown that attentional processes can modulate monocular input streams ( Zhang et al., 2011( Zhang et al., , 2012. One possible explanation for this apparent discrepancy in results pertains to the experimental paradigm used: possibly, attentional gating of the two eyes only happens when these send highly conflicting information, such as in standard binocular rivalry ( Leopold and Logothetis, 1996 ;Levelt, 1965 ;Tong et al., 2006 ) and continuous flash suppression paradigms ( Tsuchiya and Koch, 2005 ). In the paradigm used here, one eye was presented with a flickering checkerboard, whereas the other eye, at the same retinotopic location, was presented with a black background, as if this part of the stimulus was occluded in one eye. Although this difference between the two input streams does indeed represent a conflict, this conflict between monocular signals is much less salient than during full-fledged rivalry paradigms, where stim-uli are purposely made highly incongruent, such as equal-contrast, orthogonal gratings of different colors ( Zhang et al., 2012 ;Zhang et al., 2011 ). A second potential explanation of the lack of interaction between the input-and feed-back manipulation in our data, is that attentional gating of rivaling stimuli might not be resolved at the level of V1, but already earlier in the visual processing stream, at the stage of the lateral geniculate nucleus (LGN). Indeed, higher-order attentional effects and perceptual states can influence neural processing already at the level of LGN O'Connor et al., 2002 ;Wunderlich et al., 2005 ). Future work should elucidate the precise interplay between attention and ocular processing by measuring ODCs in V1 across cortical depth using UHF-fMRI during fullfledged binocular rivalry and simultaneously imaging ocular dominance layers in LGN ( Zhang et al., 2010 ).
A recent 7T fMRI study by Lawrence et al. (2019) also investigated the laminar BOLD activation patterns of thalamocortical bottom-up and corticocortical feedback processes in V1. Lawrence and colleagues did not use columnar inputs structures, but different stimulus contrast levels to experimentally manipulate thalamocortical input. The feedback manipulation was also different, as a feature-based attention as opposed to a spatial attention manipulation was used: subjects had to report on one of two orthogonally oriented stimulus features of the same stimulus at the same retinotopic location. To correct for the increase of BOLD activity towards the pial surface due to draining veins, Lawrence et al. normalized (z-scored) the estimates of raw BOLD activity for their main results. This approach is comparable to our decoding approach, where the performance of the decoder is largely determined by the ratio between task-related activity and the variance of the BOLD signal. Indeed, Lawrence and colleagues found the largest effect of their input manipulation at medium cortical depth, as we did here. Their input also showed an inverted-U shape, consistent with our decoding results. However, Lawrence and colleagues interpret this U-shaped pattern as evidence for thalamocortical input exclusively into the middle granular layer of V1. We propose that, since there is a significant effect of stimulus contrast near the white/gray matter border as well (both for standardized and raw data, their Fig 3, supplement 5), it can not be excluded that this inverted U-shape is the result of thalamocortical inputs to both infragranular and granular layers. This speculation highlights the importance of modelling of the draining vein problem and/or taking this into account when interpreting raw BOLD effects across cortical depth ( Heinzle et al., 2016 ;Marquardt et al., 2018 ;Polimeni et al., 2018 ). For the feature-based attentional manipulation, Lawrence et al. found the numerically largest (normalized) effect of attention at more superficial cortical depth, as we did here. They found a highly significant main effect of attention across all cortical depths and the interaction between cortical depth and effect size was non-significant (whereas we found a significant interaction effect at p < 0.001), suggesting that in their experiment, attention modulates all cortical layers equally. One explanation for this apparent discrepancy in results is that the resolution of both our anatomical (0.26 vs 0.51 mm 3 ) and functional data (0.34 versus 0.51 mm 3 ) was about 1.5-2 times higher than the data presented in Lawrence et al., which could have reduced BOLD signal blurring across laminae due to the increased fidelity of cortical surface reconstructions and reduced partial volume effects. Another explanation pertains to the properties of the task: in the experimental paradigm used by Lawrence et al., the orientation that needed to be attended by the subjects was explicitly relevant to the task. This is different from our task, where the relevance of the ocularity of the input signals is only implicit and of a different nature than the orientation of a stimulus. The relevance of orientation in the task of Lawrence and colleagues could have recruited additional processing of incoming signals, such as cross-orientation suppression. This processing could have happened via intra-columnar connections from the supra-to the infragranular layers. Future work should investigate different forms of attention, like spatial and feature-based attention using UHF-fMRI in the same scanning session, to determine whether these subtle differences in the laminar location of attentional effects are indeed due to fundamentally different mechanisms.
A related limitation of our study is that the retinotopic area that needed to be attended during the binocular attention-condition (fixation color) was smaller than during the monocular attention-condition (checkerboard rotation direction) and that the task requirements were slightly different. This means that the attentional difference between the two conditions is both spatial and ocular. It is unclear how this confound could have driven any spurious laminar effects, but in future work it could be interesting to compare attentional conditions that are identical in task load and size of the area that need to be attended (e.g., in a binocular version of the paradigm used in Liu et al., 2020 ).
Another important limitation of our study is that we used GRE-fMRI rather than alternative fMRI techniques like GRASE , VASO ( Huber et al., 2014 ), or balanced steady-state free precession (bSSFP; Lee et al., 2008, Miller, 2012 that have potentially higher laminar specificity. Therefore, in our study we had to resort to a laminar deconvolution model ( Markuerkiaga et al., 2016 ) to account for draining veins, whereas techniques like GRASE, VASO, and bSSFP avoid the draining-vein problem during measurement, since they are less sensitive to signal arising from these veins. However, this increased specificity comes at a cost of increased SAR, reduced spatial and temporal resolution, as well as a reduction in sensitivity. Very recent, preliminary work has compared GRE-fMRI, GRASE and VASO in their ability to visualize ocular dominance columns ( Haenelt et al., 2020 ). Interestingly, Haenelt and coworkers showed that the estimated spatial frequency across the three techniques was highly similar (~1.3 mm, in line with this study). And although GRASE and VASO seemed to be more specific in their laminar response profiles, this came at a cost of 2-8 times lower z-values for the same amount of data. Liu et al. showed a similar tradeoff when they compared the influence of attention on contrast-response functions using both GRE-fMRI and bSSFP. Balanced SSFP results were more in line with electrophysiological work, but this increased specificity came at the cost of highly anisotropic voxels of 0.3 × 0.3 × 3.0 mm ( Liu et al., 2020 ). Clearly, future work such as that of Haenelt et al. and Liu et al. will be of great importance to the field of laminar fMRI, since GRE-fMRI is still very-much its main workhorse (e.g., Kok et al., 2016 ;Kasper et al., 2019 ;Lawrence et al., 2018 ;Marquardt et al., 2020Marquardt et al., , 2018Schneider et al., 2019 ;Zaretskaya et al., 2020 ). We think that in future comparisons, it will be important to consider the higher sensitivity and ease-of-use of GRE-fMRI and compare SE-/VASO-/sFFP-approaches to GRE-fMRI using a model-based deconvolution approach, over and above just the raw, unmodeled GRE-fMRI signal.
Due to the use of GRE-fMRI, some caution should also be taken in the interpretation of the decoding (MVPA)-results. Temporal SNR degrades towards the pial surface and one could argue that the decrease in decoding accuracy near the pial surface that we found could be due to increased (physiological) noise, rather than a neural effect. However, this increase in noise is also counteracted by an increased size of the BOLDresponse, so it is hard to quantify the impact of the noise. In recent work, Vizioli et al. (2020) advocate for the use of MVPA-techniques to analyze laminar GRE-fMRI data and empirically show that such techniques can dissociate granular from supragranular effects, much like the results presented here. Future work should further explore and model the relationship between MVPA decoding accuracies, physiological noise and laminar effects.
In conducting this study, we took the utmost care to tackle the wide range of methodological challenges that ultra-high field fMRI has to surmount. We stress here that, in our experience, extensive manual checking and correction of initial masking, unwarping, registration and segmentation are essential to make valid inferences about the relationship between cortical depth and BOLD activation patterns. We therefore included a very detailed methods section and shared all our analysis scripts 1 and raw data 2 to open source repositories. We believe that such sharing of data and code is of crucial importance to understand what is and what is not possible with UHF cortical depth-resolved fMRI and to determine the level of methodological rigor needed to further this young and exciting field ( Poldrack et al., 2019 ).

Participants
Seven participants were recruited from the Vrije Universiteit Amsterdam. They were aged between 21 and 39 years old. All participants gave informed consent, and procedures were approved by the ethical review board of the Vrije Universiteit Amsterdam.

MRI acquisition
All data from Amsterdam were acquired using a Philips Achieva 7 T scanner (Best, Netherlands). For the anatomical data, we used a volume transmit coil for excitation and a 32-channel head coil for signal reception (Nova Medical, MA, USA). For the functional acquisitions, data were acquired using two custom-built high-density 16-channel surface coils (total 32 channels) for signal reception ( Petridou et al., 2013 ) and the standard NOVA coil for transmission. The gradient coil has a maximum amplitude of 40 mT/m and a 200 T/m/s maximum slew rate.

Stimulus presentation
Fully separated dichoptic stimulus presentation was achieved by use of prism glasses ( Schurger, 2009 ), with a septum and a backprojection screen mounted inside the transmit coil. The quasicircular back-projection screen was 29 cm in diameter to fit precisely into the transmit coil. Depending on the individual participant's positioning, the resulting distance to the screen was between 12 and 15 cm. This caused variations in the maximal stimulus eccentricity (see below) across subjects, for which we corrected in our pRF eccentricity and size estimates. For some participants, we used positive-diopter lenses to aid their ability to maintain focus at this short distance. To minimize reflections inside the transmit coil it was covered in black cloth, and the septum was painted using matte black paint.
A ProPixx (VPixx technologies, Saint-Bruno, Canada) projector running at 1920 × 1080@120 Hz was used for stimulus presentation, implementing a linearized lookup table. As the scanner environment is hostile to exact luminance output measurements of this custom setup the maximal luminance of the display is unknown. Based on the manufacturer's data however, we estimate that it was over 500 cd/m 2 . Due to the shape of our projection screen and the distance between screen and projector, the width and height of the effective screen area were approximately 1400 by 700 pixels.
For the receptive field mapping paradigm, a higher temporal resolution was warranted, to reliably estimate PRF parameters from the design matrix . Therefore, for these runs, the number of shots and thereby spatial resolution was reduced to 1 mm (FOV = 160 × 160 × 34 mm; matrix size = 160 × 160; 34 slices; TR = 0.058 s; TE = 0.028 s; FA = 20°; echo planar factor 21; acceleration factor using SENSE encoding: 3.5 (right-left) × 1.3 (anterior-posterior)). This protocol had a volume acquisition time of 2.7 s.
For both the 0.7 mm and 1.0 mm functional MRI protocols, we also collected an identical protocol, except with reversed phase-encoding blips (phase-encoding in the opposite direction). Those images would be used to correct for distortions due to B0 field inhomogeneities using the "TOPUP"-algorithm ( Andersson et al., 2003 ). For the ocularity mapping paradigm, we collected 5 volumes with opposite phase-encoding. For the PRF mapping paradigm, we collected 8 volumes with opposite phase-encoding.

Framing stimulus array
Both the PRF mapping paradigm, as well as the ocularity mapping paradigm were presented within a larger stimulus array (the framing stimulus array ) that was designed to help the subjects comfortably fuse the input images of the left and right eye and thereby to prevent them from making unwanted eye movements. Before an experimental run started, the subject was able to move and rotate the left and right stimulus array with the response boxes, so that binocular fusion of the two stimulus arrays was as comfortable as possible. Specifically, the subject could translate, scale, and rotate the left and right stimulus array independently. The subject was instructed to make the framing stimulus array as large as possible, without any of its parts falling out of the visible area.
See Fig. 2 B and D for a graphical representation of the framing stimulus array. It was built around a circular main stimulus aperture, with a diameter of 23-28 degrees-of-visual-angle. Its exact size depended on the shape of the subject's head and the way the subject set up the stimulus at the beginning of the experimental run. On the outside of the stimulus aperture, an outer rim was presented with a width of 10% of the inner aperture (2.3-2.8 degrees-of-visual angle). This rim also contained four checkerboard squares (3 × 3 cells) with a size of 30% of the stimulus aperture, to annotate the cardinal orientations of the stimulus array, aiding in the binocular fusion of the subject.
The inner 5% radius of the stimulus array (approximately 1.15-1.4 degrees-of-visual-angle) was occupied by a fixation dot. The inner 33% of this dot was colored red/green, the middle 33% was occupied by a black circle, and the outer part of the fixation dot was a white ring. The colored dot in the middle of the fixation circle dynamically changed color according to an exponential distribution with a scale ( ) of 0.33, corresponding to a mean duration of 3 s. However, the distribution was left-censored, such that only durations of two seconds or longer could occur, which increased the effective mean duration of the fixation dot stimuli to 5 s.

PRF mapping paradigm
We used a standard "passing bar" PRF mapping paradigm  that was presented binocularly using the binocular stimulation setup described before (see Fig. 2 B for illustration). The passing bar had a height of 12.5% of the main aperture size (approximately 2.875-3.5 degrees-of-visual-angle) and consisted of 2000 purple/cyan (50%) and green/blue (50%) overlaid gabor patches with a spatial frequency that was sampled from a uniform distribution between 0 and 30 cycles/degree. The Gabors patches were uniformly spaced across the passing bar and were dynamically and randomly repositioned according to a uniform distribution between 0 and 0.05 s. The size of the gabors was 0.75 degrees-of-visual angle. The Gabors were shifting phase at a frequency of either 3 (50%) or 12 Hz (50%). Finally, all the stimuli inside the passing bar were flickering together according to a square wave at a frequency of 12 Hz. The complete stimulus array was visually extremely salient and was designed to maximally modulate V1 activity and to optimize the efficiency of the PRF mapping paradigm. The power of the paradigm is reflected in the quality of the resulting retinotopic maps at a volumetric resolution of 1 cubic millimeter, without any smoothing and collected in just over 15 min.
Most subjects performed 3 runs of the PRF mapping paradigm (one subject performed 5 runs). Subjects were instructed to report changes of color (red/green) of the central fixation dot and ignore the passing bar stimuli.
The paradigm started with only a fixation dot for 24 s, then a bar pass from top to bottom in 30 s, a bar pass from top left to bottom right for 30 s, then a 12 s rest period with only the fixation dot and aperture stimulus array, then a 30 s bar pass from bottom to top, then a 30 s bar pass from bottom right to top left, then another 12 s rest period, then a 30 s right-to-left bar pass and 30 s bottom left to top right bar pass, another 12 s rest period and finally a 30 left-to-right and bottom left-to-top right bar pass. The total duration of the paradigm was thus 324 s (5 min and 24 s). At the end of an experimental run, 8 different and angularly equidistant bar passes were shown to the subject. The paradigm ended with a 24 s period with only the fixation dot and stimulus array.

Ocular dominance mapping paradigm
For the ocular dominance mapping paradigm, we used the same general stimulus array as the PRF mapping experiment, but the moving bar was replaced with a circular rotating and flickering checkerboard (see Fig. 2 D for illustration). The flickering checkerboard fitted exactly in the inner aperture (23-28 degrees-of-visual-angle), but was masked by a raised cosine, reducing the contrast for the outer 20% radius of the checkerboard.
The cells within the checkerboard had a width and height of 2 degrees-of-visual-angle. The contrast of the checkerboard was inverted (black cells become white and vice versa) according to a square wave with a frequency of 8 Hz. The Checkerboard was also rotating at a frequency of 0.66 rotations per second. The direction of the rotation (clockwise versus counter-clockwise) was changed once every 5 s on average, according to the exact same left-censored exponential distribution as the color of the fixation dot.
Importantly, unlike the framing stimulus array, the checkerboard was only presented in one eye at a time (monocular stimulation). It was first presented to the left eye for 12 s, then the right eye for 12 s. Then the paradigm continued with subsequent monocular stimulation for 16, 20, 24, 20, 16 and again 12 s. This particular block design was chosen based on earlier work in cats that showed that, when using GRE-fMRI, a design with rapid changes in the stimulation of different columnar patterns is more efficient in detecting columnar patterns than a design where different columnar patterns are stimulated in isolated blocks with no stimulation in between. The rationale for this approach is that, in such a rapid-switch design, non-columnar activation patterns, such as those arising from draining veins, are more similar across the stimulation conditions and thereby easier to remove in differential contrast analyses ( Moon et al., 2007 ). Before and after the monocular stimulation, an empty stimulus array was presented for 12 s. Thus, a single experimental run had a duration of 264 s (4 min and 24 s). Two subjects performed 10 runs of the binocular mapping paradigm, the other 5 subjects performed 8 runs.
On uneven experimental runs, subjects were verbally instructed to report changes in color of the fixation dot with a response button box, thereby focusing their attention on the binocularly presented part of the stimulus array. On even experimental runs, the subjects were verbally instructed to report changes in the direction of the rotation of the checkerboard , with their response button box, thereby focusing their attention on the part of the stimulus array that was monocularly presented.

Structural preprocessing
To study the 2D (cortical location) and 3D properties (laminae) of the functional organization of V1, precise delineation of the inner and outer border of the cortical surface is of the utmost importance ( Polimeni et al., 2018 ;Waehnert et al., 2014 ). Therefore, we spent a lot of resources in a hybrid approach of automatic and manual segmentation of gray and white matter and CSF, as well as the subsequent reconstruction of the cortical surface. Because the correspondence between locations on the inner and outer surface of the cortical sheet was central to some of our research questions, we mainly opted for the surface inflation-approach of Freesurfer ( Polimeni et al., 2018 ) in most of our analyses, rather than level set-based approach such as implemented in CBS tools Huntenburg et al., 2018 ). However, for some analyses we did use the latter approach to estimate cortical depth more precisely on a voxelwise basis. Furthermore, we used a combination of multiple software packages to preprocess and segment the anatomical data. The two end goals of the structural processing workflow were (a) a cortical reconstruction by Freesurfer represented as a white matter and pial mesh, consisting of points and triangles , (b) a white matter and pial surface reconstruction by CBS-tools, represented by two level sets Waehnert et al., 2014 ). To get the best possible segmentations with minimal manual intervention, we found it very helpful to combine the results of multiple segmentation algorithms as a "wisdom-of-the-crowd"approach. See below for more details.
We used the in-house developed python package pymp2rage ( de Hollander, 2018 ) to estimate the T1-UNI image ( Marques et al., 2010 ) and T1 maps of the MP2RAGE and MP2RAGE-ME data. We used the B1 + maps to correct for any residual B1 + bias fields ( Marques and Gruetter, 2013 ). Furthermore, we also estimated T2 * /R2 * -maps from the four echoes of the MP2RAGE-ME data using ordinary least-squares in log signal space.
The next part of the structural preprocessing pipeline is implemented in a Docker image that can be found at https://github.com/VU-Cog-Sci/ mp2rage _ preprocessing . The T1-UNI image of the MP2RAGE-ME protocol was registered with 6 degrees-of-freedom (rigid body transform) to the T1UNI of the MP2RAGE protocol using FLIRT  and then resampled using windowed Lanczos sinc-interpolation as implemented in ANTS ( Avants et al., 2014 ).
We corrected the mean second inversion (INV2) for bias fields using N4BiasFieldCorrection ( Tustison et al., 2010 ), distributed with ANTs 2.2.0 and used this as input to FSL's brain extraction tool (;Jenkinson et al., 2005Jenkinson et al., , 2012. The skull-stripped average INV2 was then used as input to AFNI's AutoMask mask with a clfrac -parameter setting of 0.5. to estimate the background noise level and remove voxels from the mask that only contain noise. Then, we took the averaged image of the two T1UNI images, the two INV1 and INV2 images, as well as the T1 maps and used them as input to the MP2RAGE skull strip and MP2RAGE dura estimation modules of CBS tools  as wrapped by Nighres ( Huntenburg et al., 2018 ) to make masks of the dura mater as well as the skull. The probabilistic map of the dura as estimated by Nighres was thresholded at 0.8 to obtain a discrete mask of the dura. The dura mask was then dilated by 2 voxels, excluding voxels that were manually or automatically labeled as brain by the BET algorithm.
Lastly, we made a very crude manual mask of the sagittal sinus using large voxel drawn using FSLEyes ( McCarthy, 2019 ). All voxels within this mask were then thresholded based on their estimated R2 * (1/T2 * ) value (voxels inside the sagittal sinus have very high R2 * values due to susceptibility effects). The precise threshold was manually determined but was on average approximately 1/20. This semi-automatic approach yielded a very precise mask of the sagittal sinus.
The dura and skull masks generated by Nighres, as well as the noise voxel mask estimated by AFNI as well as the mask of the sagittal sinus were set to zero in the averaged T1UNI image. The masked T1UNI image was then used as input to the structural preprocessing module of fmriprep ( Esteban et al., 2019 ).
In that pipeline, the T1-weighted (T1w) image was corrected for intensity non-uniformity (INU) with N4BiasFieldCorrection ( Tustison et al., 2010 ), distributed with ANTs 2.2.0, and used as T1w-reference throughout the workflow. The T1w-reference was then skull-stripped using antsBrainExtraction.sh (B. B. ( Avants et al., 2008 )), using OASIS30ANTs as target template. Brain surfaces were reconstructed using freesurfer's recon-all , and the brain mask estimated previously was refined with a custom variation of the method to reconcile ANTs-derived and FreeSurfer-derived segmentations of the cortical gray-matter of Mindboggle ( Klein et al., 2017 ). Spatial normalization to the ICBM 152 Nonlinear Asymmetrical template version 2009c ( Fonov et al., 2009 ) was performed through nonlinear registration with antsRegistration (ANTs 2.2.0), using brain-extracted versions of both T1w volume and template. Brain tissue segmentation of cerebrospinal fluid (CSF), white-matter (WM) and gray-matter (GM) was performed on the brain-extracted T1w using FAST ( Zhang et al., 2001 ). The fmriprep preprocessing workflow yielded a cortical surface representation in Freesurfer format in " hires "-mode, with an average edge length of 0.55 mm in the white matter surface and 0.66 at the pial surface and that was used in further analyses.
The masked T1UNI, T1map and skull and dura masks of Nighres were also used as input to the MGDM segmentation algorithm Bogovic et al., 2013 ), as wrapped by Nighres ( Huntenburg et al., 2018 ). Then, to create a level set-based cortical surface representation, the voxelwise GM/WM/CSF-segmentations of FAST ( Zhang et al., 2001 ), Freesurfer , MGDM Bogovic et al., 2013 ), as well as manual "correction masks" (see below) were averaged and used as input to the CRUISE cortical surface reconstruction algorithm ( Han et al., 2004 ), as implemented in Nighres ( Huntenburg et al., 2018 ). The manual masks were weighted 5 times more than the automatic segmentation algorithms. This "wisdom of the neuroimaging software crowd"-approach led, in our experience, to much more precise cortical surface reconstructions than using only the MGDM (or any other) segmentation as input to the CRUISE cortical surface reconstruction algorithm.
After these two pipelines were run, both the pial and white matter surface meshes of Freesurfer and the levelset representations of CBStools were inspected in FSLEyes ( McCarthy, 2019 ). Any missegmentations were annotated with manual "correction masks", after which the entire preprocessing pipeline was run again, using these manual masks as additional input. We usually needed 3-4 of these iterations to get satisfying cortical surface reconstructions.
When the Freesurfer cortical surface reconstruction was sufficient, we made 6 additional surfaces between the pial and white matter surface using the "equivolumetric layering" formulas described by Waehnert et al. (2014) as implemented by Konrad Wagstyl in his "Surface Tools"-package 3 .

Functional preprocessing
Functional data was minimally processed and extra care was taken to not smooth the data by resampling the data only once. The preprocessing workflow was implemented in nipype ( Gorgolewski et al., 2011 ) and can be found on GitHub 4 . It was identical for the PRF and ocular mapping paradigm.
The following steps were performed. First, each run was independently motion-corrected using linear rigid-body registration and a normalised correlation cost function towards the mean of the run with the MCFLIRT algorithm as implemented in FSL ( Jenkinson, 2002 ). The same motion-correction was applied to the reverse phase-encoded data. Then, average images for both time series were constructed, where we used only the middle volumes of the task data, such that the unwarping algorithm would use the same number of volumes for the main task and reverse phase-encoded data. These two averages images were bias field corrected using N4BiasFieldCorrection ( Tustison et al., 2010 ) after which they were registered towards each other (unwarped) following the "TOPUP" unwarping algorithm described by Andersson et al. ( Andersson et al., 2003 ), as implemented in the qwarp program of AFNI ( Cox, 1996 ) using median filtering across 1 mm and a minimum patch size of 5 mm. Then, the unwarp field was applied to the averaged task data (this time using all volumes in the run), using Windowed Lancos Sinc interpolation.
The averaged task data was then registered to the T1-weighted (T1UNI) anatomical data using the Gray/White-matter segmentation of MGDM as input to the boundary-based registration (BBR) algorithm as implemented by FSL's FLIRT ( Greve and Fischl, 2009 ;Jenkinson, 2002 ), with 9 degrees-of-freedom. We chose the 9 degrees of freedom, with additional scaling across the axes in addition to rigid body transformations, to allow for some flexibility in correcting any global residual B0 distortions. Then, the MeasureImageSimilarity function of ANTS with a Mutual Information cost-function was used to detect which of all the task data runs that was most similar (registered best) to the T1-weighted data. This was then used as a 'reference BOLD image' to which all the other runs were registered once more. This last step turned out to be helpful, since for most subjects 1 or 2 runs were completely misregistered (due to the very small field-of-view that hinders many automatic registration algorithms). Without this step, the alignment between subsequent task runs was often not sufficient.
Finally, for every volume of every run, (1) the motion-correction affine matrix, (2) the distortion unwarp field, (3) the registration to the T1-weighted anatomical image and (4) the refined registration to the best-registered reference BOLD image were all concatenated, so that the resampling of every individual volume in the timeseries to anatomical space could be done in one single Windowed Lanzcos Sinc interpolation step, as implemented in ANTS (2.2.0). This yielded an unsmoothed volume time series at a resolution of 0.7 mm isotropic, very well-aligned with the T1-weighted anatomical data.
We then used the white-matter/gray-matter segmentation of MGDM to create aCompCorr time series that can be used to regress out spatially correlated structured noise. ACompcorr components are time series based on projections of raw data on the largest components of a PCA decomposition of the signal in an eroded mask of the CSF and white matter, that is believed to represent structured (physiological) noise ( Behzadi et al., 2007 ). We also extracted the translations and rotations of the MCFLIRT motion correction ( Friston et al., 1996 ), as well as Framewise Displacement ( Power et al., 2012 ) and DVAR ( Power et al., 2014 ;Smyser et al., 2010 ). The functional data were also high pass-filtered by subtracting the output of a Savitzky-Golay filter with a polynomial order of 3 and a window length of 120 s ( Savitzky and Golay, 1964 ). The resulting time series were then divided by the average image intensity of the corresponding voxels and divided by 100, to get a "percent signal change" time series.
For the purpose of fitting the PRF model, we then sampled the raw BOLD time series to surfaces vertices by taking the mean across all cortical depths for the PRF model using the mri_vol2surf function of Freesurfer  and using trilinear interpolation.

PRF fitting
For fitting the PRF model, we sampled vertex-wise time series across the entire depth of the vortex. We did so because the retinotopic locations of neurons at the same 2D cortical location across cortical depth are very similar ( Fracasso et al., 2016 ) and it allowed us to pool data across voxels, increasing the signal-to-noise ratio and the quality of the estimates of the retinotopic location of different 2D locations in V1. To further increase the signal-to-noise ratio of the data, we also linearly projected the DVAR time series, the Framewise Displacement, translation and rotation-parameters from the motion correction, as well as 6 anatomical CompCorr-components out of the raw signal ( Power et al., 2014 ). Then, we estimated a standard Gaussian PRF model  with 3 parameters: angle and eccentricity (equivalent to x/y-coordinate of the Gaussian) and a size (standard deviation of the Gaussian). We used the popeye Python package ( DeSimone et al., 2016 ) to construct hypothetical time series for a large range of parameters (25 eccentricities x 32 angles x 25 sizes). We then correlated these hypothetical time series with the actual time series for every vertex, sampled across its entire depth and took for every vertex the parameter settings with the highest correlation. This "grid search" approach already yielded a relatively coarse retinotopic map of V1, which were used as a starting point for a gradient descent optimization algorithm as implemented in PopEye ( DeSimone et al., 2016 ). The resulting parameters were then plotted back onto the surface using Pycortex ( Gao et al., 2015 ). We used an (arbitrary) R2 threshold of 15% within-set explained-variance explained to mask out very noisy vertices. Then, we manually delineated the left and right V1/V2-border on the polar angle maps of every individual subject. The borders of V1 and V2 were defined by visual angle phase reversals at the dorsal (lower vertical meridian of the visual field) and ventral (upper vertical meridian of the visual field) banks of the calcarine sulcus ( Wandell and Winawer, 2011 ).

Estimation of ocularity maps
To estimate ocular dominance of cortical locations, we estimated a simple general linear model (GLM) using nistats ( Nistats 2019 ) on the preprocessed, voxelwise time series. The GLM consisted of one regressor representing stimulation of the left eye, convolved with a standard HRF, and a second regressor represented stimulation of the right eye, convolved with a canonical HRF. We also added 6 nuisance regressors, which were the 6 components that explained most variance of a principal component (PCA) decomposition of all the 14 collected nuisance regressors: volume-wise DVAR, framewise displacement, six translation and rotation parameters provided by the motion-correction algorithm, as well as six anatomical CompCorr-regressors. We opted for this PCAapproach because the nuisance regressors were highly correlated, and the total number of volumes (degrees-of-freedom) of the ODC data was low due to our 4-second TR.
We then fit parameter estimates for the average BOLD activation for left and right eye stimulation, as well as the contrast for left > right eye stimulation for every run separately, as well as over all runs combined. These parameter estimates were normalized for their variance, resulting in corresponding z-maps.
We then used the cortical depth estimates for every voxel provided by the CRUISE cortical surface level sets Han et al., 2004 ;Huntenburg et al., 2018 ) to assign voxels to 5 bins of equivolumetric cortical depth. We used those bins to estimate the effects of cortical depth on the height of the BOLD response in different task conditions, as well as its interaction with stimulation condition and/or attentional condition. We also applied the linear laminar deconvolution approach developed by Marquardt and colleagues ( Markuerkiaga et al., 2016 ;Marquardt et al., 2018 ). Here, the activation of 5 different bins was modeled as a linear combination of 5 underlying neural populations. For example, the BOLD response at deepest layer is assumed to be a result of BOLD activation due to neural activation only in the deepest layers, whereas the 2nd-to deepest layer is assumed to be a linear sum of both BOLD responses to the neural activity in the deepest layer and the 2nd-to-deepest layer, etc.
The "left > right" contrast z-map was also sampled to 6 equivolumetric cortical surfaces as defined by Freesurfer. This was done for the analysis of the spatial frequency of the ocularity profiles on the cortical surface, as well as to gauge the consistency of the ocular dominance columns within subjects across runs. To not induce any spatial smoothing, nearest neighbor interpolation was used for this final interpolation step.
After estimating the z-maps, they were visually inspected on the surface, with the V1/V2-masks overlaid. All z-maps were visually consistent with ocular dominance column patterns: except for one hemisphere in one subject, they showed relatively large changes in z-value (2-4) over relatively short amounts of geodesic distance (~1.5 mm). The left hemisphere of subject 1, however, showed a very large right-sensitive blob of activity in parafoveal V1. Since these patterns did not qualitatively align with ocular dominance patterns, this hemisphere was excluded from all further analyses. This is why in almost all analysis there was one more right hemisphere than left hemispheres. After visual inspection of the z-maps, we tested them for robustness by correlating the z-map of the first half of the session with the second half of the session. For one subject (subject 5), we found no consistent pattern. Both hemispheres of this subject were excluded from all further analyses.

Spatial frequency estimation on the surface
To estimate the spatial frequency of the ocularity maps, we applied 2D Gabor filters ( Forsyth and Ponce, 2015 ) to the ocularity maps after resampling them to a flattened cortical surface. We used Freesurfer's tksurfer to cut out V1 from the rest of cerebral cortex and used Freesurfer's flattening algorithm ( mris_flatten ) to convert the 3D vertex coordinates to 2D plane coordinates, minimizing any distortions in their distance matrix. We then resampled the ocularity z-maps from their flattened 2D coordinates to a gridded plane with a resolution of 0.35 mm using linear interpolation. We then convolved this plane with Gabor patches with a frequency ranging from 1 mm/cycle up to 10 mm /cycle in 50 logarithmically spaced steps and 16 equally space orientations. Then, for every vertex, this yielded an estimate of its power for a specific frequency and orientation. We assigned a "main frequency" to every vertex by finding the frequency with the maximum amount of power, summed over all orientations. We then plotted the distribution of main frequencies across vertices and estimated its mode (most common frequency). We repeated this procedure across multiple cortical depths, so we could investigate any shift in main frequency and power across cortical depth.

Inverted encoding model
To estimate the amount of information that the entire pattern of activity in V1 contained, we used an adaptation of the inverted encoding model described by van Bergen et al. ( van Bergen et al., 2015 ). Specially we assumed that the BOLD acitvity in each vertex in V1 could be explained by a weighted sum of a "left eye" and "right eye" neuronal population. This weight matrix W (as many rows as vertices and 2 columns -one column represent the left eye, the other the right eye-) was estimated by solving the linear system = Where is the time course in vertex i and X is a design matrix with two columns, indicating which eye was stimulated at a given frame (either 0 or 1, shifted by 4 s in time to correct for hemodynamic delay). To regularize the weights matrix, we used ridge regression with a prefixed alpha-parameter of 1.0 ( Hoerl and Kennard, 1970 ;Nunez-Elizalde et al., 2019 ), corresponding to a Gaussian prior with standard deviation of 1 on the weights. This regularizes the found weights towards more plausible values near 0.
Using the fitted weight matrix of the vertex-wise GLM, we fitted a regularized multivariate normal to the unexplained residuals − : Where the covariance matrix Σ was defined as a weighted sum weighted of the diagonal covariance matrix • and a perfect correlation matrix ( Ledoit and Wolf, 2003 ), as well as the estimated variance of the neural populations, 2 (van ( van Bergen et al., 2015 ), and finally the empirical covariance matrix Σ (van ( van Bergen and Jehee, 2019 ): The weight matrix W was estimated with ridge regression (see above), represents the amount of regularization and was set to 0.9 based on cross-validation (performance gains compared to of 0.1, 0.5 or 1.0 depended on the amount of vertices used, but were in the order of 5% percentage point classification accuracy), Σ is directly estimated from the data (cross product of data matrix) and , , and 2 are optimized using gradient descent . We implemented the model in Tensorflow ( Abadi et al., 2016 ), which calculated the gradients of the parameters with respect to the parameters automatically ( Bartholomew-Biggs et al., 2000 ;Corliss et al., 2002 ), thereby drastically improving the speed of the optimization process. To further speed up the optimization process, we selected a subset of 400 vertices in the V1 mask that showed the highest explained variance (R 2 ) in the GLM to be used for the residual model.
After fitting the residual model, we estimated the maximum-aposteriori (MAP) stimulus for unseen data D by inverting the likelihood model using an informed stick prior that only allowed either activation of the left eye with 1 or the activation of the right eye with 1: We then used the (log) Bayes Factor of the probability of a left versus right eye stimulation as a measure of decoded-eye and its uncertainty: The log 10 (BF) represented the logarithm of the odds that the left or right eye was stimulated. When log 10 (BF) was above 0, the model predicted left eye stimulation, whereas if log 10 (BF) was below 0, the model predicted right eye stimulation.
We used a two-leave-out cross-validation scheme: For every fold, two runs were left out for validation purposes: one run with attention on the fixation cross and one run with attention on the checkerboard.

Linear versus quadratic model estimation
For two variables, namely total spectral power and decoding accuracy, we wanted to see if they increased or decreased linearly with cortical depth, or that they showed a clear peak at a specific cortical depth. To do this, we performed Bayesian model comparison. Specifically, we fitted two hierarchical linear model using pymc3 ( Salvatier et al., 2016 ) and its implementation of the NUTS sampler ( Hoffman and Gelman, 2014 ).
The dependent variable (decoding accuracy and spectral power) was first modeled as a linear function of cortical depth: and then also as a quadratic function of cortical depth Where were all observations for subject , is a corresponding vector of cortical depths and is the parameter vector for subject that was estimated.
The subject-wise parameters were modeled as coming from a Gaussian group distribution mean and standard deviation : We used the state-of-the-art Watanabe-Akaike information criterion to do Bayesian model comparison between the linear and quadratic models Watanabe, 2013 ). Furthermore we, estimated the posterior of the peak of the quadratic function using the formula − 3 2 2 and calculated its 95% confidence interval using the highest posterior density approach ( Kruschke, 2014 ).