Topographic signatures of global object perception in human visual cortex

Our visual system readily groups dynamic fragmented input into global objects. How the brain represents global object perception remains however unclear. To address this question, we recorded brain responses using functional magnetic resonance imaging whilst observers viewed a dynamic bistable stimulus that could either be perceived globally (i.e., as a grouped and coherently moving shape) or locally (i.e., as ungrouped and incoherently moving elements). We further estimated population receptive fields and used these to back-project the brain activity measured during stimulus perception into visual space via a searchlight procedure. Global perception resulted in universal suppression of responses in lower visual cortex accompanied by wide-spread enhancement in higher object-sensitive cortex. However, follow-up experiments indicated that higher object-sensitive cortex is suppressed if global perception lacks shape grouping, and that grouping-related suppression can be diffusely confined to stimulated sites and accompanied by background enhancement once stimulus size is reduced. These results speak to a non-generic involvement of higher object-sensitive cortex in perceptual grouping and point to an enhancement-suppression mechanism mediating the perception of figure and ground.


Supplementary methods and results
1.1. Diamond experiment 1.1.1. Data analysis Perceptual durations. Participants' key presses were used to calculate the durations of the diamond and no-diamond percept. If the same key was pressed multiple times in succession, the resulting subdurations were summed up. The period from the onset of the diamond display until participants' first key press was discarded. For each participant and the data pooled across participants, we then fit the durations for the diamond and no-diamond percept with a two-parameter (α: shape, β: rate) gamma probability density function using the maximum likelihood method. The resulting fits were superimposed onto a density histogram of the perceptual durations (bin width: 2 s).
Eye position. To examine eye position stability, we analyzed the vertical and horizontal position samples corresponding to the durations of all perceptual intervals (i.e., diamond, no-diamond, and fixation). As for the analysis of perceptual durations (see Supplementary material 1.1.1 Data analysis), the interval from the onset of the diamond display until the first key press was discarded. Moreover, horizontal and vertical position samples with unavailable position information (e.g., when participants blinked or the signal dropped) were removed. This procedure resulted in substantial data loss for P4 (> 20 % of all samples in any perceptual interval in > 20 % of all runs rounded up to the next integer). We therefore excluded P4 from all subsequent analyses.
Slow drifts in the eye tracking signal were counteracted by detrending the time series of horizontal and vertical posites in each run using a robust time-windowed filter. To this end, we passed a sliding window (length: 10 s) through each time series by translating its midpoint from one time point to the next (range: 5-430 s; step size: 5 s) and calculating the median over all position samples within the current window. Note that all sliding windows were rightbounded except for the last one which was right-open to account for slight measurement imprecision. The resulting window-wise medians were then subtracted from the corresponding eye position samples and very extreme values (> |8.5 dva| in both the horizontal and vertical direction; cut-off determined based on screen height) eliminated afterwards.
To explore percept-specific differences in the bivariate distribution of position samples, we generated bagplots (Rousseeuw et al., 1999) across all runs for each perceptual interval and participant. Bagplots represents a (modelfree) 2D generalization of univariate boxplots and typically include a center point corresponding to the depth median, a bag demarking a region containing 50% of the data, a fence (usually not visualized) generated via inflating the bag by a factor of 3, and a loop circumscribing data points between the bag and the fence. Data points beyond the fence are considered outliers.
To further asses fixation variability, we determined the run-wise robust scale parameter S n (Rousseeuw & Croux, 1993) for the horizontal and vertical position samples of each perceptual interval. Moreover, to check for systematic biases towards the (perceived) movement direction of the diamond stimulus more directly, we subtracted the S n of the vertical samples from the S n of the horizontal samples for each interval type (fixation served as a benchmark here). If participants' gaze was indeed biased towards the horizontal direction during the diamond precept and the vertical direction during the no-diamond precept, we should observe a positive difference for the diamond inter-vals and a negative difference for the no-diamond intervals. Such dispersion differences should be likewise evident in the bagplots.

Results
Perceptual durations. The probability density histograms of the durations per perceptual state for each participant and the pooled data with superimposed gamma fit can be found in Supplementary Figure S2. Despite inter-individual variability in the shape and rate parameters, both the pooled and individual diamond and no-diamond durations seem to be well fit with a gamma distribution, suggesting they follow similar temporal dynamics. However, all participants except P2 showed a tendentially higher probability density of longer durations for the no-diamond relative to the diamond percept. Likewise, these participants showed a higher median duration for the no-diamond percept and spent a higher proportion of time in this perceptual state, which was also reflected in the pooled results. Consequently, the perception of most participants was slightly biased towards the no-diamond state.
Eye position. Supplementary Figure S3 shows bagplots of the eye position samples across runs (A.) as well as simple (B.) and differential (C.) run-wise dispersion estimates for each participant and perceptual interval. For most participants, the size of the bag was fairly small and similar across perceptual intervals, suggesting a small degree of position variability unrelated to the perceptual interval at hand. Moreover, for most participants, the shape of the bag and loop was largely symmetric (circular or elliptical) and comparable across perceptual intervals, speaking against a systematic skew. For P3, however, the size of the bag tended to be generally larger compared to other participants and stretched out horizontally in the no-diamond relative to the diamond interval, highlighting greater position variability and a modest horizontal bias (see all Supplementary Figure S3, A.). Importantly, however, this bias ran counter to the perceived vertical movement during the no-diamond percept. The run-wise dispersion estimates (Supplementary Figure S3, B. and C.) confirmed all these observations whilst additionally revealing that the horizontal bias for P3 was predominantly present in the last run.
1.2. Dots experiment 1.2.1. Data analysis Eye position. The eye position analysis was conducted as in the diamond experiment. However, due to different design parameters, the range of the sliding window midpoints was 5-355 s here. Moreover, as a result of signal drop due to technical issues, the data for P3 and P4 were removed from all analyses.

Results
Eye position. Supplementary Figure S7 lists the bagplots of the eye position samples across runs per participant and perceptual interval (A.) together with the corresponding run-wise simple (B.) and differential (C.) dispersion estimates. For all participants and perceptual intervals, the shape of the bag and loop was reasonably symmetric and the size of the bag fairly small. Importantly, these characteristics were comparable across perceptual intervals (see all Supplementary Figure  Eye position. The eye position analysis was conducted as in the diamond and dots experiment except that the range of the sliding window midpoints amounted to 5-370 s here (since we also collected eye data during the final blank, see 5.1.4 Procedure).

Results
Eye position. Supplementary Figure S11 depicts bagplots for the eye position samples of all runs (A.) along with run-wise simple (B.) and differential (C.) dispersion estimates per participant and perceptual state. For all participants, we observed relatively small-sized bags as well as symmetrically-shaped loops and bags that were highly similar across perceptual intervals (see all Supplementary Figure S11, A.). The run-wise dispersion estimates confirmed this pattern (Supplementary Figure S11, B. and C.). These outcomes suggest an overall low position variability that seems unrelated to the motion direction in the horizontal or vertical condition.

Supplementary figures and tables
A.

P5
E. Vertices with a pRF surpassing an eccentricity of 15 dva were discarded (no other post-smoothing thresholding was applied). Note that these pRF maps were subjected to the experiment-specific smoothing procedure in the diamond experiment (see 3.1.7 Data analysis). The color disks represent the color schemes used to label different visual field portions. E. Differential brain activity resulting from contrasting periods of intact to periods of phase-scrambled images. Differential betas surpassing a value of ± 2 were set to that value. Cold colors reflect negative and warm colors positive differential beta values as indicated by the color bar.  C. E. Figure S4. Diamond experiment| Searchlight back-projections of differential brain activity as a function of contrast of interest and visual area. A.-E. P1-P5. T -statistics surpassing a value of ± 25 (first and second row) or ± 15 (third row) were set to that value. The saturation of colors reflects the number of vertices with a pRF inside a given searchlight plus the inverse distance of these pRFs from the searchlight center. White lines represent the extreme positions of the diamond stimulus. White solid lines denote the visible ungrouped diamond segments. White dashed lines additionally illustrate the inferred but invisible diamond shape when the segments were grouped together. D = Global, diamond percept. ND = Local, no-diamond percept. Fix = Fixation baseline. VLOC = Ventral-and-lateral occipital complex. P1-P5 = Participant 1-5. pRF = Population receptive field. 6 Figure S5. Diamond experiment| Searchlight back-projections of differential brain activity as a function of contrast of interest and visual area. Please note that the visual field coverage for these visual areas was suboptimal in some participants.
T -statistics surpassing a value of ± 25 (first and second row) or ± 15 (third row) were set to that value. The saturation of colors reflects the number of vertices with a pRF inside a given searchlight plus the inverse distance of these pRFs from the searchlight center. White lines represent the extreme positions of the diamond stimulus. White solid lines denote the visible ungrouped diamond segments. White dashed lines additionally illustrate the inferred but invisible diamond shape when the segments were grouped together. D = Global, diamond percept. ND = Local, no-diamond percept. Fix = Fixation baseline. LO = Lateral-occipital area. LOV3B = Areas LO-1, LO-2, and V3B. VO = Ventral-occipital area. Pooled = Data pooled across all 5 participants. pRF = Population receptive field.        E. Figure S12. Dots quadrant experiment| Searchlight back-projections of differential brain activity as a function of contrast of interest and visual area. A.-E. P1, P6, P9-P11. T -statistics surpassing a value of ± 25 (first and second row) or ± 15 (third row) were set to that value. The saturation of colors reflects the number of vertices with a pRF inside a given searchlight plus the inverse distance of these pRFs from the searchlight center. White lines represent the spatial extent of the circular apertures carrying the RDK. H = Global, horizontal condition. V = Local, vertical condition. Fix = Fixation baseline. VLOC = Ventral-and-lateral occipital complex. P1, P6, and P9-P11 = Participant 1, 6, and 9-11. RDK = Random dot kinematogram. pRF = Population receptive field. Figure S13. Dots quadrant experiment| Searchlight back-projections of differential brain activity as a function of contrast of interest and visual area. Please note that the visual field coverage for these visual areas was suboptimal in some participants.

D vs ND
T -statistics surpassing a value of ± 25 (first and second row) or ± 15 (third row) were set to that value. The saturation of colors reflects the number of vertices with a pRF inside a given searchlight plus the inverse distance of these pRFs from the searchlight center.