Attentional influences in primary visual cortex: an investigation of key task factors

Whether or not spatial attention can boost the initial volley of visual processing in V1 remains controversial. In particular, two recent studies failed to replicate an earlier study that found a spatial attention modulation of the earliest, V1-generated component of the human VEP (“C1”). Here, we sought to reconcile these findings through a careful consideration of the computational demands imposed by the target detection tasks. We conducted 3 new experiments. The first sought to elucidate the role of target-non target feature similarity and the second, the level of feedback provided. The third experiment was a close replication of the task conditions of the original experiment. Taking all three experiments together, attention boosted C1 amplitude. However, this effect was present in only the second and third experiments, with the first showing a modulation in the reverse direction. This reversal coincided with differing behavioural results, perhaps reflecting different strategies employed by participants to carry out the task. Thus, although these findings affirm our general hypothesis that the determining factor for attentional modulation of the very earliest sensory representations relates to the precise computational demands of the perceptual task, further work is needed to pinpoint the computational principles that the attention system follows.


Introduction
The question of whether the initial sweep of visual processing in area V1 (as indexed by the C1 component of the visual evoked potential) can be modulated by spatial attention has long been controversial and has garnered renewed interest in recent years (Slotnick, 2017). In particular, one convincing demonstration of a C1 modulation by spatial attention (Kelly, Gomez-Ramirez and Foxe, 2008) has recently been subject to two replication attempts, neither of which reproduced the modulation (Baumgartner, Graulty, Hillyard and Pitts, 2018;Alilovic, Timmermans, Reteig, van Gaal and Slagter, 2019). However, subtle yet perhaps crucial differences were present between the original experiment and the repetitions. In all three, participants covertly attended one of two locations to detect whether an impending Gabor stimulus would contain a superimposed darkened ring target. However, the background luminance in the original experiment was dark and the Gabor stimulus thus consisted of both a contrast and a luminance component. By contrast, the repeated experiments used pure-contrast Gabor stimuli. Spectral analysis of these stimuli both with and without the darkened ring target (see Fig. 1A) demonstrate that, in the repeated experiments, low spatial frequencies were present only when targets occurred but these were present regardless of whether or not a target occurred in the original experiment (Kelly and Mohr, 2017). This is important because the brain may have capitalized on this unique aspect of target stimuli in the replication attempts and thereby modulated only low spatial frequency coding neurons in V1. Since these neurons do not code for the high spatial frequency Gabor stimulus, such a modulation may have been missed. This strategy was not so readily available in the original experiment however, as both targets and non-targets contained low spatial frequency components. A second discrepancy between the original and repeated experiments pertained to the feedback that participants received. All experiments adjusted the difficulty level based on participant performance, but the original experiment provided feedback to participants about their difficulty level while the repeated experiments provided feedback on hit and false alarm rates only. The latter does not fully capture the participant's performance and may be less effective at motivating participants to perform at a high level since a given hit and false alarm rate can be maintained at any difficulty level. To address these two discrepancies, we ran two experiments, one of which manipulated target-non target feature similarity and the other of which manipulated performance feedback. We also ran a third experiment in which we repeated the original experiment.

Method
In experiment 1, we address the possible moderating factor of feature similarity between targets and nontargets in yielding C1 attention modulations by employing two different types of target (see Fig. 1C). As

1099
This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0 in the previous experiments, participants monitored one of two locations to detect whether an upcoming Gabor stimulus contained a target. This target either had similar features to the non-target (a superimposed orthogonal Gabor stimulus of identical spatial frequency) or dissimilar features to the non-target (a uniform luminance disc of low spatial frequency). Stimuli appeared for 100 ms and participants clicked a mouse button as soon as they detected a target at the attended location; the unattended location was always task-irrelevant. The target sought changed half way through the experiment and the target used first was counterbalanced between participants. In experiment 2, we further investigated the potentially moderating role of online performance feedback in the same task, using only the high-frequency target. The experiment design (in both experiments) included online navigation through 11 different difficulty levels based on performance. Feedback was given when participants made correct detections, false alarms or changed difficulty level (which would increase upon two correct detections or decrease after two false alarms or a single miss). Participants also received a detailed breakdown of performance and level progression following each block. In the low-feedback condition, all this feedback was absent and replaced instead with a simple report of hit rates and false alarm rates following each block. Finally, in a third experiment we repeated the original experiment, employing the ring target and using a dark background such that non-target Gabors had both a luminance and a contrast component, as in the original experiment finding the C1 modulation (Kelly et al 2008). Given the large impact of seemingly small discrepancies between the original experiment and the failed replications, care was taken here to replicate as many aspects of the original experiment as possible, even down to the precise monitor distance. However, two changes were made. Firstly, the original experiment used a long probe session in which participants engaged in a similar task to the main experiment in order to choose stimulus locations that would yield an optimal C1 for each participant. Here, we replaced this probe session with a 10-minute, task-free, multifocal pattern-pulse procedure so that the experiment could be completed in a single sitting. Secondly, performance feedback was more extensive here as the original experiment did not include feedback during the block.

Participants
Seventeen participants were recruited for experiment 1, a further 17 participants for experiment 2, and 16 participants were recruited for experiment 3. All procedures were approved by the human research ethics committee of UCD and followed the guidelines outlined in the Declaration of Helsinki.

Behavioural Results
Responses were slower for the upper visual field in experiment 1 (p<.01) and experiment 3 (p<.05) but not experiment 2. They were also less accurate in the upper field in experiment 1 (p<.05) and experiment 2 (p<.05) but not experiment 3. In experiment 1, responses were slower in the Gabor-target condition (p<.01) and in experiment 2, responses were faster (p<.01) with higher difficulty level achieved (p<.05) and improved hit rates (p<.05) in the high-feedback condition. Since the high feedback-condition of experiment 2 was identical to the Gabor-target condition of experiment 1, these two conditions were compared directly between experiments (see Fig. 3). While there was a trend for RTs to be faster in experiment 2, this was not significant (p=.06). However, peak difficulty level achieved (p<.05), hit rate (p<.01) and d' (p<.05) were all higher in experiment 1 than in experiment 2.

VEP Results
C1 amplitudes were measured between 70 and 90 ms from electrodes that were chosen individually for each participant based on grand average topographies collapsed across attention conditions. Two participants from experiment 1 and one from experiment 3 were excluded as they did not display a clear C1. Two participants in experiment 2 did not complete the highfeedback condition and were also excluded. Only trials without a target were included in this analysis. While there was a significant attention by target-type by location interaction in experiment 1 (p<.05) that was driven by the Gabor target in the upper visual field (p<.05; see Fig. 4), the modulation was such that attended stimuli yielded smaller C1s rather than larger. In experiment 2, there was a main effect of attention in the expected direction (p<.05) but this was not moderated by the feedback condition. Finally, in experiment 3, there was again a main effect of attention on the C1 (p<.05). Collapsing across all three experiments (see Fig. 5), the main effect of attention prevailed (p<.01). To ensure that attention was indeed deployed during these experiments, modulation of the P1 was measured, as the task design did not yield unattended behavior and the P1 has been routinely found to modulate with spatial attention (see Fig. 5). Indeed, robust P1 modulations were found in all three experiments (p<.001 in all cases).

Discussion
Although a modulation of the C1 by attention was observed collapsing across all three experiments, it was not universally present in each one. In particular, experiment 1, which investigated the primary hypothesis that the elusiveness of the C1 attention effect may be due to dissimilarities between targets and non-targets in terms of stimulus features, showed a modulation in the reverse direction for the highfrequency target (but only in the upper visual field). Nevertheless, as expected, the low-frequency target did not yield a modulation. Perplexingly, the high-feedback condition of experiment 2, which was identical to the high-frequency target condition of experiment 1, yielded a modulation in the expected direction. This discrepancy in C1 results coincided with a discrepancy in the pattern of behavioural results, with better performance, and a trend towards slower response times, exhibited in experiment 1 compared with experiment 2. This, raises the intriguing possibility that these groups deployed slightly different strategies to carry out the task, which may have incurred different overlapping signals in the C1 time range. Indeed, it has been highlighted that the C1 likely consists of contributions from V2/V3, which oppose V1 anatomically, and thus also in terms of scalp topography (Ales, Yates and Norcia, 2013). Interestingly, dipole modelling conducted by Ales et al. (2013) suggest that V1-V2 overlap may be greater for the upper than the lower visual field, mirroring the present reversal of the C1 attention effect in the upper visual field only. It may in fact be the case that in order to observe a modulation of the C1, not only does afferent V1 activity need to be modulated but it also needs to be modulated above and beyond any modulation applied to geometrically opposing V2/V3. One might therefore speculate that the divergent C1 results observed in the present experiments may reflect a relative preferential reliance on V2/V3 among participants in the target-type experiment compared with those in the feedback experiment.

Conclusion
Spatial attention can modulate the C1 component of the VEP but observation of this modulation likely depends on the specific demands of the task at hand. The characteristic elusiveness of C1 modulations may also be due in part to the flexibility of the brain to deploy different strategies that may sometimes involve modulations of both V1 and V2/V3, which tend to cancel each other out on the scalp.  broken down by difficulty level. Adjacent are difference waveforms between condition (target type/feedback type; above) and visual field location (below). Perceptual sensitivity (d'). Response times and difficulty level are broken down by quintile bins. Hit rates and d' are broken down by difficulty level. Adjacent are difference waveforms between experiments (above) and visual field location (below)