Introduction

Whenever people look for an object, they prioritize those aspects of the visual environment that match their attentional set. This set is thought to represent a target-defining feature—for instance, the object’s color. It biases the visual system such that attention is guided toward the target object (e.g., Olivers & Eimer, 2010; Wolfe & Horowitz, 2004). Is there a cost associated with switching from one attentional set to another?

There is a large body of literature showing that switching from one task to another is associated with costs (Jersild, 1927; Monsell, 2003). However, task switching involves not only a switch in attentional sets, but also, a switch in response categories (Rushworth, Passingham, & Nobre, 2005). The aim of the present study was to investigate the costs of a purely attentional switch, rather than a complete task switch.

Several previous studies have used a cuing procedure to address this issue. For example, Vickery, King, and Jiang (2005) presented observers with a visual search task that was preceded by a cue indicating the target on each trial. Search became slower with decreasing stimulus onset asynchrony (SOA), suggesting that observers needed time to implement the new attentional set. However, this time interval may at least partly reflect the time needed to process the cue, and not an attentional switch cost. Vickery et al. always changed the target from trial to trial and, thus, did not have a baseline allowing the comparison between switch and no-switch trials. Wolfe (2004) employed a similar procedure in which observers looked for a cued target. Again, search improved with more time between the cue and the search display. This study did include a baseline condition. In this condition, target identity was blocked, such that both the target and the cue were always the same from trial to trial. However, note that this then eliminates the need to process the cue in the baseline condition, which again raises the possibility that the reaction time costs observed in the other conditions reflect the cue-processing time, instead of switch costs. Another study, by Rushworth et al. (2005), directly compared cues instructing observers to switch attentional sets with cues instructing them to maintain the current set in a mixed design. This way, both types of cue had to be processed. They found target response times to be slowed by only about 10 ms after a switch cue, suggesting that attentional switch costs may only be marginal. However, the time between the cue and the target display was fixed at 2,000 ms, allowing for ample time to prepare. Accordingly, the small effect might therefore represent an underestimation.

Performance costs are apparent when people are asked to look for two targets at the same time, relative to looking for only one target (Houtkamp & Roelfsema, 2009; Menneer, Cave, & Donnelly, 2009; Moore & Osman, 1993). Even though this suggests that it is difficult to maintain two attentional sets at the same time, this does not directly demonstrate that there is a cost of switching between them. In fact, Adamo and colleagues (Adamo, Pun & Ferber 2010; Adamo, Pun, Pratt, & Ferber, 2008; Adamo, Wozny, Pratt, & Ferber, 2010) recently proposed that two attentional sets can even be used simultaneously, as long as each feature set is tied to a specific location—for example, green to the left and red to the right side of the display’s center. Adamo et al., (2008) first presented a placeholder box on each side of a fixation cross. After a brief delay, a color cue appeared around one of the placeholders, followed by the presentation of a target in one of the placeholder boxes. Participants were asked to respond only to targets of one color (e.g., blue) on one side (e.g., left) and to targets of the other color (e.g., green) on the other side (e.g., right) and to refrain from responding in all other cases. They found a cuing benefit, relative to target-only presentations, when a matching color cue was presented at the matching location (e.g., green cue and green target on the right), but not when a nonmatching color cue was presented. This suggests that two attentional control sets can be maintained simultaneously as long as both sets are tied to separate locations in space. If observers are indeed able to maintain two spatially separated attentional sets in parallel, one might expect the costs of switching between these sets to be minimal.

The present study directly investigated the cost associated with switching attentional sets. We conducted two experiments in which observers were presented with displays consisting of four circles of different colors on the left side of fixation and another four colored circles on the right side. There was a target circle on each side. The task of the observers was to make a saccade first to the left target and then to the right target as quickly as possible. To prevent the preprogramming of saccades, the two halves of the display were presented sequentially, with the second display appearing as soon as observers looked at the first target. Crucially, the target was either of the same color on both sides (e.g., both red; the no-switch condition) or of different colors (e.g., red on the left, green on the right; the switch condition). In both conditions, the target colors were prespecified for an entire block, and importantly, in the switch condition, each target color was consistently tied to a location. By always presenting the two target displays in the same order and consistently tying them to the same location, we sought to enable observers to simultaneously maintain both sets (Adamo et al., 2008) and, thus, prevent switch costs, if possible. The rationale is that if, even under these optimal circumstances, performance suffers, we can conclude that switching attentional sets comes at a cost. Experiment 1 indeed showed such costs when a switch in feature set was required. Experiment 2 further explored this cost by having each target accompanied by a distractor matching the opposite set. While attention has not yet fully switched, observers should be more distracted by an object matching the current set. This allowed us to measure the time course of switching.

Experiment 1

There were two main conditions. In the no-switch condition, observers had to saccade toward targets of the same color presented on the left (T1) and on the right (T2) sides of the display (e.g., with both targets being red). In the switch condition, observers had to saccade toward targets of one color on the left side (T1; e.g. green) but of another color (T2; e.g., red) on the right side of the display. Target colors remained constant within one block of trials. The second target display appeared only after observers had selected the first target. We asked our participants to make a fast and direct saccade to the targets as soon as the display appeared on the screen.

If switching attentional sets is associated with costs despite the fact that the different targets are consistently tied to distinct display sides, saccades from one to the other color target should be less accurate or slower than when the targets have the same color.

In order to control for direct priming of the second target by the presentation of the first in the no-switch condition, we also included trials on which only the second target would appear (T2-only trials). For example, in a standard no-switch trial, a green target might appear on the left, followed by another green target on the right. It has been shown that such target repetitions lead to faster performance for the second target due to priming (Maljkovic & Nakayama, 1994; Olivers & Humphreys, 2003). That is, priming increases the speed with which a second target sharing features with a first target is selected. Accordingly, any advantage in performance in no-switch relative to switch trials may not be caused by costs in the switching trials but merely by priming benefits in the no-switching trials. On T2-only trials, only the right green target would appear. If there is still an advantage relative to the switch condition, this must be due to the fact that there is no need to switch sets, rather than to perceptual priming.

Method

Participants and apparatus

Sixteen university students participated for either course credit or money. The stimuli were generated using a standard PC running E-Prime (version 1.2; Psychology Software Tools, Pittsburgh, PA) on a 19-in. color monitor at 100 Hz. Viewing distance was about 65 cm. Eye movements were recorded with an Eyelink 1000 eyetracker (SR Research) using the standard built-in saccade detection algorithm. Data were analyzed with MATLAB (Mathworks, Natick, MA).

Design, stimuli, and procedure

Figure 1 shows a schematic depiction of one trial of the switch condition with the first saccade target (T1) on the left side and the second saccade target (T2) on the right side of the display. In the no-switch condition, both T1 and T2 had the same color. In the switch condition, T1 and T2 were differently colored. Each trial started with the presentation of a fixation cross in the center of the screen (until a stable fixation of at least 1 s). Drift correction was performed during the presentation of the fixation cross. Then the outlines of four circles were presented on each side of the screen, equidistant from fixation (about 7°). After a further 500 ms of stable fixation, the circle outlines on one side were filled with color. There were always three distractor colors (blue, yellow, gray; CIE x,y coordinates 0.211, 0.177; 0.417, 0.484; 0.303, 0.333) and one target color (red or green; CIE x,y coordinates 0.572, 0.343; 0.304, 0.509). We did not attempt to equate all colors for luminance, but we chose the target colors such that they were not the brightest or the dimmest. In addition, the two target colors were counterbalanced within participants. On half of all trials, the first display half was presented, consisting of four colored circles, one corresponding to T1. After the participants’ gaze landed on this target, the second half was presented at the opposite side of fixation, and the first half disappeared. The second display half always consisted of T2 and three distractors. If T1 was not reached after 800 ms, an error message was shown on the screen for 1 s, and the trial was started again. On the other half of the trials, the side that contained T2 appeared first, requiring participants to make a saccade to T2 only (see the lower panel in Fig. 1). This manipulation was included to exclude the possibility that the color of T1 directly primed the color of T2 in the no-switch condition and to force participants to keep two sets active in the switch condition. If they did not look at T2 within 800 ms, an error message was shown on the screen, and a new trial was started.

Fig. 1
figure 1

The time course of one trial with T1 on the left side and T2 on the right side of the display. T1 and T2 could have either the same color (no-switch condition) or different colors (switch condition, depicted here). Each trial started with the presentation of a fixation cross for 500 ms. On 50% of all trials, participants made a saccade to T1, followed by a saccade to T2. On the other 50% of all trials, participants made a saccade only to T2. Whitish gray corresponds to the yellow distractor, blackish gray to the blue distractor, light gray to the gray distractor, and intermediate grays to the red and green targets

Each participant completed four blocks of 100 trials with the color combinations red → red, red → green, green → red, and green → green for T1 and T2, respectively. The order of blocks was counterbalanced across participants. Whenever participants were asked to make two fast successive eye movements (first to T1, then to T2), T2 appeared together with three distractors as soon as participants looked at T1. At the same time, T1 and the first three distractors disappeared. For half of the participants, the side containing T1 was always presented on the left side of the display, requiring a leftward saccade to T1, followed by a rightward saccade to T2, or only one rightward saccade to T2. For the other half of the participants, this assignment was reversed. T1and T2 were always randomly placed on one of the four possible positions within a display half. Participants completed 20 practice trials.

Data analysis

We discarded saccades with latencies below 80 ms and above 600 ms and trials with blinks or other artifacts (in total, 35% of all saccades were excluded). Note that since observers had to perform two eye movement tasks (one on the left and one on the right), the chance of an error was increased. Saccades were defined as having ended on a target or a distractor circle when the endpoint of the saccade fell into a “wedge”-shaped region around the circle with an inner radius of about 6.5° and an outer radius of about 10.5° from the center of the screen. We focused our analysis on three saccade types: (1) the first saccade after T1 onset that ended either on the target or on one of the distractors, (2) the first saccade after T2 onset that ended either on the target or on one of the distractors on trials when only T2 was presented, and (3) the first saccade after T2 onset, which ended either on T2 or on one of the distractors on trials in which T1 had been presented (and was looked at) first. We calculated the proportion of saccades to the target relative to all saccades that ended on any object (target or distractor) and the time to the target—that is, the time between display onset and the end of the saccade for each of these saccade types.

Results and discussion

Figure 2 shows the proportion of saccades that ended on a target in each of the conditions and for each saccade type. An ANOVA with the factors of condition (no-switch, switch) and saccade type (T1, T2 only, T2 after T1) revealed an overall drop in accuracy of 9.3% whenever participants had to switch between attentional sets [main effect of condition: F(1, 15) = 39.280, p < .001]. Furthermore, there was an accuracy difference between saccade types [main effect of saccade type: F(1, 15) = 30.245, p < .001]. Saccades to T2 after T1 were less accurate (mean saccades to target = 61.2%) than saccades to T2 only (mean saccades to target = 82.4%) and to T1 (mean saccades to target = 82%). The interaction of condition and saccade type was not significant. Nevertheless, we tested whether the drop in accuracy for the switch condition, relative to the no-switch condition, was significant for each saccade type. This was indeed the case. Accuracy was, on average, 6% lower in the switch condition than in the no-switch condition for saccades to T1, t(15) = 2.885, p < .05, 11% lower for saccades to T2 only, t(15) = 4.957, p < .001, and also 11% lower for saccades to T2 after T1, t(15) = 3.309, p < .01.

Fig. 2
figure 2

Proportion of saccades to the target in Experiment 1

An ANOVA with the same factors on the time to target revealed that participants were, on average, 9 ms slower whenever they had to switch attentional sets [main effect of condition: F(1, 15) = 8.576, p = .01]. This difference was most pronounced for saccades to T2 only (18 ms), whereas it was less pronounced for saccades to T2 after T1 was presented (4 ms) and for saccades to T1 [4 ms; interaction of condition with saccade type: F(2, 30) = 7.741, p < .01]. Note that no-switch and switch trials were always presented in separate blocks, allowing the proportion of saccades that ended on a target to be separately calculated for each saccade type (T1, T2 only, and T2 after T1).

These results go against the hypothesis that switching can be performed without any costs. Even though both sets were consistently tied to separate spatial locations, performance was worse in the switch than in the no-switch condition. The fact that switch costs were also found for the first of two targets means that switch costs for only the second target were probably even slightly underestimated.

Experiment 2

In Experiment 1 we used displays consisting of multiple heterogeneous objects for two reasons. First, we wanted to encourage participants to adopt feature-specific attentional sets, and second, we wanted to see whether they would adopt a separate set for each side of the display, because previous work had suggested that two sets may be maintained in parallel under these circumstances (Adamo et al., 2008). However, note that it was not strictly necessary to tie the two different target colors to specific sides of the display to do the task. Participants could decide to just look for any of the target colors anywhere in the display (i.e., adopt a pair of display-wide sets rather than location-specific sets). After all, the T1 and T2 displays each contained only one object that matched one of the sets (i.e., the targets), thus automatically leading to selection. It is possible that having to switch between multiple display-wide sets is costly, while switching sets between two different spatial areas is efficient.

Experiment 2 was designed to prevent display-wide attentional settings and to further coerce observers to prepare for the attentional switch. We did this by including the other target color as a distractor in each of the display halves. Thus, if the task was to look for red on the left and green on the right, a green distractor was present on the left, and a red distractor on the right. This way, participants should have every incentive to look for only one particular color at only one particular side of the screen. If switching location-specific sets is associated with switch costs, we should again find costs for the condition requiring two sets, as compared with the condition that required only one set. Moreover, we then expected that the distractors associated with the other attentional set would interfere with search by attracting attention.

A second advantage of this procedure is that we can assess the time course of the attentional switch by looking at the accuracy of the eye movements to the second display as a function of time since display onset. Early in time, observers may still be employing the old attentional set and, thus, may look at the matching distractor more often than at the new target. With time, however, observers will have switched attentional sets and, thus, will prefer the target. The crossover of these preferences can then be taken as the switch time.

Method

The present experiment was identical to Experiment 1, except that now both possible target colors (red and green) always appeared together with two distractors (blue and yellow) on each side of the search display. Thus, the gray distractor of the previous experiment was replaced by the color associated with the other attentional set. In the no-switch condition, the gray distractor was replaced with the other, now irrelevant, possible target color. Sixteen university students received either course credit or money for participating in the experiment. In total, 24% of all saccades needed to be discarded due to blinks and other artifacts.

Results and discussion

Figure 3 shows the proportion of saccades that went to the target and to the distractor associated with the other attentional set for each saccade type and condition. As in Experiment 1, the no-switch and switch trials were presented in separate blocks, allowing the proportions of saccades that went to the target and to the distractor to be separately calculated for each saccade type (T1, T2 only, T2 after T1). We found an overall drop in accuracy of 26.3% when participants had to switch attentional sets [main effect of condition: F(1, 15) = 191.740, p < .001]. Overall, accuracy was lowest for saccades to T2 after T1 (38.1%), intermediate for saccades to T2 only (53.1%), and highest for saccades to T1 [66.7%; main effect of saccade type: F(1, 15) = 61.619, p < .001]. The differences in accuracy between conditions varied with saccade type [interaction of condition with saccade type: F(2, 30) = 6.457, p < .01]. In the switch condition, accuracy was, on average, 22% lower than in the no-switch condition for saccades to T1, t(15) = 6.666, p < .001, 35% lower for saccades to T2 only, t(15) = 12.656, p < .001, and 21% to T2 after a saccade to T1, t(15) = 6.439, p < .001. Overall, the switch costs observed in Experiment 2 were higher than those observed in Experiment 1, possibly due to the simultaneous presentation of the distractor.

Fig. 3
figure 3

Proportion of saccades to the target (T) and the distractor associated with the other set (D) in Experiment 2. Dotted lines indicate the average proportion of saccades that ended on a regular (blue or yellow) distractor

In addition, we found an overall cost in the time to target of, on average, 21 ms whenever participants were supposed to switch attentional sets [main effect of condition: F(1, 15) = 8.991, p < .01]. Furthermore, the time to T2 after presentation of T1was, overall, fastest (243 ms), and the time to T2 only was intermediate (255 ms), whereas the time to T1 was slowest [262 ms; main effect of saccade type: F(1, 15) = 3.409, p < .05]. The difference in time to target between the switch condition and the no-switch condition was most pronounced for saccades to T2 only (33 ms), intermediate for saccades to T1 (17 ms), and smallest for saccades to T2 after the presentation of T1 [15 ms; interaction of condition with saccade type: F(2, 30) = 3.409, p < .05].

For each type of saccade, we then compared the proportion of saccades that erroneously ended on the distractor associated with the other attentional set to the proportion that would have been expected to end on this distractor if there had been no interference from the other set. To do so, we computed the average proportion of saccades that ended on a regular distractor (blue and yellow). If the distractor that was associated with the other attentional set was treated as a regular distractor, a similar proportion of saccades should have ended on this distractor. These proportions are indicated as dotted lines in Fig. 3. However, we found that whenever participants were supposed to switch attentional sets, significantly more saccades went to the distractor associated with the other set than to a regular distractor. These differences were 16% for saccades toward T1, t(15) = 7.995, p < .001, 34% for saccades toward T2 only, t(15) = 10.313, p < .001, and 14% for saccades toward T2 after T1 was presented, t(15) = 4.627, p < .001. In the no-switch condition, the proportion of saccades that ended on the now irrelevant distractor was about equal to the expected proportion (see Fig. 3), all ts(5) < 1.964, , all ps > .06. If anything, there was a trend toward a suppression of the irrelevant distractor.

To estimate the switch time, we analyzed saccadic accuracy for T2 as a function of time (since T2 onset; cf. van Zoest, Donk, & Theeuwes, 2004). For this purpose, we binned the first two saccades into six bins and determined the average time to the target and the proportion of saccades that ended on T2, as well as the proportion of saccades that ended on the distractor associated with the other set. Figure 4 depicts the proportion of saccades that ended on the target (solid lines) and on the distractor associated with the other attentional set (dashed lines) as a function of time to target. Whenever participants did not need to switch, already the majority of even the fastest saccades ended on the target. The more time passed, the higher the proportion of saccades that were correctly directed to the target. However, when participants had to switch attentional sets, the earliest saccades ended on the distractor associated with the first set. This indicates that participants were still using the first (i.e., old) attentional set (Al-Aidroos & Pratt, 2010; Folk, Remington, & Johnston, 1992). The proportion of correct saccades to the target (i.e., second set) then gradually increased, while the proportion of saccades to the distractor associated with the other (i.e., first) attentional set gradually decreased. As is shown in Fig. 4, the competition between the old-set distractor and new-set target shifted balance in favor of the latter around 250–300 ms post onset, suggesting that this is the time it takes to switch.

Fig. 4
figure 4

Proportion of saccades from T1 to T2 that ended on the target (solid lines) and on the distractor associated with the other attentional set (dashed lines) as a function of time to target

General discussion

In two experiments, we assessed whether people can efficiently switch attentional sets. We asked observers to sequentially saccade toward two targets of the same color (no-switch condition) or of different colors (switch condition), corresponding either to one display-wide attentional set or to two attentional sets tied to different parts of the display. Each target was always presented together with three differently colored distractors, in order to ensure that selection had to be based on the feature associated with the attentional set (i.e., color). We found that saccades in the switch condition were slower and, especially, less accurate than those in the no-switch condition. Furthermore, we found that whenever a target was presented together with a distractor having the color associated with the other attentional set, a large proportion of saccades ended not on the target but on this distractor. This interference clearly speaks against the possibility that attentional switches can be performed without any costs. This was further supported by an analysis of the time to target for saccades toward the second target, showing that participants shifted from the first (i.e., old) to the second (i.e., new) set about 250–300 ms after T2 onset.

Our study shows that a switch between attentional sets is associated with costs, even when these sets are tied to separate spatial locations. Adamo et al., (2008) suggested that attentional sets could be maintained simultaneously as long as these sets refer to separate spatial locations. Our results do not exclude the possibility that people can simultaneously maintain two attentional sets to at least some extent (see also Ansorge & Heumann, 2004; Ansorge & Horstmann, 2007; Ansorge, Horstmann, & Carbone, 2005). However, our results suggest that if people have this ability, this does not preclude the occurrence of switch costs. Indeed, recently, Moore and Weissman (2010, 2011) have proposed that it is possible to passively keep two attentional sets in memory but that, for a set to affect selection, it needs to be put in the acitive focus of attention (for a similar view, see also Adamo, Wozny, et al., 2010; Parrott, Levinthal, & Franconeri, 2010). It is quite possible that the switch costs we find reflect this process.

Our study provides a more direct assessment of attentional switch costs than have previous studies that have used cues to indicate a new target on each trial. Those studies also came to the conclusion that it takes time to change the target set. However, such estimates invariably include the time it takes to interpret the cue, and not only the time to perform the switch. Moreover, as Wolfe (2004) showed, these estimates depend on the type of cue used, with picture cues causing optimal performance at a 200-ms cue–target SOA, while word cues need more than 800 ms. The only study that has controlled for cue processing time between switch and no-switch trials is Rushworth et al., (2005). However, in that study, only a single SOA of 2,000 ms was used, precluding the possibility of estimating the switch time.

Our findings are completely in line with the literature on task switching (Jersild, 1927). In task-switching experiments, participants are asked to apply two different stimulus–response rules to the same perceptual stimulus (see, e.g., Allport, Styles, & Hsieh, 1994). Usually, participants are less accurate and slower in responding to the stimulus when they switch stimulus–response rules from trial to trial than when they apply the same rule on two consecutive trials. This is even the case when the switch is completely predictable (e.g., on each trial or on every second trial) and each stimulus–response rule is tied to a specific location in the display, as shown by Rogers and Monsell (1995). These authors estimated the switch cost for such predictable switching to be between 200 and 300 ms, which is consistent with the estimate of 250–300 ms we obtained. It is also consistent with estimates of what has been termed the attentional dwell time in paradigms where observers need to switch from identifying a digit to identifying a letter (Duncan, Ward, & Shapiro, 1994; Ward, Duncan, & Shapiro, 1996). Our results also agree nicely with findings by Moore and Weissman (2010, 2011). Their participants looked for targets in either of two possible colors in a central RSVP stream flanked by two peripheral streams. It was found that 100–300 ms after a target-colored distractor had been presented in a peripheral stream, it was more difficult to identify a differently colored target, because the first attentional set was still active (“in the focus of attention”). Taken together, these and our results show that people need some time to switch between two tasks or between two attentional sets and that this takes about a quarter of a second.