Attention Interacts With Emotion to Drive Perceptual Impairment of Images in an RSVP Task

Attention Interacts With Emotion to Drive Perceptual Impairment of Images in an RSVP Task Divita Singh 1 a , Meera Mary Sunny 2 1 School of Arts and Sciences, Ahmedabad University, Ahmedabad, IN; Centre for Cognitive and Brain Sciences, Indian Institute of Technology Gandhinagar, Gandhinagar, IN, 2 Centre for Cognitive and Brain Sciences, Indian Institute of Technology Gandhinagar, Gandhinagar, IN


Introduction Introduction
Previous studies examining the temporal limits of attention have shown a reduction in the accuracy of target detection for the second of two targets (T2) in a Rapid Serial Visual Presentation stream (RSVP) task when it is presented between 200-500ms after the first target (T1) (Raymond et al., 1992;Shapiro et al., 1994). This reduction in the detection accuracy of T2 that is contingent on the correct identification of T1 is known as the Attentional Blink (AB) (Raymond et al., 1992). Importantly, Raymond and colleagues showed that T2 detection was not affected by the presence of T1 when the task was to report only the presence/absence of T2 while ignoring T1. Based on this finding, they argued that the post-target processing deficit is not due to visual suppression or other early level perceptual processes, rather an attentional effect. Even though the specific mechanisms underlying AB is debated, there is a consensus that AB is an attentional effect (Olivers & Meeter, 2008;Raymond et al., 1992;Schneider, 2013;Spalek et al., 2006). However, also see Bowman & Wyble (2007); Chun & Potter (1995); Jolicoeur & Dell'Acqua (1998).
Recently, Most et al. (2005) asked participants to identify a neutral target image while ignoring an emotional distractor image. Contrary to typical AB finding, they showed that even when the participants were required to identify only a single target image, the emotional distractor image impaired the identification of the subsequent target image. Specifically, they found reduced accuracy in the orientation detection of the landscape/architecture image when it appears within 200-500ms after an emotional distractor image as compared to a neutral or scrambled distractor image. This impairment arising from the processing of the emotional distractor is referred to as the Emotion Induced Blindness. Most et al. (2005) looked into the role of the attentional set on the perception of neutral target image which was followed by an emotional, neutral, or a scrambled image and attentional set was manipulated by having a fixed or variable category of target images (in Expt. 2). They found that participants performed better in the fixed attentional set condition compared to the variable attentional set condition. Importantly, this performance benefit occurred only for the group who scored low in the harm avoidance but not in the high harm avoidance group. This deficit was discussed in terms of attentional rubbernecking driven by the automatic capture of attention by the emotional images.
However, in a subsequent study, Wang et al. (2012) argued that EIB is a result of spatio-temporal competition at the early stage of visual processing. That is, when an emotional and neutral image is presented spatially and temporally close to each other, they compete for the early perceptual representation and the stronger signal wins the competition (in this case an emotional image), leading to the impairment in the non-salient one (neutral target image). This explanation holds weight because it was shown that EIB is observed only when the emotional distractor and the neutral target are presented not only temporally, but also spatially close to each other. In contrast, many studies have shown that AB is observed even when the target and distractor appear in different spatial locations (Lunau & Olivers, 2010). Hence, it was argued that while AB is a deficit arising from a capacity limitation at the central bottleneck, whereas EIB is the result of perceptual interference at the early visual representations. However, later studies show that AB and EIB share similar electrophysiological components suggesting that the underlying mechanism of EIB might be similar to the AB (Kennedy et al., 2014).
One important difference between AB and EIB is in the use of emotional stimuli and its top-down relevance in the task context. For example, Mathewson et al. (2008) examined the role of emotional and taboo words on the identification of a neutral target word in an RSVP stream and showed a reduction in the accuracy of neutral T2 word when it followed an emotional T1 word. Interestingly, they also showed a similar impairment even when the emotional word presented in the RSVP stream was not a target and participants were asked to ignore it. Mathewson et al. (2008) explained the target impairment following an emotional pseudo-target as resulting from attentional capture by the emotional stimuli, resulting in a generalized impairment in the detection of a target that follows it. There is much evidence supporting the idea that emotional stimuli capture attention as well as that the processing of emotional stimuli requires attention (Pessoa et al., 2002). Additionally, it has been shown that allocating top-down attention to emotional stimuli leads to an additive effect wherein the effects of attention are potentiated (Phelps et al., 2006).
The present study aims to examine how emotion interacts with attentional control in driving the impairment in EIB. We hypothesize that allocating top-down attention to a relevant non-emotional target image could have a similar effect as that of an emotional distractor. If this is true, it could be argued that the emotional distractor presented in the RSVP stream captures attention automatically, leading to subsequent impairment in the identification of a second target that follows (Expts. 1 and 2). Additionally, we also test how the impairment in the processing of a neutral target image is affected by the presence or absence of an emotional distractor image (Expts. 3 and 4).
Experiments 1 -4 Experiments 1 -4 The first two experiments were comparable to an AB task, and participants were required to report two targets. In Expt. 1, the first target was an emotional image and participants reported valence of the image whereas in Expt. 2, T1 image was oriented left or right within an RSVP stream of upright images, and participants were asked to report the orientation. The third and fourth experiment was similar to the first and second experiment respectively, with the exception that the participants were not required to report any attribute of the emotional image (Expt. 3) or the neutral oriented image (Expt. 4) and report only the second target. We expect that in both Expt. 1 and 2, allocating attention in a top-down manner will lead to an impairment in the subsequent orientation discrimination task, irrespective of the emotional nature of the T1. However, when the T1 is to be ignored, only the emotional T1 will capture attention and hence lead to an impairment in the subsequent orientation discrimination task only in Expt. 3 and not in Expt. 4. This makes a fully factorial design where emotion and attentional control is manipulated.  Martino et al., 2009;Kawahara et al., 2006;Martens et al., 2010;Nieuwenstein et al., 2005).
All participants gave prior consent for participation. All of them reported to have normal or corrected to normal vision. All the participants were given a monetary reward for their participation in the study. The protocol was approved by the Institutional Ethics Committee.
Apparatus and stimulus: Participants were seated in a dimly lit room in front of the IBM PC compatible computer with an 18.5-inch 60 HZ LCD monitor. All the stimuli were presented using Matlab with the Psychophysics Toolbox extensions (Brainard, 1997;Pelli, 1997). Participants used six keys; Y, N, V, B, Alt, and Ctrl on the standard keyboard for their responses. To make the responses easier for the participants, these keys were labelled with their corresponding values. All the stimuli used in the experiment were coloured photographs with a resolution of 400 × 300 pixels. The images subtended a visual angle of 11º × 8º at approximately 65 cm distance and were presented on a black background.
Each RSVP stream contained 22 images including target(s) and (or) distractor as well as filler images. In Expt. 1, the T1 images were emotional and selected from a pool 80 images [40 positive images (mean valence = 7.4, mean arousal = 6.4) and 40 negative images (mean valence= 1.8, mean arousal = 6.4) with similar arousal level. All the emotional images were taken from the International Affective Picture System (IAPS) database (Lang, Bradley, & Cuthbert, 2001). In Expt. 2, the T1 images were landscape or architectural images titled to 90°left or right, counterbalanced across participants. In both experiments, T2 was drawn from a separate set of 80 landscapes and 80-architecture images, tilted to 90˚ left or right. These were taken from publicly available sources. The emotional distractor used in the Expt. 3 was the same as Expt. 1 while the tilted landscape/architecture distractor used in Expt. 4 was the same as in Expt. 1. The filler images were the same in all four experiments.
Procedure and design: Each trial started with the presentation of the fixation cross in the center of the screen for 1000ms. This was followed by an RSVP stream of 22 pictures. Each picture was presented for 100ms, without any temporal gap in between the pictures. In the first two experiments, the first target appeared at 4 th , 6 th , 8 th , 10 th , or 12 th position of the RSVP stream and second target could appear either at 1 st (100 ms), 2 nd (200 ms) or 8 th (800 ms) position from the first target. In the 3 rd and 4 th experiment, the same algorithm was used to determine the placement of the irrelevant emotional and non-emotional distractor and as well as the target. The target was never the last item in the RSVP stream.
In Expts. 1 and 2 participants were asked to report two target images whereas, in Expts. 3 and 4, participants were asked to ignore the distractor image and to report only the target image. In Expt. 1, T1 was defined as an emotional image, and participants were asked to report its valence. In Expt. 2, the target was defined as either architectural for half of the participants and the landscape for the other half and was asked to report orientation. In Expts. 3 and 4, the same targets from 1 and 2 respectively were present in the RSVP Attention Interacts With Emotion to Drive Perceptual Impairment of Images in an RSVP Task Collabra: Psychology stream but were not relevant for the task. Participants were required to report the orientation of the tilted image as T2 in Expts. 1 and 2 and as the only target in Expt. 3 and 4. Note that in 2 and 4, participants who were shown architectural T1 were shown landscape T2 and vice-versa, counterbalanced. This was done so that if participants saw only one image, they were not confused about it being T1 or T2.
Once the RSVP stream ended participants were asked to report either the valence (Expt. 1) or orientation (Expt. 2) of T1. They used the keys "B" (for positive / right) or "V" (for negative/ left) keys on a standard keyboard to indicate their responses. After that, they were asked whether they saw the tilted Landscape/Architecture image (T2) or not. If they report that they have seen T2 by pressing "Y", they were asked to report its orientation by using "Alt" and "Ctrl" keys (Alt for left and Ctrl for right). However, if they report that the target was absent by pressing "N" then the next trial started automatically after 1000ms. Expts. 3 and 4 followed the same procedure to record responses, except that they were not asked the question about the valence or orientation of the image they had to ignore. All the responses were recorded to calculate percentage accuracies. Each participant completed 10 practice trials followed by 150 experimental trials. The experimental trials were divided into two blocks of 75 trials. An enforced break of 2 minutes was given after the first block, after which they were asked to press any key to resume the experiment. T1 was absent on 6% of the trials and acted as catch trials. Visual feedback "incorrect response; press any key to continue" was given for every incorrect response. The experiment resumed when participants pressed a key.

Results Results
In both Expt. 1 and 2, we calculated percentage accuracy for both T1 and T2 separately in three lags (one, two, and eight). Correct T2 report was counted towards calculating Blink only if T1 was accurately reported in that trial. Typically, AB studies use contingent T2 accuracies (denominator was the number of correct T1 trials) as a measure of blink rate whereas EIB uses only percentage accuracy for T2 (denominator was the number of T2 trials) as a measure of Blink. Since there is no T1 in Expts. 3 and 4; it is not possible to calculate contingent accuracy in an EIB like task. Hence, to keep the comparability of measures between all the 4 experiments we decided to calculate T2 accuracy as a measure of Blink and not Contingent T2 accuracy. 1 Experiment 1 Experiment 1 In order to estimate the attentional blink, we calculated both T1 and T2 accuracy at each lag for each participant Figure 1. Figure 1. Illustration of the general method for Expt. 1 Illustration of the general method for Expt. 1 and 2 with T2 appearing at Lag-2. The participant's task and 2 with T2 appearing at Lag-2. The participant's task was to report the valence (Positive /Negative) of T1 in was to report the valence (Positive /Negative) of T1  Figure 2). An ANOVA on T1 accuracy with lag showed a significant main effect of lag, F(2, 30) = 8.04, p = .002, η p 2 = .72, with highest accuracy at Lag-8 (mean accuracy = 96.5 %) and Lowest at Lag-2 (mean accuracy = 96.6%) and Lag-1 (mean accuracy = 92.6 %).

Experiment 2 Experiment 2
In order to check whether T2 accuracy is modulated by the stimulus type, we calculated T2 accuracy separately for both architectural and landscape T2. Results of an ANOVA with category (Architecture and landscape) as a factor did not show a significant main effect (p =.08). Neither did it interact with Lag (p = .09).
In order to estimate the robustness of the observed effects with the sample size of 14, we conducted sensitivity power analysis using G* Power 3.1.9.6 software (Faul et al., 2007(Faul et al., , 2009) with the F test family. The sensitivity power analysis was computed separately for the main effects and interaction effects with fixed power (1 -β) of .70 [the power of .70 was decided based on the observed power in the post-hoc power analysis which was computed on the observed main effects (.80) and interaction effect (.70). We also observed a similar power when we computed the power analysis on the previous studies with a similar design (Raymond, 2003;Raymond et al., 1992)]. Sensitivity power analysis computed with the sample size of 14, alpha of p = .05 and power (1 -β) of .70 showed effect size (as in Cohen, 1988) f(V) of 0.79 (for the main effects) and 1.09 (for the interaction effects). The effect size of .79 and 1.09 is considered large for the current set of design (Cohen, 1988), which confirms that the sample size of 14 provides adequate power to the study and the observed main 1 2 Attention Interacts With Emotion to Drive Perceptual Impairment of Images in an RSVP Task Collabra: Psychology effect of Lag on T1 performance, F (2, 26) = 24.9, p < .001, η p 2 = .65. Further pairwise comparison showed that there was a significant difference between Lag-1 (mean accuracy = 72.79%) and Lag-2 (mean accuracy = 87.38%) (p < .001) as well as between Lag-1 and Lag-8 (mean accuracy = 86.32%) (p = .001). There was no significant difference between accuracies in Lag 2 and Lag-8 (p = .55).

Experiment 3 Experiment 3
Percentage accuracies for the detection of the neutral target were calculated for each participant separately in each of the three lags (See Panel C of Figure 2). These mean were submitted to a repeated measure ANOVA, which showed a significant main effect of Lag, F (2, 26) = 82.7, p < .001, η p 2 = .86,. Pairwise comparisons showed that accuracies across all three lags were significantly different from each other (p = .004), with greatest accuracy at Lag-8 (mean accuracy = 80.28%) and Lag-2 (mean accuracy = 44.5%) and lowest at Lag-1 (mean accuracy = 29.7%).

Experiment 4 Experiment 4
In order to check whether target accuracy is modulated by the stimulus type, we calculated accuracy separately for both architectural and Landscape targets. The result of an ANOVA with category (Architecture and landscape) as a factor did not show a significant main effect F(1, 6) = 0.03, p = .86, η p 2 = .005. Neither did it interact with Lag F(2, 12) = 0.059, p = .94, η p 2 = .01. Percentage accuracies calculated for each participant in each Lag separately (See Panel D of Figure 2) were submitted to an ANOVA showed that target accuracy was not significantly different across the three lags, F(2, 26) = 2.5, p = .097, η p 2 = .2. showing comparable accuracy at Lag-1 (mean accuracy = 68.4%), Lag-2 (mean accuracy= 59.2%) and Lag-8 (mean accuracy = 62.6%).

Emotion x Attention Analysis Emotion x Attention Analysis
In order to determine the role of emotion and attentional control in the accuracy of target detection, a 2×2× 3 mixed factorial ANOVA with factors Attention (top-down (Expts. 1 and 2) and bottom-up (Expts. 3 and 4)), Emotion (emotional (Expts. 1 and 3) and non-emotional (Expts. 2 and 4)) as between-subjects factor and Lag (one, two and eight) as within-subject factor was performed on T2 accuracies.
Importantly, there was a significant three-way interaction between Lag, Emotion and Attention F (2, 104) = 12.37, p < .001, η p 2 = .2. Further analysis showed that this interaction was driven by the absence of significant lag effect when the control of attention was bottom-up and the T1 was nonemotional (p > .05 ). That is, in Expt. 4 where the oriented image was irrelevant to the task and it did not capture attention and thus did not lead to the lag-dependent impairment, as generally observed during the standard blink. Furthermore, when the control of attention was top-down, such as in Expt. 1 and Expt. 2, there was significant impairment in T2 identification across the lag, irrespective of the emotional nature of T1 (p < .001). That is, we found a similar level of lag-dependent impairment in Expt. 1 where T1 was emotional as well as in Expt. 2 where T1 was a neutral stimulus. However, when the attentional control was bottom-up such as in Expt. 3 and Expt. 4, T2 (or target) identification was impaired only when the pseudo-target was emotional. That is, a significant lag effect occurred only in the Expt. 3 (p <.001) but not in the Expt. 4 where control of attention was bottom-up but distractor was non-emotional (p >.05).

Discussion Discussion
Present study was conducted to understand the relative contribution of emotional specificity and attentional control in driving the impairment in Emotion Induced Blindness (EIB) and to understand how attentional control and emotion might interact with each other to modulate target identification in an RSVP task. We hypothesized that, regardless of the type of attentional control involved, attentional allocation to any stimulus can lead to impairment in the processing of a subsequent target. That is, an emotional stimulus might result in target impairment because it captures attention in a bottom-up manner. However, the same impairment may be observed following a non-emotional stimulus when it is attended to in a goal-driven manner. That is, both an emotional as well as a non-emotional stimulus may lead to the subsequent impairment in the identification of the target image if (i) it is relevant to the task or (ii) it is salient. The results show that the impairment in Expt. 1 was comparable to the impairment observed in both Expt. 2 and Expt. 3. Expt. 4 shows that the same nonsalient feature from Expt. 2 when made task-irrelevant does not lead to impairment in target detection, suggesting that a non-salient stimulus that does not capture attention in a bottom-up manner, does not lead to impairment when it is also task-irrelevant. The result of the present experiment supports the hypothesis. There was significant impairment in the detection of a neutral target following top-down attentional allocation irrespective of whether the stimulus T1 was emotional or non-emotional. On the other hand, when effects, as well as interaction effects in the study, are overpowered.
We also did the same ANOVA with Contingent accuracies for experiment 1 and 2. The results showed the same pattern with a significant main effect of Attention F(1,52)=5.9, p=.01 η p 2=.1 , but not Emotion, F(1,52)=.04, p=.834, η p 2 = .001. The 3 way interaction was also sig-  the same T1 stimuli were to be ignored, T2 accuracies were impacted only when an emotional, but not when a neutral image was presented in the RSVP stream.
Moreover, many studies have shown that independently, attention and emotion can have similar facilitatory effects on early level perceptual processing (Binocular rivalry, contrast sensitivity, etc.) and that independent neural systems are involved in these processes (LeDoux, 2003;Morris et al., 1998;Phelps et al., 2006;Sàenz et al., 2003). Moreover, re-cent studies have also shown that the facilitatory effects of attention and emotion jointly on early perceptual processing are potentiated compared to either attention or emotion alone (Brosch et al., 2011;Keil et al., 2005;Phelps et al., 2006). That is, once attention is allocated to an emotional stimulus, this attentional effect is potentiated compared to the effect of emotion or effect of attention, individually. The result of Expt. 1 confirms this additive effect by exhibiting greater accuracy for the emotional T1 compared to the neutral T1 of Expt. 2. This implies that the emotional characteristics of T1 potentiated the attentional effect -reflected in the larger T1 accuracy in Expt. 1 but not in Expt. 2 where T1 was neutral and only attention was applied (in a top-down manner). Most of these studies show potentiating effects on early perceptual processes. The absence of such a potentiating effect of attention as well as emotion in the results of Expt. 1 as compared to both Expts. 2 or 3 could be evidence against the role of early perceptual effects in driving EIB.
Attention Interacts With Emotion to Drive Perceptual Impairment of Images in an RSVP Task Collabra: Psychology Additionally, we found a similar kind of impairment for the second target in both Expt. 1 and 2, irrespective of the emotional valence of the first target. This suggests that once attention is allocated to T1 (either in top-down or bottom-up manner) -this capture leads to the impairment of succeeding items. More importantly, the impairment observed in Expt. 3 was also comparable to that of both Expt. 1 and 2, suggesting that the effect is not sensitive to the type of control involved in attentional allocation. Taken together, the results point to the fact that EIB is less likely to be driven by factors involved at the early perceptual stage; rather an attentional effect (similar to the AB) wherein emotion captures attention and the capture leads to impairment in subsequent target detection. Overall, the present study provides a novel understanding of how attentional control settings may play a significant role in the occurrence of the blink (especially in EIB) and questions the general narrative that EIB is different from the AB. Furthermore, these findings are theoretically important because it bridges the gap between two purportedly different phenomena and shows how these are essentially driven by the same attentional processes.
In fact, the findings from our Expts. 2 and 3 are in line with the findings of other researchers who have shown that emotional as well as a non-emotional salient stimulus can lead to AB. For example, in Arnell et al. (2007) study irrelevant emotionally arousing words impaired identification of the subsequent goal-relevant neutral target word. This finding suggests that even though emotional words were not relevant to the task, attention was allocated involuntary to these emotional words, leading to the unavailability of attention for the encoding of the subsequent neutral target item, thus resulting in AB. Additionally, similar impairment has also been observed when the target was presented after the presentation of an irrelevant colour singleton letter (Folk et al., 2008;Spalek et al., 2006) in an RSVP stream. For example, Spalek et al. (2006) showed reduced T2 accuracy when it was presented after an irrelevant colour singleton stimulus in the RSVP stream. They argue that colour captures attention exogenously leading to Attentional Blink. Spalek et al. (2006) explained the impairment in terms of a hybrid input-filtering model wherein a hard-wired mechanism directly passes the salient signal to higher-level processing, leading to the impairment in the perception of the subsequent less salient stimulus.
Similarly, Folk et al. (2008) showed impairment in the identification of a target letter when it follows a pseudo-target that shares its colour, but no impairment occurred when the target and pseudo-target did not have the same colour. That is, impairment in target identification was contingent on the attentional control settings. In fact, it is worth noting that in our Expt. 4, the contingent capture explanation would have predicted a blink as the pseudo target shared the horizontal orientation with the target. However, despite an overall reduction in accuracy, we did not find lag dependant impairment. One explanation for the absence of contingent capture in Expt. 4 may be that the filler images also shares features with the pseudo target in the RSVP stream. However, it is also possible that the target was more distinctive than the distractor thus the distractor failed to capture the attention and thus no blink. In fact, Kennedy & Most (2015) showed that increasing the target distinctiveness can attenuate the distraction effect by the emotionally salient distractor.
Overall, the present study demonstrates the role of attentional control and emotion in the manifestation of EIB, suggesting that there is an essential role of attentional allocation to the irrelevant distractor in driving the perceptual impairment of a subsequent neutral target in EIB. This implies that if a neutral stimulus is also made to capture attention, it will also cause comparable target impairment as observed for emotional distractors. Thus, future research should measure the role of distractor's saliency on the target perception in EIB. Future research should also control for the stimulus and task setting requirement to examine the absence of Lag-1 sparing in EIB as it is considered one of the factors that distinguish EIB from the AB. Data Accessibility Statement: Data Accessibility Statement: The data used in the paper are available at https://doi.org/10.6084/m9.figshare.