Contextually-Based Social Attention Diverges across Covert and Overt Measures

Humans spontaneously attend to social cues like faces and eyes. However, recent data show that this behavior is significantly weakened when visual content, such as luminance and configuration of internal features, as well as visual context, such as background and facial expression, are controlled. Here, we investigated attentional biasing elicited in response to information presented within appropriate background contexts. Using a dot-probe task, participants were presented with a face–house cue pair, with a person sitting in a room and a house positioned within a picture hanging on a wall. A response target occurred at the previous location of the eyes, mouth, top of the house, or bottom of the house. Experiment 1 measured covert attention by assessing manual responses while participants maintained central fixation. Experiment 2 measured overt attention by assessing eye movements using an eye tracker. The data from both experiments indicated no evidence of spontaneous attentional biasing towards faces or facial features in manual responses; however, an infrequent, though reliable, overt bias towards the eyes of faces emerged. Together, these findings suggest that contextually-based social information does not determine spontaneous social attentional biasing in manual measures, although it may act to facilitate oculomotor behavior.


Introduction
Faces convey a great deal of information. From an evolutionary perspective, researchers have theorized that the hierarchical system of social groups in both human and non-human primates primarily relied on visual information in faces to convey social signals to others [1,2]. As such, systems that processed these signals quickly and efficiently enhanced the ability to accurately predict other's behavior and dispositions [3,4]. This prioritization of social information is evident developmentally, with a preference for faces and eyes early in life [5][6][7][8][9], as well as neurologically, with a distributed network of specialized brain structures within the temporal and occipital lobe (e.g., fusiform face area, superior temporal sulcus, occipital face area) that are specifically tuned for processing faces, gaze, and other socio-biological signals [10][11][12][13][14][15][16][17][18]. These findings suggest that information conveyed by faces and facial features like eyes represent a key component of the complex social communication system [19][20][21][22].
As such, it is intuitive to expect that faces and facial features would preferentially capture and spontaneously shift attention, a process often called social attentional biasing [12,23]. Consistent with this idea, research has demonstrated quick and spontaneous attentional biasing towards faces and eyes in both covert (attentional shifts independent of eye movements) and overt (attentional shifts accompanied by eye movements) measures. Covertly, attentional biasing is typically indexed by prioritized [60][61][62][63], with for example, increased congruency effects in identifying facial emotions when faces are consistent versus inconsistent with background scene contexts [64]. However, it remains relatively unexplored how background context influences social attentional biasing.
To address this question, here we used the same task and parameters as Pereira and colleagues [45], but embedded the face and house cues within natural contextual backgrounds as illustrated in Figure 1. We measured attentional biasing using a dot-probe task and assessed the speed of manual target discrimination when targets were presented at the previous location of the face versus the house cue. Since it is still unclear whether attentional biasing towards faces are driven by faces as a whole or by any specific facial feature, targets were positioned at either the previous location of the eyes or mouth of the face or the top or bottom of the house to allow for a more detailed examination of attentional biasing at each location. Experiment 1 measured covert attention while participants maintained central fixation, whereas Experiment 2 measured natural eye movements using an eye tracker. If contextually-based social information resulted in robust social attentional biasing, we expected to find a reliable social attentional bias in both covert and overt measures, with faster responses in manual measures for targets occurring at the previous location of the face, and in particular the eyes, and greater proportion of saccades directed towards the face and eye cues.

Materials and Methods
Participants. Thirty volunteers, with normal or corrected-to-normal vision, participated (25 females, M age = 21 years, SD age = 3 years). They were remunerated with course credits. This sample size falls within the range reflected by an a priori power analysis (G*Power [65]) that was based on the estimated magnitude of face selection effects from past research [24,29,66,67]. The analysis indicated that data from 6-38 participants were needed to detect medium-to-large effects ranging from 0.41-1.36 (as estimated from Cohen's ƒ) with corresponding power values from 0.95-0.97. Informed consent was obtained from all participants before they participated in the study. The study was conducted in accordance with the Declaration of Helsinki, and all protocol and procedures were approved by the University Research Ethics board (protocol number 81-0909).
Stimuli and Apparatus. All stimuli were presented on a 16" cathode ray tube (CRT) monitor at an approximate viewing distance of 60cm. Stimulus presentation sequence was controlled by MATLAB's psychophysics toolbox [68].
The fixation screen consisted of a fixation cross (1 • × 1 • of visual angle), positioned at the center of the screen and set against a uniform 60% gray background. The cue stimuli, illustrated in Figure 1, consisted of grey-scale photographs of a female face and a house. The face and house parts of each cue measured 4.2 • × 6 • , and were positioned 6.3 • away from the central fixation cross. A house image was selected as the comparison stimulus due to both faces and houses being canonical stimuli (i.e., those that maintain a consistent internal configuration), with faces containing two eyes and a mouth, and houses typically containing windows and a door. This choice of stimuli maintains consistency with past attentional work [11,[69][70][71][72].
Along with size and distance from the fixation cross, the face and house cues were matched for average luminance (computed using the MATLAB SHINE toolbox [73]), Average gray scale luminance (ranging from 0-1) was equated across cues overall (face = 0.60, house = 0.56) as well as between the upper and lower halves of each cue (eyes = 0.60, mouth = 0.60, top house = 0.58, bottom house = 0.55). Michelson contrasts across each of these regions were also equivalent, though some variance existed across the lower half of each cue (eyes = 0.64, mouth = 0.56, top house = 0.65, bottom house = 0.72). Although we did not use a linearized monitor, all luminance and contrast measures reflecting image pixel values were verified to accurately reflect screen measures using a DataColor Spyder3Pro colorimeter.
The face and house cues were also matched for perceived attractiveness (measured via independent raters). Thirty-five additional naïve participants were asked to independently rate images of faces and images of comparison house and object stimuli using a Likert scale ranging from 1-Very Unattractive to 10-Very Attractive. The cue images used here received equivalent attractiveness ratings, t(34) = 1.40, p = 0.17, d z = 0.24.
Background context was added to the face and house cues using a photo editing software (Adobe Photoshop CS5), such that the face belonged to a person who was depicted sitting in a room, while the house was depicted as a picture that was hanging on a wall. The target screen consisted of a yellow circle or square (0.3 • × 0.3 • each), positioned 7.2 • away from the fixation cross and set against a uniform 60% gray background. Design. The target discrimination task was a repeated measures design with five factors: Cue orientation (upright, inverted), Face position (left visual field, right visual field), Target location (eyes, mouth, top house, bottom house), Target identity (circle, square), and Cue-target interval (denoting the time between the onset of the cue and the onset of the target; 250, 360, 560, and 1000 ms).
Cue orientation varied between upright and inverted cue images to control for baseline visual differences across the cue stimuli [74][75][76]. Face position varied between the left and right visual fields, with the house image always occurring in the opposite visual field. This manipulation was included as previous work has found that social processing of faces is facilitated when they are presented in the left visual field [11,13,14,45,77,78]. Target location was varied to occur at either the previous location of the eyes, mouth, top of the house, or bottom of the house. This critical manipulation was included to capture performance differences between targets occurring at the location of the face and its specific facial features relative to the comparison stimuli. Target identity was varied between a yellow circle and a yellow square in order to collect both response time (RT) and response accuracy. Cue-target interval varied between 250, 360, 560, and 1000 ms in order to assess the time course of attentional biasing and to maintain consistency with past work [24,45].
All factor combinations were equiprobable and presented equally often throughout the task sequence. The cues were spatially uninformative about the target location and its identity, as each target was equally likely to occur at any of the possible target locations. Conditions were intermixed and presented in a randomized order. Thus, participants had no incentive to attend to any particular cue.
Procedure. As before [24,45], we used the dot-probe task [79]. Figure 2 depicts the typical sequence of events. After the fixation display of 600 ms, a cue display was shown for 250 ms. After 0, 110, 310, or 750 ms (constituting 250, 360, 560, and 1000 ms cue-target intervals, respectively), a single target was presented at the previous location of the eyes, mouth, top house, or bottom house, and remained visible until participants responded or 1500 ms had elapsed. Participants were instructed to withhold their eye movements and to identify the target by pressing the 'b' or 'h' keys on the keyboard quickly and accurately (target identity-key response was counterbalanced). They were informed about the progression of the task sequence, that the target was equally likely to be a circle or a square, that the target could appear in any of the possible locations, and that there was no spatial relationship between the cue content, cue orientation, cue placement, target location, or target shape. Participants completed 960 trials divided equally across five testing blocks, with ten practice trials run at the start. Responses were measured from target onset. The cue screen was then presented for 250 ms. After 0, 110, 310, or 750 ms, a target (circle or square) demanding a discrimination response appeared in one of four possible locations. The target remained on screen for 1500 ms or until a key press was made.

Results
Response anticipations (RTs < 100 ms; 0.3% of all trials), timeouts (RTs > 1000 ms; 2.9%), and incorrect key presses (key press other than 'b' or 'h'; 1.9%) accounted for 5.1% of data and were removed from all analyses. Overall, accuracy was at ceiling at 94% and was not analyzed further.
Manual RT. In order to probe the extent of attentional biasing towards both overall faces and specific facial features (i.e., eyes and mouth), we conducted three sets of analyses. Using null hypothesis significance testing (NHST), we examined mean correct RTs for (1) target responses for the overall face (averaged across target locations of eyes and mouth) compared to the overall house (averaged across target locations of top and bottom house), and (2) target responses for each target location of the eyes, mouth, top house, and bottom house. NHST were performed using repeated measures Analyses of Variance (ANOVA) with Greenhouse-Geiser corrections applied for any violations of sphericity. Paired two-tailed t-tests were used for post-hoc comparisons where applicable, with multiple comparisons corrected using the Holm-Bonferroni procedure to control for Type I error [80]. All comparisons are shown with corresponding adjusted p-values (α FW = 0.05 [81]). If background context facilitated social attentional biasing, we expected to find faster responses for targets occurring at the previous location of the face (both overall and/or at the eyes) relative to targets occurring at the previous location of the house.
Furthermore, any null effects were examined using Bayesian analyses to assess (3) the relative strength of evidence for preferential attentional biasing towards faces versus houses by quantifying the evidence for the alternative hypothesis over the null hypothesis [82]. Bayesian analyses were performed using an online Bayes factor calculator (http://www.lifesci.sussex.ac.uk/home/Zoltan_Dienes/inference/ bayes_factor.swf) based on previously reported social attentional biasing effects when using similar paradigms. A Bayes factor that is less than 0.33 provides substantial evidence for the null hypothesis, whereas a Bayes factor greater than 3.00 indicates evidence for the alternative hypothesis (values between 0.33 and 3.00 suggest the need for more evidence).
Overall face vs. house comparisons. Mean correct interparticipant RTs were analyzed using an omnibus repeated measures ANOVA, run as a function of Cue orientation (upright, inverted), Face position (left visual field, right visual field), Target location (face, house), and Cue-target interval (250, 360, 560, and 1000 ms   Bayesian analyses. To further examine the plausibility of no attentional differences between the cues, we performed Bayesian analyses using a two-tailed Gaussian distribution centered around a mean of 17.67 ms and SD of 7.55 ms, which reflected the previously-reported manual RT advantage for faces vs. objects ( [24]; Experiments 1a,b). A Bayes factor of 0.08 was found for upright face vs. house contrasts, thus supporting the findings from the NHST and providing evidence in favor of the null hypothesis of no difference in reaction times between the face and house cues.

Discussion
If contextually-based social information resulted in spontaneous covert social attentional biasing, we expected to find faster responses for targets occurring at the previous location of the face overall and/or the eyes specifically. Our data did not support this hypothesis, indicating no attentional effects for targets occurring at the location of the face or the eyes. If anything, there was a short-lived effect at 560 ms cue-target interval only, suggesting slower RTs for overall faces relative to houses, as well as specifically for eyes relative to top house, when faces were presented in the left visual field; however, since this finding was not specific to upright faces, it may have reflected differences in the stimulus properties of the contextualized cues [74][75][76]. Similar contextualized differences may have been responsible for slower RTs for the mouth relative to top house targets, both overall and when cues were presented in an upright orientation, particularly since this effect was not specific to when faces were presented in the left visual field. Additionally, Bayes analyses supported the null hypothesis of no differences between face and house cues. Experiment 1 then suggests that when the face and house stimuli are presented within appropriate background context, there are no reliable effects to indicate preferential covert attentional biasing towards the face or the eyes. These results are consistent with our recent work [45], and further suggest that covert social attention is not determined by contextual factors alone. In Experiment 2, we examined whether these results held when we measured overt attention.

Experiment 2
In the Pereira and colleagues [45] study, when participants were allowed to make eye movements during the dot-probe task, they broke central fixation on 11% of all trials. Of these 11% of trials, when examining where saccades were directed, it was found that participants looked towards the eyes of the face 17% of the time. This reliable, albeit modest, bias to look at the eyes reflects a potential dissociation between covert and overt orienting towards social stimuli. In the present experiment, we examined whether similar oculomotor biasing also occurred when cues were presented within contextual backgrounds. To do so, we did not provide participants with any instructions to maintain central fixation, but measured their spontaneous eye movements while they performed the same dot probe task as in Experiment 1.

Materials and Methods
Participants, Apparatus, Stimuli, Design, and Procedure. Thirty new volunteers (23 females, M age = 21 years, SD age = 3 years) participated. None took part in the previous experiment and all reported normal or corrected-to-normal vision. All stimuli, design, and procedures were identical to Experiment 1, except that: (i) participants' eye movements were tracked using a remote EyeLink 1000 eye tracker (SR Research; Mississauga, ON) recording with a sampling rate of 500 Hz and a spatial resolution of 0.05 • . Although viewing was binocular, only the right eye was tracked; (ii) prior to the start of the experiment, a nine-point calibration was performed, and spatial error was rechecked before every trial using a single-point calibration dot. Average spatial error was no greater than 0.5 • , with maximum error not exceeding 1 • ; and (iii) participants were not given any instructions regarding maintaining central fixation in order to preserve their natural eye movements during the task.

Results
Anticipations (0.1%), timeouts (2.2%), and incorrect key presses (0.1%) were removed from manual data analyses. Overall response accuracy was 96%. Manual RTs were analyzed as before using the same three sets of analyses.
Overall face vs. house comparisons. Mean correct RTs were analyzed using an omnibus ANOVA, run as a function of Cue orientation (upright, inverted), Face position (left visual field, right visual field), Target location (face, house), and Cue-target interval (250, 360, 560, and 1000  Bayesian analyses. Once again, Bayes factor was used to examine the plausibility of these findings using the same parameters as before (i.e., two-tailed Gaussian distribution, M = 17.67, SD = 7.55; ( [24]; Experiments 1a,b)). A Bayes factor of 0.07 was found for upright face vs. house contrasts, which once again provided support for the null over the alternative hypothesis indicating no difference in reaction times between the face and house cues. Oculomotor data. To assess if participants spontaneously looked at the face cue more frequently, we next examined trials in which saccades were launched from central fixation towards one of the predefined regions of interest (ROI), i.e., eyes, mouth, top house, or bottom house location, during the 250 ms cue period only, as we were specifically interested in examining attentional biasing in response to the cue stimuli. As illustrated in Figure 5, each ROI was comprised of its respective cue region and spanned a 30 • radial window. Saccades were defined as eye movements with an amplitude of at least 0.5 • , an acceleration threshold of 9500 • /s 2 , and a velocity threshold of 30 • /s. For each participant, we calculated the proportion of saccades for each ROI by examining the direction of the very first saccade that was launched from central fixation towards one of the ROIs upon cue onset. The number of saccades that were launched towards each ROI were tallied across the entire experiment for each participant and then divided by the total number of first saccades that occurred during the cue period. On average, participants saccaded away from the fixation cross on 11% of all trials, of which saccades were launched towards an ROI on 91% of those trials. As with manual RT, we conducted NHST to analyze the proportion of saccades launched towards (1) the overall face versus the house and (2) each specific target location (eyes, mouth, top house, bottom house), and we conducted Bayesian analyses to examine any null effects to assess (3) the relative strength of evidence for the alternative over the null hypothesis. Specific facial features vs. house comparisons. Proportion of saccades were examined using a repeated measures ANOVA run as a function of Cue orientation (upright, inverted), Face position (left visual field, right visual field), and ROI (eyes, mouth, top house, bottom house). Mean proportion of saccades away from the fixation cross are illustrated in Figure 6 as a function of ROIs for Upright (6a) and Inverted (6b) cues.  Thus, when participants' natural eye movements were measured, spontaneous saccades were launched more frequently towards the face overall as well as the eyes specifically, particularly when the face was presented in an upright orientation and when it was positioned in the left visual field.

Discussion
In Experiment 2, we examined whether participants' overt attention was spontaneously directed toward faces or their specific features. Without any specific instructions about eye movements, we once again found no manual advantages for targets occurring at the location of the face and Bayesian analyses provided evidence for the null hypothesis of no RT differences between targets occurring at the previous location of the face and house cues. However, when we examined spontaneous eye movements, we found that participants broke fixation and looked at the cue stimuli on 11% of all trials, which is numerically consistent with the percentage of saccades found in the Pereira and colleagues [45] study. However here, saccades were launched towards the eye region on 48% (versus 17% in the previous study) of trials that broke fixation. This finding was also qualified by an increase in the proportion of saccades towards faces overall, and eyes specifically, when faces were upright and when they were presented in the left visual field. Therefore, even though oculomotor biasing occurred on a small subset of all trials, it appears that faces presented within consistent contextual backgrounds exert differential effects across manual and overt responses.

General Discussion
The present study examined whether social information presented in context influenced spontaneous social attention biasing. Using the dot-probe paradigm, we presented participants with face and house cues embedded within appropriate contextual backgrounds and measured their speed of target discrimination when targets were presented at the previous location of the face (eyes, mouth) versus the house (top, bottom). While controlling for stimulus information across size, distance from the fixation cross, overall luminance, and attractiveness between the face and house stimuli (as in Pereira and colleagues' [45] study), we measured covert attention by instructing participants to maintain central fixation in Experiment 1 and spontaneous eye movements by using an eye tracker in Experiment 2.
No evidence of attentional biasing towards faces or facial features was found in manual responses in either experiment. This replicates and extends our previous work demonstrating that covert social attentional biasing is fragile in nature and affected by stimulus content factors [45] even when the stimuli are embedded in appropriate background contexts. Thus, visual context alone appears to be insufficient in engaging social attention biasing in covert measures. However, when we measured participants' eye movements, we found that their overt attention was biased towards the eyes of faces when they were presented in an upright orientation and in the left visual field. Although this biasing towards the eye region occurred in only 48% of trials in which participants broke fixation during the cue display (i.e., only 5.3% of all trials), the magnitude of this effect was numerically larger than in Pereira and colleagues' [45] study, where they observed biasing towards the eye region on only 17% of trials in which participants broke fixation (i.e., 1.9% of all trials). This suggests that it may be quicker and less effortful to extract social information from faces when they are presented in the appropriate context. However, since these observations are based on between-study comparisons, future investigations are needed in which background context is directly manipulated using a within-participants design to arrive at a more precise estimation of the effects of context on the magnitude of social attention biasing. Taken together, the results of the present study show that contextually-embedded social information does not result in spontaneous social attentional biasing in covert measures but does appear to modulate the magnitude of attentional biasing in overt measures.
These findings raise three main discussion points. One, they suggest that past work that has reported robust effects of social attention biasing in manual and oculomotor measures when using more uncontrolled stimuli [24,25,29,30,32,36,37,67] likely did not reflect the contribution of visual context alone. Instead, it is more plausible that these effects were due to some combination of visual context, stimulus content, and task factors. Content factors such as luminance, internal configuration of features, and emotional valence have each been documented to engage attention irrespective of any biases elicited by the social nature of faces [47,48,50,85,86]. Additional factors, like geometrical shape, that are specific to faces but not tied to any inherent social importance that faces contain may also play a role in attentional biasing towards these social stimuli [87]. Furthermore, task settings, like the predictability of the cues and the setting of the attentional paradigms have also been found to modulate the magnitude of social attentional effects [83,88]. For example, Burra, Framorando, and Pegna [89] investigated the electrophysiological correlates of eye gaze processing and found that perceiving eye gaze was highly dependent on whether the faces were relevant to the task. Similarly, Hessels and colleagues [90] engaged participants in face-to-face communication and found that gaze allocation was affected by task instructions (i.e., speaking versus listening) and the social context of the communication (i.e., direct conversation versus pre-recorded video). Dovetailing with these data, the present results point to the underlying influence of both stimulus and task settings in spontaneous attentional biasing towards faces and eyes, and highlight the need for future investigations geared towards manipulating and isolating the contribution of visual context, stimulus content, and task factors.
Two, while overt measures demonstrated infrequent effects, they were nevertheless statistically reliable. This is consistent with recent work by Hayward and colleagues [43] who compared social biasing occurring within a typical cuing task with social biasing occurring during a live social interaction. One difference that emerged in the comparison of these methods was the relative scarcity of gaze following observed during real-world interaction. Subsequently, Blair, Capozzi, and Ristic [91] found similarly infrequent though reliable effects when examining overall social orienting during gaze cuing tasks. Together, these data demonstrate that gaze following and social orienting may in actuality occur relatively infrequently, which further suggests that these behaviors may be contextually and situationally mediated, such that appropriate attentional responses only need to occur occasionally in order to affect behavior reliably. Our eye movement measures support these findings showing that orienting may be reflective of an infrequent bias towards key parts of social cues.
Finally, while social attention biasing was observed in overt measures, no effects emerged in covert measures. This result adds to the growing body of evidence demonstrating dissociations between covert and overt measures of social attention, in that the two modes of orienting appear to serve different purposes in real-world social environments-covert attention is hypothesized to serve as a mechanism that surreptitiously gathers information from the environment, while overt attention is hypothesized to serve as an active signaling mechanism in order to communicate with others [44,[92][93][94][95]. These dissociations have only just begun to be probed on an experimental level [42,[96][97][98][99], with the present study along with Pereira and colleagues' [45] study providing direct evidence in support of this distinction. Future studies in which covert and overt attention are systematically manipulated and measured are needed to understand the nature of this dissociation.
In sum, the present investigation shows that spontaneous social attention biasing may diverge across covert and overt measures. This underscores the fragility of spontaneous attentional biasing towards social information and points to the need for systematic investigations of the specific contributions of stimulus content and visual context factors in covert and overt social attention.
Author Contributions: All authors were involved in developing the initial study concept and design. E.J.P. implemented the study and performed data collection. All authors were involved in analyses, interpretations, manuscript preparation, and final approval of the manuscript.