Dissociable effects of averted “ gaze ” on the priming of bodily representations and motor actions ☆

Gaze direction is an important stimulus that signals key details about social (dis)engagement and objects in our physical environment. Here, we explore how gaze direction influences the perceiver ’ s processing of bodily information. Specifically, we examined how averted versus direct gaze modifies the operation of effector-centered representations (i.e., specific fingers) versus movement-centered representations (i.e., finger actions). Study 1 used a stimulus-response compatibility paradigm that tested the priming of a relevant effector or relevant movement, after observing videos of direct or averted gaze. We found a selective priming of relevant effectors, but only after averted gaze videos. Study 2 found similar priming effects with symbolic direction cues (averted arrows). Study 3 found that averted gaze cues do not influence generic spatial compatibility effects, and thus, are specific to body representations. In sum, this research suggests that both human and symbolic averted cues selectively prime relevant body-part representations, highlighting the dynamic interplay between our bodies, minds, and environments.


Introduction
Eye gaze is an important social stimulus. It signals social engagement or disengagement, and it provides valuable information about objects in the physical environment. Accordingly, eye gaze has been extensively researched, with some considering it the "core of social cognition" (Itier & Batty, 2009;Kleinke, 1986;Senju & Johnson, 2009). Indeed, eye gaze influences interpersonal evaluations (Ellsworth & Carlsmith, 1973;Macrae et al., 2002;Mason et al., 2004;Mason et al., 2005), face perception (Stein et al., 2011), and even inferences of social identity and intelligence (Wheeler et al., 1979). Processing of eye gaze may also be privileged, given that even young infants detect gaze direction and use it for learning (Farroni et al., 2002;Grossmann et al., 2007).
Most studies in this domain have focused on direct gazewhen someone looks directly at the perceiver. However, from the perspective of social (and general) cognition, the effects of averted gaze (i.e., when someone looks out at the environment, rather than at the perceiver) are just as interesting and ecologically important. Yet, they have so far received limited consideration. The current paper explores the consequences of gaze direction on the perceiver's bodily representation, using classic automatic imitation and spatial compatibility paradigms. Specifically, we test whether averted gaze (as opposed to direct gaze) facilitates the use of anatomical, effector-centered, body-part representations. Our investigation is grounded in previous work exploring the effects of gaze direction on attention, emotion, embodiment, and imitation. Next, we briefly elaborate on these issues for context. behind the current research.
Direct and averted gaze differentially influence attention (Conty et al., 2010;George & Conty, 2008;Langton et al., 2000;Myllyneva & Hietanen, 2015;Nummenmaa & Calder, 2009). Direct gaze indicates upcoming interpersonal interaction, so it orients the perceiver towards the interactant, their face, and sometimes also the self. In contrast, averted gaze signals to the perceiver that there are potentially relevant stimuli "out there" in the surrounding environment. This could result in a direct attentional shift in the perceiver, due to reflexive gaze following behavior or social joint-attention phenomena (Pfeiffer et al., 2013). However, averted gaze could also signal the need to prioritize processing of bodily representations relevant for dealing with environmental content, as we explain shortly.
Importantly, note that eye gaze is not unique in its effects on spatial orienting, given that symbolic and body direction cues have similar effects (Gervais et al., 2010;Hietanen et al., 2006;Kingstone et al., 2003). For example, arrows can trigger similar patterns of attentional orienting as eye gaze, even when they are incidental or counter-productive to the task-at-hand (Santiesteban et al., 2014;Tipples, 2002).
Direct and averted gaze also differentially influence processing of emotional cues related to approach and avoidance, presumably because these cues are associated with different spatial locations of pertinent events. Direct gaze facilitates processing of approach-related emotional cues, such as faces of joy and anger (involving face-to-face interaction). In contrast, averted gaze is critical for rapidly and effectively communicating potential dangers in one's surroundings, especially when coupled with expressions of fear or anxiety (Adams & Kleck, 2005;Benton, 2010;Hadjikhani et al., 2008;Hietanen et al., 2006;Lachat et al., 2012). Recent reviews highlight that emotional states are tied to the preparation of specific motor acts and general action tendencies (Blakemore & Vuilleumier, 2017). So, averted gaze might play a role in how individuals monitor and prepare their own bodies to anticipate stimuli in the environment .

How does gaze direction influence representations of body parts vs. motor actions?
One way to examine how gaze direction influences bodily representations is through the phenomenon of automatic imitation (AI), or the reflexive tendency to mimic others' actions (Heyes, 2011). The general interest in AI stems from the role mimicry plays in social learning, synchronization, bonding, and representing (Niedenthal et al., 2005;Rizzolatti et al., 2001). Indeed, on a general level, mimicry works as a rudimentary mechanism for responding to others (Arnold & Winkielman, 2019;Carr & Winkielman, 2014;Palagi, Celeghin, Tamietto, Winkielman, & Norscia, 2020). However, it is worth pointing out that even seemingly related mimicry phenomena, such as spontaneous gestural mimicry and automatic finger imitation, are not necessarily correlated (Genschow et al., 2017), and their relation to social abilities such as empathy or autism are weak and complex (Cracco et al., 2018).
Recent research shows that even simple motor imitation (e.g., fingerlifting, hand-opening and closing, etc.) can be modulated by social context, including prosocial attitudes (Leighton et al., 2010), incidental similarity (Guéguen & Martin, 2009), and affiliative drive (Lakin & Chartrand, 2003). It is worth noting that for AI (rudimentary movement mimicry), there are limits to those social modulation effects, at least with high-order social cues, such as power and status (Farmer, Carr, Svartdal, Winkielman, & Hamilton, 2016). Still, AI has been reported to increase with direct eye contact, when such eye contact is combined with enhanced signals of prosociality, such as a smile (Wang et al., 2010;Wang & Hamilton, 2014).
Note, however, that AI tasks used in these experiments typically do not separate different aspects of body processing. This is important because the facilitation or inhibition of participants' own actions by observation of, say, a lifting finger or an opening hand could be due to processing of the actual action or due to the mere activation of a congruent effector (i.e., of a bodily but not an action representation). This ambiguity can be addressed by additional controls to separately gauge effects of observing an action vs. observing an effector.
Several lines of research highlight the importance of distinguishing between processing of body actions and body parts. For instance, mapping spatial position of a specific body part is separable from the mapping of actions Tsakiris et al., 2007). In another example, some body processing disorders involve the inability to differentiate parts of one's own body or someone else's (e.g., finger agnosia; Poeck & Orgass, 1969), which is separate from the inability to move that part of the body (e.g., finger apraxia; Benton, 1959). This is partly because there are overlapping yet distinct neural mechanisms for body schema (i.e., "body naming" networks; Tsakiris et al., 2007;Tsakiris et al., 2010) and the observation and initiation of motor movements ("action control" networks; Cross & Iacoboni, 2014;Cross et al., 2013;Hogeveen et al., 2015;Rizzolatti & Sinigaglia, 2010). More broadly, there is also work distinguishing the sense of having a body (e.g., body ownership) vs. moving the body (e.g., agentic movement) (Tsakiris et al., 2006;Tsakiris et al., 2010). Finally, and also more generally, the need to distinguish object processing and movement processing comes from work on the ventral stream (involved in object recognition and form representation) and dorsal stream (involved in movement planning and execution) (Cohen et al., 2009;Culham et al., 2003;Passingham & Toni, 2001;Shmuelof & Zohary, 2005).
Given the importance of these distinctions, it is surprising that only recent studies on AI of observed actions have started to include tests to dissociate the effects of congruent body part representations vs. the action representation (e.g., Cook & Bird, 2011. This newer work typically distinguishes body and action representation by comparing AI movement trials to effector priming (EP) trials, where a certain body part is highlighted but not by movement. For example, during EP trials, participants may see a specific finger only change color (instead of moving, as with the AI trials). Recent studies highlight the importance of this distinction and suggest that the two types of compatibility, effector compatibility (EP effect) vs. action compatibility (AI effect) are differentially modulated by social factors (Cook & Bird, 2011).
Consequently, a key open question is how gaze direction impacts the recruitment of body-part vs. action representations. To understand the possible effects, recall that effectors (e.g., fingers) are coded "locally" using somatotopic representations Tsakiris et al., 2006). This is true not only for coding of one's own finger but also during observation of someone else's finger (Valchev et al., 2017). Conversely, actions themselves are coded more globally (Tsakiris et al., 2010(Tsakiris et al., , 2006. Rather than solely relying on somatotopic "mirroring", action processing recruits the interpretative system, sensitive to the goals and intentions of the observed action (Brass et al., 2007).
Recall that averted gaze can signal the need for an individual to monitor and adjust their own bodies to anticipate stimuli "out there" in the environment (Conty et al., 2010;George & Conty, 2008;Gervais et al., 2010;Myllyneva & Hietanen, 2015). This suggests that averted gaze cues might facilitate the effects of observing someone else's body parts. In turn, this leads to our key (somewhat counter-intuitive) prediction that averted gaze should enhance the effect of "local" effector priming, as tested on EP trials. In contrast, direct gaze should facilitate more global, "integrated" action processing, measured on AI trials. This prediction is implied by previous research, but has not yet been tested with pure eye gaze cues without additional smiling signals (cf Wang et al., 2010;Wang et al., 2011;Wang & Hamilton, 2014). Because of this difference, the direct gaze portion of the study is also interesting as it constitutes a robust and statistically well-powered check on the assumption that direct gaze (even without smiling cues) may promote AI.
The general idea that averted gaze is associated with more local processing also aligns with work showing that faces with averted gaze receive more local, part-based vs. global, configural processing (Young et al., 2014). It is also consistent with work on imitation and psychological distance, where more distance (which is associated with higher construal level) tends to promote imitation of goals, whereas less distance (lower construal level) is associated with imitation of specific details (Genschow, Hansen, Wänke, & Trope, 2019;Hansen & Genschow, 2020). One reason this might occur is because direct vs. averted gaze primes abstract vs. concrete processes, or alternatively this effect might be due to a more specific connection between direct gaze and mentalizing processes (Baron- Cohen & Cross, 1992;Khalid et al., 2016). All these considerations suggest that averted gaze should enhance processing of a specific effector and direct gaze should enhance processing of actions.
Finally, we addressed two additional issues that are vital for the theoretical interpretation of any gaze effects. First, we wanted to know whether symbolic cues (like arrows) have similar effects as eye gaze cues. Examining this idea is essential because previous research has found that symbolic cues can also influence automatic orienting (Santiesteban et al., 2014;Tipples, 2002). We predicted parallel effects of biological (eyes) and symbolic (arrow) cues, given recent research showing that moving non-biological stimuli can be represented similarly and lead to similar after-effects as biological stimuli, if believed to have human origin (Gowen et al., 2016). Second, we also wanted to know whether averted gaze cueing effects are specific to tasks involving the observation of body parts, or if they are more generic and emerge in any task that involves spatial compatibility (e.g., the classic Simon task). Given the literature suggesting that EP effects are not reducible to spatial compatibility effects (Cook & Bird, 2011), we expected influences of gaze to be limited to the processing of bodily representations.

Current studies
We investigated these predictions in three studies. Study 1 used a stimulus-response compatibility (SRC) paradigm that tested the priming of bodily representation (effector priming; EP) or action imitation (AI), after observing videos of direct or averted gaze. To give a limited preview of the key result, we found a selective priming effect on bodily representation (EP trials), but only after averted gaze videos. Study 2 examined whether similar effects can be obtained with symbolic cues. We tested this by replacing the human eye gaze videos from Study 1 with moving arrows in Study 2, where an arrow with similar low-level features either directed or averted its "gaze" to participants. To preview the key results, we observed the same pattern of results as Study 1 -a selective enhancement for the priming of bodily representation (EP trials), only after "averted" trials. Finally, Study 3 tested whether these averted gaze effects were specific to bodily representations, or if averted gaze would also impact general and abstract spatial compatibility. We did this by replacing the SRC task from Studies 1 and 2 with a Simon task, which paired motor responses with abstract shapes that appeared at different locations on the screen. Here, we only observed a general slowing on averted gaze trials, rather than the more targeted effects from Studies 1 and 2. Taken together, the current studies offer robust evidence that averted gaze cues (both human and symbolic) specifically enhance the priming of bodily representations.

Sample size justification and power analysis
We conducted a power analysis on all studies to ensure that we had adequate sample size to detect all effects of interest. We used G*Power 3.1 software with settings for within-factors repeated-measures ANOVAs (Faul, Erdfelder, Lang, & Buchner, 2007). For Studies 1-2, we referenced effect sizes from Cook and Bird (2012). Given our sample sizes, we found that we achieved 99.6% power in Study 1 and 100% power in Study 2 (based on the effect size of η p 2 = 0.07 for the key interaction). Note that if we instead use the prosocial vs. non-social contrast for control participants in Cook and Bird (2012), the effect size is ηp 2 = 0.14, and thus the sample size would still provide 100% power for Studies 1-2.
For Study 3, we referenced results on the Simon effect reported in Liepelt, Wenke, Fischer, & Prinz (2011), and given our sample size, we also achieved 100% power in Study 3 (based on the effect size of η p 2 = 0.63 for the standard compatibility effect in the Simon task). With all power analyses, we assumed alpha = 0.05, nonsphericity correction = 1, and correlation among repeated measures = 0.5 (standard defaults in G*Power).

Study 1
Study 1 investigated our main prediction that averted eye gaze would selectively augment the priming of effectors but not actions. As discussed previously, this prediction is suggested by findings that averted gaze leads to appropriate adjustments of body representations to potential environmental events. This prediction is also consistent with the literature suggesting that effectors (fingers) are coded from a local somatotopic perspective (whereas actions are coded more globally, using more interpretative mechanisms). Specifically, we expected that averted gaze should only impact effector priming (EP) trials, but not action priming, as measured by automatic imitation (AI) trials.

Participants and equipment
Seventy-two undergraduates from the University of California, San Diego (UCSD) participated for course credit (M age = 20.69 years, SD age = 1.77 years). All participants were right-handed English speakers and signed consent forms approved by the UCSD Human Research Protections Program (HRPP).
Stimuli were presented on 17-inch Dell flat-screen monitors with Intel® Core™ 2 Duo CPUs containing 4 GB of RAM and a 32-bit operating system (running Windows 7 Professional, © Microsoft Co.). The study tasks were presented using E-Prime 2.0 (Psychology Software Tools, Pennsylvania, USA).

Design and procedure
Study 1 used a stimulus-response compatibility (SRC) paradigm, based on previous research (for additional details, see Cook & Bird, 2012). During the task, participants proceeded through eight blocks of 32 trials each (256 total trials), after completing 32 (one block) of practice. The real trials were evenly split among the factors of Gaze Type (2: direct, averted), Trial Congruence (2: congruent, incongruent), and Trial Type (2: effector priming, automatic imitation). The specific differences between trial types will be explained shortly. Each trial began with a 2000 ms ITI, followed by a fixation with a jittered duration between 500 ms and 1000 ms, a gaze video (averted or direct gaze), and an SRC stimulus.
The gaze video centrally displayed either direct or averted eye contact from a female model who held a neutral facial expression throughout (see Fig. 1a). All gaze videos lasted for 2500 ms (presented at 30 frames per second [fps]; 75 frames total). In direct gaze videos, the video started with the model looking towards the left side of the screen, after which she brought her gaze to the center of the screen around the 1500 ms mark, where it remained for the rest of the video. In averted gaze videos, the video again started with the model looking towards the left side of the screen, but around the same 1500 ms mark, the model brought her head all the way across with her gaze ending towards the right side of the screen, where it remained for the rest of the video. All gaze videos (in both Studies 1 and 3) depicted the same female model (see Fig. 1a).
Next, the SRC trial was triggered (see Fig. 2). All videos depicted a human hand that was presented vertically on the screen (6 • vertical visual angle × 9 • horizontal visual angle), to control for spatial compatibility effects. Participants rested their right hand in a horizontal orientation (relative to the presentation screen) with their index finger on the "V" key and middle finger on the "B" key. After the offset of the gaze video, a baseline stimulus (a resting hand) was presented for a jittered movement delay between 100 ms and 300 ms. Following this delay, the hand in the video showed either an index or middle finger lift on automatic imitation (AI) trials, or either an index or middle finger color change on effector priming (EP) trials. The video also showed either a "1" or "2", positioned between the index and middle fingers. The finger lift was shown over three successive frames, to give the appearance of movement. Trials were evenly split between AI (i.e., finger in the video displayed actual movement) and EP (i.e., finger in the video changed color but did not move). To respond, participants were required to lift either their own index finger (i.e., lift off the "V" key, in response to a "1") or middle finger (i.e., lift off the "B" key, in response to a "2"). Participants' RTs were calculated between the onset time of frame 2 to the key-lift time after the offset of frame 4, for each video. Thus, RT congruency effects were calculated by subtracting congruent trial RTs from incongruent trial RTs (where RT variability would occur after the offset of frame 4, since frames 2 and 3 were always presented for 34 ms each). Note that the SRC task trials (both AI and EP) could be either congruent (e.g., participants were required to lift their own index finger [cued with a "1"] while observing an index finger action or color change) or incongruent (e.g., participants were required to lift their own index finger [cued with a "1"] while observing a middle finger action or color change) (see Fig. 2).

Analysis strategy
RTs were analyzed using multilevel models (MLMs) via restricted maximum likelihood estimation. We used MLMs because they more effectively handle unbalanced data with missing observations, rely on fewer assumptions regarding covariance structures, and increase parsimony and flexibility between models (Bagiella, Sloan, & Heitjan, 2000). Note that while we report MLM results here (due to the advantages over GLM ANOVA methods), all reported effects still replicate when using those other traditional approaches. MLMs were built with the lmerTest package (Kuznetsova, Brockhoff, & Christensen, 2017) in R, using a maximal random-effect structure (Barr et al., 2013). To obtain p-value estimates for fixed-effects, we used Type III Satterthwaite approximations, which can sometimes result in decimal degrees of freedom, based on the number of observations (West et al., 2014). There is not yet a consensus on how to report effect sizes for omnibus fixed effects in multilevel models (see Lorah, 2018), but we provide Cohen's d z effect size estimates for all significant contrasts (Lakens, 2013).
Across all studies, trial RTs less than 100 ms or greater than 1000 ms were excluded as outliers (Leighton et al., 2010;Press et al., 2008). If a participant performed below 70% accuracy during the main task, they were not included in the main analyses. For the remaining participants, error trials were excluded.

RTs
RTs were collapsed into congruency scores by subtracting congruent RTs from incongruent RTs for all participants, across all factor levels. Congruency scores were then analyzed according to a 2 (Gaze Type: direct, averted) × 2 (Trial Type: effector priming [EP], automatic imitation [AI]) fixed-effects structure. Error trials and outliers constituted 6.45% of all trials. All participants performed better than 70% accuracy on the SRC task, so none were excluded (leaving a final n = 72).
Aside from the interaction, we observed a main effect of Gaze Type, F (1, 86.09) = 5.99, p = .02, where averted gaze videos led to greater congruency scores than direct gaze videos. We also detected a main effect of Trial Type, F(1, 102.91) = 30.87, p < .001, where EP trials E.W. Carr et al. resulted in greater congruency scores than AI trials.

Study 2
Study 1 found that averted gaze selectively increases effector priming (EP) congruency scores, an index of the priming of specific body part representations. Interestingly, no such pattern emerged for the automatic imitation (AI) trials, an index of action mirroring. This suggests that averted gaze may specifically prime the representation of the effector (body part), but not actions performed with that effector.
In Study 2, we wanted to examine whether these effects are selective to orienting by eye gaze. As mentioned, previous research has also shown similar attentional effects with symbolic cues like arrows (Kingstone et al., 2003;Tipples, 2002). Thus, it is not yet clear whether averted eye gaze is necessary for the priming of effector representation, or if any symbolic aversion cue would do. We tested this by replacing the human eye gaze videos from Study 1 with moving arrow videos in Study 2, where an arrow with similar low-level features either directed or averted its "gaze" towards participants.

Participants and equipment
One-hundred and eleven UCSD undergraduates participated for course credit (M age = 20.41 years, SD age = 1.65 years). All participants were right-handed English speakers and signed consent forms approved by the UCSD HRPP. All equipment (software and hardware) was the same as Study 1.

Design and procedure
Study 2 was similar to Study 1, except that we replaced the human eye gaze videos with new "arrow gaze" videos, before each of the SRC task trials (see Fig. 1b). The arrow "gaze" videos were created to match the lower-level features of the human model from the videos in Study 1 (e.g., color, movement speed, etc.) but instead depicted an arrow that moved to different end positions. On direct arrow trials, the arrow started pointing towards the left half of the screen, then moved to make "eye contact" with the center of the screen by pointing at the participant (displayed using the same 30 frames per second [fps] frame rate; 75 frames total). On averted arrow trials, the arrow moved along the same trajectory but only briefly pointed at the center of the screen, instead ending by pointing towards the right half of the screen (see Fig. 1b). All other task and design parameters were the same as Study 1.

Analysis strategy
Our analysis strategy was the same as Study 1.

RTs
Congruency scores were analyzed using a 2 (Arrow Type: direct, averted) × 2 (Trial Type: effector priming [EP], automatic imitation [AI]) fixed-effects structure. One participant performed below 70% accuracy during the main task and was thus excluded from all analyses  Cook & Bird, 2012). Automatic imitation (AI) trials were designed to measure action-based motor priming, while effector priming (EP) trials were designed to assess priming of specific body representations.
(resulting in a final n = 110). Error trials and outliers constituted 6.16% of the remaining trials.
Note that we observed marginal main effects for both Arrow Type, F (1, 164.53) = 3.63, p = .06, and Trial Type, F(1, 115.06) = 3.62, p = .06. These main effects demonstrated that averted arrows led to higher congruency scores overall, and congruency scores were generally increased during EP trials.

Comparative RT analysis between Studies 1 and 2
We also wanted to assess the relative strength of the RT effects after human gaze (Study 1) and arrow "gaze" (Study 2). To do this, we computed the difference in RT congruency scores by gaze type (averted gaze congruency score minus direct gaze congruency score) for each trial type (EP vs. AI) and each study (Study 1 vs. Study 2), across all participants. Using these new difference scores by gaze type, we were able to directly compare the size of congruency score effects across studies. We implemented this with another MLM, which included Trial Type (2: EP, AI) × Study Number (2: Study 1, Study 2) as fixed-effects. Random intercepts were fit across individual participants. Fig. 5 displays the results. We did not observe evidence of a Trial Type × Study Number interaction, F(1, 360.00) = 2.74, ns. We only detected a main effect of Trial Type, F(1, 360.00) = 31.29, p < .001, which showed that there were greater differences in EP congruency scores by gaze (i.e., averted gaze led to greater congruency scores than direct gaze), compared to AI. There was also no main effect of Study Number, F(1, 360.00) = 1.04, ns. We further confirmed this analysis by conducting a Bayesian repeated-measures ANOVA with the BayesFactor package in R, using default scales for prior probabilities of both fixed and random effects at r = ½ (Morey et al., 2015). We used default priors because they are computationally efficient and broadly applicable to many different effects (Rouder et al., 2012), and we did not have any reason to alter these prior distributions for this analysis. The use of JZS Bayes Factors (BFs) offers another alternative to p-values, since they can weigh the relative evidence between null and alternative hypotheses through model comparison (Rouder et al., 2012). When comparing the null to alternative hypothesis (abbreviated as BF 01 , where greater numbers indicate greater relative evidence for the null hypothesis), it is generally accepted that BF 01 's between 1 and 3 represent anecdotal evidence; BF 01 's between 3 and 10 denote strong evidence; BF 01 's between 10 and 30 signal substantial evidence; and BF 01 's greater than 30 signify overwhelming evidence (Jeffreys, 1961). BF's can also be stated against the null hypothesis (instead called BF 10 , which indexes relative evidence for the existence of an effect, thus in support of the alternative hypothesis), simply by taking the inverse of BF 01 .
The Bayesian repeated-measures ANOVA demonstrated relatively stronger evidence for the null hypothesis (against the existence of an effect) for the Trial Type × Study Number interaction, BF 01 = 1.49 ± 3.48% vs. BF 10 = 0.67 ± 3.48%, but the magnitude of this evidence was still relatively weak. Similar to the MLM analysis, we observed strong E.W. Carr et al. evidence in favor of the null for the main effect of Study Number (indicating no effect), BF 01 = 5.29 ± 0.72%, and overwhelming evidence against the null for the main effect of Trial Type (indicating an effect), BF 10 = 402,320.20 ± 6.61%, which again showed that there were greater differences in EP congruency scores by gaze (i.e., averted gaze led to greater congruency scores than direct gaze), compared to AI. In short, this analysis between the RT congruency effects observed during Study 1 (with human gaze videos) and Study 2 (with arrow "gaze" videos) did not show any evidence that the respective findings were statistically distinguishable. Generally, both studies demonstrated greater RT congruency scores after averted gaze, specifically for EP trials.

Study 3
To quickly review the main findings thus far, Study 1 demonstrated that after averted eye gaze, congruency scores were selectively enhanced for effector priming (EP) trials, which index the priming of bodily representation (while similar effects were not observed for movement [AI] trials). Study 2 replicated these findings using moving arrows that averted their "gaze" from participants. Finally, with an MLM that compared the strength of the RT congruency effects across Studies 1 and 2, we found that the human and symbolic aversion cues led to results that were statistically indistinguishable. In short, the findings from the first two studies demonstrate that averted cues specifically augment the priming of bodily representations, but not actions.
In Study 3, we wanted to assess whether the effects are specific to the priming of bodily representations. As mentioned previously, our results suggest that averted cues facilitate spatial remapping of bodily information (Becchio et al., 2011;Haggard et al., 2006), but it is not clear whether averted gaze just has a broader impact on more general spatial compatibility processes (i.e., mapping abstract objects to different parts of space, while using the body). Another way to think about this is that spatial stimulus-response compatibility (SSRC) effects can occur via different routes. Such coding can refer to the internal spatial positions of effectors (as with body parts in EP trials, called internal SSRC) or it can relate to external positions in visual space (general spatial locations of visual stimuli, called external SSRC) (Matsumoto et al., 2004).
To test this, instead of the SRC task in Studies 1 and 2, we used the classic Simon paradigm, which pairs specific motor responses with abstract stimuli that vary in visuospatial location (Simon & Wolf, 1963). If we do observe similar effects of averted cues on Simon RTs, this would suggest that averted cues impact spatial compatibility processes more broadly, not bodily representations specifically (i.e., both internal and external SSRC). If we do not observe similar effects, this would suggest that averted cues selectively augment the priming of bodily representation (i.e., only internal SSRC).

Participants and equipment
Fifty-nine UCSD undergraduates participated for course credit (M age = 21.08 years, SD age = 1.77 years). All participants were right-handed English speakers and signed consent forms approved by the UCSD HRPP. All equipment (software and hardware) was the same as Studies 1 and 2.

Design and procedure
For Study 3, instead of the SRC task used in Studies 1 and 2, we substituted a variant on the classic Simon paradigm (Simon & Wolf, 1963). Participants were told that they would be presented with repeated trials where different shapes would appear on the screen. If a red square appeared on the screen, they were instructed to respond using the "A" key on the keyboard (i.e., left-handed response, using their index-finger). If a green square appeared on the screen, they were instructed to respond using the "L" key on the keyboard (i.e., righthanded response, using their middle-finger). Trial congruence was varied per the spatial position of each colored square stimulus, such that the square was presented on either the left-or right-half of the screen (− 40% or + 40% from the screen's midpoint respectively, equidistant along the horizontal axis). For instance, a red square presented on the left-half of the screen would represent a congruent trial (since it requires a left-handed key response), while a red square presented on the righthalf of the screen would represent an incongruent trial (since it still requires a left-handed response), and vice versa for the presentation of a green square (see Fig. 6). Moreover, "baseline" trials were also included, where the square was presented at the midpoint of the screen (and thus, was neither spatially congruent or incongruent). To ensure that they understood the task instructions, participants completed 20 practice trials for each color (distributed across congruent, incongruent, and baseline trials in the same proportion as in the main Study), during which both accuracy and RT feedback were given.
Participants then progressed through eight blocks of 32 trials each (256 total trials, similar to Studies 1 and 2), evenly split among the factors of Gaze Type (2: direct, averted) and Square Color (2: red, green). 50% of all trials were baseline trials (i.e., square presented at the midpoint of the screen), while the other 50% were Simon trials (i.e., square presented either at ±40% x-axis point, congruently or incongruently along the horizontal midline of the screen). Each trial began with a 2000 ms ITI, followed by a fixation with a jittered duration between 500 ms and 1000 ms. After a fixation cross, a gaze video centrally displayed either direct or averted eye contact from a female model Fig. 5. Results of comparative analysis on difference in congruency scores by gaze type, between Studies 1 and 2. Using multilevel modeling (MLM), we did not observe any evidence that the RT results from Study 1 (using human gaze videos) and Study 2 (using arrow "gaze" videos) were statistically different, since the Trial Type × Study Number interaction was not significant. We only observed a main effect of Trial Type, where greater differences in RT congruency scores by gaze were found for effector priming (EP) than for automatic imitation (AI) trials. Error bars = ± 1 SEM.
holding a neutral facial expression throughout (same videos as Study 1; see Fig. 1a). Next, after the offset of the gaze video, a red or green square stimulus was presented on a black background (with a varied spatial position along the horizontal midline of the screen, according to whether the trial was baseline, congruent, or incongruent). RTs were recorded for participants to respond using the "A" (left) and "L" (right) keys on the keyboard, according to the square color (no feedback was given during the real trials).

Analysis strategy
Our analysis strategy was the same as Studies 1 and 2.

RT results
Due to the inclusion of the baseline Simon trials (where the square was presented in the center of the screen, thus neither congruent nor incongruent), raw RTs were analyzed by each trial type (instead of congruency effects). 1 RTs were analyzed on the trial-level for each participant, according to a 2 (Gaze Type: direct, averted) × 3 (Trial Type: baseline, congruent, incongruent) fixed-effects structure. Error trials and outliers constituted 7.56% of all trials. All participants performed better than 70% accuracy on the Simon task, so none were excluded (leaving a final n = 59).
Interestingly, we did not observe similar effects to Studies 1 and 2.
More specifically, we did not detect any evidence for a Gaze Type × Trial Type interaction, F(2, 13,800.70) = 1.04, ns (see Fig. 6). We only observed main effects of Gaze Type, F(1, 66.30) = 22.88, p < .001, and Trial Type, F(2, 85.10) = 117.23, p < .001. This demonstrated that averted gaze led to slower overall RTs on the Simon task, and across the different trial types, congruent trial RTs were faster than baseline RTs, which were faster than incongruent trial RTs (see Fig. 6). Therefore, there was no difference in Simon RT congruency effects between direct and averted gaze videosonly a general slowing after averted gaze.

General discussion
In the current work, we obtained robust evidence that averted gaze facilitates the priming of body part representations, but not the action itself (Study 1). These results replicate even if the averted "gaze" is purely symbolic (i.e., moving arrows; Study 2). Note that these effects are specific to bodily representations (and not just abstract spatial compatibility), since we found no gaze differences in congruency effects for a Simon RT task. In other words, averted cues only impact internal spatial stimulus-response compatibility (SSRC, as with body parts in effector priming trials for Studies 1 and 2), rather than external SSRC (or the general mapping of stimuli in visual space; Matsumoto et al., 2004).
Our results are theoretically informative in the context of past work on various social and cognitive effects of gaze. They suggest a novel possibility that averted gaze cues encourages more efficient processing of internal, body-part representations, which are coded in a local, somatotopic fashion Tsakiris et al., 2010Tsakiris et al., , 2006. There are different possible theoretical reasons for our observations.
Another explanation is inspired by research that addressed the effects of gaze in the context of face processing. As mentioned, previous work shows that the holistic configural encoding of faces is disrupted when the face displays averted gaze, presumably focusing people on local features (Young et al., 2014). Again, the idea here is the averted gaze cues may generically facilitate part-based processing.
A somewhat related explanation comes from research on imitation and construal level. This work shows that low construal level promotes focus on local features, which in imitation are as specific means of doing something, whereas high construal level promotes focus on global features, which in imitation are intentions and goals (Genschow et al., 2019;Hansen & Genschow, 2020). If gaze direction influences construal level in our paradigm, just like it does during face processing, this could explain why participants prioritize effectors (local parts) after averted gaze, and movements (abstract goals) after direct gaze. Another explanation for how gaze direction could have similar effects comes from proposals that averted gaze selectively inhibits mentalizing processes while direct gaze selectively facilitates them (Baron-Cohen & Cross, 1992;Khalid et al., 2016). Again, a direct test of these assumptions would be welcome, especially given the complexities in the literature regarding the role of goals in AI phenomena (Cracco et al., 2018). Furthermore, if these considerations are right and gaze influences imitation via more general processes (construal level, mentalizing level), then it is worth exploring if other social and non-social cues modify AI via similar general processes and if similar processes operate in other imitation and mimicry phenomena.
The work on face processing also addresses the role of emotional expressions. To be clear, our model had a neutral facial expression in both direct and averted condition. We also did not measure any  emotional responses in our participants. However, there are some interesting findings showing that averted gaze on a face facilitates processing of its avoidance-related expressions (e.g., fear and anxiety; Benton, 2010;Lachat et al., 2012) whereas direct gaze on a face facilitates processing of its approach-related expressions (e.g., joy and anger; Adams & Kleck, 2005). One speculative link here is via processing level because avoidance is associated with local processing (i.e. part focus) and approach is associated with global processing (Schwarz & Clore, 2007). Another possible link is via motor preparation effects associated with different emotions, especially signals of threat that are associated with specific body responses (Blakemore & Vuilleumier, 2017). Of course, these speculations need verification in future studies that manipulate and measure emotions. After all, specific motor preparations and actions depend on, for example, the type of threat and its imminence (Fanselow & Lester, 1988), and are flexibly and dynamically constructed based on the organism's current needs (Winkielman, Coulson, & Niedenthal, 2018). The current studies also make an important point that similar effects of averted gaze occur with both human cues (eye gaze, Study 1) and symbolic cues (arrow motion, Study 2). Previous research has demonstrated that arrows can trigger similar patterns of attentional orienting as eye gaze, even when they are incidental or counter-productive to the task-at-hand (Santiesteban et al., 2014;Tipples, 2002). So, in some way our results suggest that the eyes do not lead to any "special" effects (Hietanen et al., 2006;Kingstone et al., 2003). Instead, our findings suggest that bodily representations are more or less equally primed by both human and symbolic aversion cues. Keep in mind, however, that our arrow stimuli in Study 2 not only matched the lower-level features of our human eye gaze videos (e.g., color, lighting, etc.), but they also incorporated the illusion of movement (i.e., the arrow moved to "gaze" towards or away from the participants). This is important because both the type and speed of movement can have specific attentional effects (Büchel et al., 1998;Cavanagh, 1992). In turn, the apparent humanlike movement of the arrows might have led participants to represent these moving non-biological stimuli as having human origin (Gowen et al., 2016), leading to similar behavioral effects as the eye gaze videos. This is especially important when considering previous work revealing that direct gaze promotes the ascription of humanlike qualities, like mindperception and mentalization (Khalid et al., 2016). It would be valuable for future research to further examine the importance of biological motion and agency, especially for similar symbolic stimuli.

Trial Type
Finally, it is worthwhile to again highlight key details of our task and stimuli, compared to previous papers on eye gaze. First, we used both effector priming (EP) and automatic imitation (AI) trials to disentangle the priming of body parts vs. actions, respectively. This is not only theoretically meaningful but also methodologically important, given that in many other studies, these distinct effects were entangled. For instance, many previous studies have either only used AI movement trials, or they have used actions where it is not even relevant to separate EP from AI (e.g., hand-opening and closing). Second, note that in Studies 1 and 2, we observed that direct gaze numerically increased AI congruency scores, but in each study, this effect did not reach significance. 2 So, why did previous experiments (e.g., Wang et al., 2010Wang et al., , 2011Wang & Hamilton, 2014) find that direct gaze facilitates AI of hand movements? This could be due to multiple reasons. For one, some of these past experiments have used different types of hand actions (e.g., a more complex action of whole hand-opening and closing), which are likely processed and coded differently than the more precise finger locations and movements in our studies . More importantly, previous studies have used gaze videos where the female model displays a subtle smile, but only after direct gaze (e.g., see Wang and Hamilton (2014) , Fig. 1). Since our studies kept the model's expression neutral for all videos (see Fig. 1a), and we observed a smaller effect for direct gaze on AI, this suggests that previously reported direct gaze enhancements on AI might be partially driven by this additional social signal of positive affect and social engagement, as emphasized by Wang and Hamilton (2012). This should be investigated in future studies because AI is sensitive to actions based on their social communicative intent (i.e., observing someone holding out an object while looking at you vs. looking away; Ciaramidaro et al., 2014), and other forms of imitation and mimicry are sensitive to accompanying reward signals (Sims et al., 2012). It might also be especially useful to evaluate continuous neural effects over time via EEG, in comparing imitative compatibility (AI), internal SSRC (body-based remapping), and external SSRC (visuospatial remapping), given that these processes likely show different courses of development (Catmur & Heyes, 2011).
More generally, neuroscientific work suggests that different processes are involved in imitation and spatial compatibility and are differently influenced by social cues, such as eye gaze and group membership (Marsh, Bird, & Catmur, 2016). There is also work suggesting that eye gaze processing also recruits distinct neural regions (Hooker et al., 2003). Some researchers have reported different neural signatures of direct and averted gaze processing (Conty et al., 2007). All this suggests that work on the neural underpinnings of our effects may clarify mechanisms underlying relationships between social cues, body processing, and different spatial compatibility effects.
It is also worth acknowledging some limitations of this work. The model we used for gaze manipulation was female. Our participants were also overwhelmingly female. There is evidence showing gender difference in gaze cueing, suggesting that both eye as well as symbolic cues work stronger on female participants (Bayliss, Di Pellegrino, & Tipper, 2005). Future studies may look at gender as a factor. Future studies may also manipulate the specific meaning and direction of averted gaze. For example, our studies leave it unclear whether averted gaze draws attention to the general environment, or more specifically to the participant's own hand or even the participant's own finger. If so, it could it be that in our studies averted gaze increases the congruency effect, because more attention is directed to the effector, but such effect would not be obtained if averted gaze directed attention elsewhere. In fact, the AI literature show that automatic imitation is reduced (but, importantly, not eliminated) when participants' attention is directed away from the imitative stimulus (Cracco et al., 2018).
In sum, we report the first evidence that both human and symbolic averted cues selectively prime relevant bodily representations, but not necessarily the action itself. In this way, our paper further underscores the implicit and dynamic interplay between our bodies, minds, and environments.