Gaze cueing in older and younger adults is elicited by a social robot seen from the back

effect (i.e., faster on congruent gazing trials than on neutral trials) at the 340-ms SOA compared to the 1000-ms SOA, and no differences between incongruent trials and neutral trials at the 340-ms SOA. Our results show that a robot with non-visible eyes can elicit gaze cueing effects. Age-related differences in the other effects are discussed regarding differences in processing time.


Introduction
Humans use a wide variety non-verbal behaviors for everyday communication, such as posture, gestures, facial expressions, and eye and head movements. Eye and head orientation, which can reflect gaze, have been central to social cognition research due to their crucial function in perception and communication (Cañigueral & Hamilton, 2019;Risko et al., 2016). Through gaze, one can modulate the levels of arousal (Argyle & Cook, 1976;de Hamilton, 2016), the turns in a conversation (Degutyte & Astell, 2021), or direct the attention of the interaction partner toward a specific location or object (Perez-Osorio et al., 2021;Senju et al., 2007). Following the referential gaze of others ensures joint attention and, thus, fosters a mutual understanding between individuals ( Baron-Cohen, 1995;Emery et al., 1997;Perez-Osorio et al., 2021).
Gaze cueing tasks with human face cues have been used to study joint attention in older adults compared to young adults. Evidence from studies with the gaze cueing paradigm suggests a decline in eye gaze following as we age (McKay et al., 2022), but no age differences with non-social directional cues (Plude et al., 1994;Slessor et al., 2016). The decline has been linked to a general (non-clinical) decline in social cognition when getting older (Phillips et al., 2011;Slessor et al., 2007Slessor et al., , 2008Sullivan & Ruffman, 2004). It has, for instance, been explained in terms of a reduced ability to extract meaningful information from the eyes (Grainger & Henry, 2020;Slessor et al., 2016). Investigating how older adults perceive non-verbal signals from social robots, including their gaze, is essential to learn how these communication cues are processed and understood -in addition to investigating other factors that influence the acceptance of these robots, such as trust or individual needs (Whelan et al., 2018). To our knowledge, the decline in gaze following and social cognition have been overlooked in HRI research. However, social robots are envisioned as tools to promote independent living for longer and from which an aging population will benefit (Deutsch et al., 2019;Robinson & Nejat, 2022).
We present a study in which young and older adults performed a gaze cueing task with the head of a social robot as the gaze cue. The study aimed to confirm and extend previous research on gaze cueing effects elicited by robotic gaze by (1) assessing cueing effects due to the robot head orientation when it is seen from the back, and so, with the eyes of the robot not visible, and by (2) exploring differences between young and older adults in this. We also explored the time course of the gaze cueing effects, considering a general cognitive slowing in aging.

Related work
During the last decade, the call for different and more realistic social stimuli in attention research (Risko & Kingstone, 2017) has aligned with an interest in studying human-robot interactions through more controlled and replicable experimental paradigms (Baxter et al., 2016). In line with both, there has been wide adoption of gaze cueing paradigms with robots as the gazing cues, either to compare with human gaze cues (Martini et al., 2015) or to explore the mechanisms of social cognition (Chevalier et al., 2019;Kompatsiari et al., 2018Kompatsiari et al., , 2021Wiese et al., 2012). These findings suggest a similar impact of human gaze and gaze from humanoid robots on (non-infant) attention and perception.
Despite evidence indicating a decline in gaze following as we age (McKay et al., 2022), the research on gaze following in older adults using robotic cues is limited. Morillo-Mendez et al. (2022) found some indication of reduced gaze-following capabilities in middleaged and older adults, compared to younger adults, when following instructions given by an on-screen Pepper robot (Pandey & Gelin, 2018). In a visual search-like paradigm, the robot either pointed with the head/eyes toward the target stimulus or did not provide such a gaze hint. However, this experiment did not include a condition where the robot gazed at a location different from the target location, and the specific impact of the robot's gaze on the reflexive attentional orienting remained unclear.
The decline in gaze following as we age has been explained in two different ways (McKay et al., 2022). The visual attention account builds on the idea that older adults focus less on the eye region than younger adults (Grainger & Henry, 2020) or show a reduced ability to extract meaningful information from the eyes (Slessor et al., 2016). The question remains, however, whether older adults also show smaller gaze cueing effects with other types of gaze cues than eye gaze. Alternatively, cognitive decline models point to a decline in volitional or strategic processing, but not automatic processing, as we age (Craik & Jacoby, 1996). While the visual attention account anticipates a greater age-related decline in spatial cueing effects with non-predictive gaze cues, implying an automatic orienting of attention, cognitive decline models anticipate age-related decline with predictive cues. The recent meta-analysis by McKay et al. (2022) revealed evidence for both accounts. However, it cannot yet be ruled out that these age-related differences depend on slower processing times in older as compared to younger adults. So, further controlled research is needed that explores age-related differences in the time course of the gaze cueing effect, for instance, by varying the time between central cue and target stimulus (i.e., stimulus onset asynchrony; SOA)

Aim of the study
We used a gaze cueing task with the rotating head of an onscreen NAO robot (Gouaillier et al., 2009) as a central, non-predictive cue to explore other differences in the automatic gaze cueing effects between older and young adults. This approach addresses the need for exploring social cognition with non-human artificial stimuli that are real, dynamic, and ecologically valid, as well as the need for a fundamental understanding of the basic perception of social robots.
This study aims to confirm and extend the finding presented in Morillo-Mendez et al. (2023) of a gaze cueing effect in young adults elicited by the head orientation of an on-screen robot, even when the robot was facing away from the participant and its eyes were not visible. This is relevant, as complex social interactions are not always necessarily face-to-face (Colombatto et al., 2020).
Crucially, we considered possible age-related differences in the time course of gaze cueing effects by systematically varying the cue-target SOA. We expected to obtain a similar gaze cueing effect in young and older adults, assuming that head orientation is sufficient and eye gaze is not necessary to induce gaze cueing effects. However, due to slowed processing with age, we expected age-related differences in the time course of cueing effects, with the effect peaking at longer SOAs in older adults compared to younger adults (McKay et al., 2022).
Finally, we added a neutral, no-gaze condition in which the robot head did not turn and looked away from the participant. This nogaze condition was added to disentangle the robotic gaze direction's differential impact on the age groups. More specifically, this approach allowed a direct comparison between, on the one hand, gazing in the direction of the target or gazing at a location opposite from the target location and, on the other hand, no directional gaze (Slessor et al., 2016).

Task design
Participants performed a computerized gaze cueing task with the central cue being a virtual head of a NAO robot facing backward to the participants. The eyes of the robot were thus not visible to them. Upon its appearance on the screen, the robot head turned equally likely to the left or right on 80% of the trials (i.e., gazing trials). On the remaining 20% of the trials, the head did not turn and remained looking straight away from the participant throughout the trial (i.e., neutral trials). On each trial, a target letter ('V' or 'T') appeared equally likely at the left or right of the robot head, either 340 ms or 1000 ms after its onset. Crucially, for half of the gazing trials, the target location was congruent with the direction in which the robot gazed (left-left or right-right; congruent trials); on the remaining trials, the target location and gaze direction were incongruent (left-right or right-left; incongruent trials). Participants' task was to discriminate each target letter by pressing one of two response keys as fast as possible and without making mistakes. Participants were explicitly told that gaze direction did not predict the location of the target letter. The typical trial configuration, including time parameters between events, is described in Fig. 1.  This study employed a mixed design with age as a between-subjects variable (2: young vs. older participants), and the congruence of robotic gaze and target location (3: neutral vs. congruent vs. incongruent) and the cue-target times (2 SOAs: 340 ms vs. 1000 ms) as within-subject variables. The primary dependent variable was the reaction time (RT), the time between the onset of the letter and the participants' key press response to the following target. The accuracy of the responses was also measured.
The gaze cueing task contained 12 blocks of 42 trials each. The first two trials of each block were random, considered warm-up. These trials were excluded from the analysis. The remaining 40 experimental trials per block contained 16 gazing trials to the left (8 congruent; 8 incongruent), 16 gazing trials to the right (8 congruent; 8 incongruent), and 8 neutral trials. For neutral and gazing trials, there were an equal number of trials for each combination of target letter identity (2), SOA (2), and target location (2), resulting in 24 unique types of experimental trials per block. Within blocks, the experimental trials were presented in a different random order for each participant with the restriction that the same type of trial could not appear more than two times consecutively.

Stimuli and apparatus
Stimuli and task instructions were presented on a 23-inch AOC monitor (1920 × 1080 pixels) with a refresh rate of 60 Hz. Stimulus presentation and response registration were done with Labvanced (Finger et al., 2017). Participants used two response keys that were organized vertically on a Cedrus response keypad RB-540. 1 In each age group, half of the participants pressed the top key with their left index finger to respond to 'V' and the bottom key with their right index finger to respond to 'T.' This hand-to-key mapping was reversed in the other half.
The central cue on the gazing trials was a video clip of NAO turning its head, taken from the Choregraphe software (Pot et al., 2009). The left turn of the robot head consisted of a head yaw movement of 70 • toward its left. The right turn video was created by mirroring the left turn movement video clip. The duration of the head movement was 340 ms. 1 https://store.cedrus.com/products/rb-540-response-pad.
Participants sat at ≈60 cm from the computer screen. The central cue was black and white (corresponding to an actual NAO 6 model) and was depicted on a gray background. Its dimensions were 4.35 • high and 5.5 • wide (4.5 × 5.8 cm). The fixation cross and letter targets were black and 0.85 • wide and high (1 × 1 cm). The targets appeared on the horizontal axis of the screen at 6.4 • (6.7 cm) from the center of the screen.

Procedure
Participants performed the task in a quiet room with dimmed light at the psychology laboratory at Örebro University. At the beginning of the test session, participants were first informed about the aim of the study, the use of their data, and their right to quit their participation at any stage of the experiment. Participants gave written informed consent and filled out a brief demographic questionnaire in which they reported their gender (man, woman, or other), age in years, dominant hand (left, right, or both), completed education, how comfortable they were with using computers ('I feel comfortable using computers'; 1-totally disagree, 5-totally agree), and if they were familiar with the NAO robot (yes, no, or not sure). Second, they saw an extract of a promotional video of NAO following an object with its gaze. 2 Third, participants completed 10 practice trials to familiarize themselves with the gaze cueing task, immediately followed by the task, with self-paced breaks between the blocks. Finally, participants were debriefed, received a gift voucher, and were thanked for participating. The test session lasted around 40 min.

Participants
Forty-four participants took part in the experiment. The inclusion criteria (based on self-report) were to be fluent in English, to have a normal or corrected-to-normal vision, and to be in the age range from 18 to 40 years (young adult group; YA) or 60 years or older (older adult group; OA) at the time of testing. In addition, participants self-reported to be cognitively healthy. Participation was L. Morillo-Mendez et al. The minimum sample size was calculated through an a priori power analysis using G*Power based on the effect size of previous research: partial 2 = 0.13; 1 − = 0.9. The results yielded 18 participants as sufficient for our study.
Data from three participants (one from the older group) were excluded from all analyses due to an overall extreme number of incorrect or missing target responses (see Results section for further details), resulting in a final sample of 21 older (13 women, 8 men) and 20 younger (14 women, 6 men) adults. See Tables 1 and 2 for more information per age group.

Results
The data of three participants were excluded from the analyses because of being outliers for accuracy as compared to the rest of the sample (i.e., error rate higher than 9% => 3 + 1.5 * ). For the remaining sample, the proportion of errors (including early, incorrect, and missing responses) was 3.2% of the total. The young adults made overall twice as many errors (median = 3.9%) than the older adults (median = 2%), = 118, = .016. Because the number of errors was limited in both groups, accuracy was not further analyzed. Incorrect trials were removed from the RT analysis. Correct trials with too slow or too fast RT (> 3 + 3 * or < 1 − 3 * per participant; 0.84% of the total) were also excluded from the RT analyses. The total number of excluded trials was significantly higher in the older adults than younger adults, = 285.5, = .044, although both values were marginal (0.6% and 0.2% respectively).
We used the mean RT of the trials within each combination of the variable levels for the analysis. The RT data were analyzed with RStudio (Posit team, 2022) with the rstatix package (Kassambara, 2022) for standard Mixed ANOVA tests and the WRS2 package (Mair & Wilcox, 2020) for Robust ANOVAs. RTs were log-transformed before being subjected to mixed ANOVA to reduce skewness. We employed Holm-Bonferroni correction for multiple comparisons with = .05 to avoid type I errors. The mixed ANOVAs with untransformed values yielded the same pattern of results compared to the alternative analyses. Only the mixed ANOVA results are reported here for clarity reasons. The mean RTs are reported in Table 3 and Fig. 2a.

Internal structure of the effect of congruence
The main effects of congruence were further explored by comparing the RTs of congruent and incongruent gazing trials (i.e., gaze cueing effect) and by comparing the RTs on congruent and incongruent gazing trials with responses on neutral trials (i.e., facilitation and incongruence effect, respectively). Because difference scores in RT as a result of aging increase with overall RT (Slessor et al., 2016;Tellinghuisen et al., 1996), we calculated these effects as proportional difference scores, following (Slessor et al., 2016): (1) (2) All the scores are reported as percentages (%). The three different effect scores were subjected to separate ANOVAs with age group (2) and SOA (2) as factors, followed by one-sample ttests comparing the effects per age group and/or SOA with 0% (i.e., no difference in responses).
For the gaze cueing effect score, the 2 × 2 ANOVA revealed no main or interaction effects (see the top part of Fig. 2b). The overall gaze cueing effect significantly differed from 0, (81) = 5.82, < .001, = .64, suggesting that our participants were on average 2.85% slower on incongruent trials than on congruent trials. The gaze cueing effect was significant at each SOA in each age group.

Discussion
This study explored the gaze cueing effect elicited by a robot seen from the back in young and older adults. The results confirmed previous research showing that gaze cueing effects (i.e., faster responses to gazed-at targets than non-gazed-at targets) still occur when the robotic eyes are not visible (Morillo-Mendez et al., 2023). Our findings suggest that head orientation seen from the back -and thus, without visual eye cues -is a social cue that can induce similar gaze-cueing effects in older adults as in young adults. Moreover, the gaze cueing effects in the different age groups did not depend on short or longer SOA. This finding might be taken to suggest that gaze cueing in older adults was not influenced by slowed processing with increased age (McKay et al., 2022).
One might argue that the robot's gaze is perceived as a non-social stimulus, as the attentional orienting toward non-social stimuli is relatively spared in older adults (Plude et al., 1994;Slessor et al., 2016). Moreover, the head of the robot seen from the back might have further reduced its perception as a social cue. However, for the gaze cueing effect to occur, participants had to perceive the cue as a head with a face turned seen from the back that pointed with the front of its head/eyes toward a location at the left or right. This perception was accomplished by the initial familiarity phase, during which our participants saw NAO following an object with its gaze. Future research may explore age-related differences in the time course of gaze cueing effects elicited by a social robot seen from the front.
A closer look at the cueing effects revealed that both age groups responded faster on congruent gazing trials than on neutral (no directional gaze) trials. However, older adults especially did so with the 1000-ms SOA. This result might suggest that older adults need longer processing times to benefit from the facilitation. However, this might be related to the differences in dynamism between the neutral (static) and congruent (dynamic) cues. Moreover, there was no clear age-related difference with 340-ms, and future studies are warranted to explore further the role of type of cue and cue predictability in age-related differences in the time course of facilitation effects.
Unexpectedly, and in contrast to previous findings with static face cues seen from the front (Plude et al., 1994;Slessor et al., 2016), participants were overall faster (and not slower) on incongruent trials than neutral trials. However, this inverse incongruence effect did not reach significance for older adults at the short SOA. A logical, but still speculative, explanation for faster responses on incongruent gazing trials as compared to neutral trials, instead of the expected opposite pattern, might also lie in anticipation effects induced by the movement of the dynamic gazing cues. Because target letters always followed the completion of the head movement, participants could expect the onset of the target based on the motion information and get ready to respond on gazing trials. This extra information was unavailable when the robot head remained static during the neutral condition, resulting in faster responses on gazing trials than on neutral trials. Future studies should include dynamic cues for the neutral condition as well.
The reduced facilitation and absent incongruence effects observed for the older adults with a short SOA, as compared to a longer SOA, are in line with the idea that age-related differences in gaze following could be related to a general cognitive slowing in aging (McKay et al., 2022;Verhaeghen & Cerella, 2002). At an SOA of 1000-ms, the results between older and younger adults were comparable in every effect, highlighting the critical role of processing times between age groups in our results. In the 340-ms SOA, the older adults did not benefit from the dynamic cue in the incongruent condition. A possible explanation for this lies in the proportion of gazing trials used in the design. Although the task was non-predictive concerning the gazed-at location, it was predictive concerning most gazing trials (dynamic trials). The speculated anticipation effects would be part of strategic processing, which is reduced in older adults (Craik & Jacoby, 1996).
This study contributes to the body of research in social cognition by exploring current theories to account for the decline of gaze following in older adults. Moreover, using the depiction of a robot provided an opportunity to start exploring how these theories might translate to new scenarios in which social interactions with non-human agents will occur. Social robots for older adults are becoming a reality, matched by their trend status in the HRI field. However, instead of considering the possible specific scenarios of assistance with clinical populationsor the ethical aspects of deigning robots to assist older users (Zafrani & Nimrod, 2019) -we decided to focus on more fundamental aspects of human cognition. Exploring these aspects is a previous step (or parallel at least) to other research exploring the acceptance and implementation of these robots in the public domain. This line of work could ultimately inform the design of assistive robots, for instance, by providing recommendations about which non-verbal signals are valuable and necessary cues to implement to fully capitalize on the effectiveness of social robots in assisting an aging population (Morillo-Mendez et al., 2022). However, in this study, the gaze conveyed by the robot's head orientation and with no visible eyes drove attention automatically and similarly in the young and older adult groups, emphasizing that this decline might as well relate to the availability of what we perceive as eyes in the context of other facial cues, i.e., not necessarily natural or biologically-looking, as shown in Morillo-Mendez et al. (2022).
Some limitations must be considered when interpreting the current novel findings. Although we introduced elements from the real world, including a video clip of a real robot as a dynamic cue, human-robot interactions typically occur in shared physical scenarios with embodied robots. One obvious avenue for future studies is to build upon the findings from controlled behavioral experiments with on-screen robotic gaze cues and transfer insights to studies of complex social humanrobot interaction in a shared space. The current results might also be specific to the humanoid NAO robot with no moving eyes. Caution must be exercised when generalizing these results to other types of robots or comparing them to other work using human cues. Nonetheless, using social robots with recognizable facial elements offers a more realistic stimulus, especially when accompanied by dynamic cues, than the static stimuli often used in gaze cueing studies.
Moreover, gaze following might vary as a function of task demand (Chen et al., 2021) in older and younger adults (Fernandes et al., 2022). In addition, other methods are required to further inform the age-related differences in robotic gaze following, such as the use of questionnaires (Morillo-Mendez et al., 2022) or eye-tracking to explore overt attention (Kuhn et al., 2015). Future studies are needed to replicate the current findings in more varied samples to consider variations in general cognitive functioning in a systematic way. Moreover, future research might focus on how clinical cognitive decline, such as dementia, affects social cognition in the context of HRI. We advocate for an interplay between human-human and human-robot-oriented research employing varied parameters, methods, and tasks to keep building the fundamental knowledge of social cognition.
In sum, we found that gaze cueing effects in a non-predictive cueing task can be induced through the head orientation of a robot when its eyes are not visible, also in older users. This finding aligns with the visual attention account linking this decline with the attention toward the eye region and extends it to robotic agents. The gaze cueing effect was similar at different SOAs. However, further exploration of the effect showed that older adults needed longer processing times to benefit from the dynamic cues in the task. Although this relates to general slower processing times in older adults, it cannot be ruled out that our results are showing a decline in strategic processing due to the higher predictability of these dynamic cues. Future studies are warranted to explore the impact of dynamic gaze cues and age-specific differences, as well as their time course, using different types of agent stimuli, both human and artificial.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
The data set gathered in this study is available in the following OSF repository: https://osf.io/z3wqt/.