Nonverbal Communication in Virtual Reality: Nodding as a Social Signal in Virtual Interactions

Nonverbal communication is an important part of human communication, including head nod-ding, eye gaze, proximity and body orientation. Recent research has identiﬁed speciﬁc patterns of head nodding linked to conversation, namely mimicry of head movements at 600 𝑚𝑠 delay and fast nodding when listening. In this paper, we implemented these head nodding behaviour rules in virtual humans, and we tested the impact of these behaviours, and whether they lead to increases in trust and liking towards the virtual humans. We use Virtual Reality technology to simulate a face-to-face conversation, as VR provides a high level of immersiveness and social presence, verysimilartoface-to-faceinteraction. Wethenconductedastudywithhuman-subject participants, where the participants took part in conversations with two virtual humans and then rated the virtual character social characteristics, and completed an evaluation of their implicit trust in the virtual human. Results showed more liking for and more trust in the virtual human whose nodding behaviour was driven by realistic behaviour rules. This supports the psychological models of nodding and advances our ability to build realistic virtual humans.


Introduction
Face-to-face interaction is a central part of human life, used to convey ideas, share information, understand intentions and emotions, build trust, and help in learning and decision making.An important goal for researchers in computer science is to engineer virtual systems, including both virtual humans and immersive virtual reality contexts, which can replicate a real face-to-face conversation.It is also an important goal for researchers in psychology to understand how humans behave during interactions, and to test theories about which aspects of these interactions are significant.Whether in a physical or virtual setup, human communication involves both verbal exchanges and nonverbal behaviours.Nonverbal communication is an effective and expressive tool that is employed to send and receive social signals, which humans have been using for thousands of years before they developed the capability to communicate with words.Therefore, both analysis and synthesis of nonverbal communication constitutes an essential part of research in Human-Computer Interaction (HCI) (Garau et al., 2003).Although physical face-to-face communication remains the most powerful form of interaction, modern communication is frequently mediated by technology and held virtually.Virtual Reality (VR) is a virtual medium of communication that can facilitate the creation of immersive real-time interaction, and enhance social presence in 3D environments (Marini et al., 2012).Moreover, VR allows intrinsic face-to-face communication in a virtual setting and feels verisimilar.In the present study we employ VR in our experiments, as it has an unrivalled potential to impact the future of numerous sectors, including interactive training, teleconferencing, education, consulting, social rehabilitation, tourism, health-care and simulations (Gunkel et al., 2018b;Jack et al., 2001;Lee and Wong, 2008).
Nonverbal communication includes head nodding, eye contact, leaning forward, and body orientation.In particular, head nodding plays an important role in regulating an interaction, signalling who is to take a turn or whether or not someone is interested in and attending to a particular item.This type of signalling is commonly called back-channelling (Cogdill et al., 2001), and is often produced by the listener or follower in a conversation to send subtle messages to the speaker.Hence, including this type of signal in an implementation of a virtual human can be vital in making the human * Nadine Aburumman, Brunel University London nadine.aburumman@brunel.ac.uk (N.Aburumman) https://nadineab.github.io/webpage/(N.Aburumman) ORCID(s): 0000-0003-4578-8738 (N.Aburumman) Figure 1: Our VR experiment, where the participant interacts with a virtual human in a form of structured conversation.Top row: The virtual human is the speaker, sharing and reading information on the tablet.The virtual human performs three nonverbal signals: eye-blinking, facial expressions, and changing of gaze direction.Bottom row: The participant is the speaker and the virtual human executes subtle fast nods as nonverbal feedback.
interaction partner feel comfortable and listened to.Previous work in building embodied conversational agents and virtual humans has implemented a variety of nonverbal signals, including nodding (reviewed in Section 2).However, many studies are based on an ad-hoc choice of social signals, while others have not always been replicated.Our starting point is a recent psychological study which carefully characterised the patterns of head movements shown in natural conversation (Hale et al., 2020) and identified two distinct patterns of head nodding.
In this paper, we implemented various experiments involving virtual interaction between a human-controlled avatar (the human participant), and a virtual human whose behaviour is controlled by a computer program.In these experiments, we focused on designing four different nonverbal signalling modes that are highly important in face-to-face human interaction: eye-blinking, nodding, facial expressions, and changing of gaze (see Figure 1), and we specifically manipulate the nodding behaviour between two different virtual humans.In our experiments we varied the head nodding behaviour of the virtual human in order to measure the effect of nodding on the experience of the human participant.That is, we enabled and disabled the head nodding movements of the virtual human among various runs of the experiments.While the nodding mode is enabled, we modelled two types of head movement based on the findings of Hale et al. (2020): head nod mimicry, in which the virtual human mimics the participant's head nod with a constant 600 lag, and fast nods that are executed by the virtual human as a listener in the conversation, by making small rapid head movements.We use VR technology in our virtual social interaction experiments, where we investigate the role of head nodding as visual feedback during conversation.Additionally, we explore how head nodding movements impact the engagement in interaction between the human particiant and the virtual human, and we aim to address how it can support virtual communication and promote proximity.The work we present in this paper is thus summarised as follows: • We implemented two types of nodding behaviour on a virtual human as social interaction signals in VR, and tested if participants prefer the theory-driven virtual human to a matched virtual human without the nodding behaviour.
• We implemented a rich version of a novel "virtual maze task" (Hale et al., 2018) in VR, in which participants navigate a maze and have the opportunity to ask for help from different virtual humans.The frequency with which participants approach a virtual human and follow that virtual humans' advice can provide an implicit measure of trust and liking.

Related Work
Social interaction in Virtual Reality (VR) is a challenging and interdisciplinary area of research that incorporates computer science, human-computer interaction, and psychology.In this section, we review earlier references related to social signals and nonverbal communication, mainly head nodding.We discuss those studies in the literature that are most relevant to our work.
Social signals form the set of nonverbal information that people exchange during social interaction, and they are the basis of highly effective communication.In face-to-face interactions, nonverbal signals make particularly important contributions to the organisation of turn taking and to back-channeling (Gravano and Hirschberg, 2011).Back-channelling is an umbrella term for all the signals which a listener sends to a speaker; these can inform the speaker whether the listener understands and can provides encouragement to the speaker (Lala et al., 2017).Some of the earliest work identifying back-channel signals and exploring their meaning comes from the 1960s (Kendon, 1970;Argyle et al., 1968).
More recently, researchers in psychology have focused on mimicry behaviour as an important component of social interaction.Mimicry has been described as a "social glue" (Leander et al., 2012) which causes people to like each other and feel affiliation for each other.In face-to-face interactions, the amount of mimicry (broadly defined) can predict feelings of liking (Salazar Kämpf et al., 2018).However, it is challenging to test the causal role of mimicry during a natural face to face interaction.Building virtual agents who do/do not mimic is an approach by which psychologists can test their theories.
In order to understand back-channeling, mimicry, and other non-verbal types of communication, and in order to explore social signals in conversational dynamics for socially-aware computing, the notions of social signal processing and applied signal processing techniques have been introduced by Pentland et al. (Pentland, 2005(Pentland, , 2007;;Pentland et al., 2008).Building on these ideas, there has been a growing body of research aimed at analysing social signals with the goal of building tools, applications, and interfaces for humans, based on models of human behaviour (Gunes et al., 2008;Vinciarelli et al., 2009;Andry et al., 2011;Burgoon et al., 2017).For example, in a study done by Otsuka et al. (2006), a visual head-tracking technique was employed to analyse the interaction between people from audiovisually recorded meetings.Another work, by Byun et al. (2011), analysed nonverbal cues, including head nods and eye gazing of participants in video conferencing, and presented a user study that shows that their predictive framework has the capability to positively influence the conversations in the video conference.An exhaustive survey on past efforts exploring social cues in human behaviour using vision-based, audio-based, and audio-visual analysis, can be seen in Vinciarelli et al. (2008) and Zeng et al. (2009).
Over the last couple of decades, nonverbal communication of social interaction in virtual worlds has been studied extensively, using embodied conversational agents (ECAs) (Cassell et al., 1998;Walker et al., 1994;Hömke et al., 2018).One common objective of these studies is investigating human reactions to different degrees of realism in nonverbal cues given by virtual agents displayed on screen (Kipp, 2005).The paper by Schroder et al. (2012) presented virtual characters with different personalities that induce emotions by producing emotional feedback and nonverbal cues.Furthermore, Lee and Marsella (2006) developed a tool that automates the selection and timing of nonverbal behaviours for virtual agents.In particular, Osugi and Kawahara (2018) studied the effect of head nodding and shaking motions on perceptions of likeability, where the authors employed 3D animated characters, and found that the nodding head motion enhanced perceived likeability and approachability, whereas the shaking motion did not influence the ratings.A more recent study on the social impact of people interacting with virtual characters indicates that humans treat virtual characters as if they were real, and exhibit many of the subtle social signals that arise in human-to-human interaction (Miller et al., 2019).These studies suggests that nonverbal behaviours for the virtual agents impact the quality of the social interaction, and this highlight the need for psychological based models to build realistic nonverbal communication for virtual humans.
On the other hand, Virtual Reality (VR) is able to provide immersive virtual environments (IVEs) and social presence for users.This technology's availability and affordability has significantly improved over recent years (Mütterlein, 2018;Oh et al., 2019), which makes VR a highly effective medium for communication and social interaction, and has encouraged multiple industrial companies to launch various social VR platforms (e.g., Facebook Spaces and VRChat) (Gunkel et al., 2018a;Tarr et al., 2018).One of the earliest works in social signalling using VR is presented by Bailenson et al. (2003), where the authors ran a set of controlled experiments to determine the impact of gaze behaviour of participants interacting with a virtual agent.The results of their experiments suggest that within VR, proxemic space behaviours and gaze operate almost identical to how they operate in physical face-to-face settings.A number of studies have been carried out that investigate the relationship between mimicry in head movement and rapport (Hamilton et al., 2016;Hale and Hamilton, 2016), where a head tracking device was employed to track the participant's head movement, rather than using a head-mounted device (i.e., a VR headset).More recently, (Smith and Neff, 2018) presents a study on verbal and nonverbal communicative behaviour in embodied virtual reality vs. face-to-face interaction, where the results show similar communicative behaviour across these two modes of communication.Furthermore, Maloney et al. (2020) presented observational and interview studies that highlight the nuance of nonverbal communication in social VR compared to offline (face-to-face) and traditional virtual world settings.Sun et al. (2019) studied whether natural occurrence of synchronous nonverbal behaviour in VR correlates with social presence and closeness.Further, Tanenbaum et al. (2020) presented an inventory of nonverbal communication in commercial social VR platforms, where they found that facial expression control, and unconscious body posture remain challenges, and they are the two critical social signals that are currently poorly supported within today's commercial social VR platforms.As noted in many VR experiments and simulations, VR can provide a certain level of control over different sensorimotor, cognitive and social factors, along with the ability to provide co-presence and high interactivity in a virtual environment.
Building on the studies mentioned above, we present a well-controlled study in VR, where we implement realistic nonverbal behaviour, including two types of head movements, i.e., mimicry and fast nodding, on virtual humans.Moreover, we investigate and discuss the impact of head nodding during virtual social interactions, and its influence on the participant's impression of the virtual humans.This allows us to test psychological theories about the impact of nodding and demonstrate how accurate understanding of natural behaviour can contribute to building virtual humans in immersive environments.Furthermore, based on our previous work and other studies detailed in this section, we can formulate two distinct hypotheses.First, participants will report more liking for and interactivity with the virtual human who shows natural nodding behaviour, as measured on the questionnaires.Second, participants will more often approach the virtual human who showed natural nodding behaviour when they later encounter the virtual human in a maze-solving task.

Experiment Design
We conducted our experiments in the social interaction lab at University College London, where an ethics approval was secured for the experiment.Participants were recruited to the study and gave written informed consent to take part.We were able to collect data from 21 participants, where there were 15 females and 6 males.The average age of the participants was 24.We used on our VR experiment the HTC VIVE Pro as the head-mounted device, with a display resolution of 1440X1600 pixels per eye.The head and hand movements of the participants are tracked using the HTC VIVE Pro headset and controller.The virtual scenes were created in Unity 18.3, using the SteamVR Application Programming Interface (API).The appearance of the virtual human can influence the virtual experience and the participant's perception of it.A main point of concern here is the virtual human's visual realism.The study by Mori et al. (2012) presented the uncanny valley effect, which describes a negative response from the participants who are interacting with a near photo-realistic virtual human.Also, several studies show that increasing the visual realism of a virtual human's appearance tends to expose the virtual human to a harsher judgment by the participant, which is an important reason for why, generally, realism is considered to be a bad predictor for affinity towards the virtual human (McDonnell et al., 2012;Zell et al., 2015).Therefore, for the appearance and render style of our virtual human, we used a style similar to the CG style described in (McDonnell et al., 2012).Thus, we chose to use a less realistic and cartoon-style, as such stylised virtual humans are often preferred over realistic ones (Zell et al., 2015).Figure 2 shows the appearance of the virtual humans used in our experiment.We used a within-subjects design for this study, where all participants interact with two virtual humans: One is animated with naturalistic nodding, and the other without.This is because within-subjects designs provide more statistical power and are not impacted by individual differences.All the non-essential features of the stimuli, including the order of interacting with each virtual human, the visual appearance and the voice actors, were counterbalanced across participants so that we can examine just the impact of interactivity in our statistical analysis.
In our experiment, participants completed three distinct tasks as follows.
In the first task, the participants were told that they would have a conversation with two different virtual humans in VR, and would be discussing a set of facts about some of the states in the United States of America (for example: The participant acclimatises to the virtual environment (in blue), then meets the virtual human, where the virtual human starts reading some facts for 45 − 55 (speaking, in pink).After that, both discuss these facts for 35 − 45 (in brown).Subsequently, the participant starts reading for 45 − 55 while the virtual human is listening (in green), after which, again, they discuss for 35 − 45 (in brown).
the fact that the state sport of Maryland is jousting).In the first stage of the experiment, the participant sits in a virtual office room and has a few minutes to acclimatise to the virtual environment.The participant then watches a short demonstration video of the task as performed by two humans on a virtual TV.This teaches the participants how the trials are structured, with monologue and dialogue segments, and provides a model for how participants should behave, providing some chatter as well as a simple reading of the facts available.
Then, the participant meets the first virtual human (Anna), where Anna introduces herself and prompts the participant to introduce theirself.After that, Anna starts a monologue of 45 − 55 , where she reads a set of facts about a US state on a tablet, and shares these facts with the participant.This is followed by a 35 − 45 dialogue between Anna and the participant, where the facts are discussed.After that, the roles are swapped: The participant reads facts about a different state for 45 − 55 , and the participant and virtual human have a 45 − 35 dialogue afterward.See Figure 3, which illustrates the timeline of for one trial of the task.After a series of four such trials (i.e., consisting of discussing eight different states), the participant meets the second virtual human (Beth).The same procedure is repeated and the participant discusses eight new states with Beth and in each trial two states are discussed.In total, the participant has to complete eight trials, where four are with Anna and four are with Beth.Table 1 contains information about the states that are discussed by the virtual humans and the participants.The aim of this experiment is to study the effect of head nodding in social interaction.Hence, we designed these two virtual humans (Anna and Beth) to provide identical eye-blinking, facial expressions, and changing of gaze behaviours.
The only difference between the behaviour of the two virtual humans is that one virtual human engages in naturalistic nodding behaviour that is contingent on their partner's actions, while the other shows only pre-recorded head motion.
The nodding behaviour of the virtual humans is counterbalanced: For half of the participants, Anna is the one who nods, while Beth does not.For the remaining half of the participants Beth nods, while Anna does not.
The second task used a virtual maze to implicitly measure the proximity, trust, and attraction that the participant has formed towards the virtual humans.In this second task, the participants plays a game where they have to escape a maze.They navigate through the maze by pointing, and teleporting to marked locations in the field-of-view, using the VIVE controllers.The virtual humans Anna and Beth are positioned at eight decision points in the maze where the participant has to choose a direction.The participant can choose to approach either Anna or Beth in order to get advice about how to complete the maze (see Figure 4).Both virtual humans are programmed to give the same advice ("Go left" / "Go right" / "Go straight ahead") and the participant would only receive advice from the virtual human that has been approached.Moreover, at each point at which the participant encounters the two virtual humans, the distance between the VR headset (participant) and the two virtual humans' positions in the virtual space, is the same.
We interpret the number of times the participant approaches each virtual human as a measure of their trust in that virtual human.

Beth Anna
Figure 4: The second experiment, where the participant has to escape a maze.During this experiment, the participant is able to navigate by pointing to a location in their field-of-view, and teleporting to the pointed location.Anna and Beth are positioned at some of the junctions of the maze, where the participant needs to repeatedly decide from which of the virtual humans to take advice on direction.
Finally, in the third task, the participant removes the VR headset and completes a set of questionnaires, which provides an explicit measure of their attitudes towards each virtual human.The questionnaires all use a seven-point Likert scale, where a score of 1 on the scale represents strong disagreement, and a score of 7 represents strong agreement.The participant rates the experience (i.e., through scoring claims such as "I felt mentally immersed in the experience.",and "I felt that I experienced a sensation of reality.").Moreover, the participant rates the two virtual humans, measuring their affinity with them (through, e.g., the claim "I find the virtual human appealing.")and interactivity (e.g., "I felt that the virtual human held my attention during discussion."and "I believe the virtual human showed attention for what I was saying.").
In Virtual Reality, cyber-sickness occurs due to the dissociation between the visual input and the vestibular input.
In the design for our experiment, we tried to minimise visual-vestibular conflicts by implementing a teleportation locomotion method in our maze task.This method eliminates the acceleration and/or granular movement allowing the participants to instantly teleport to locations by pointing at them.This style of movement has been shown to reduce cyber-sickness (Coomer et al., 2018).We also adopt dynamic blurring, which reduces cyber-sickness by reducing the amount of visual information and optical flow available to the participant.The participant wears a HTC Vive Pro, where a circular area is in focus which is centered on the participant's gaze point which is tracked by an eye-tracking device embedded in the headset while the rest of the image is blurred.This strategy was found effective for reducing cyber-sickness (Budhiraja et al., 2017).

Experiment Implementation
The main purpose of this study is to investigate the role of head nodding as a nonverbal signal in virtual interaction, using VR, in order to facilitate immersive interaction in social communication, remote collaboration, and learning.To this end, we design a VR social experiment where a human participant interacts with two virtual humans (both females).
Figure 5: An overview of the nonverbal modules implemented in our VR experiment.Two modules that control the virtual human's nonverbal behaviour are active while the virtual human is listening : The eye movement module (controlling eye contact and blinking) and the head movement module (controlling mimicry head nodding and fast head nodding).For the virtual human's speaking behaviour, we implemented change of gaze direction behaviour within the eye movement module, and we use a Wizard of Oz module for the choice of facial expressions and for use during the dialogue interaction.
For the listening behaviour, and while the participant is reading and sharing information, we implemented an eye movement module that controls the eye contact and blinking behaviour of the virtual human (see Section 4.1).We also implemented a head movement module, where the virtual human imitates a participant's head nod if it occurs, and performs fast head nods, i.e., where the head pitch changes at a relatively high frequency.Head nodding movements of the virtual human are programmed to take a fraction of approximately 15%-20% of the total speaking time of the participant (see Section 4.3).Furthermore, with regard to the speaking behaviour of the virtual human, we make the virtual human perform movements where she changes her gaze direction occasionally: The virtual human interchanges looking at the participant and looking at the tablet from which she is reading.Moreover, we use a lip sync module with a pre-recorded voice of actresses, and we employ a Wizard of Oz module for the choice of facial expressions and for use during the dialogue interaction (see Section 4.2).See Figure 5 for an overview illustration of the nonverbal modules implemented within our VR experiment.

Eye Movement Module
The eye movement module controls the virtual human's eye-blinking, eye contact, and gaze direction.For the eye-blinking movement, we employ the eye blink model presented by Trutoiu et al. (2011), which is based on physiologically human data.Thus, we use a 9-frame blink duration model (at 30 frames per second), where the eye blink profile is illustrated in Figure 6.The spontaneous blink rate ( ) of the virtual human is between 6.0 to 30.0 eye blinks per minute: The time between two consecutive blinks is constrained to lie between 2 to 10 seconds, where after each blink, a new time duration is drawn from the interval of 2 to 10 seconds, uniformly at random.Moreover, the eye blink rate while the virtual human is in the reading posture is lower, at 2.4 − 14.2 eye blinks per minute (Doughty, 2001).
During listening to the participant, the virtual human is always gazing at the participant's headset to indicate eye contact and engagement.On the other hand, while the virtual human is speaking and reading facts from her tablet, and sharing information with the participant, she employs both eye and head movements to replicate human behaviour.The virtual human interchanges her gaze direction between the tablet and the participant, which models a gaze mechanism similar to the one presented by Andrist et al. (2012).The virtual human keeps her head aligned to the participant, when gazing at the participant.When gazing at the tablet, the virtual human aligns her head with the tablet (see Figure 6).When a change of gaze direction is performed, it is subject to an eye saccade of approximately 15 − 20 • , where the head motion begins approximately 100 before the eyes saccade as presented by Laurutis and Robinson (1986).

Conversation and Facial behaviour Module
In our experiment, both of the virtual humans' (named Anna and Beth) voices were recorded by native English speaking actresses.We prepared a set of phrases related to each conversation topic (based on a previous study with real participants), including some stock phrases ("yes, tell me more") etc, so that a Wizard of Oz could provide a realistic conversation with the participant.For each prepared speech phrase, we also created a matching facial behaviour in advance.We employ a lip sync algorithm that works off-line (i.e., by preprocessing a recorded voice to compute lip movements), which enables using the voice signal to drive the blendshape deformation of mouth movement and facial expression automatically (Lewis, 1991).In addition, some prepared speech phrases were linked to pre-specified facial expressions, such as a smile with "yes" and a small frown with "really?".
During the dialogue interaction between the virtual human and the participant, we used a human wizard, who interacted using a specific interface within Unity to select an appropriate item from the set of prepared phrases.The human wizard who operates the dialogue was blind to conditions.Because participants will not behave in the same way when in dialogue with the virtual human, we designed the Wizard of Oz module for multimodal utterance to anticipate various dialogue scenarios and facial expressions.This enabled us to create a realistic dialogue for each participant.

Head Nod Module
Head nodding plays an important role for information sharing during human interaction.Therefore, in our experiment, we model two types of head movements based on the findings of (Hale et al., 2020): mimicry head nodding and fast nodding.The nodding behaviour of the virtual human gives feedback to the participant, and is part of the virtual human's listening behaviour.
The study presented by (Hale et al., 2020) employed motion capture technology to investigate head nodding in the human-human dyadic social interaction (31 dyads), where cross-wavelet analysis was used to examine the data.
They found that mimicry occurs between people with a constant lag of around 600 .This study suggests that in a conversation, the conversation follower tends to mimic the conversation leader's head nod with roughly a 600 delay.
To model mimicry head nodding of the virtual human, we first detect a participant's potential head nod based on movement data of the VR headset.The VR headset utilizes a sensor mounted on the participant's head, which allows for head gesture detection efficiently and accurately.We used the method presented by (Kobayashi., 2018) to detect head nods based on VR headset movements in real-time.After the participant performs a head nod, with a lag of 600 after the detected end of the head nod movement, the virtual human performs a head nod with more or less the same value of the pitch rotation angle (See Figure 7).The velocity of the head motion lies in the interval [ ⋅ 1.25, ⋅ 6.91] and is chosen uniformly at random from that interval, where is a radius, given by the distance from the center of the virtual human's neck joint to the chin.This nodding motion includes slow-in motion (i.e, motion starts slowly and accelerates), and slow-out motion (i.e., the motion decelerates and then stops).
During the virtual human's listening behaviour and while the participant is speaking, we implemented fast nodding behaviour as visual feedback of agreement towards the participant.Such a fast nod movement of the human consists in 2-3 rapid nods, with faster velocity of the head motion than in the mimicry head nodding behaviour.In this case, the velocity lies in [ ⋅ 16.33, ⋅ 40.84].We detect the participant's speech based on the audio coming from the microphone input of the headset, where we filter out noise by using a low-pass filter at a mean cutoff of 10.9 kHz.The virtual human performs these fast head nods in a fraction of approximately 15%-20% of the total speaking time of the participant.
For the second virtual human who did not have this nodding module active, the head movement behaviour was done through a "still"-animation captured from very subtle and slow pre-recorded head movements of one pilot participant.
This means that the virtual human does not show slow mimicry nods or fast nods, and does not move in a fashion that is contingent on the participant's own movements and speech.

Results and Discussion
Post-experiment questionnaire: To investigate the effect of nodding as a social signal in virtual interaction using VR, we conducted an analysis of the data obtained from 21 participants, where none of the participants have used VR before, and were not familiar with any virtual human.We used Cronbach's alpha coefficient, denoted by , as it is the most commonly used in sociological research on a seven-point Likert scale (where a score of 1 stands for strong disagreement, and a score of 7 stands for strong agreement).We found that Cronbach's alpha coefficient for the 46 questions in the questionnaire is 0.9306, which indicates high reliability.
For the immersiveness and co-presence section of the questionnaire, participants were asked the following questions: Q1.)I felt mentally immersed in the VR experience.Q2.)I felt that I experienced a sensation of reality.Q3.)I felt that the experience was natural.Q4.)I felt that I was in the presence of another person in the virtual room.The median rating of Question 1 in this section was 6 out of 7, where the mean M was 5.48 and the standard deviation SD was 1.03.For Question 2, the median rating was 5, where M was 4.77 and SD was 1.27.The median rating for Question 3 was 5, where M was 4.71 and SD was 1.29.Lastly, the median rating for Question 4 of this section was 4, where M was 4.43 and SD was 1.33.See Figure 8 for an illustration of the result, for this section of the questionnaire.
To examine the effect of head nodding during the virtual social interaction, we carried out t-tests on the ratings given for each of the eighteen questions (which include questions on personality, affinity and interactivity) of the postexperiment questionnaire.Those questions are a modified version of the one used in Kidd and Breazeal (2004) and Ho and MacDorman (2010).We found a significant difference in the mean values (p = 0.036, t=2.167, df =40) for  the nodding/social virtual human (mean = 4.92) for the nodding human and the mean was 4.38 for the non-nodding virtual human.The mean rating scores for the virtual humans are represented in Figure 9.The difference in the mean rating scores between the nodding virtual human and non-nodding virtual human for eighteen questions on personality, affinity, and interactivity of the post-experiment questionnaire.
We also conducted a post-hoc analysis to identify which questions were best at discriminating between the two virtual humans, in order to design future questionnaires.In the interactivity section of the post-experiment questionnaire, the nodding virtual human was rated significantly higher than the non-nodding virtual human, where the mean scores were higher in following questions.Q1.)I felt that the virtual human was maintaining eye-contact with me, where the difference in the means (henceforth called the -difference) was 0.905 (p =0.0213, t(20) =2.09).Q2.)I felt that the virtual human's head movement was natural, where -difference was 1.238 (p =0.0003, t(20) =3.68).Q3.)I believe the virtual human showed attention to what I was saying, where the -difference was 1.24 (p =0.0014, t(20) =3.16).Therefore, our results indicate that the virtual human's nodding behaviour while the participant is speaking was perceived as engaging and interactive.

Measuring implicit trust using the maze task:
The maze task provides an implicit measure of how much participants trust each virtual human.We counted the number of times that participants approach each virtual human (out of 8 decision points).We executed a paired-samples t-test to determine whether there was a significant difference between the number of approaches towards the nodding virtual human (social virtual human) and non-nodding virtual human (non-social virtual human).Overall, the participants approached the nodding character (M = 4.62) more than the non-nodding character (M = 3.38), and this result was significant (p = 0.0453, t(20) = 1.77).Moreover, the participants always followed the advice from the virtual human that they approached.
As the results in Figure 10 illustrate, some participants approached the non-nodding virtual human more the nodding virtual human, but overall there were more approaches to the nodding virtual human (purple).This indicates that Figure 10: Summary of the counts with which the participants approached the nodding and non-nodding virtual humans during the maze experiment.The x-axis denotes the count by which a virtual human was approached, the y-axis distinguishes between the nodding (purple) and non-nodding virtual human (brown), and the numbers in the interior show the number of participants who approached the specified virtual human the specified number of times.For example, there were seven participants who, during the maze experiment, approached the nodding virtual human 5 times (and the non-nodding virtual human 3 times).
nodding virtual human is generally been perceived as more approachable and regarded as more trustworthy.

Limitations
A limitation of the experiment was that only 21 participants (15 female, 6 male) took part in this experiment.
Although the small sample size can decrease the likelihood and inflate the chance of finding statistical significance, that was not the case in this study (Button et al., 2013).However, in our future studies, we will test a higher quantity of participants (over 70 participants) to investigate whether nonverbal and subtle signals contribute to learning, and the quality of peer collaboration within educational settings using VR.
Another point worth mentioning here is that the pre-recorded movements that were used for the "quiet sitting" animation of the virtual character are captured from a single pilot participant.While we paid attention to selecting these movements to be very small, slow, natural movements, that do not appear odd, having a larger sample for these movements available would be useful to ascertain this point somewhat more confidently.

Conclusion and Future work
In this study, we investigated the role of head nodding behaviour during virtual social interaction using Virtual Reality (VR).We aimed to demonstrate the capacity of virtual humans to embody interactive head nodding behaviour based closely on natural movements, and to test the psychological theory that mimicry of nodding leads to liking and affiliation.Thus, we implemented two sophisticated virtual humans who engage in a simple Wizard of Oz assisted conversation, supplemented by nonverbal signals programmed in VR, including eye-blinking, facial expressions, and natural nodding.We find that participants rate the nodding virtual human more positively in our explicit questionnaire measures, and also approach the nodding virtual human more often than the non-nodding virtual human in our maze task which measures implicit trust.We consider here what these results mean for the improvements of designing engaging virtual human characters in VR, where all the measures used in our experiment is reported in Section 5.
Previous studies of nodding mimicry using virtual humans have found mixed results, with both positive (Bailenson and Yee, 2005) and negative (Hale and Hamilton, 2016) findings.In our experiments, we used more advanced virtual humans presented in VR, whose nodding behaviour was carefully specified based on findings of (Hale et al., 2020) for natural human movements.The two types of head nodding that we have implemented are (a.) head nod mimicry, in which the virtual human mimics the participant's head nod with a constant 600 lag, and (b.) fast nods, which occur when the participant is speaking.We find a positive impact of naturalistic nodding, with both the implicit and explicit measures showing that participants liked and trusted the nodding virtual human: The nodding virtual human was rated significantly higher than the non-nodding virtual human during the virtual social interaction.The participants have rated particularly highly the question I believe the virtual human showed attention to what I was saying of the nodding virtual human, which suggests that the nodding behaviour of the virtual human, while the participant is speaking, was perceived as a sign of engagement and interactivity.Moreover, we found in the maze game experiment that the participants have approached the nodding virtual human for advice more often than they approached the non-nodding virtual human, which appears to reflect higher levels of approachability, liking, or trust towards the nodding virtual human.
This supports the claim that mimicry is a social glue (Lakin et al., 2003), and that by copying the actions of another person, it is possible to build trust and liking.Future studies could test how this generalises across different types of conversation and across different social groups, for example, using virtual humans of different races and genders.Our results also demonstrate the potential of implementing simple behaviour rules derived from real behaviour in virtual humans and virtual reality.
One potential application of this kind of research is in generating artificial agents who can teach participants new information and can act as a tutor who builds trust.The 'talking about US states' task which we used here is well-suited to learning studies because it includes a set of discrete facts which are not known to our (UK) participants before the study begins, but which can be tested afterwards.In our future experiments, we plan to study the effects of nonverbal signs on participants' performance within VR educational settings.We aim at exploring whether nonverbal and subtle signals contribute to the learning process, and the quality of peer collaboration and communication in VR.We will, moreover, design more elaborate sets of experiments to investigate to what extent, and in which settings, nonverbal communication adds value to social interaction, in comparison to purely verbal communication, and how it can be used to convey ideas and meaning better than using words only, in order to use VR in educational settings and in academic video conferencing.

Acknowledgement
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: Antonia F de C Hamilton acknowledges financial support from the Leverhulme Trust under the grant code RPG-2016-251.

Figure 2 :
Figure 2: The virtual humans that we used in our experiments.Anna on the left, and Beth on the right.

Figure 3 :
Figure3: A timeline for one trial of the first task.The participant acclimatises to the virtual environment (in blue), then meets the virtual human, where the virtual human starts reading some facts for 45 − 55 (speaking, in pink).After that, both discuss these facts for 35 − 45 (in brown).Subsequently, the participant starts reading for 45 − 55 while the virtual human is listening (in green), after which, again, they discuss for 35 − 45 (in brown).

Figure 6 :
Figure6: (a) The eye blink profile used for modelling blinks in our experiment, where a 9-frame blink duration (at 30 fps) model is adopted.(b) The virtual human in our study is sharing information and reading facts, where she interchanges her eye gaze direction between the participant and the tablet.

Figure 7 :
Figure7: The virtual human (the follower) mimics the participant's head nod (the leader) with a constant a lag of 600 and with approximately the same pitch rotation angle (nod depth).

Figure 8 :
Figure 8: The results of four questions related to the immersiveness and co-presence of the post-experiment questionnaire.
Figure9: The difference in the mean rating scores between the nodding virtual human and non-nodding virtual human for eighteen questions on personality, affinity, and interactivity of the post-experiment questionnaire.

Table 1
A series of eight trials, which consists of discussions of 16 different states.