Power of looking together: an analysis of social facilitation by Agent's mutual gaze

: The authors investigated the effect of gaze tracking of humans by an agent. Psychological studies have shown that the presence of others facilitates task with implying partnership. This effect also occurs if the presence of others is replaced with a robot. Furthermore, the previous research showed that being touched by a robot motivates a person. However, direct contact such as the touch of a robot has a risk of disturbing human work. The authors used gaze tracking for social facilitation because gaze tracking can also facilitate human efforts and implemented a social facilitation robot agent by using gaze tracking. The agent projects its face on a surface to show quickly changing expressions, which is difficult to achieve with ordinary robots. The agent can also express gaze tracking by implementing eye trackers. The authors prepared three conditions with monotonous tasks to compare the effects of gaze tracking: the agent gazes at the instructed point moves its gaze randomly and traces the gaze point of a participant. The results showed that the participants’ motivation increased if the agent is tracking gazed point by a human, even though there are no differences between the achievement scores of the task and the continuation time.


Introduction
Social facilitation can improve performance during task execution owing to the existence of others who are working. This effect is a promising application in the human-agent interaction (HAI) field, and several studies suggested that this is also effective in realworld and screen agents [1,2]. The effect of social facilitation becomes stronger if there is an interaction between the user and an agent. One example is social touch. The user shows more enthusiasm for work when the user was touched by other agents [3]. This paper suggests a strong power of 'interaction' in HAI. However, this also requires some intimate interaction, and some users prefer not to be touched.
In this research, we propose to utilise the effect of a noninvasive gaze by applying an agent's mutual gaze interaction as a method of social facilitation called operation support. The gaze is known to play an important role in communication such as in intention transmission and conversation control. For example, the impression that an individual gives to the opponent can be influenced by factors such as the duration of the gaze and the direction of the line of sight at the time of face-to-face interaction [4,5]. By applying this gaze effect to a robot agent, it became clear that we can improve the impression of the target agent and make the robot agent and the user's communication more efficient.
Several studies showed that robot agents improve the impression on the agents themselves by exhibiting a joint gaze to users [6,7]. A joint gaze involves seeing the object seen by the other person following the line of sight of the opponent. It is defined as seeing the same object and sharing its existence between them. These studies show that a joint gaze improves the impression evaluation of the robot agent itself. On the other hand, further research is needed regarding how the agent's line of sight improves the behaviour of the user and possible engineering applications of this method.
Unlike other animals, humans are known to use the gazes of others for communication, and they are highly susceptible to gazes [8]. In the field of social psychology, the influence of the existence of others at work is called the 'sightseeing effect' [9]. As for the effect of the human gaze, it has been shown that human beings working in others' line of sight received better responses than those working against the line of sight [3]. The human ability to chase people's gaze instantaneously and unconsciously and understand their meaning naturally helps to convey thoughts to other people and increase the efficiency of collaborative work.

Background and related works
It is known that human motivation, work quantity, quality, and speed are influenced by the existences of other people. Even if an agent is used to replace the existence of another person, the same effect as the sight effect on work is confirmed. Shiomi et al. [3] investigated improvements to human motivation by the active contact of robot agents. They showed that human task performance in simple work improves when agents actively touch humans. However, in the proposed method by Shiomi et al., contact is necessary to receive social facilitation. The problem is that the user is required to always have one hand in contact with the robot at the time of work. Since it is impossible to perform tasks using both hands, the limitation by contact is significant. It was also pointed out that users who have an impression of rejection exist against touching [3]. In this respect, the joint gaze used for social facilitation in this paper can be controlled using a non-invasive and involuntary gaze. Thus, the burden on the experiment participants is considered to become small.

Facial expression presentation agent
Real robot agents have anthropomorphic features such as eyes and hands and can instruct the user while using these features to guide the user's line of sight and interact while encouraging empathy [10]. Since virtual agents on a screen can freely change their projected image and interact with the user, there is an advantage, in that the representable range is large, but the presence to the user is small. On the other hand, robotic agents in the real world have a large influence of the presence toward users, unlike virtual agents on a screen [11]. In particular, agents in the real world can share the existing space.
An agent device that projects a face shape on a location other than a plane has a Mona Lisa effect (in the case where a face is drawn on a plane, the viewer cannot identify line-of-sight information [12]). It is effective in suppressing the psychological effect that the agent always seems to be looking in the viewer's direction. An earlier example of a projected face called Talking Head Projection had images on a face-shaped display [13]. In addition, Oyekoya et al. [14] developed a spherical display called SphereAvatar. Using this device, we investigated the errors in the line-of-sight direction transmitted to the user by the spherical display.
We found that a robot using a spherical display can transmit accurate gaze directions to the user. Otsuki et al. [15] proposed ThirdEye, a display that imitated the eyeballs of a person, and conducted experiments to present the direction of the gaze of a remote interlocutor to the local environment. ThirdEye suppressed the Mona Lisa effect when viewing the gaze diagonally, and could present the gaze direction more accurately than the general method of showing only the face of the remote interlocutor on the display. In this research, it was assumed that by using a similar spherical projection method, eye direction will be easier to convey to the user and determining the effect of line-of-sight tracking will become easier.

Design and implementation
Social facilitation by gaze aiming as proposed in this research has the following requirements for the agents: (i) To be an agent with the entity in order to maintain its presence. (ii) To operate with a system that tracks the movement of the human's line of sight without delay. (iii) To have a gaze designed to be accurately conveyed to the experiment participants in order to achieve a joint gaze.
The design for meeting the above requirements will be described below.

Requirements for social facilitation by the gaze
In this paper, we used a device that projects the facial expression of the agent on a spherical surface, as shown on the left-hand side of Fig. 1. The part hitting the face of the agent is a spherical display, which makes it possible to show three-dimensional expressions. The system projects a face drawn by OpenGL from a projector mounted on the agent on the upper semi-transparent sphere (Hitachi LDG 15D-G) through a fisheye lens. We used SmartBeam from SK Telecom as the projector. The facial expression can be changed at a rate of once per 1/60 s, and it is fast enough to express changes in the line of sight. It has the same presence as a robot that expresses facial expressions using physical parts and is smoother and more expressive than a robot ( Fig. 1, right-hand side). Fig. 2 shows the system configuration. Since it is necessary to change the line of sight at the same speed as a human being, a device that accurately reflects the line of sight data acquired by the sensor is necessary. In this research, a gazing point measuring device (VOXER by NAC Image Technology Co., Ltd.) was used to acquire the user's line of sight. In the gazing point measuring apparatus, infrared light is irradiated on the eyeball of the user, and the camera image of the eyeball is subjected to image processing in order to measure the gaze point of the user. In this device, gaze points can be acquired at a rate of once per 1/60 s, and the data can be reflected. In the host PC, the direction of the face of the agent and the movement of the eyeball is controlled using the gazing point data of the user acquired by the sight-line measuring device.

Evaluation
We conducted the following experiments to determine what sort of gaze-tracking strategy would generate social facilitation, the number of achievements in the user's working performance, duration, and motivation improvement effect. We adopted Shiomi et al.'s task to evaluate social facilitation [3]. For the experiment participants, the task involved dragging a white rectangle from the four corners of the screen with a mouse in order to overlap the white rectangle in a black rectangle. A white rectangle and a black square are shown in two places in Figs. 3 and 4. When the rectangles overlap, new white squares and black squares appear at random positions in the four corners of the screen.
The participant repeats this series of tasks and is instructed by the agent to continue working as long as possible. Tasks can be stopped by the participants, who are also instructed by the agent in advance. Tasks are designed to automatically terminate when a participant indicates an intention to terminate or when the operation time exceeds 10 min.
However, we did not tell participants in advance that the task would end in 10 min at maximum. The participants in the experiment were asked to sit in front of the monitor and agent. As interaction with the experiment participants, the agent gazed at the face of the experiment participants for 0.5 s at the time of task success and then gazed at the next black target rectangle. In addition, the behaviour of the agent after 0.5 s of continuous target gaze was implemented under the following three conditions. Stable gaze condition: In the constant gaze condition, the participant looks at the target and continues to watch the target until the next task success. Riether et al. [2] implemented this in order to make comparisons, especially for agents that do not contribute to improve the results of tasks as well as humans.
Random gaze condition: Gaze control performed at random irrespective of the user's sight-line changes is called a random condition. In this condition, the line of sight of the agent is changed according to the line-of-sight data of another user who performed the same task, and so a gaze movement unrelated to the user actually faced is performed. This is implemented to compare the effects of simple agent operation and the effects of eye tracking.
Gaze-tracking condition (joint gaze condition): When the user looks at the face of the agent, the agent also gazes at the user's face. When the user moves his/her line of sight from the agent, the agent also looks at the same location as the user. This is a condition that imitates the joint gaze.
The procedure of the entire experiment is shown in Fig. 5. The agent repeats the cycle of this experiment: participant gazing, target gaze, and action by the condition.
At the beginning of the experiment, we set up the gaze point measuring device and calibrated the gaze of the participants. However, they explained to participants that gazing at the agent's use of information on gazing points of experiment participants for gaze control and gazing point measurement to use for analysis of impression evaluation.
Next, to familiarise participants with the task, we asked them to practise it for 1 min without an agent. After completing the exercise, the participants sat face-to-face with the agent and received a request from the agent for task execution (Table 1 includes the agent's utterance script.).
Three conditions were carried out for each experiment participant, and the order of the conditions to be carried out was counterbalanced, so there was no bias. Every time one experiment condition was completed, we issued questionnaires to the participants in order to obtain their impressions of the agents and work.

Impression evaluation items for agents and tasks
To review the impressions of agents and tasks, four items were evaluated by seven linked stages. The evaluation items are shown below: • Favourable to agents: high (7) ⇔ low (1): In a previous paper that uses touch [16], there was no significant difference in agent favourability. We also evaluate gazing case.
• Pressure on work: high (7) ⇔ low (1): There were many opinions expressed that in the preliminary experiments, the constant gaze condition seemed to be hurried. This factor verifies the effect of this pressure.

Participants
We conducted experiments with 15 male university students in their early 20s. All college students specialised in engineering research. The experimental time per person was 20-30 min including individual differences according to the duration of the task.

Hypotheses
On the basis of the previous research on social facilitation, we set the following two hypotheses for agents' gaze tracking (joint gaze): Hypothesis 1: When the agent performs gaze tracking, the impression for the agent improves. Hypothesis 2: When the agent performs gaze tracking, the achievement number and duration of the task of the experiment participant are increased compared with when the agent does not follow the line of sight. Hypothesis 3: When the agent performs gaze tracking, the subjective evaluation of the experiment participants' tasks is higher than when no eye tracking is performed.

Results
We analysed the experimental results to clarify whether there was a significant difference in the task scores between the three conditions and whether the subjective evaluation of the experiment participants caused a difference between conditions. To omit the representation, we denoted S (stable gaze) as the constant condition of the sight line, R (random gaze) the random condition, and F (gaze follow) the condition of eyeline tracking. Tables 2 and 3 list the average and standard deviation (SD) of the task duration for each condition's time and duration. For the numerical values, the mean scores under the three conditions were compared by a variance analysis with a significance of 5%, but there was no significant difference.
The results of the subjective evaluation are shown in Fig. 6. The average likability felt by the participants was 4.2 (SD) = 1.6 for S, 3.9 (SD = 1.4) for R, and 5.6 (SD = 1.0) for F. Multiple  Table 1 Instructions on the experiment [first condition utterance] "Thank you for your participation." "I want you to continue the previous task as long as possible." "I want you to do your best, but please raise your hand and let me know when you want to finish." "Please start your task." [second condition utterance] "Do your best as you did before." "Please start your task." [third condition utterance] "This is the last task. Good luck." "Please start your task."

206
IET comparisons were carried out using Bonferroni's method at a 5% level with respect to likability, and it was confirmed that F was significantly higher than R and S. The average pressure felt by participants was 3.0 (SD = 1.7) for S, 2.9 (SD = 1.8) for R, and 1.7 (SD = 1.3) for F. When multiple comparisons were made using Bonferroni's method at the 5% level against the pressure, S tended to be higher than F. The participants' average motivation toward tasks was 4.5 (SD = 1.2) for S, 3.6 (SD = 1.4) for R, and 5.4 (SD = 1.2) for F. Multiple comparisons were conducted using Bonferroni's method at the 5% level for motivation toward the task, and it was also confirmed that F was significantly higher than R and S. The participants' average duration of concentration was 4.5 (SD = 1.5) for S, 3.3 (SD = 1.4) for R, and 5.1 (SD = 1.3) for F. Multiple comparisons were carried out using Bonferroni's method at the 5% level for sustained concentration. It was confirmed that F was significantly higher than R, and there was a tendency that S was higher than R.

Discussion
The experimental results indicated that the positivity toward the agent when looking at the line of sight was significantly higher than for the other two conditions. Therefore, it was suggested that hypothesis 1 is correct. There were no differences depending on the conditions for either the number of task achievements or the duration. Therefore, hypothesis 2 was not verified. A subjective evaluation of the task showed that the motivation for work increased when observing the line of sight. However, there was no significant difference in the concentration on work under the gazetracking condition and the gaze-constant condition. Hypothesis 3 was considered to be partially supported.

Influence of practise
From the results in Section 5.2, though there was no significant difference in the objective evaluations, it can be seen that subjective evaluations contributed to the motivation for tasks by eye tracking. Since there was no significant difference as discussed in Section 5.1, we assumed that it might be owing to the participants' familiarity with tasks. Thus, we compared the number of accomplished tasks and the task duration time in the order of experiments. Table 4 lists the average and SD of the task achievement number for each order, in which the experiment was conducted. Multiple comparisons were carried out using Bonferroni's method at the 5% level with respect to the numerical value. It was confirmed that the first was significantly lower than the second. It was also confirmed that the first was significantly lower than the third.
From this, it can be seen that the experiment participants increase the amount of work in the second and third conditions. Therefore, it is suggested that the experiment participants were accustomed to the tasks, and the amount of work increased. Since the influence of variation for participants who were accustomed to their tasks is significant, it is possible that the effect of each condition on the achievement number could not be found.

Pressure from agents and concentration on tasks
Regarding the duration of concentration, there was no significant difference between the gaze-fixed condition and gaze-following condition, and an agent that only shows the target in sight can be considered effective.
In addition, though there was no significant difference between the random condition and the line-of-sight tracking condition with respect to the pressure felt during the work, there was a significant trend between the fixed line-of-sight condition and the line-of-sight  This implies that social facilitation using a joint gaze without accompanying gaze tracking is beneficial for sustaining concentration. On the other hand, there is a possibility that the mental burden of avoiding an objective appearance to workers will increase.

Attention point of participants and influence on work
From Section 5.3, we can see that there may be a difference in the ratio of gazing at the agent for each condition. Thus, we compare the average of the proportion of time during which the experiment participants gaze at the agent for each condition. The gaze at an agent was defined using a gazing point in a range of x coordinates of 2800-3800 px and y coordinates of 1900-2900 px. This is the range of agents visible to the participants. The results are listed in Table 5. For these three conditions, multiple comparisons were carried out using Bonferroni's method at the 5% level, and there was a tendency that F was higher than R. From these results, it seems that the experimental participants concentrated too much on the movement of the agent under the random condition, and the concentration ability toward the work decreased. Therefore, it is suggested that for the worker to maintain concentration in order to carry out a clear target action such as a constant gaze condition or gaze-tracking condition, rather than performing the unintended movement.

Conclusion
We examined the effect of an agent's attention to workers during work, known as the social facilitation effect. For the purpose of the investigation, we installed three different conditions of constant conditions of gaze, random condition, and gaze-tracking condition (joint gaze condition) in the agent, and the motivation difference for a simple task with experiment participants was compared.
As a result, we found that the eye tracking condition is particularly effective for enhancing the motivation toward the task in the subjective evaluation. However, there were no significant differences in the number of actual tasks completed and the working time.
There was no significant difference between the constant gaze condition and the gaze-tracking condition with respect to the duration of the concentration, and even an agent that only indicates the target was considered effective for improving concentration. However, it was also suggested that when an agent only indicates the target, there is a possibility that the mental burden of avoiding an objective appearance to workers may increase.
The results of the measurements of the gazing points of the experiment participants suggested that if the rate of gazing time at the agent was large, it would result in a decrease in the level of concentration in the task.
Although there was no significant difference between the number of task achievements and duration between the conditions, there was a difference in the subjective evaluation, as discussed above. Tasks, where a joint gaze works effectively, would be useful to extend these observations. Therefore, in the future, it will be necessary to verify the effectiveness of joint attention during different tasks.