How Should Your Assistive Robot Look Like? A Scoping Review on Embodiment for Assistive Robots

Assistive robots have the potential to support older people and people with disabilities in various tasks so that they can live more independently. One of the research challenges is the appearance of assistive robots so that they are accepted by prospective users and encourage interaction. This scoping review aims to identify studies that report preferences in order to derive indicators for the embodiment of a robot with assistance functions. A systematic literature research was conducted in the three electronic databases IEEE Xplore, ACM Digital Library and PubMed Central (PMC). Included papers date back not further than 2015 and report empirical studies about the preferred appearance of service robots. The search resulted in 1,760 papers. 29 were included, of which 20 papers reported quantitative studies, three described a qualitative and six a mixed-methods design. Out of these papers, seven categories of robot appearances and design components could be extracted. Most papers focused on humanoid or humanlike robots and components like facial features or gender aspects. Others relied on design that reflects the robot’s function or simulated emotions through light applications. Only eight studies focused on older adults, and no study on people with disabilities. The appearance of a humanoid robot is often described as favorable, but the definition of ‘humanoid’ varies widely within all analyzed studies and an explizit allocation of features is not possible. For their practical work, robot designers can extract various aspects from the papers; however, for generalization more research is necessary.


Assistive Robots
People of advanced age or with disabilities often need assistance to handle activities of daily life and to stay safe in their home. Many of them also lack social interaction as they can no longer pursue previous occupations or lack contact opportunities, resulting in longer periods of being alone [1].
The field of assistive robotics addresses these issues by developing various kinds of robotic systems. Heerink et al. [2] distinguish between non-social and social robots. Nonsocial assistive robots such as exoskeletons or intelligent wheelchairs do not interact socially with the user. Assistive social robots are companion and service robots that provide social support as well as physical and cognitive assistance. Klein et al. [3] indicate robots for rehabilitation, to support caregivers and other staff and for support at home. Shishehgar et al. [4] subdivide robotic especially for older adults in companion, telepresence, manipulator service, rehabilitation, health monitoring, reminder, domestic, entertainment, and fall detection/prevention robots.
An assistive robot called ROSWITHA (RObot System WITH Autonomy) is being developed at the Frankfurt University of Applied Sciences in Germany (Fig. 1). Its intention is to autonomously navigate in rooms/flats and to get things for the user. A current research project at the FUTURE AGING research centre is concerned with the physical appearance of the robot, the so-called embodiment. To this end, principles for an accepted robot design and possible embodiments are to be identified. The focus of the scoping review is therefore on the appearance of mobile social robots that can provide assistance at home.

Robot Acceptance
An assistive robot accompanying people in their private environment has to be highly accepted in order to be used regularly and fulfil its purpose [5]. A common model for technology acceptance which can be applied to robotics is the Technology Acceptance Model (TAM) by Davis [6] with two key components: 1. Perceived usefulness, "the degree to which a person believes that using a particular system would enhance his or her job performance " ( [6], p. 320) 2. Perceived ease of use, "the degree to which a person believes that using a particular system would be free of effort " ( [6], p. 320).
A key factor in meeting these conditions is adequate functionality that meets the needs of the user. But apart from the functions, in order to be accepted a robot should look attractive and interesting and also give the impression through its appearance that it is useful and easy to use in daily life. Various embodiments are designed to meet the user's expectations [7], such as humanlike, zoomorphic, fantasy, mechanical, or functional embodiments. The TAM model has been extended several times and factors such as user's gender, age, experience with the technology, and voluntariness of use have also been identified as relevant [8].
Another factor that might be considered when designing a social robot is an effect often described when it comes to the appearance of robots. The Uncanny Valley Effect (UVE) was first described by Mori in 1970 and depicts the phenomenon of human aversion against robot appearances reaching a certain degree of human likeness [9]. As the human resemblance increases, the positive feelings and impressions also increase until a certain degree of human resemblance is reached. From this point on, people perceive the appearance as uncanny and their feelings towards the robot become negative. But if the degree of human likeness continues to increase even further, the impressions will again become positive [9]. This effect could also be important when designing the appearance of robots.

Objective and Research Question
The objective of this scoping review is to identify key issues of embodiment for assistive and social robots that are accepted by people in their homes. Therefore, current research findings are being studied and mapped to identify user surveys on the appearance of robots. Of particular interest are studies with older people who rely on assistance in daily living and with people with disabilities or loss of function. The results are to be incorporated into the appearance of the university's own robot. The scoping review was chosen because it is a method for obtaining an overview of the relevant literature in a specific area. It is suited to cover a broader range of topics and to include different types of studies. Compared with the systematic review, it also focuses less on very specific research questions and on the quality of the included studies [10,11].
In this context, the following research questions are key: How should the embodiment of a robot be designed to assist people in manifold ways in their home environment? What evidence can be gained from surveys on user preferences for robot embodiment?

Methods
The research strategy was conducted following the PRISMA Statement [12], a guideline developed to systematize reviews and review reports.

Data Sources
The database search included the electronic databases IEEE Xplore, ACM Digital Library and PubMed Central (PMC). The systemic literature search was conducted in July 2020.
Based on the search results, a forward and backward search was carried out to complement the results. For the backward search, the references of the full texts were screened. The forward search was conducted via Research Gate and Google Scholar. Literature of overview reviews was screened as well. The additional search took part in November 2020.

Inclusion and Exclusion Criteria
Scientific papers that described empirical studies and were published between January 2015 and 22 July 2020 (last date of search) were included. A timeline of 5 years was chosen to capture recent results in this highly innovative and rapidly changing field of robotics and in order to manage the number of articles. Papers had to be written in English or German language and an abstract had to be available.
Included papers had to focus on the embodiment of assistive or social robotic platforms targeting the support of people in their households. This involved the overall shape, textures, coloured lights and facial features.
Papers were excluded that did not focus on design as an exterior appearance, but on software design or algorithms as well as papers that did not report empirical studies. Furthermore, studies with children or animals were excluded. From the field of robotics, exoskeletons, intelligent prostheses, medical robots for surgery and rehabilitation, drones and other vehicles as well as pet-like emotional robots/ companions that cannot assist in daily life tasks were not included, because they did not fit the application possibilities and functions of ROSWITHA.

Search Strategy
The selected search terms were based on an informal literature search in advance. It was decided to conduct the search in broad categories rather than being too restrictive to make sure not to lose articles because of too small categories. This resulted in the search strategy: Robot AND (Social OR Assistive) AND (Embodiment OR Design OR appearance).
For every electronic database, the search terms had to be adapted. In PubMed we searched for abstract and title, in ACM Digital Library -as the same search strategy was not available -for the abstract and in IEEE for "all Metadata".

Identifying Relevant Papers
In a first step, all research results with a title and an abstract were saved in CITAVI 6.0. Secondly, all authors, years of publication, titles and abstract were exported in the spreadsheet program Excel 2010 to check for double entries, and to screen title and abstract using the previously defined inclusion and exclusion criteria. The results were split into two parts and each part was then screened by two people independently (4-eyes-principle). In case of uncertainty of inclusion, the paper was discussed with a third person of the research team and the majority decided. In a further step, the full texts of the remaining studies were procured and again divided into two parts, with each part screened by two people, with a third team member consulted as needed to get a clear vote on a study.
For the additional research, we screened the reference lists from reviews that appeared in the search results but were excluded because of not being empirical research. Additionally, a backward search was implemented based on the procured full texts by checking their reference list. In a forward search based on the final results, we identified further papers on Research Gate and Google Scholar that had cited our results and expanded the conclusions. The additional literature was also evaluated by at least two people independently. The same inclusion criteria were used.

Data Extraction
All remaining studies were loaded in a CITAVI project and an inductive content analysis [13] was carried out to define categories of robot appearance. The inductive procedure represents an analytical form of summarized interpretation. Relevant aspects are collected from the full texts and summarized into categories by two persons from the research team.

Search Results
A total of 1,760 articles were retrieved. The search in the three databases IEEE Xplore, ACM Digital Library and PubMed Central (PMC) yielded 1,752 references, hand search (reviews, forward and backward search) another 8 articles. After removing duplicates and papers without abstracts, 1,643 articles were screened by title and abstract. 247 articles remained for full-text screening. Finally, 29 papers were included in the scoping review (see PRISMA flow diagram, Fig. 2).
The quantitative studies contained questionnaires, 13 of the 20 took place as an online survey [7, 15-17, 23, 26-28, 31, 32, 35-37], among them six with Amazon Mechanical Turk [15, 26-28, 31, 32]. The studies with a qualitative design contained a focus group [14], interviews and a focus group [22], as well as semistructured interviews [41]. The mixed-methods designs mostly contained interviews and questionnaires [33,34,39], but also a combination of designing an individual face with software in conjunction with a questionnaire [21], a description of perceived emotions with a rating [29], and descriptions of textiles after having touched them with ratings on different scales [24]. The respective characteristics and methods of the reviewed studies are described in Table 1.
Among the quantitative studies, multiple scales and questionnaires were used to measure peoples' impressions of the robots. The most widely used was the Godspeed Questionnaire [43] or adapted versions of it [17,21,30,31,38]. The questionnaire uses semantic differential scales to rate the five categories anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety. Among the adapted versions of the Godspeed Questionnaire was the Robotic Social Attributes Scale (RoSAS) [28], also used in Benitez et al. [26] and Stroessner & Benitez [27]. Two papers [7,34] reported the use of the UTAUT questionnaire [2] or parts of it. The authors of the other quantitative studies reported self-generated scales or questionnaires, partly using items from other surveys (Table 1).  To illustrate the image of a robot or to compare different designs, either real robots were presented or videos or pictures (in most cases of the head part of the robot) were shown. Most studies used pictures (Table 1). Many of the presented robots were commercially available [14, 15, 19-21, 25, 34-36] or at least had some name recognition at the time [7, 27, 31-33, 37, 41]. Others were inhousedevelopments / prototypes [17,18,23,24,29,30,[38][39][40] or visual designs [16,26,28].

Qualitative Results
The selected 29 publications were analyzed concerning their aspects of appearance design. From the inductive content analysis, we matched the participants' favors to seven categories: Humanoid/humanlike, non-humanoid, design reflecting the robot's function, face, gender aspects, lights and motion, and texture ( Table 2).

Overall Design
A number of papers discussed the question of what an assistive robot should generally look like. The main issue was whether or not it should look humanlike, or whether its appearance should reflect its functionalities (Table 3).
A trend towards a preference of a humanlike appearance was seen in five studies [7,20,22,27,34] of which two studies targeted older adults [20,34] and one healthcare professionals [22]. Esposito et al. [20], one of the studies with older adults, further examined humanoid versus anthropomorphic (resembling real humans) robots and found no preference.
In Otterbacher and Talias [15] it was found that humanlike robots are perceived as more capable of feeling pain or fear and of planning and executing self-control which led to negative affective responses towards them. These reactions were influenced by gender-specific attributes of the robot as well as by the gender of the participants.
No clear preference between humanlike or non-humanlike robots could be found in two papers [25,41] of which one, Oehl et al. [25], targeted older adults. In Bedaf et al. [14], no preference was identified in older adults and informal caregivers, while professional caregivers tended towards a humanlike robot.
In Cavallo et al. [18], three robots with a similar mixed appearance between anthropomorphic and machine like were rated with high scores by older adults. There was no alternative appearance that could be used for comparison.
Whether it is a humanoid robot or not, four studies have identified the importance of the robot design reflecting the functions/roles of the robot [7,18,22,41]. Studies that concern older people are in bold print  The focus of the study is marked with a bold "X". A smaller "x" stands for information that occurs in addition, and "(x)" refers to humanlike faces with a machinelike pattern

Faces
In ten papers, the focus was on the face design of robots (Table 4).
On the question of whether the robot should be humanoid or mechanical in terms of its face, the Uncanny Valley Effect (UVE) was observed in three papers [32,33,37]. In Tu et al. [37], the UVE was only seen in younger and middle-aged participants, not for the older.
In Prakash and Rogers [33], more than half of the older participants chose a picture of a human face for their home robot, but only a quarter of the young participants. Half of the young participants chose the presented robotic face. A similar result could be seen in Tu et al. [37], where older adults ranked humanlike robots as their top 5 robot appearances and young adults preferred nonhumanlike robots. The effect changed regarding social tasks, for which half of the young and 60% of the older people chose a human face [33].
In two studies [26,28], humanlike faces were rated higher on the factor warmth and competence and lower on discomfort on the RoSAS than blended robots and machinelike robots, with machinelike being rated the lowest. Stroessner and Benitez [27] could see higher rankings for humanoid robots in warmth and competence than for machinelike robots. For discomfort, the effects could only be partly confirmed.
Three studies were found which analyzed digital versions of a robot's eyes and mouth on a screen. Faces with no pupils, eyelids [31] and no mouth [17,31] were ranked as unattractive. Robots with a mouth (especially with a smile) were frequently chosen for education and entertainment. Blue eyes with pupils were seen as friendly and relatively trustworthy and also favored for entertainment purposes. For the home context, more detailed eyes with eyebrows were chosen [31]. Iconic eyes were rated higher than abstract ones and a possibility to express mood changes was favored [23]. A dynamic mouth was voted most positive and a smile was seen as friendly and sociable [17].
A user-centered approach for robotic faces was described in Heuer [21]. In this paper, an individual digital face was created by the users and then projected onto the robot's head.

Gender Cues
Eight papers evaluated the effects of gender cues (Table 5). Jung et al. [30] found that robots tend to be generally perceived as male unless a female cue (a pink earmuff in this case) is provided. This was not confirmed in Otterbacher and Talias [15] where different reactions from men and women were observed for robots with female and male cues but not for gender neutral robots. Different preferences for male or female robots were seen in younger and older adults with younger adults favoring female features and older adults not having any preferences [33].
Evoking more cognitive trust and appearing more competent had been predicted for male robots [16,27,28] which was only confirmed in Stroessner and Benitez [[27], 2nd study], but rejected by the others. More affective trust was ascribed to female robots in Bernotat et al. [16].
Utterly different results were shown in Jung et al. [30], where the male robot elicited greater anthropomorphism and animacy, and lower anxiety than the female robot.
Otterbacher and Talias [15] reported differences related to the participants' gender: Both, male and female participants perceived female robots as capable of experiencing pain and fear, but only female participants considered female robots as having agency (the capacity to plan and exercise self-control). For male robots, the opposite was true: Humanoid robots with male gender cues were perceived as feeling pain and fear only by male participants while agency was seen by men and women. As both perceptions increased negative affective responses, the authors inferred influences of gender on the Uncanny Valley Effect.
There seem to be indications that gender cues influence the role older adults associate with robots. Three examples of different female robots were perceived to be particularly suited to performing household tasks and less suited to protection/security, healthcare and front office tasks [19]. The results were independent of participants' gender. Also, younger participants preferred female robots for typically female tasks, but for typically male tasks, no differences could be perceived [16]. Feminine faces were preferred as companions and entertainment robots [26].
In terms of how to design gender cues, a higher effect of gender was seen when the cues were on a screen than on the robot's body, but significance was only reached for female robots [30].

Light Effects and Movements
Five studies were identified investigating the effect of lights and movements for human-robot-interaction (Table 6).
In Baraka et al. [40], appropriate colours and light patterns were identified to communicate a robot's status to the user. A light blue colour with a siren-like pattern in slow speed was seen as appropriate for the scenario "Waiting for input", red colour in a faded animation that turns on quickly and dies out slower was chosen for "Blocked" and a green bottom-up progress bar to signal that the robot platform is completing a task [40].
Three papers reported the use of lights to express 'emotions' [35,36,39]. In Song and Yamada [35], blue light in a low frequency was chosen to present attractiveness while red light in high frequency was chosen for hostility. In a following study, the research group tried to specify light patterns for specific emotions. They found that expressive lights alone were not able to convey emotions precisely [36]. By adding in-situ movements, some emotions (surprise, disgust, sadness, happiness) could be identified by the participants. Hoggenmueller et al. [39] reported good ratings for the emotions anger, happiness, and sadness with an animated pattern of colours and specific motions. But in both studies, other emotions were also misinterpreted.
The attempt to develop a completely different form of expression was described in Gemeinboeck/Saunders [38]. The robots were in a white cube shape and moved in a way which was adapted from dancers. Although they had little specific appearance, they were rated over average as affective, and as having agency and intelligence. This might implicate that other factors such as movement can influence user perception similarly as a specific appearance.

Texture
Two studies reported results about the haptic design of the surface of robots (Table 7). Hu and Hoffmann [29] assessed how goosebumps and spikes textures in different frequencies and amplitudes can represent key emotions. They identified goosebumps as being perceived more positive than spikes and high texture change mapped to a higher arousal level. Most texture expressions could be linked to a specific emotion. McGinn and Dooley [24] studied participants' preferences for robot surfaces and found that participants preferred compliant surfaces (medium/soft) over the hard alternatives, which are very common in assistive robots.

Discussion
The aim of this scoping review was to identify papers that report and evaluate potential embodiments of assistive robots. From this, design principles for an accepted embodiment of ROSWITHA, a robot developed at Frankfurt University of Applied Sciences in Germany, will be derived.
Twenty-nine studies were identified that met the criteria. Although, this is not a high number, it reflects the fact that a relatively new field of research is being addressed. It can be assumed that the number will increase in the future. Among these studies were ten papers that addressed the question of whether robots should resemble humans and if so, to what degree. Although some papers showed a certain tendency towards humanoids, the results vary greatly in their statements.
With regard to the target groups of assistive robots, the group of older people is of particular importance. Specific differences between younger and older adults could be confirmed in two papers [33,37], with a higher number of older adults preferring humanlike robots than younger adults. In one paper, reporting only results from people of older age, a majority preferred a humanlike robot for service and companion-related tasks such as housework, finding/fetching things or chatting [34]. However, they were rather indifferent about human or nonhuman features in other studies [14,25]. In one paper, older people preferred the robot Pepper, which was in between the options of totally machinelike and anthropomorphic [20]. Due to that and the fact, that the presented robots were not comparable, statements about the preferences of this specific age group are not possible.
Other studies evaluated more specific aspects of a robot, such as the face, gender-specific appearances, light patterns and movements, or texture. This allows statements about the design of faces such as the eye and mouth area. Indications about the use of coloured lights and movements of the robot to simulate emotions are also revealed, as well as suitable textures for different purposes. These aspects can shed light on how to design robots to better interact with them.
However, the examined studies varied widely in terms of study design, number and characteristics of participants, methodology and stimuli, which may have influenced the results: There are no clear distinctions between embodiments or clear classification clusters, especially concerning humanoid or humanlike robots. Therefore, the stimuli cover a wide range of appearances and are not comparable. Robots with a similar design -functional with some kind of body, head, and face -are titled as 'humanoid' [19,20,25,34,41] in some studies, or as 'mixed appearance' between anthropomorphic and machine-like [18], 'technical design' [25], or 'caricatured' [7]. Very realistic robots resembling humans like Geminoid HI-1, HRP-4c, Erica or Sophia are either referred to as android [15,19,41] or also as humanoid [7,27]. Whereas very realistic human faces combined with a Preferences of robot surfaces mechanical looking pattern are described as 'machinelike' [26,28].
Regarding methodology, many studies used images rather than real robots, which may have biased the evaluation because the actual size and proportions were not obvious. Apart from this aspect, it seems sometimes uncertain if robots were rejected by participants because of their mechanical or humanlike design or just because they looked less positive or more frightening. Such a potential bias in favor of the male robot was described in Jung et al. [30]. But it also seems possible, e.g. with the images used in Benitez et al. [26] and Carpinella et al. [28], in which the eye area of the male stimuli appears more threatening than that of the female or androgynous ones. Especially in studies with focus on the relationship between gender and competences of a robot, other factors seen on the robot screen may affect participants' perceptions, such as the age of a robot's face or accessories like glasses [27]. Apart from that, gender preferences might be dependent on the participants' own gender [15].
However, other effects could also be relevant: It could be shown by Koschate et al. [44] that the influence of emotional expression can help to overcome the UVE in robots. This would mean that robots expressing emotions would leave a better feeling than robots with an indifferent appearance regardless of their overall appearance. Kwon et al. [45] investigated expectations based on appearance and showed that participants tented to generalize social capabilities in a humanoid robot, which might cause an expectation gap. If participants see the robot during an activity in which it does not fulfill the expectations, this can lead to negative impressions which are not directly related to the robot appearance. These are aspects that should be considered when planning evaluation tests.

Conclusion
Even if there are indications that certain humanlike features could be of advantage for a social and assistive robot, only a few general appearance patterns could be identified that lead to good acceptance values. The small number of studies found on the various characteristics makes statements difficult, but also their low comparability. As too many additional factors are of importance, such as specific realization of an embodiment, human-robot-interaction, the robot's functionalities and users' individual factors (age, gender), an approach that fits all target groups does not seem to be the solution. Instead, it might be useful for design teams to get to know the needs of the specific target group and the required functionalities of their robot very well and to have a significant number of these people participating in the design process over a longer period of time.
Robot designers can extract some aspects from the papers; however, a generalization, especially with respect to physical assistance, seems more than difficult. For the embodiment of ROSWITHA, various parameters contributing to acceptance have been derived and transformed into different 3D models. For the first phase of evaluation, three versions were designed with a head and a body to give the embodiment a basal form of human resemblance. A rather neutral, a more playful and a head with a monitor as a face were chosen. The bodies vary from angular to conical to round. These designs will be evaluated and further developed at first in an Augmented Reality environment with different target groups in order to achieve the most accepted embodiment.
Finally, it is important to mention that embodiment is strongly linked with the functionalities. The design should also reflect the functions of the robot and the functions themselves have to convince the user to reach acceptance.