1 Introduction

Behaviour change interventions are often designed to change behaviours of a group or a community by focusing on a specific behaviour in order to address an issue. It requires an understanding of one or more modifiable influences on behaviour which can be then be addressed using behaviour change techniques. The science of behaviour change is complex and the design of technologies to effect behaviour changes is not well understood. One area of active research in behaviour change treatments involves using social robots as change agents [1, 2]. When we refer to a social robot, we mean an autonomous artificial intelligence agent with a physical embodiment that can have social interactions with humans [3]. In this study we focussed on handwashing behaviour change as a case study to design an extensible social robot platform that can nudge children towards positive behaviours.

Poor hand hygiene practices among children have been identified as the primary cause of morbidity such as anaemia, respiratory illnesses, and diarrhoea [4]. Widespread adoption of a proper handwashing routine with soap can potentially prevent up to one million deaths worldwide [5]. It can lower the risk of respiratory infections by 16% [6]. When hand hygiene is practised in community school settings, research has shown that it has reduced absenteeism due to gastrointestinal illness [7, 8]. Research has also shown that handwashing effectively reduces the spread of viruses such as the newly discovered coronavirus causing COVID-19 [9, 10].

However, despite its simplicity and effectiveness, most handwashing campaigns conducted over the past three decades failed to establish handwashing behaviour as a regular habit. Hussam et al. [4] state in their work that handwashing behaviour can become effective only if it is practised often enough, and for this, it must become a habit. We believe that when handwashing regimens are introduced to children at a young age, it has the potential to instil regular habits. Thus, instead of effecting a handwashing behaviour change using models that focus on the fear of disease and epidemics, our effort is to focus on motivation, positive emotions, and habit formation through positive persuasion.

The primary purpose of our study was to determine children’s preferences for the embodiment of a handwashing assistive agent in their school. The social robot was co-designed with children with the intent to foster good handwashing practices among their peers eventually. Based on our analysis of the inputs from the children, we present recommendations for a minimalist design of a social agent to promote handwashing behaviour. Our design recommendations are based on children’s inputs and prior research in Behaviour Change Support System (BCSS), human–robot-interaction research and practical constraints. In future work, we will conduct a behaviour change study with the children to evaluate our final prototype design in effecting habit formation.

2 Social Robots as Agents of Change

Prior research indicates that physically embodied agents are perceived to be more enjoyable, have a better social presence and acceptance and are more engaging than a virtually simulated robot or a teleconference agent. A teleconference agent means a setup where the physical robot is in a remote location, and the users interact with the robot through a video conference. In a study of the role of the physical embodiment of robots for social interactions, Wainer et al. [11] compared a remotely located physical agent and a simulated virtual agent. They found that the interaction with the co-located physical agent was more enjoyable, and the same was perceived to be more observant than a virtual agent.

Similarly, Bainbridge et al. [12] studied the effect of the physical presence of an agent and found that the study participants were more compliant toward a physically present robot’s commands than an on-screen robot’s commands. The participants also found this robot more engaging than the virtual one. Moreover, the physically present robot was given more personal space, which could be evidence of respect. In another study by Leyzberg et al. [13], assistance was provided to solve a puzzle by the voice of a robot, an on-screen robot, and a physically embodied robot. A significant increase in learning gains was observed when a physical embodiment was used. The participants rated the physically embodied robot as less annoying. According to Leyzberg et al., [13], this could be the effect of social presence and possibly indicate that physical embodiment causes a greater sense of social acceptance compared to a virtual agent. Research has also shown that a physical embodiment enables the robot to use multiple channels of communication, including proxemics, oculesics, and gestures [14]. We co-designed our robot as a physically embodied agent based on these research findings.

Breazeal [15] postulated that a sociable robot should be able to communicate with humans and understand and relate to humans personally. Using social cognitive capabilities, a sociable robot should be able to understand humans and itself in social terms. In turn, human beings should be able to understand the robot in the same social terms to relate to it and empathise with it. Such a robot must be able to adapt and learn throughout its lifetime, incorporating shared experiences with individuals into its understanding of itself, of others, and of the relationships they share. We use this as a guiding principle from HRI research and juxtapose it with the physical and interactional features children found preferable for such a robot.

2.1 Human Behavioural Change via Social Robots

The social robot we are designing will be a persuasive agent that provides necessary support so that the behaviour becomes a habit. To discuss the theoretical foundation for designing our social robot as a persuasive system, we will adopt the definition of persuasion proposed by Fogg [16]. Fogg defined persuasion as: “an attempt to change attitudes or behaviours or both (without using coercion or deception)”. The Persuasive Systems Design (PSD) model builds on this definition and proposes twenty-eight design principles grouped into four categories that can be used to guide the design of a persuasive system [17]. When an HRI system is built with the design principles of the PSD model, it can be referred to as a Behaviour Change Support System (BCSS). Lehto et al. [18] define BCSS as: “a socio-technical information system with psychological and behavioural outcomes designed to form, alter or reinforce attitudes, behaviours or an act of complying without using coercion or deception”. The authors also present an expanded PSD model for a BCSS with five factors that influence perceived persuasiveness which in turn influences intention to use. These are (i) primary task support; (ii) dialogue support; (iii) credibility support; (iv) design aesthetics; and (v) unobtrusiveness. We explore the first four factors in this empirical research study with children. The fifth factor pertaining to how developers can build a BCSS in such a manner that it will be unobtrusive with users’ primary tasks is an open research agenda that warrants a separate focus by itself [19].

For a social agent to bring about a behaviour change, long-term interaction with human subjects is a precursor. Several research studies have been conducted with social robots as long-term behaviour change agents. Billard et al. conducted a longitudinal study on the effect of an imitator robot as an educative therapy platform for low-functioning children with autism in the age group of 5–7 years [20]. They found that children preferred and interacted more with Robota when it appeared more simple-looking and plain compared to Robota which had detailed anthropomorphic features. Therefore, the authors conclude that a robot’s appearance plays a crucial role in how children respond to and engage with the robot. The authors also reported that Robota could elicit imitative behaviour in children over time. They observed increased social interaction skills (imitation, turn-taking, and role reversal) and communicative competence when the robot served as a salient object mediating joint attention with an adult.

Another study conducted by Robins et al. with a minimalistic humanoid robot, KASPAR and children with low functioning autism made similar observations about interactional competencies of children with the robot [21]. The authors designed KASPAR with the cognizance that the physical appearance of a robot requires to match the level of the behavioural and structural complexity of autistic children. They found that the children displayed remarkable signs of physical engagement, such as touch and gaze with KASPAR and even with other unfamiliar adults co-present in the environment. The children also showed signs of awareness to the co-present adults’ perceptions by gazing at them whenever KASPAR performed certain actions, which they had never portrayed before in their day-to-day activities.

In the context of education, two English-speaking robots behaved as tutors in a 9-day evaluation study [22]. The robots could recognize almost fifty different English words and be able to recognize the children based on their RFID tags. At the end of the study, the children who interacted with these robots for more than a week showed significant improvement in their English-speaking skills.

To understand the extent to which Socially Assistive Robots (SAR) can help promote behaviour change and positive nutrition habits among children, Short et al. conducted a three-week-long study with the DragonBot robot and children aged 5-8 years [23]. The authors found that throughout the study children exhibited positive behaviour towards the DragonBot by promptly responding to the robot’s queries, maintaining longer engagement periods with the robot and displaying signs of high levels of enjoyment in the robot’s presence. Children also continued to make healthier food choices over time. By the end of the intervention, the authors found that the children had developed a good rapport with the robot, suggesting evidence of positive relationship-building in child–robot interaction.

In a similar but slightly different context, researchers studied the impact social robots have on dieting [24]. Their social robot, Autom, was capable of keeping a record of the weight loss journey of the participants while simultaneously interacting with them. Few participants maintained a paper log or a computer log to monitor their progress while others depended on Autom. By the end of the study, it was observed that the participants who interacted with the social robot during their dieting phase felt more encouraged to continue their weight loss journey, due to which they were able to continue dieting and achieve their target goal of losing weight, in contrast to the participants who did not interact with Autom. These studies suggest that social robots have great potential to interface with humans for extended periods and can act as active agents of behaviour change. They can effectively assist in learning new behaviours as tutors, play a supportive role in health interventions, and also probably pose as objective assessors of various indicators of health and hygiene.

2.2 Building on a Pilot Social Robot Intervention of Handwashing

In 2019, researchers from the AMMACHI Labs, Amrita Vishwa Vidyapeetham, conducted a Wizard of Oz experiment among children in a rural Indian school with a social robot called Pepe [25, 26]. They leveraged the Hawthorne effect while designing the robot. The Hawthorne effect [27] refers to a type of reactivity in which individuals change their behaviour when they become aware that they are being observed. Research has shown that the presence of observers increases the frequency of handwashing [28]. In alignment with that, the handwashing intervention with the Pepe robot showed significant improvement (40%) in handwashing behaviour and its technique. The quality of handwashing increased during and immediately after and six days after the robot intervention, thereby showing a positive effect of the robot intervention in retaining good hand hygiene compliance. While this pilot study was short-term in nature, it validated the potential. It served as an inspiration for future studies that can influence a handwashing behaviour change through the intervention of a social robot.

2.3 Pertinent Aspects of Design of Social Robots for Behavioural Change

Focusing our attention on the design of our social robot, we laid down some boundary conditions for the design of the robot early on. The intended functionality of promoting handwashing behaviour change in rural schools in developing countries like India meant that the design of the robot had to be minimal, that is, simple and low cost to allow for scalability. So we split the co-design process into two halves. In the first half, we let the children design their robots and gathered inputs from them on their designs, functionalities, and interactions. In the second half, we presented the children with eight caricatured anthropomorphic designs based on minimalistic principles and sought their input on our designs. We will now explain the considerations we took into creating caricatured conceptual designs.

Fong et al., [29] stated that the morphology of a social robot must match its intended functionality. For our application, therefore, it is important that the robot could socially engage with children towards its intended function of promoting handwashing. They also stated that to develop this engagement, the robot must project a degree of humanness that the children will feel comfortable with and yet have a degree of robotness not to create false expectations of the robot’s capabilities. What is pertinent to our work is that we decided to adopt anthropomorphic caricatured forms for our conceptual designs.

2.4 Children as Co-designers

Our prior knowledge of objects in the world and interactions with those objects exists as mental models [30] that guide how we conceive of those objects and the predictions of our interactions with them. When it comes to HRI and social robots, research has shown that even young children have mental models of how they expect robots to look like and behave [31]. Many developers of robots ask parents and teachers what they think their children or students may need, rather than asking the children directly [32]. Sandoval et al., [33] conducted a study with a teleoperated anthropomorphic robot POWI and reported that children have a deep understanding and expectations for robots in the future. Therefore, to design a successful social robot, we must know what children, as future users of the robot, would prefer regarding the designs we envisioned. In other words, as designers, we needed to know if our envisioned social robot morphologies would be able to produce an appropriate mental representation in the children about such technology and its use.

Additionally, research in social robots has shown that understanding the preferences of future users at an early stage of development is crucial for its societal acceptance [34]. When children are involved in designing robots, they play four main roles in the technology design process: user, tester, informant, and design partner, each bringing its strengths and weaknesses to the design process [32]. As a user, children contribute to the research by using the newly developed technology. As a tester, children test prototypes of technology that have not been released to the world. It makes them feel empowered, and they portray active participation because they perceive that adults value their opinions. As an informant, children play an active part in the design process. They are motivated and challenged by the problem-solving and brainstorming experience during various technology design stages. As design partners, children are considered equal stakeholders in the design of new technologies throughout the entire design experience. As the involvement of children in the design process increases, so does the sense of empowerment. In the present study, the children played the role of informants.

Oliveria et al. discuss the co-design process of YOLO [35], a non-humanoid robot with interactive capabilities to boost creativity in children by engaging them in a storytelling activity. In their study, the children played the role of an informant to add social and creative behaviours to the robot. We adopted a similar approach for our study, with the children playing the role of an informant and providing inputs on the robot’s embodiment and its interactions. This approach was the easiest balance we could strike between maximising children’s contribution to the design process itself and working around limitations imposed by COVID-19 constraints, such as restrictions on face-to-face interactions with the children due to school closures.

In the field of human-machine interaction, Norman [36] postulated that mental models are naturally evolving models that need to be functional but need not be technically accurate. Based on the research done by Puerta-Melguizo et al. [37], to get an understanding of children’s understanding of the social robot and their expectations and needs of the robot, we designed this research study to collect inputs from the children even when the design of the system is not specified yet. To remind the reader, the purpose of our co-design study was to find out what children’s ideas about and preferences for the embodiment and interaction of a handwashing assistive agent in their school were. Towards this, we asked the following research question:

According to children, what should a robot that can inspire children to handwash properly in their school look and sound like?

Fig. 1
figure 1

Eight conceptual designs of caricatured robot morphologies

3 Methods

Breazeal stated that for human-style sociability, social robots have to be believable [15]. A caricatured embodiment can be perceived as believable even if it is not realistic. We took inspiration from Pixar and Disney, who created caricatured believable characters from everyday objects. We decided to base our robot’s morphology on practical objects that could suggest handwashing to the children, such as a soap dispenser, a human hand, and a water drop. We also included a design of a little boy and a little girl with ponytails resembling a cartoon character to see if children preferred human-like cartoon characters for the robot’s morphology.

Further, to understand if a zoomorphic caricatured design would affect children’s preference for the morphology of a handwashing robot, we added a caricatured cat design as well. The idea for including a zoomorphic design came from the previous study with Pepe [25] where the children suggested a cat’s face as an alternate design for the robot because of their familiarity with cats in their homes as naturally clean animals. We understand that by choosing to create these eight conceptual designs, we may have exhibited experimenter bias. Still, as the role of children is that of informants in our study, we wanted to take their responses and ideas into consideration and, at the same time, make the final decision about the prototype of the embodiment as designers, accounting for minimalism and reproducibility.

As stated earlier, to keep our conceptual designs minimal, we carefully made designs for which a physical prototype could be made with minimum cost and off-the-shelf components. We also utilised form factors that could leverage fabrication technologies such as 3D printing for rapid prototyping. We drew inspiration from the work of [38] with the robotic face MiRAE and eliminated superfluous factors such as eyelids and ears and kept only the eyes, eyebrows, and mouth while designing the face in our conceptual designs. However, we chose to keep a graphically rendered face for our caricatured believable characters instead of an actuated mechanical face to reduce the complexity of control and actuation while allowing for a wide range of expressions to depict the character’s personality. The eight conceptual designs we made (refer to Fig. 1), served as a foundation for co-designing our social robot with children, specifically in the second half of our study, as mentioned in the section below.

Fig. 2
figure 2

The smiley-meter used for the quantitative questionnaire

To reiterate, there are reasons to believe that a handwashing robot, co-designed with children, can be used effectively through long-term interventions to train children in handwashing hygiene. Co-designing with children helps to tap into children’s imagination of how a social robot should be, from their point of view [39]. While we have additional design constraints on the robot’s physical and technical features, involving children early in the design process and incorporating their ideas into our robot design can increase the acceptance of the robot. Also, by mapping the children’s feedback into the PSD model, we could understand which principal constructs and factors from the PSD model we need to incorporate into our final prototype design. As such, the twenty-eight design principles of the PSD model are regarded as central ideas in studies on persuasion, nudging, and influence. Even so, the model does not mandate that all the design principles be incorporated into a BCSS as increased elaboration could lead to decreased overall persuasiveness [19]. Thus, co-designing allowed us to both build acceptance for the robot design and to understand children’s perspectives on the robot design to take appropriate design decisions to maximise its persuasiveness. The present study reports on how we arrived at the recommendations for a minimalistic social robot design. We now turn to our study’s design.

To answer our research question, we adopted a convergent mixed-methods design approach wherein we analysed and interpreted both qualitative interview questions, and quantitative data [40]. For the quantitative questions, we used a smiley-meter with a 5-point Likert scale as shown in Fig. 2 for the quantitative questions. We partly developed the smiley-meter by using assets found in the Twitter Open-Source emoji set [41].

Fig. 3
figure 3

The research process to understand children’s mental model for a social robot to promote handwashing behaviour

We divided our research study into the following four phases. In the first phase, we tried to determine the children’s expectations regarding the robot’s appearance. For this, we selected the ‘appearance’ dimension of the multi-dimensional robot attitude scale [42] and adapted it for this phase. In the second phase, we tried to ascertain the mental models that the children had about their future interactions with a robot designed to promote handwashing behaviour. We did this in terms of how they thought other children would interact with the robot and how the robot would respond to the children. Towards this, we asked the children to create an abstract 2D or 3D mockup design representing their ideas of a robot for promoting handwashing using paper cutouts and then discussed their robot’s physical and interactional features. In the third phase, we tested our conceptual designs for a minimalistic social robot, forming the foundation for this study phase. For this, as discussed earlier, we presented animated versions of the eight conceptual designs shown in Fig. 1. Each animation was 10-12 seconds long and showed three emotions for each embodiment, happy, sad, and angry, without any sound (refer to Online Resource 1). In the animations, only the facial expressions were changed, and the robot’s body was static. We wanted to see how our conceptual designs aligned with children’s mental models. To conduct this test, we created a questionnaire using the likeability dimension of the widely used standard, the Godspeed scale. In the fourth and final phase, we asked the children to rank the conceptual designs based on the likeability dimension. Figure 3 shows the research process we followed. We will now describe each phase of our study in detail.

3.1 Phase 1: ‘appearance’ Dimension of Multi-dimensional Robot Attitude Scale

Our study is an online study with children of primary school age. The average focus and attention span of young children in this age group are typically minimal. Since they were getting connected online from their individual homes, it was challenging to control everyday distractions around them during the study. Hence, to seek inputs from the children on the robot’s morphology, we decided to restrict our questions only to the ‘appearance’ dimension from the Multi-Dimensional Robot Attitude scale. HCI research states that how agents are visually perceived predicts users’ satisfaction, pleasure and enjoyment levels. In other words, design aesthetics significantly impacts the perceived use of persuasive systems [18]. The first five items of the ‘appearance’ dimension of the Multi-Dimensional Robot Attitude scale allowed us to explore ‘design aesthetics’ factor for the BCSS at a deeper level.

The original Multi-Dimensional Robot Attitude scale reflected viewpoints of people who did not have sufficient experience in interacting with real robots but were expecting future experiences with real robots. Previous research we carried out with children has shown us that most primary and middle school children in rural and semi-urban schools in India have no prior experience in either working with or interacting with robots [25, 43]. Therefore, our participant demographic characteristics were similar to Ninomiya’s [42] research study. We dropped the question which said “I think the shape of a handwashing robot should have roundness.” because there was some ambiguity in the children’s minds about this statement—they could not understand what a rounded robot design meant. We also added a question “I think the voice of a handwashing robot should be like the voice of a child who is less than ten years old” to understand if they preferred a child’s vocalisation over an adult vocalisation for the robot. Table 1 presents the items included in this phase.

3.2 Phase 2: Handwashing Robot Buddy Design

We designed this phase to be an activity-based hands-on session. We began this phase by briefing the children about the activity. We also answered any questions children had about the activity itself. Next was the design activity step. For this, we requested parents to provide the children with pre-cut pieces of card stock paper in various shapes such as oval, rectangle, square, circle, semi-circle, triangle, diamond, star, and bubble shapes. We asked the children to create their own handwashing robot buddy. We asked them to draw features such as eyes, nose, ears, eyelashes, whiskers, and others on the shapes if they wanted to do so. We also encouraged the children to develop more than one robot design if they wanted to do so. The activity ended with a presenting and wrapping up step where the children presented their designs.

Inspired by the participatory design approach adopted by [44], for this section, we asked the children four prompting interview questions organised into three themes, namely Actions, Output, and Communication:

  1. 1.

    Actions: What actions will children interacting with your robot do that affects your robot’s response?

  2. 2.

    Output: How does your robot react to those actions?

  3. 3.

    Communication:

    1. (a)

      What do you want to communicate to children through your robot?

    2. (b)

      How should children feel or respond when they interact with your robot?

Table 1 Items based on ‘appearance’ dimension of multi-dimensional robot attitude scale
Table 2 Items asked about each robot animation

Finally, we asked the children to describe the embodiment of the robot they designed. The open-ended measures for this phase permitted analysis of the PSD model factors brought up during the children’s interviews. The prompting questions were designed to elicit responses primarily for ‘primary task support’ and ‘dialogue support’ factors of the PSD model. The interviews were conducted in four languages: English, Hindi, Telugu, and Malayalam. We translated the interview questions from English to the Telugu language with the help of two native translators. The different translation transcripts were then compared to develop the best-translated version. Then with the help of two other native translators, we independently back-translated the questions to English and compared these new ones with the original questions. This exact process was repeated for Malayalam and Hindi languages.

3.3 Phase 3: Animations of Caricatured Robot Embodiments

In this phase, the questionnaire asked the children to rate each of the eight animations shown in Fig. 1 on the items mentioned in Table 2. For each of the eight robot animations, we asked four items, as shown in Table 2. Both in this phase and the next phase, we explored the trustworthiness construct from the ‘credibility support’ factor of the PSD model. For this study, we operationalized credibility as a function of two constructs - likeability and trustworthiness. Other factors such as expertise and enthusiasm can be more easily explored after the children have a chance to interact with a BCSS. Now, it is somewhat intuitive that people are more persuaded by a message coming from a likeable and trustworthy source. In exercise psychology research, when participants liked the exercise leader, they were more motivated to continue to attend classes of the exercise program [45]. In his book, Perloff stated that “just being likeable can help a communicator achieve his or her goals” [46]. Thus, to explore likeability, we adopted the likeability scale of the Godspeed model III [47] to our questionnaire. We took three out of the five items from the likeability scale to keep the survey fatigue low. To understand children’s preferences on the trustworthiness of the robot, we added one item “I will listen to the robot if it asks me to handwash properly”. The children were asked to rate each item on a 5-point Likert scale with a smiley-meter similar to questions in 3.1.

3.4 Phase 4: Ranking the Caricatured Robot Embodiments

In this final phase, we asked the children to rank the eight caricatured embodiment designs we showed in the previous phase and answer the questions in Table 3.

Table 3 Items for rating the eight caricatured robot animations

4 Participants

The study was conducted with 39 children, 22 boys and 17 girls, between the ages of 5–10 years, who were selected by convenience sampling. Prior studies informed our choice of age group on habit formation, which state that children establish a firm learning habit by the age of 9–10 years [48, 49]. The participants are currently studying in schools in five different geographical locations in India, namely, Puthiyakavu, Thalassery, Tirupati, New Delhi, and Hyderabad, as shown in the map in Fig. 4.

Fig. 4
figure 4

Geographical locations of the schools from where data was collected

Fig. 5
figure 5

A Face-to-face session with children before COVID-19 lockdown and an online session after COVID-19 lockdown

The initial sample consisted of 40 participants. After the first phase of the study, one participant opted out and another participant from a rural area in India left due to repeated technical issues he faced during the interview (poor network connectivity). The geographical locations covered both northern and southern regions of India, with most participants (n = 31) from the southern regions of India. Out of the total 39 participants, 18 were from Puthiyakavu, a rural village in Kerala; 6 of them were from Thalassery, a semi-urban region in Kerala; 6 of them were from Tirupati, an urban region in Andhra Pradesh; 8 of them were from New-Delhi, the metropolitan (urban) capital of India and 1 participant was from Hyderabad, a metropolitan city (urban) in Telangana. Thus, our participant sample was weighted more towards rural areas, with 24 participants versus 15 participants from the urban regions. We weighted our sample towards rural areas because awareness of basic sanitation and hygiene practices continues to be low among the rural population [50], causing more illnesses in these areas in contrast to urban regions of India.

As the participants hailed from various communities and regions of India, to ensure that they understood the questionnaire, we transcribed the questionnaire into their native languages, as we described earlier in the Methods Section. For the first six participants, we conducted a face-to-face session as seen in Fig. 5. Then the second wave of the COVID-19 pandemic hit India, and we had to conduct individual online sessions with all the remaining participants. All the participants from the rural regions in our study had access to smartphones, as they were using them to attend online classes during the pandemic and thus could participate in our online study. We acknowledge that this is also a limitation of this research study because we had no way of reaching out to the lowest rung of the economic population in the rural areas who typically do not own a smartphone.

Table 4 Frequency of children’s expectations about the appearance of the robot

The study was undertaken with the participants’ parents and guardians’ and participants’ consent. Confidentiality and anonymity of the data collected from the participants were maintained throughout the study. All the children’s interviews were recorded and transcribed.

5 Results and Discussion

We now present data for all four phases.

5.1 Phase 1: Children’s Expectations of Robot’s Appearance

To remind the reader, this phase determined the children’s expectations regarding the robot’s appearance.

Of the 40 children who answered questions on the ‘appearance’ dimension of the multi-dimensional robot attitude scale, 37 said they knew what a robot was. Nine children among the 40 had used robots as toys before. One child said he had spoken to Siri, the voice assistant on an Apple iPhone, which he considered a robot. The rest of the children had never spoken to a robot before.

As shown in Table 4, most children expressed expectations of attractiveness for the robot’s appearance by stating that the robot should be cute, cool and beautiful. Since the average attractiveness score was very high (4.3 out of 5) we can infer that visual ‘design aesthetics’ was an important design consideration for the children. Also, based on attractiveness research, the highly positive reactions from the children suggest anticipated bonding and attachment with the robot [51, 52].

To explore whether children preferred anthropomorphic or zoomorphic appearance, we performed a Wilcoxon signed-rank test between the items “I think handwashing robots should have animal-like shapes” and “I think handwashing robots should have human-like shapes”. The test was statistically significant (Z = 2.037, \(p = 0.041\)), which indicates that children preferred an anthropomorphic design over a zoomorphic design for the robot. This result is further validated in the next phase, where we asked the children to design a handwashing buddy robot and describe their design. Most children made anthropomorphic designs with body features such as torso, hands, and arms and facial features such as the face, eyes, head, and mouth. However, this result is also interesting because when we present the results of 5.4, the reader will see that most children selected the caricatured animation of a cat for the robot embodiment from our conceptual designs. We will discuss this dichotomy further in 5.5.

A Spearman’s rank-order correlation was run to determine the relationship between the preferred shape of the robot (human-like/animal-like) and the preferred voice of the robot (living creature/machine-like). We found a strong, positive correlation between the preferred voice and animal-like shape, which was statistically significant (r_s(8) = \(-0.486, p = 0.0014\)). Similarly, there was also a strong, positive correlation between the preferred voice and human-like shape, which was statistically significant (r_s(8) = \(0.360, p = 0.0224\)). From this, we can say that for both the robot designs (zoomorphic and anthropomorphic), children preferred the robot to have a voice of a living creature.

5.2 Phase 2: Children’s Design of a Handwashing Robot Buddy

This phase was the qualitative part of our study. This part was designed to determine the nature of children’s mental models about their future interactions with a robot that promotes handwashing behaviour. As stated earlier, of the 40 children who participated in Phase 1, two participants dropped out in this phase.

The parents had prepared the cut-outs of basic shapes such as squares, circles, and triangles that we had requested apriori. Using these pieces, the children made their robots. We observed that the time taken by the children to create their robot embodiment was age-independent. Few younger children could finish creating their embodiment within 10 min, while a few older children requested more time, up to 20–25 minutes and vice versa. Few children used all the nine shapes for their design, while few other children below the age of 6 used only 3–4 shapes and created relatively simple robot designs. Two of the participants wanted to make more than one design. Figure 6 shows some of the designs the children came up with. We asked the children to talk about their robot design and asked them the four prompting interview questions pertaining to the robot’s actions, output, and communication. Through this activity, we attempted to capture and summarize the latent needs of the children. We now discuss the qualitative analysis performed on the children’s responses to the interview questions.

Fig. 6
figure 6

A sample of the Robot designs made by the children

Table 5 Themes and categories emerging from children’s responses with its frequency count
Table 6 Themes and categories that emerged from children’s responses with its frequency count, continued

We added all the responses to a database and assigned a numerical identifier to each participant for anonymizing and transcribing their responses. We adopted an inductive approach and carried out the analysis after completing all interviews. Two members of the research team independently constructed preliminary open codes from the interview responses of the first two participants—this included both descriptive codes and in vivo codes [53]. When we say codes, we mean distilled units of text that capture the exact meaning of the source text. The research team met to discuss the codes and condensed them into larger categories using axial coding to allow for the grouping of the raw data. After this, the researchers looked at the responses a second time and coded based on the research questions, where we focused specifically on the morphology and interactions the children intended for their handwashing robot. The author team again reviewed the codes created, and all disagreements in preliminary coding were resolved. Then, two team members worked independently to code the remaining participants’ responses based on the initial categories, adding new codes to the preliminary list when required. The author team again met to combine categories into more abstract themes using the process of selective coding. We used Cohen’s kappa test to establish inter-rater reliability. It was quite high between the coders 90.7% (\(\kappa \) = 0.907). Based on children’s needs for the robot, their responses were categorized into three themes, namely:

  1. 1.

    Children’s needs for robot’s communication

  2. 2.

    Children’s needs for emotional intelligence in the robot

  3. 3.

    Children’s needs for robot’s appearance and functionality

There were 375 responses from the sample of 38 children, categorized into 19 categories and then grouped into three themes. Each of the responses was also assigned a PSD principal construct that best reflected the response given by the children. These constructs were then grouped as per the categories. Table 5 and Table 6 show the categories, themes and the PSD principal constructs assigned for the children’s responses.

We now discuss the data from the three themes.

  1. 1.

    Theme 1 -“Children’s need for robot’s communication”: This theme talks about the conversational AI aspect, which is an integral part of a robot’s communication. Based on the four categories that emerged from the data (refer to Table 5), the authors concluded that the participants preferred an expressive and highly interactive robot that is capable of encouraging children to practise good habits such as handwashing and playing games with them. The categories “Can engage in pleasing conversations”, “Encourages behaviour change for wellness” and “Can listen and respond”, which have the highest frequency counts, suggest that the robot has to be incorporated with a conversational AI which is capable of listening to the children, entertaining them, and ultimately teaching good habits.

  2. 2.

    Theme 2 -“Children’s need for emotional intelligence in the robot”: This theme talks about the requirements as stated by the children for the robot’s emotional intelligence, which is its ability to perceive and understand different emotions (refer to Table 5). The first two categories, namely “Expresses emotions such as happiness, anger, and pain”, and “Can detect human emotions” suggest that participants prefer a robot with intelligence to display different emotions and detect the participant’s emotions to respond appropriately. For the third category “Provides empathetic and compassionate responses”, a few of the participants’ responses such as “If the kids are sad, it will make them laugh, and if they are very happy the robot will also feel happy” informs us that there is an expectation that the robot should be capable of first recognising emotions and then mimicking compassionate behaviour.

  3. 3.

    Theme 3 -“Children’s need for robot’s appearance and functionality”: This theme talks about children’s need for the robot’s appearance and functionality, that is, its embodiment design. After analysing the participant’s data, we came up with 12 categories that best suit the theme (refer to Table 6). Based on the data analysis, we deduced that the participants prefer an attractive anthropomorphic robot design capable of detecting a person’s presence. The participants also shared that the robot should be capable of moving from one place to another and should have a speaker and microphone installed to communicate with the children. According to a few participant responses such as “if a child touches the robot, it will not say anything”, we infer that touch detection sensors should also be a part of the robot’s design. The codes “Can promote handwashing”, “Has hands” and “Has an attractive robot design” have the highest frequency counts, which suggest that the children thought that the robot should have hands that could carry sanitisers and soap dispensers and should use hand gestures to demonstrate how to handwash.

The children also wanted the robot to perform various other functions, such as having the ability to cook, wash clothes, clean the house, help the children with their lessons, and assist elders. We grouped all this within the code ‘Multiple functionalities’. Since the main objective of our paper is to develop a social robot to promote handwashing behaviour, adding other functionalities to the robot will be scope for future work.

Based on the data presented in Table 5 and Table 6, out of the twenty eight principal constructs of the PSD model, a total of nine principal constructs were identified from the responses given by the children. They are:

  • Primary Task Support : Tunneling, personalization and tailoring.

  • Dialogue Support: Suggestion, reminders, social role and praise.

  • Credibility Support: Expertise.

  • Design Aesthetics

We will further discuss the above-identified nine principal constructs in Sect. 5.5.

5.3 Phase 3: Repeated-Measures ANOVA

This phase was the quantitative part of our study. Here, we present the data from 33 participants who participated in this phase. We eliminated the data we collected from the offline study for this phase from six children since the offline study was done by children together as a group and not individually. In this section, we compared the responses given by children on the conceptual designs of caricatured robot morphologies (shown in Fig. 1) in terms of two factors we grouped them into, namely likeability and trustworthiness.

We created a composite for the likeability from the three items “I think the robot is friendly”, “I think the robot is nice”, and “I love this robot” and performed a one-way repeated-measures ANOVA on the responses. We define the null hypothesis as follows: There is no significant difference between the means of each animation in terms of children’s perception of likeability. We used \(\alpha \) = 0.05 for ANOVA. We used Levene’s test to investigate homogeneity in variances of the data on the likeability composite constructed. For trustworthiness since the data was ordinal, we performed Friedman’s analysis on the responses. We define the null hypothesis as -There is no significant difference between the ranked mean of each animation in terms of children’s perception of trustworthiness. We used \(\alpha \) = 0.05 for Friedman’s test.

  • Likeability Levene’s test was not statistically significant and hence variance in data was homogeneous. A one-way repeated measures ANOVA was then conducted to determine if there were any differences in the likeability between the different robot morphologies. The results showed that there was no statistically significant difference in the likeability between different robot morphologies.

  • TrustworthinessI will listen to the robot if it asks me to handwash properly”: Friedman’s test showed that there was no statistically significant difference in the perception of trustworthiness among the animations.

5.4 Phase 4: Ranking of the Conceptual Designs

In this final section, based on the children’s ranking of the eight caricatured embodiment designs (refer to Fig. 1), we constructed a bubble chart to represent the frequency count (the number of times the participants chose the animation for various items from the likeability and trustworthiness factors as mentioned in Table 3) for each animation as shown in Fig. 7. Here too, we present the data from 33 participants as explained in Phase 3. Two participants did not choose any of the eight caricatured robot embodiments as an answer, stating that they did not find them likeable or trustworthy. From Fig. 7, we can see that R8 (cat) is the embodiment most liked by the participants and R1(soap dispenser) is the embodiment the children trusted the most.

Fig. 7
figure 7

Children’s ranking of the eight caricatured robot animations for the likeability and trustworthiness factors

5.5 Summary

As stated in Sect. 5.1, it is interesting to note the dichotomy between the results of likeability in the first half and second half of the study; In 5.1, the children stated that they preferred an anthropomorphic design significantly over a zoomorphic design. In 5.2, most children made their robot designs with a humanoid morphology. This contrasts with results from 5.3 and 5.4, where the children stated that they significantly liked the zoomorphic cat design over the little boy design. We believe that the children’s preference for the robot’s morphology altered because they found the cat design more attractive than the humanoid design. These conclusions are further strengthened by results from 5.1 and 5.2, where the children stated that the robot’s attractiveness was an important design consideration for them. Hence, we can say that the effect of visual ‘design aesthetics’ or attractiveness is a much stronger predictor of likeability than the aesthetic form (anthropomorphic, zoomorphic or caricatured) of the robot.

Fig. 8
figure 8

Prototype design—a block diagram of its hardware and software components

Coming to the factors affecting perceived persuasiveness, in 5.2, we presented nine principal constructs and four factors that we identified from the children’s responses. To elaborate on them, the principal constructs belonging to the ’primary task support’ factor are responsible for helping the user to carry out their primary tasks. Based on the results, children primarily wanted a robot that would guide them towards their end goal (tunnelling) and that can be tailored and personalised according to individual children’s needs (tailoring, personalisation). This means that the children wanted the robot to remember and interact with them with different content based on their emotional state and with different content for different user groups such as the elderly and children.

The principal constructs belonging to the ’dialogue support’ factor help provide verbal and non-verbal user feedback and motivate the user to achieve their target behaviour. From the results, we can see that children preferred a robot that could provide suggestions and nudge them towards the target behaviour (suggestions), provide timely reminders until their goal has been reached (reminders), adopt a specific social role such as a teacher or a mentor (social role) and provide verbal and non-verbal praises to the children whenever they achieve specific set targets (praise).

The principal constructs under ’credibility support’ help us identify how the robot must be designed to increase its credibility. From the results, children wanted a robot that is knowledgeable in the topics it has been designed for (expertise) and in other domains like being able to help with their homework and assisting the elderly with their medicines.

Given these findings on persuasive design, in the next section, we summarise the key implications for the prototype design of a social robot to promote handwashing behaviour among primary school children.

6 Design Implications for a Minimalistic Design Robot Prototype for Long-term Handwashing Behaviour Change

As we stated in the previous phases, the purpose of our study was to co-design a minimalistic social robot to promote handwashing behaviour with children as informants. We stress on minimalism to allow for large-scale deployments in developing countries like India, where the problem of handwashing is more acute than in other developing countries. We aim to keep the social robot affordable at approximately INR 50,000 or under USD 700. For this, we have to make several trade-offs on design choices to create a minimalistic yet functionally capable robot.

Based on the feedback and results from the children and keeping in mind children’s expectations of the robot’s behaviour, we will present a high-level overview of an ideal prototype design for the social robot as an embodied AI, as shown in Fig. 8. The physical prototype requirements are split into three categories, namely: input components, output components, and processing units.

Appearance As stated previously, most children agreed that the robot should appear cute and beautiful. To make the robot’s appearance cuter, we suggest utilizing the baby schema while designing the rendered face of the robot. This means that compared to the face, the eyes could be made bigger, eyebrows could be higher, and the mouth smaller [54]. Prior research done by Zebrowitz and Montepare [55] indicated that the impressions created by baby faces are attributed to traits such as being child-like, warm, and honest and hence are perceived as being trustworthy.

Due to the popularity of a humanoid robot with hands and legs in children’s designs, we suggest adding simple penguin-like flipper arms and avoiding legs for the robot. In a previous research study done by Li and Chignell [56], the authors found that even simple arm and head movements suggested greater life-likeness for a robot. At the same time, adding legs that can be used in meaningful ways will increase design complexity and hence the overall cost–adding legs just as accessories can lead to unmet expectations that can further lead to frustration and adversely affect interactions. We suggest adding simple arms and avoiding legs to balance cost and functionality. Also, we suggest keeping the robot’s appearance caricatured to avoid the uncanny valley effect.

In Fig. 9, we present our interpretation of one possible mechanical design of the robot. The robot has a head with two rotational degrees of freedom (DoF) which enables it to direct gaze at salient objects and faces in the environment. A detailed view of this two DoF pan-tilt mechanism and an exploded view of the prototype can be seen in Fig. 10.

Fig. 9
figure 9

3-D model of the social robot

Fig. 10
figure 10

(Left) Pan-Tilt Mechanisms in the robot; (Right) Individual Components of the robot

Interaction capabilities Our results showed that most children wanted the robot to be highly interactive and responsive. To fulfil the participant’s requirements, we suggest that the robot uses conversational AI frameworks such as Riva [57] and RASA [58], which are application frameworks for building multi-modal conversational AI services.

Emotional intelligence Since the children wanted the robot to be empathetic and compassionate, emotional intelligence is essential functionality for the robot. In order to set an atmosphere that is conducive for children to learn, the robot’s behaviour model should be able to infer and interpret the child’s emotional state. We recommend augmenting the interaction abilities of the robot with affective computing elements so that the robot’s actions and behaviour are appropriate for the child’s emotional state. As emotion can be expressed through modalities like verbal (emotion-driven speech) and non-verbal gestures (facial expressions, body movements), we suggest that these modalities can be incorporated into the robot’s design by adopting various machine learning techniques [59, 60]. Further, by combining emotion recognition with the conversational AI framework through emotion recognition in conversation algorithms (ERC), the robot can become capable of generating emotion-aware dialogues. We also recommend programming multiple interaction scenarios between the robot and children, such as playing musical notes, playing games, listening and responding to children’s unrestrained conversations, and so on.

Persuasion Capabilities Besides engaging children in natural conversations, the children expressed that the robot should be able to engage the children in goal-directed conversations and provide reminders and praises. Therefore we suggest that the robot has a built-in knowledge base using which it can provide appropriate responses to children that reinforce positive behaviours or suggest improvements. We recommend designing the robot with a persuasive model which contains linguistic strategies and vocabularies for generating the necessary feedback. The persuasive model should be designed to interface with conversational AI to provide appropriate responses.

Regarding the robot’s voice, the children wanted the robot to have a human-like voice. However, the children were divided about their preferences for a child-like voice for the robot. This contrasts with the study by Sandygulova [61] wherein the children overwhelmingly preferred robots to have child-like voices during child–robot interaction. Our recommendation is to design a robot with the voice of a human child and keep the voice gender-neutral to avoid gender-related bias.

Mobility Although many children prefer that the robot could move from one place to another, we posit that a stationary robot would fulfil the task of teaching handwashing behaviour equally well. Adding mobility to the robot’s design would again add design complications and increase the cost of the robot; hence we do not foresee the need for a mobile robot design for this functionality.

We understand that as designers, we need to make principled decisions about the extent to which the suggestions and ideas given by the children in this study are compatible with our evolving conceptual designs [62]. As children play the role of informants in our research, we took the technologically feasible suggestions with minimal component cost and incorporated them into our design recommendation. We also realize a mismatch between our proposed prototype design and children’s mental models in some areas, such as the robot’s mobility and appearance in terms of the robot not having legs. Therefore, we must take appropriate measures during acceptance studies to build acceptance for this robot designed with minimalistic principles to promote the intended functionality.

The following three sections will detail the physical prototype requirements, namely the input, output, and processor components.

6.1 Input Components for the Social Robot

Below we provide the list of components and their description that can serve as input for the social robot.

6.1.1 Microphone

Hall [63] introduced proxemics as a category of human-human non-verbal communication and described four interpersonal distance zones – intimate, personal, social, and public distance. According to Human–Robot Interaction (HRI) studies, the robot must exhibit appropriate proxemics behaviour for seamless interactions [64]. Since this robot has to operate in a community setting, the social distance was chosen as the desired interpersonal distance zone. The social distance zone has a close phase of 1.2m to 2.1m. Normal human speech levels are at 60dBs [65]. Based on these criteria, a suitable microphone has to be selected.

6.1.2 Touch Sensors

A significant number of children mentioned that the robot should be able to detect the presence of a person when they come near it and should be able to react if anyone touches it. We recommend incorporating a touch sensor into the robot’s head and torso shell based on these ideas. We also recommend using the camera for proximity detection.

6.1.3 Camera

We recommend fixing two cameras, one on the forehead of the robot and a second one above the sink. The camera on the forehead contributes to the robot’s ability to perceive children’s presence, while the camera over the sink can monitor the steps on handwashing.

6.2 Output Components for the Social Robot

In this section, we present the components that serve as the output for the social robot.

6.2.1 LCD Screen

As discussed in the results, children preferred a robot with high emotional expressiveness. For this, a moulded, rigid face would not be ideal, and a mechanically actuated expressive face will be costly and complex. To avoid the drawbacks of these two alternatives, we propose an LCD display for rendering the robot’s face as it offers the flexibility required for emotional expressiveness. We also suggest that the robot uses necessary gaze types - mutual, referential, or joint- as stated by Admoni and Scassellaati [66] to increase engagement with the children.

6.2.2 Speakers

Children expected the robot to play music and converse with them, which requires a speaker. As stated earlier, the social distance zone has a close phase of 1.2 to 2.1 m, based on which a suitable speaker has to be selected so that the robot’s voice is audible to the children.

6.2.3 Motor (Motion Actuators)

As the children wanted the robot to appear and behave like a human, mimicking human head motion by varying the speed of head motions, from slow to fast and fast to slow, can give a sense of friendliness and natural appearance [67]. Cuijpers [68] investigated the effects of the natural idle motion of robots, that is, the motions that a social robot performs when it is not performing a task or interacting. They discovered that the study participants perceived the robots executing natural idle motions to be alive and more empathetic than robots with no motion. So we recommend designing a robot with two rotational degrees of freedom for the neck. These degrees of freedom are for pan motion and tilt motion. Stepper motors can drive the neck mechanism, allowing it to operate at fast and slow speeds. Additionally, two more degrees of freedom should be added to each flipper arm so that the robot will have the ability to perform expressive arm gestures.

6.3 Processing Units in the Social Robot

This section discusses how the input data will be processed to provide the required outputs. These processing units act as a bridge between the input and output components. We present these results below.

6.3.1 Conversational AI Agent to Enable Autonomous Interactions between the Robot and the Child

As discussed in the preceding sections, conversational AI frameworks based on machine learning, such as RASA [58] and Riva [57] can be built for the robot to have complex persuasive communication with children. Once the robot detects the user’s speech input, Google’s ASR (Automatic Speech Recognition) engine can be used to convert vocal user input to textual user input. This textual input can then be sent to RASA’s NLG (Natural Language Generator). The NLG Generator can then produce responses based on a text input detected by the RASA core. In our context, we recommend using a machine learning-based Generator model so that the responses can be generated by detecting the dialogue state of the user in real-time. Once a response has been generated, it can then be fed to a knowledge base within the robot. A knowledge base should ideally contain a wide range of information pertaining to the conversation’s domain of interest, which can be built into the robot’s architecture. Information that the robot has to convey to the user can then be selected from this knowledge base based on the response identified by the NLG Generator. Finally, we recommend using RASA’s NLG response component to generate the output text message from the knowledge base. We recommend using the RASA framework because it allows for these output text messages to be customized based on the dialogue state of the user. In the end, Google’s TTS (Text-to-Speech) engine can be used to convert the textual output message into a verbal response to the user.

6.3.2 Affective AI Agent to Enable Affective Communication

As mentioned earlier in this section, ensuring that the child is in the right mental state or mood to learn is a vital design recommendation. When a child’s face is detected, the robot can have a behavioural model that detects the child’s verbal and facial expressions using visual and speech inputs that get fed in. The model can be implemented to work in synchrony with conversational AI to validate the robot’s perception of the child’s mental state. Emotion recognition in conversations (ERC) algorithms can be combined with emotion recognition through facial expressions to help detect the child’s emotion as the interaction between the child and the robot proceeds. Finally, at the end of the interaction, the robot can give task-specific praise.

6.3.3 Human Action Recognition System to Classify Handwashing Steps in Real Time

To provide feedback to the children about how correctly they were doing the handwashing steps, we recommend implementing a deep learning neural network or a vision transformer that generalizes well for different hand sizes, skin tones, and lighting conditions. The overhead camera can start to capture visual data of the child’s hand while performing the hand washing step. The deep learning model’s output can be fed to the conversational AI agent to provide verbal feedback to the children.

6.3.4 Microprocessor for the Robot

The recognition of individual handwashing steps using the human action recognition system, the conversational AI and the affective AI will require running multiple neural network classifiers. Our recommendation is to use a microprocessor such as NVIDIA Jetson Nano [69], which can run multiple neural networks in parallel for image classification, face and object detection, and natural language processing.

7 Conclusions and Future Work

Our work aimed to co-design the embodiment of a minimalistic autonomous social robot to promote proper handwashing practices among children through their participation in the design process as informants. Post the interview process with 40 children aged \(5-10\) years, we analysed and interpreted both qualitative and quantitative interview questions. The qualitative section of the interview process was based on the handwashing buddy robot designs children made from pre-cut geometrical shapes. We summarised their responses into the categories of children’s needs for the robot’s communication, emotional intelligence, and functionalities. We found that despite designing a robot for handwashing, most of the children wanted their robots to also perform multiple other functions, such as playing the role of a caregiver, friend, and tutor. The children’s impressions of robots seem to be heavily influenced by what they had seen in the media. Hence, they had a futuristic outlook for their robot regarding its functionalities.

Regarding the robot’s appearance, we observed that visual design aesthetics was a key influencer of children’s preference for the robot’s embodiment. In fact, the results suggest that attractiveness was a stronger predictor of likeability than the robot aesthetic form (anthropomorphic, zoomorphic or caricatured) itself. The children’s idea of a robot was that it is cute, beautiful, and cool in appearance with a human-like voice. They wanted the robot to engage the children in goal-directed conversations and provide reminders and praises. They wanted the robot to have a wide domain of knowledge and adopt familiar social roles like a teacher or a friend. Based on all the feedback from the children, and keeping in mind children’s expectations of the robot’s behaviour, we presented a high-level overview of an ideal prototype design for the social robot as an embodied AI. We further discussed the input, output components, and processing units required for the social robot, such as microphones, speakers, cameras, and motors.

As with a majority of studies, the design of the current study is subject to limitations. The first is the online nature in which the study had to be conducted owing to the ongoing Covid\(-19\) pandemic. Due to this, selective sampling of the population was done as we could not reach out to the lowest rung of the economic population in the rural areas. The second limitation that could be addressed in future research is that not all robot functionalities mentioned by children could be considered, such as mobility and full human-like embodiment. As designers, we chose to include those functionalities and responses given by children, which were technologically feasible today, while considering scalability requirements. The final limitation of our study is that since it was conducted with children from different regions of India, their mental models of a robot might differ significantly from that of children from other developing countries. A broader study across multiple developing countries can help provide richer data on the similarities and differences between children’s perceptions of a social robot to promote behaviour change.

In future work, we plan to design a physical working prototype based on the design implications presented in this article. The children preferred an expressive and highly interactive robot that is capable of encouraging them to practise good handwashing behaviour and playing games with them. To achieve this purpose, we aim to incorporate a conversational module based on machine learning and AI into the social robot that is capable of handling non-linear interactions with children on topics related to handwashing and in general conversations. Along with having conversational AI capabilities, we intend to utilise visual processing through the robot’s camera to add affective computing capabilities. We will then field-test the prototype across different schools in India by conducting a study to measure the children’s acceptance of our final design and subsequent behaviour change in handwashing practises that can be brought about through this robot.