A motivational model based on artificial biological functions for the intelligent decision-making of social robots

Modelling the biology behind animal behaviour has attracted great interest in recent years. Nevertheless, neuroscience and artificial intelligence face the challenge of representing and emulating animal behaviour in robots. Consequently, this paper presents a biologically inspired motivational model to control the biological functions of autonomous robots that interact with and emulate human behaviour. The model is intended to produce fully autonomous, natural, and behaviour that can adapt to both familiar and unexpected situations in human–robot interactions. The primary contribution of this paper is to present novel methods for modelling the robot’s internal state to generate deliberative and reactive behaviour, how it perceives and evaluates the stimuli from the environment, and the role of emotional responses. Our architecture emulates essential animal biological functions such as neuroendocrine responses, circadian and ultradian rhythms, motivation, and affection, to generate biologically inspired behaviour in social robots. Neuroendocrinal substances control biological functions such as sleep, wakefulness, and emotion. Deficits in these processes regulate the robot’s motivational and affective states, significantly influencing the robot’s decision-making and, therefore, its behaviour. We evaluated the model by observing the long-term behaviour of the social robot Mini while interacting with people. The experiment assessed how the robot’s behaviour varied and evolved depending on its internal variables and external situations, adapting to different conditions. The outcomes show that an autonomous robot with appropriate decision-making can cope with its internal deficits and unexpected situations, controlling its sleep–wake cycle, social behaviour, affective states, and stress, when acting in human–robot interactions.


Introduction
Designing machines with autonomous behaviour has been addressed from multiple perspectives in the last years [96]. A distinctive approach takes inspiration from ethology, i.e. the study of animal behaviour [24]. The inclusion of embodied robots in complex applications underscores the importance of endowing them with adaptive behaviour to fulfil their task  [2]. Recent advances in neuroscience reveal significant theories about the origin of human behaviour and the critical role of the nervous and endocrine systems [90]. In spite of these advances, further research is necessary to characterise human decision-making and create fully autonomous intelligent robots.
Drawing on biology, this paper presents a novel motivational model for a social robot to exhibit intelligent decision-making. The model shapes the robot's behaviour, depending on the evolution of artificial biological variables affected by environmental stimuli that the robot perceives. The biological variables vary following circadian and ultradian rhythms that control the periodic evolution of biological functions. Consequently, the biological variables affect both physiological and psychological processes that, in our model, act as homeostatically controlled variables. Biological functions have optimal set-points for maintaining the robot's internal state in good condition. However, biological imbalances produce deviations in biological functions, causing deficits. Prominent deficits motivate the robot to execute specific behaviours to restore its state of welfare. In addition, the selection of behaviour is affected by other psychological factors, such as our affective state and the perception of arousing stimuli. Considering these biological foundations, our motivational model allows social robots to exhibit a longlasting natural autonomous behaviour based on the evolution of biological variables that simulate how human behaviour emerges. Furthermore, we present the design and implementation of a novel artificial neuroendocrine system based on 12 neuroendocrine substances that affect basic biological functions, defining the robot's motivational and affective states and, therefore, its behaviour.
The principal contribution to the memetics community is to present a method that allows the robot to select the most appropriate behaviour from a pool of available choices, adapting to different situations that change with a dynamic environment. In addition, the algorithms and functions we propose for generating biologically inspired behaviour generalise (although the dynamics of the robot's processes are manually adjusted in some cases) to be reused to model new biological functions, endowing the robot with many units of behaviour that interact so as to produce a complex behaviour. The robot's prior knowledge that is used to produce complex behaviours are its skills, units which can be combined and exchanged to produce different behaviours (e.g., executing a game combined with different emotional expressions). This paper continues in Sect. 2, introducing the biological foundation of our model. Section 3 applies the previous concepts to generating intelligent decision-making and appropriate expressiveness. Then Sect. 4 describes the experiment we conducted to evaluate the model. Section 5 shows the autonomous behaviour exhibited by our social robot Mini [76] using our model for four consecutive days. Sections 6 and 7 discuss the outcomes of the model and future avenues for research.

The neuroscience behind human behaviour
Human behaviour involves many biological processes that orchestrate our organism. This section describes essential human processes involved in the generation of behaviour, focusing on those that can be emulated in social robots.

The neuroendocrine system
In the last decade, many theories have described human motivation and affection [96]. The homeostatic model of physiological control [15] as the origin of motivated behaviour [39] constituted an important advance in modelling biological functions. Nowadays, the homeostatic and allostatic Schulkin [79] theories complement each other to clarify the operation of our organism.
The neuroendocrine system converts electric signals generated by the nervous system into chemical substances in charge of modulating many biological functions [90]. In this work, we have focused on modelling those functions that may allow social robots to exhibit biologically inspired and autonomous behaviour. Thus, we concentrate on the role of melatonin (MT) and orexin (OX) in sleep and wakefulness. We shape the influence of dopamine (DA), serotonin (SE), and brain norepinephrine (BNE) on our affective state. The model represents how oxytocin (OT) and arginine vasopressin (AVP) affect socialisation. Finally, we model the stress response managed by corticotropinreleasing hormone (CRH), adrenocorticotropin (ACTH), cortisol (CT), adrenal epinephrine (EPI), and adrenal norepinephrine (ANE). Table 1 summarises the primary functions of these substances in the human body, and the following sections present their implications for regulating our organism.

Sleep-wake rhythm
Circadian rhythms are periodic oscillations of biological processes with a period of a natural day (24h); ultradian rhythms are those with a period less than a natural day [48]. These rhythms occur in organs and endocrine glands sequenced by the suprachiasmatic nuclei in the brain [45]. One of the most significant inputs of the brain comes from the retina, affecting the sleep-wake cycle [5].
Both melatonin and orexin are affected by light, varying their levels during day-night phases exhibiting a precise circadian rhythm that controls the sleep-wake cycle [100]. In addition, both hormones indirectly influence many other physiological and psychological processes, as Table 1 shows.

The monoamine nuclei
The monoamine nuclei are a group of cells that communicate through monoamine neurotransmitters [28]-dopamine, serotonin, and brain norepinephrine-three primary modulators of human emotions. Although the most important effects of these substances occur in the brain, they are implicated in many other physiological and psychological processes, presented in Table 1.

The HPA axis
Significant research in neuroscience concludes that the hypothalamus, one of the most critical brain areas, presents many projections to the pituitary gland. The pituitary, in turn, projects to the adrenal glands. This pathway, commonly called the hypothalamic-pituitary-adrenal axis (HPA axis), controls many human processes related to stress regulation [57]. The stress response starts with the secretion of CRH and arginine vasopressin [8]. The hypothalamus projects to the pituitary gland, leading to the synthesis of CRH and arginine vasopressin into ACTH [66]. ACTH is a hormone that stimulates the adrenal glands, where it is possible to find most of its receptors [43]. Adrenocortical cells in the adrenal cortex produce cortisol, which is considered the principal stress hormone [65]. Most cortisol is synthesised by the adrenal medulla, producing norepinephrine and epinephrine [49]. These hormones are implicated in many other biological functions, such as the heartbeat or pupil size, as Table 1 shows.

Biological functions
The neuroendocrine responses presented above are a few of the many occurring in the human body. The maintenance of optimal levels of these substances is essential for the survival of the agent [87]. However, they control many other biological functions revealed in physiological and psychological processes.

Physiology
Physiology can be defined as the scientific study of the functions and mechanisms in a living system [34]. This section explores how neuroendocrine substances influence physiological processes such as sleep or wakefulness. Physiological processes take place throughout the body, principally in the organs. They vary, depending on the evolution of neuroendocrine substances and internal rhythms. For example, sleep follows a circadian pattern controlled by the secretion of melatonin [67] and indirectly by other hormones [68]. Wakefulness is directly controlled by orexin [74], but also by dopamine [23]. In addition, the brain controls the periodicity of many physiological processes, such as the heart rate [31]. According to these findings, most physiological variables present circadian patterns affected by neuroendocrine levels that define their optimal value. If a physiological process deviates from its optimal set-point, a deficit arises in the agent, leading the body to correct it by executing specific actions.

Psychology
Unlike physiology, psychological processes are specific to the brain [11]. Moreover, individuals differ in their psychological processes more than they do in their physiological processes. The assessment that each individual makes from the environmental stimuli differs and depends on many cognitive and contextual factors [71]. First, as befalls on the physiological processes, substances with influence on the behaviour of the agents are not equally produced and secreted in different organisms [88]. Second, genetic background along with the individual's history influence how different people have cognitively evolved [88]. Consequently, historical factors such as culture, ethnicity and habits, among others [71], define the cognitive response of the human brain. Our model addresses psychological processes mainly in emotion and motivation, as the following sections describe.

Emotion
Among the many studies in the last two decades concerning artificial life, we believe that the computational model of Velásquez [93] presents essential findings in modelling motivation and affection, drawing on outstanding theories of that time. In line with Velásquez, Cañamero [13] pre-sented a model for artificial agents where the selection of actions depends on their physiological needs and the stimuli they perceive. In the study, neuroendocrine substances are physiological processes in the agent instead of substances controlling the agent's internal state. Some years later, Cañamero studied how autonomous agents should build up their emotional state towards presenting a reasonable action selection process [14], drawing on relevant research in affective generation and expressions, such as Ortony [64] on cognition and emotion and Sloman [83] on communication and emotion.
Influential theories of emotion have used a dimensional approach to situate emotion and mood. Russell's circumplex model of affective states [72] posits that an affective state arises from our interpretations and feelings as a result of two neurophysical systems: valence (well-being) and arousal (alertness). Similarly, Plutchik developed a model including a third axis, representing the intensity of emotion [69]. More recently, Lövheim [54] presented a three-dimensional emotional model grounded on the levels of three monoamines: serotonin, dopamine, and norepinephrine. In the model, eight basic emotions, discovered by Tomkins, are placed on the vertex of a cube, representing the emotional states of the agent. The model lacks the inclusion of other substances involved in emotion, and does not explain how each monoamine is elicited. Derived from Lövheim's cube of emotions, new architectures have been developed using neurobiology as the starting point of the generation of emotions, most of them using the six basic emotions discovered by Ekman [25]. Thus, Vallverdú [92] presented a cognitive architecture based on Lövheim's cube for affective decision-making. Zhang et al. [92] designed an affective model based on emotion, mood, and personality. These affective phenomena are situated in a three-dimensional space whose axes correspond to valence, arousal, and dominance. Emotion represents short-term reactive responses, moods are long-lasting affective states derived from past experiences, and personality is an agent's trait settled from an invariant genetical component.
Our model draws on [29], considering mood as in Zhang et al. [99] and emotion as in Lövheim [54], but with subtle variations. Nine mood states are situated in a bidimensional valence-arousal space where the neutral mood is situated at the origin of both axes. Moods are long-lasting affective states derived from past experiences. For emotion, we use Lövheim's monoamine approximation combined with four of the basic emotions revealed by Ekman [26] for defining short-term reactions to unexpected events. It is worth noting that in this study we only analyse the emotional response of the robot, and not its mood.

Motivation
Research into human motivation has been carried out for many years, developing models to characterise how human behaviour emerges. In recent years, numerous models have been developed to determine how biological deficits motivate our behaviour and the influence of external stimuli on motivation [18,55,56,89]. In this vein, Fallatah and Syed [27] employed the Maslow pyramid to investigate and how behaviour is driven, depending on the levels of the agent's deficits. While the pyramid's base contains physiological needs essential for survival, the top levels are more related to cognitive processes not critical for survival. This theory posits that human decision-making aims at maintaining an optimal internal state, and once we no longer have any critical physiological needs, we focus on other aspects, such as safety, love, knowledge, and self-actualisation. Making use of these ideas, much research has structured motivation as an organising behaviour in a hierarchical pyramid depending on the urgency of guaranteeing survival [33,70,97].
Complementary to these ideas, motivated behaviour has been intrinsically related to the stimuli we perceive from the environment [38]. An earlier theory of Lorenz is worth mentioning [53]: he defined human motivation using two pillars-internal needs (drives) and stimuli. This theory states that human motivation emerges from our internal deficits (drives), which are amplified by the perception of external stimuli (perceiving palatable food) that in many cases, are essential to trigger specific behaviours (social behaviour requires perceiving other people). In addition, the theory asserts that if the stimulus is strong, it might not be necessary to have a significant deficit to execute the behaviour (eating for pleasure). The ideas of this theory have already been implemented in some studies [17,30,55] to endow robots with autonomous behaviour using biological concepts.
As we present later in our model's definition, we combine the previous concepts (hierarchical organisation of needs and the influence of stimuli) to produce motivated behaviour in social robots. In addition, considering the recent advances in human neuroscience, our model emulates the combination of both deliberative and reactive behaviour using the terms of proactive and reactive motivation [47]. Proactive motivation arises from internal deficits and the perception of specific stimuli, typically deriving from deliberative voluntary behaviour (e.g. sleeping when tired). Unlike proactive motivation, reactive motivation embraces those behaviours solely elicited from the perception of strong stimuli (e.g. escaping from a lion).

Proposed model
The design of our motivational model addresses the problem of social robots exhibiting autonomous behaviour for long-lasting human-robot interaction (HRI) via intelligent decision-making. We believe that for robots to be deployed in complex dynamic environments, they have to include adaptive and biologically inspired behaviours like those exhibited by humans. The following sections describe the modules making up our motivational model, shown in Fig. 1, justifying the modelling methods we propose and the biological functions selected for controlling the behaviour of our robot.

The social robot Mini
The model we present in this paper controls the autonomous behaviour of our social robot Mini [76], the platform we have developed for our research in social robotics at the University Carlos III of Madrid. We opted for using Mini to integrate and evaluate the model we present in this paper since for assisting (older) people in their private homes we need a robot that can exhibit long-term autonomous behaviour. Since Mini is meant for HRI, our model draws on how human behaviour emerges so as to replicate it in robots so that people can feel that these machines are more natural and feel closer to them. In this context, Mini's main application is to drive cognitive stimulation, affective, and entertainment sessions adapted to each user while producing natural behaviour the rest of the time. Figure 2 shows Mini's external appearance.
Mini has five degrees of freedom-in the hip, arms, neck, and head-a speaker to play verbal and non-verbal sounds, including communications with the user, two expressive eyes, LEDs to simulate the blushing of the cheeks and heartbeat, and a touch screen to execute the activities. Regarding its sensors, Mini has three capacitive touch sensors in the belly and shoulders to perceive hits and strokes, a microphone to capture the speech of its users, and a 3D camera to perceive the user's presence.
Considering the application of our social robot Mini to prolonged interactions and its capabilities, we wanted to emulate and control the following high-level biological functions: • Sleep-wake cycle Robots that work in long-term scenarios can not be continuously performing activities. For this reason, we want to control the robot's periods of activity through its sleep and wakefulness. These processes are regulated by melatonin, orexin and the light intensity in our model, motivating the robot to sleep during the night and stay awake during the day. • Social behaviour Social robots assist people in many applications. Therefore, they must exhibit proper communication skills. Our model regulates the robot's entertainment and social needs using dopamine, oxytocin, brain norepinephrine, and arginine vasopressin, four hormones involved in positive and negative human social behaviour. The levels of these substances depend on  Similarly, we believe robots should incorporate convincing expressiveness to transmit their state to users and show how they feel. Emulating how emotion emerges in humans, we combine the perception of stimuli with the levels of dopamine, serotonin, and brain norepinephrine to control the emotions of anger, surprise, joy, and sadness in robots.
• Stress management Robots interacting with people may encounter negative situations where interaction may not be positive. To deal with this issue, we shape human stress management using hormones emulating the HPA axis. These hormones influence the stress level, making the robot avoid undesired situations like non-responsive users or users mistreating the robot.

Neuroendocrine responses
Neuroendocrine responses regulate many biological processes in humans, being the origin of our motivated behaviour [90]. Our model integrates twelve neuroendocrine substances to control essential processes in Mini. These substances were selected considering their influence on the high-level biological functions we wanted to control in our robot. Melatonin and orexin control the robot's sleep-wake cycle. Dopamine, serotonin, and brain norepinephrine control Mini's affective state. Oxytocin and arginine vasopressin regulate social behaviour. Corticotropin-releasing hormone (CRH), adrenocorticotropin (ACTH), cortisol, adrenal norepinephrine, and adrenal epinephrine modulate the robot's level of stress. We model neuroendocrine responses as time dependent variables whose levels vary between 0.01 and 1 units. Thus, as we propose with Eq. 1 considering the description provided in [85], the level of each neuroendocrine substance l i at time step t depends on circadian (cr ) and ultradian rhythms (ur), the effects of K stimuli (se k ), and P interactions with The effect can be both stimulatory or inhibitory, weighting the secretion value of the target substance applied on the target substance other substances (hi p ).
Next, we define how we address the modelling of how neuroendocrine substances interact with each other, with circadian and ultradian rhythms, and the effects of stimuli.

Neuroendocrine interactions
In living organisms, multiple substances control essential biological processes, while interacting with each other [90].
In this paper, we model neuroendocrine interactions between a source substance and a target substance using Eq. 2. The value of the interaction on the target substance hi p target depends on the level of the source substance l source and an empirical weight δ p ranging [−1, +1] − {0} that modulates the intensity of the interaction. If δ p is positive, we consider the interaction as stimulatory (STI) since it increases the lev-els of the target substance. If it is negative, we consider the interaction as inhibitory (INH) as it decreases the levels of the target substance. To simplify the modelling process, we define three typical values for most of the weights δ p : ±0.001 if the interaction intensity is weak, ±0.05 if the interaction intensity is moderate, and ±0.01 if the interaction intensity is strong. However, we consider some exceptions to these three values in essential processes in the robot.

Ultradian rhythms
Ultradian rhythms increase neuroendocrine levels of some substances with a periodicity below a natural day. We consider five basic parameters to shape ultradian rhythms: the time when it starts, when it ends, the period of stimulation, and the value ur(t) representing the increase in the substance level. Table 3 shows the ultradian rhythms of cortisol and ACTH included in our model considering [16,43].

Circadian rhythms
According to the neuroscientific studies included in Table 4, all the neuroendocrine substances in our model follow a circadian rhythm. Based on the references included in Table 4, we propose to approximate these rhythms using two mathematical functions: a cosine wave (Eq. 3) and the difference of two sigmoids (Eq. 4). The selection of the modelling function depended on the waveform of the substance proposed in its corresponding reference, that has been obtained from neuroendocrine studies, as Table 4 shows. Nevertheless, we manually tuned the values defining each function to be within the range 0.01-1, present the peak and nadir at specific times, and obtain the desired behaviour in our robot.
In the previous equations, br is the basal level of the substance, ar is the amplitude of the function, T z is the time of the day when the peak level occurs, T zd defines the time of the day when the secretion starts decreasing in the sigmoid function, dr is the rate of decrease of the second sigmoid function, and t is the time of the day in floating hours. Table 4 shows the circadian rhythm of the neuroendocrine substances included in the model. Circadian values range within [0, 1] units with a period of a natural day (24 hours).

The influence of stimuli
As Fig. 1 shows, in our model, stimuli affect Mini's internal state at three different levels, as [20] suggests in their study.
• Neuroendocrine substances Stimuli urge our body to rise and secrete neuroendocrine levels in specific situations like fear or anger [90]. We propose Eq. 5 based on [20,35] to shape the effects of stimuli on neuroendocrine substances, represented as se k . This effect depends on the intensity with which the robot perceives a stimulus si k ranging within [0, 100] and on an empirical weight α k • Biological functions The effect of stimuli on biological functions are specific and context dependent [71]. Section 3.7 describes the effect of stimuli on the primary biological functions emulated in Mini. • Motivation Motivational states define our behaviour. As Sect. 3.8 describes, stimuli are important modulators of human motivation, affecting the decisions we make [56].
Mini can perceive using its embodied sensors the intensity of the illumination, the intensity of the ambient noise, when the user strokes or hits its body, when the user correctly or incorrectly responds to the robot's questions when playing a game together, when the robot fulfils or does not not fulfil its goals, and when the user is in front of the robot. Table 5 shows the effects of the stimuli that our robot Mini can perceive on its neuroendocrine substances. Again, to simplify the modelling process, we define three typical values for the weight α p : ±0.001 if the effect is low, ±0.05 if the effect is moderate, and ±0.01 if the effect is strong. Exceptions to these three values indicate essential processes in the robot.

Biological functions
The biological functions emulated in Mini evolve with time ranging over [0, 100] units. As Fig. 1 shows, Mini's biological functions depend on stimuli and neuroendocrine responses. Table 6 shows the physiological (PHY) and psychological functions (PSY) (including emotion) emulated in Mini, specifying the equation for calculating the value of each process (where PV is the previous value of the process). In the model, we emulate the robot's sleep and wakefulness cycle, the social need, level of entertainment for playing with the user or alone, and the level of stress. In addition, considering the affective aspect, we model joy, sadness, anger, and surprise, four of the basic emotions discovered by Ekman [25]. Finally, to model mood, we simulate the pleasantness and arousal of the robot.
As the next section describes, the deficits of our biological functions define our motivational states and, therefore, urge behaviour. We model the deficits d i of biological functions as the absolute difference between its current value cv i and its ideal value iv i at time step t, as Eq. 6 shows.

Motivation
As Fig. 1 shows, motivations are psychological states that drive behaviour [56]. Their intensity m i at time step t depends on the intensity si s of S ∈ N stimuli, on the values bp b of B ∈ N biological functions, and on D ∈ N deficits denoted by d d , as Eq. 7 shows.
In the model we propose, motivational states that solely depend on the intensity of a stimulus are called reactive and elicit behaviours related to punctual behaviours. In the rest of the cases, the motivation is called proactive and elicits a voluntary behaviour [47]. Mini has 7 proactive (P) and 5 reactive (R) motivations, each of them with a threshold level to become active, as Table 7 shows. If more than one motivation is simultaneously active, they compete to become dominant following a winner-takes-all approach.

Mini's behaviour
The purpose of our motivational model is to emulate the biology behind human behaviour so as to provide Mini with intelligent decision-making to exhibit natural and reasonable behaviour. When a motivational state becomes dominant, the robot executes a skill or behaviour units to improve its internal state or attain a specific goal. These skills can be combined in some cases to produce complex behaviour like playing while expressing a particular emotion.  Currently, Mini can sleep, stay awake waiting for new upcoming events, dance, play a quiz game with the user, request the user to stroke it, talk with people about different topics, meditate if it is stressed, welcome new users that approach it, congratulate users when they correctly answer a question of the game or encourage them if the answer was incorrect, complain when the user hits it, and thank the user for stroking it. Table 8 shows the behaviours that Mini can execute depending on the robot's dominant motivation. The execution of some behaviours produce effects on neuroendocrine responses, biological functions, and the way we perceive stimuli. These effects modify our milieu, typically improving our internal condition.

Emotion
The behaviour exhibited by Mini comprises a motivational component modulated by affection. Our model defines 4 basic emotions (anger, surprise, joy, and sadness) that evolve with time. Emotions are short-lasting psychological processes strongly influenced by the levels of the monoamine chemicals dopamine, serotonin, and brain norepinephrine, as Lövheims [54] states. In our model, the robot's affective state is the emotion with the highest level of intensity and determines the robot's expressiveness in the short run. Emotions can only become dominant and allow the robot to change its expressiveness if their intensity is above 20 units, a threshold set to avoid shallow emotions appearing.

Experimental setup
An evaluation of the motivational model presented in this paper was conducted to analyse the robot's behaviour for long periods of time. The social robot Mini was deployed in our laboratory for four consecutive days, exhibiting fully autonomous behaviour. During that time, the robot was not restarted nor was its software modified by its designers. For the first three days, the robot did not interact with people but focused on autonomously maintaining an optimal internal state and selecting the most appropriate behaviour. Then, from 3 pm to 8 pm on the fourth day, Mini interacted with two persons exhibiting different behaviours. The first user mistreated the robot, continuously hitting it and not paying attention to its willingness to socialise. Then, the robot faced a second user who exhibited positive social behaviour, exhibiting a positive emotion (stroking the robot) and playing with Mini. These scenarios were designed to observe the robot's response and adaptability to different situations, especially its social behaviour and emotional responses during the interactions.
The analysis of the results presented in the following section is divided into four cases (sleep-wake cycle, social behaviour, emotional responses, and stress management). The figures shown to analyse these cases all present a similar structure to facilitate their comprehension since many different biological functions are involved in the robot's behaviour as a whole.

Results
This section presents the main results obtained from implementing the motivational model to control the robot's sleepwake cycle, its social behaviour, its affective state, and its stress response. Figure 3 shows the evolution of the biological variables involved in the sleep and wakefulness circadian rhythms over three consecutive natural days. Figure 3a shows the evolution of the light intensity, while Fig. 3b shows the levels of melatonin and orexin, the two hormones implicated in the sleep and wakefulness biological functions. Figure 3c shows the sleep and wakefulness biological variables. Figure 3d represents the robot's motivation to sleep and to stay awake. Figure 3e shows its phases of sleeping and activity. Light influences the levels of melatonin and orexin. On the one hand, acute light promotes the secretion of orexin, maintaining the robot active. Melatonin levels decrease with light, reducing the robot's need to sleep. When the dark period arrives, melatonin levels increase, producing a notable rise in Fig. 3 Circadian evolution during three consecutive natural days of the variables involved in robot's sleep and wakefulness. Both processes present opposite patterns affected by light intensity, the primary stimulator of wakefulness and an inhibitor of sleep the robot's necessity to sleep. It is worth noting that melatonin and orexin have decoupled circadian patterns synchronised with the hours of natural light, as Fig. 3b shows.

Sleep-wake cycle
As shown in Fig. 3c and (second bottom), the dominant motivation of the robot during the night is to Sleep, driving Mini to sleep. During the periods of light, the dominant motivation of the robot is to Stay awake, performing energetic behaviour like dancing. During the hours that the robot is sleeping (see Fig. 3e), the sleep deficit decreases in Mini (Fig. 3c), leading the robot to wake up once it has slept enough. The modelling that we have presented in Sect. 3 drives the robot's internal variables to evolve as represented in Fig. 3, exhibiting a natural behaviour that allows it to sleep during the dark phase and be awake during the light hours. In addition, the light intensity acts as an external stimulus that modifies the levels of melatonin and orexin, indirectly affecting the sleep-wake cycle.
A fact that was not programmed in the model and can be subtly perceived in Fig. 3d is the adaptation of the cycle to the light conditions. If light conditions change, the periods of sleep and wakefulness may vary accordingly, modifying these biological functions. In addition, it is possible to adapt this cycle to the robot's potential users using the circadian rhythms of the melatonin and orexin, not overwhelming the user with continuous activities, but respecting their rest periods. It would be interesting to analyse whether acute light at night makes the robot wake up and assist the user in case they need it.  Figure 4 shows the evolution of Mini's social behaviour during the interaction with two different users (one with negative and the other with positive social behaviour). The figure focusses on the light hours of the fourth day (from 6 am to 8 pm) since that was the period when the biological functions related to social relationships took place. Figure 4a shows the intensity of the stimuli related to socialisation (user presence, hits, strokes, fulfilling a goal, and not fulfilling a goal). Figure 4b represents the evolution of those hormones and neurotransmitters implicated in regulating the robot's level of entertainment and social need, whose evolution is depicted in Fig. 4c. Figure 4d shows the robot's motivation to Play with the user, Socialise, Dance, and Request affection. Finally, Fig. 4e shows the behaviour the robot executed during the test.

Mini's social behaviour
The first part of the experiment finds the robot Mini sleeping. At 9:00 am, it wakes up. At that moment, the levels of the neuroendocrine substances dopamine, brain norepinephrine, oxytocin, and arginine vasopressin slightly increase. At the same time, the robot's desire for entertainment and socialise start rising. The speed with which these processes increase remains constant since the robot does not perceive the user. Thus, when the motivation to Play alone is above 30 units, the robot starts dancing to reduce its entertainment deficit. Note that the design of our model lead the robot to Dance if the user is absent and the entertainment deficit is high, making dominant the Play alone motivational state. However, if the user is present, our design promotes the robot to play with the user instead of dancing alone.
The Social need variable increases with time but much slower than the Entertainment variable. For this reason, the social need never presents a significant deficit to elicit the Request affection behaviour. As shown in Fig. 4a, b, the per-ception of the user results in a variety of events in the robot's internal state. In the first place, when the user is present, the robot changes its behaviour and instead of dancing, it reduces the entertainment deficit by Playing with the user. This fact leads the user to execute different actions, such as hitting or stroking the robot and responding to the questions when playing together. This experience provokes substantial variations in hormonal levels, leading the robot to react differently, depending on the stimuli it perceives.
Thus, when the user hits the robot, the brain norepinephrine and arginine vasopressin levels increase significantly. The value of the increase depends upon the intensity of the stimuli. In contrast, a stroke increases oxytocin levels, leading to a valuable decrease in the robot's social need. If the user correctly answers the robot's questions, dopamine levels rise, fostering the continuation of play. Otherwise, dopamine levels drop, driving the robot to stop playing with the user.
The social behaviour exhibited by the robot is strongly responsive to the user it faces. Thus, interacting with a hostile user will drastically affect its well-being, leading to a failure to reduce the social deficit of the robot (oxytocin is not adequately secreted). On the other hand, interacting with a friendly and positive user causes the opposite case (high oxytocin release), leading to a benefit for the robot since its social needs are correctly met. In these situations, the behaviour of the robot adapts to maintain the best possible conditions, attempting to play with the user when possible and complaining about the second user's negative behaviour. As we discuss later, it would be interesting to deeply analyse the robot's social behaviour, including more behaviours with which to react to adverse situations. Figure 5 shows the evolution of the processes implicated in emotion for five consecutive hours of the fourth day (from 12 am to 7 pm), when Mini is interacting with people. Figure 5a shows the environmental stimuli that the robot can perceive and influence Mini's emotional state. Figure 5b shows the levels of secretion of dopamine, serotonin, and brain norepinephrine, the three primary emotional regulators in our model. Figure 5c shows how emotions are triggered as a consequence of how stimuli vary the hormonal levels. Figure 5d depicts the emotional reactions of the robot that activate particular punctual behaviours.

Emotional responses
The model we propose in Sect. 3 defines that dopamine levels rise when the robot fulfils a goal, with the user presence, when the robot receives strokes, and with correct user answers. Dopamine levels decrease when the robot does not fulfil its goals or the user provides wrong answers. Thus, as Fig. 5b shows, dopamine levels rise with positive social stimuli and drop with negative ones. Similarly to dopamine, serotonin also reacts to the stimuli the robot perceives. Con-sidering these facts, serotonin levels increase with the user presence, correct answers, strokes, and when the robot fulfils a goal. In contrast, the perception of negative stimuli, such as wrong answers or not fulfilling a goal, does not affect serotonin levels. Brain norepinephrine levels rise with negative stimuli such as hits or wrong answers, and this is linked to aggressive reactions. However, neuroendocrine studies [32] demonstrate that brain norepinephrine levels also rise with arousing stimuli, thus promoting social play. For this reason, in our model, brain norepinephrine rises when the robot perceives the user and fulfils a goal.
The previous variations suggest that both dopamine and serotonin lead to positive emotions like joy, high norepinephrine levels lead to anger, and abnormally shallow levels of these monoamines lead to sadness. In addition, surprise is triggered when both serotonin and brain norepinephrine are at high levels. Our model represents these relations in Table 6 taking inspiration from the Lövheims's [54] cube of emotions. Figure 5 shows these relationships in graphs (b and c). As mentioned in the modelling section, each emotion has a threshold value of 20 units to avoid becoming active with shallow levels. Figure 5d shows the robot's reactions to environmental changes combined with emotional responses. In our model, the definition for the motivational states we presented in Table 7 permit the robot to react to particular stimuli. Thus, if the stimuli elicit a specific emotion, the robot's expressiveness will modulate the current dominant emotion. An example of this can be perceived when looking over Fig. 5 as a whole. The reaction to the user's presence in Fig. 5d occurs when the robot perceives the user after a while. Even though perceiving the user elicits joy, its level of intensity is below its activation threshold, so the reaction is not modulated by the emotion. Nevertheless, when the robot reacts to strokes and hits, the emotions joy and anger respectively become active, modulating both behaviours' expression.
Drawing on affecting computing studies and modelling the effects of dopamine, serotonin, and brain norepinephrine on the robot's emotions allows it to react emotionally to particular situations. However, including emotions requires not only their generation but also their expression, requiring specific gestures and expressions that correctly transmit what the robot feels. Considering the positive results we obtained while evaluating the robot's emotional expressions [29], we believe that including new emotions such as fear will improve the robot's responsiveness and HRI. However, this also requires us to endow our model with new variables in the loop, which will increase the system's complexity.

Managing stress
The last scenario we show regarding the operation of our neuroendocrine motivational system concerns stress man-  Figure 5a shows the stimuli that stress the robot (user presence, hits, ambient noise, strokes). Figure 5b shows the evolution of the stress hormones. These hormones represent the interactions occurring in the HPA axis of a human being. First, CRH and arginine vasopressin activate as a response to arousing situations. These hormones stimulate ACTH, which at the same time increases cortisol levels. Then, the cortisol elicits adrenal norepinephrine and epinephrine.
The stress level in the social robot Mini over four consecutive days is shown in Fig. 5c. This graph shows how the stress levels exhibit a precise circadian rhythm that peaks in the early morning (around 9 am). This rhythm is a consequence of the rhythms exhibited by the hormones implicated in the stress response (CRH, arginine vasopressin, ACTH, and cortisol). As can be seen in the graph, excessive ambient noise and other arousing stimuli such as the user's presence or when the robot is hit notably increase the stress levels in the robot. However, the threshold level is not attained, so the motivation to Relax never becomes dominant in the robot during this experiment (Fig. 6).
Contrary to what we hypothesised regarding stress management, the threshold value to elicit relaxing behaviours was never exceeded. Consequently, the robot was never stressed enough to rest and avoid stressful situations. We believe this issue may be caused by the definition of the model and the high number of hormones that generate stress. We wanted to thoroughly emulate the human stress response in the model, which implied using four hormones (CRH, arginine vasopressin, ACTH, and cortisol). Nevertheless, this seems to be an important drawback since it makes it difficult to adjust the generation of the behaviour of the robot in this situation. To solve this problem, a precise adjustment of the hormones implicated in this process is required to make the robot more responsive to stressful environmental stimuli.

Resulting behaviour
The previous results analysed high-level biological functions emulated in the social robot Mini. Nevertheless, all these processes take place in the robot simultaneously. As Fig. 7 shows, during the four days of the experiment, the robot could maintain an optimal internal state even if no user interacted Fig. 6 The stress response in the social robot Mini over three consecutive natural days with it, reducing its internal deficits by optimally selecting the most appropriate behaviour. In this situation, the robot used to Dance as a mechanism to entertain itself because the user was not present. As we can see, the robot could not socialise or play, and spent most of the time awake but not doing any specific activity.
User absence translates into a robot behaviour that looks more repetitive. As we mentioned in the previous sections, adaptation typically occurs with unexpected situations provoked by changes in the robot's perceptions. However, since in our scenario the user is the major stimulus for the robot and during the three first days no interaction occurred,the robot's behaviour shows a repetitive pattern for correctly reducing the robot's needs when needed.
On the other hand, as we show in Fig. 8, user presence makes the robot adapt its behaviour to different situations. In this case, if the user negatively interacts with the robot, it reacts by complaining about the hits received, avoiding the interaction. However, when the robot interacts with a positive user who provides positive emotions, the interaction is more fruitful, driving cooperative playing and meeting the robot's social needs. Consequently, as Fig. 8c shows, when the user is present, the number of behaviours triggered by the robot is more colourful as a consequence of the interaction.

Discussion and limitations
Recent advances in neuroscience have allowed researchers to address valuable results in modelling nature in robots. However, robots are still far from behaving and expressing themselves as humans do. The model presented in this work seeks to establish the primary relations in the main biological functions occurring in human beings and emulate them in our social robot Mini.  Unlike previous motivational architectures for artificial systems [13,54,92,93], our model is grounded on an artificial neuroendocrine system as the first step towards autonomous behaviour. It integrates the most important processes of human behaviour, defining robust relations between them, always supported on biological foundations. Thus, our architecture is different from those mentioned above in connecting several very particular types of processes, such as the perception of stimuli, neuroendocrine responses, physiological and psychological (emotional) processes, motivation, and behaviour. Our system anticipates unexpected stimuli using ultradian and circadian daily patterns inspired by animal studies. In addition, the relations between the different processes allow the agent to present an adaptive behaviour that depends on its internal and external circumstances.
One of the significant advantages of the system that we have presented is its scalability and modularity, being easily extendable to include new processes that improve the execution and expression of Mini's behaviour. We opted for modelling the functions considering the capabilities and behaviours of Mini [76], as it is the platform that we are currently working on. We believe that endowing social robots with biological processes may improve the naturalness and acceptance of the robot during a long-lasting interaction.
Regarding the limitations of our architecture, the most important one is related to the design phase. Although plenty of neuroscience literature describes human biology, there is no mathematical definition of the biological processes since they vary across individuals and depend on many factors. Thus, designers might address the modelling of the system considering the behaviour they want to emulate in the robot and empirically determining the attributes of the processes. Moreover, the complexity of the modelling increases with the number of processes included in the model. Therefore, the processes we can model are limited by the software and hardware components since their capabilities for behaviour depend on such devices.

Conclusion
In this paper, we have presented a robot architecture based on biological foundations. The model emulates an artificial neuroendocrine system as the origin of natural behaviour, affecting the evolution of multiple biological processes that control the robot's functions. These artificial biological functions define the motivational and affective components of behaviour, shaping how the robot acts according to its internal condition and environmental state.
The results demonstrate that modelling artificial life drawing on nature provides multiple benefits. For example, the robot's behaviour is not deterministic and rests upon different internal and external stimuli. Additionally, it resembles human behaviour, so the potential users of the robot might perceive the robot as a more natural and capable system. Finally, the model opens a wide range of possibilities to develop new lines of research towards endowing artificial embodied agents with variable and appropriate behaviours and expressions, seeking to improve human-robot interactions, especially in social robotics.
Our future research aims to expand the robot's possibilities by including new processes in the model. For example, it would be interesting to study the effect of the biological model during social human-robot scenarios, evaluating whether the robot's users perceive the robot as more autonomous and capable. Following this line, it would also be worthwhile to study how deficiencies in hormonal levels affect the robot's state, serving as case studies about how bod-ily disorders arise in human beings and how other humans interpret them when interacting with machines presenting these deficiencies.