Relationship-specific and relationship-independent behavioural adaptivity in affiliation and bonding: A multi-adaptive dynamical systems approach

Humans often adapt their behaviour toward each other when they interact. From a neuroscientific perspective, such adaptivity can involve mechanisms based on adaptive connections (synaptic plasticity) and adaptive excitability thresholds (nonsynaptic plasticity) within the mental or neural network concerned. It is, however, often left unaddressed which of the types of adaptation are specific for the relationship and which are more general for multiple relationships. We focus on this differentiation between relationship-specific and relationship-independent adaptation in social interactions. We analysed computationally how an interplay of adaptive relation-specific and relation-independent mechanisms occurs within the causal pathways for social interaction. As part of this, we cover also the context-sensitive control of these types of adaptation (adaptive speeds and strengths of adaptation), which is sometimes termed higher-order adaptation or metaplasticity. The model was evaluated by a number of explored runs where within a group of four agents each agent randomly has episodes of interaction with one of the three other agents. The outcomes of the analysis of the (stochastic) simulation results show a strong dependence of adaptation on the extent of social interaction: more social interaction leads to more adaptation of the interaction behaviour. This holds both for the short-term and long-term first-order adaptation and for the second-order adaptation, which is long-term.


Introduction
During social interaction, humans often adapt their interaction behaviour toward each other.This behavioural adaptivity may concern short-term effects such as affiliation but also long-term effects such as bonding.Interaction can involve different modalities like movement, affect and verbal actions.Central mechanisms in the adaptive causal pathways for social interaction can have the form of synaptic plasticity based on adaptive connections (Bear & Malenka, 1994) and nonsynaptic plasticity based on adaptive excitability thresholds (Debanne, Inglebert, & Russier, 2019).An interesting issue is which of these mechanisms are specific for the other person and which are more general for multiple persons.In the latter case, transference takes place: what you learn in one relationship will also influence how you interact in another relationship.Attachment theory is a theory that claims that adaptations acquired in one relationship also have their effects in other relationships (Salter Ainsworth & Bowlby, 1965;Salter Ainsworth, 1967;Salter Ainsworth, Blehar, Waters, & Wall, 1978;Salter Ainsworth & Bowlby, 1991;Bowlby, 2008).In accordance with that claim, we assume that at least part of the considered behavioural adaptivity in social interaction is relationship-independent.With this assumption as a point of departure, we analysed computationally how an interplay of adaptive relationship-specific and relationship-independent mechanisms occurs within the causal pathways leading to behavioural adaptivity during social interaction.
As part of this, in the current paper we analyse from a neuroscience perspective which learning or adaptation principles can apply to the mechanisms within the different types of causal pathways indicated above.We do not only cover both relationship-specific and relationshipindependent adaptation as forms of first-order adaptation, but also second-order adaptation to control these types of first-order adaptation in a context-sensitive manner, in particular adaptive speeds and strengths of adaptation.As such causal pathways in general involve connections between states and excitability of states, from neuroscience both synaptic and nonsynaptic forms of plasticity are exploited: adaptive connection weights and adaptive excitability thresholds.In addition, metaplasticity (second-order adaptation) is incorporated to control these forms of plasticity depending on the context.We evaluated the model by a number of explored runs where within a group of four agents each agent randomly has episodes of interaction with one of the three other agents and due to these episodes displays both short-term and long-term behavioural adaptivity.

Background knowledge
In this section, we elaborate on the main assumptions behind our multi-adaptive dynamical systems approach applied here.These main assumptions are grounded in the neuroscience literature.There are relationships between interpersonal synchrony and nonsynaptic plasticity (Debanne et al., 2019) and synaptic plasticity (Bear & Malenka, 1994) respectively.Such nonsynaptic plasticity functions on short-term time scales, whereas synaptic plasticity functions on the long-term time scales.More specifically, we assume the following mechanisms for the causal pathways leading to behavioural adaptivity in social interaction.
Behavioural adaptation following interpersonal synchrony can take the form of both short-term changes and long-term changes in behaviour.While many studies have focused on short-term adaptive shifts in coordination (Hove & Risen, 2009;Tarr et al., 2016;Wiltermuth & Heath, 2009), there is also evidence for long-term effects of interpersonal synchrony.For example, developmental research has shown that movement synchrony between infants and caregivers can predict social interaction patterns in the child several years later (Feldman, 2007).Similarly, research on close relationships has found that early patterns of interpersonal synchrony can predict subsequent indicators of relationship functioning, such as converging patterns of cortisol variation in spouses over a period of years (Laws, Sayer, Pietromonaco, & Powers, 2015).The attachment theory also considers that behavioural adaptivity acquired in one relationship can influence interaction behaviour in other relationships too, which can be seen as a form of transference, and the attachment theory has been studied in numerous settings and has been commonly applied to therapeutic contexts (Bowlby, 2008;Feeney, 2004;Fonagy, 2001;Fraley & Hudson, 2017;Fraley, Hudson, Heffernan, & Segal, 2015;Johnson, 2019;Marmarosh, Markin, & Spiegel, 2013;Salter Ainsworth & Bowlby, 1965;Salter Ainsworth, 1967;Salter Ainsworth et al., 1978;Salter Ainsworth & Bowlby, 1991).According to the attachment theory, the first attachment relationship between children and their primary caregiver significantly impacts children's future functioning in relationships.While the attachment theory has originally mainly been applied to intimate, romantic relationships, since more recently also friendship relationships are addressed, for example (Cronin, Pepping, & O'donovan, 2018;Heinze, Cook, Wood, Dumadag, & Zimmerman, 2018;Welch & Houser, 2010).Moreover, the neuroscientific mechanisms underlying attachment theory have gathered attention (Beckes & Coan, 2015;Beckes, Ijzerman, & Tops, 2015;Coan, 2016;White, Kungl, & Vrticka, 2023).
The current article analyses this in relation to basic mechanisms from neuroscience and their plasticity.To enable an individual to adapt behaviour upon being in synchrony, it is assumed that the individual has the ability to detect synchrony patterns across different modalities, like introduced in (Hendrikse, Treur, Wilderjans, Dikker, & Koole, 2023c).From an overall conceptual analysis perspective, mental states representing detected synchrony can be considered mediating mental states in the pathway from synchrony patterns to changed interaction behaviour (Treur, 2007a;Treur, 2007b).

Using mechanisms from neuroscience: Synaptic and nonsynaptic plasticity, and metaplasticity
The field of neuroscience distinguishes between synaptic plasticity and nonsynaptic plasticity (also called intrinsic plasticity).Synaptic plasticity is a classical concept explaining how the strength of a connection between different neural states adapts over time (Bear & Malenka, 1994;Hebb, 1949;Shatz, 1992;Stanton, 1996).In contrast, nonsynaptic adaptation of intrinsic excitability of neural states has been more recently addressed and linked to homeostatic regulation (Chandra & Barkai, 2018;Debanne et al., 2019;Zhang, Li, Gao, & Niu, 2021;Boot, Baas, Van Gaal, Cools, & De Dreu, 2017;Williams, O'Leary, & Marder, 2013;Lisman, Cooper, Sehgal, & Silva, 2018).Nonsynaptic plasticity and synaptic plasticity can work together, allowing for the modeling of simultaneously working mechanisms for different types of behavioural adaptivity in the multi-adaptive dynamical systems model for short-term adaptation and long-term adaptation.In this way, via multiple circular pathways an interplay between synchrony, short-term adaptivity, and long-term adaptivity occurs.Synchrony does not only lead to short-term adaptation but through this also intensifies interaction, leading to more synchrony and strengthening long-term adaptation.Conversely, longterm adaptivity strengthens interaction, leading to more synchrony and consequently stronger short-term adaptivity.Metaplasticity (Abraham & Bear, 1996) controls the plasticity in a context-sensitive manner.Second-order adaptation (adaptation of the adaptation) has been included in our model to allow for more realistic and contextsensitive control of plasticity.This has been applied particularly to address adaptive speeds and strengths of the different types of first-order adaptation that form the plasticity.

Research questions and hypotheses
The general focus is on how interaction with other agents leads to adaptation of the interaction behaviour.The main research question considered is: How can an adaptive agent model for interacting agents be designed that captures how depending on the interaction durations with other agents that occur, an agent achieves relationship-specific and relationship-independent adaptivity concerning interaction behaviour?
In the experimental setup chosen, the interaction episodes and their durations form the independent variable (per simulation run chosen in a stochastic manner) whereas the agent's processes and adaptations depend on that.The main research question can be detailed by the following more specific hypotheses that will be verified through simulations: A. Adaptation in basic interaction behaviour can be observed for (ii) More experiences with interactions with any agent A lead to faster and stronger adaptation in interactions with any agent B in the future (transference).D. Such adaptation occurs both (i) in the short-term and (ii) in the longterm: (i) Interaction within episodes (ii) Interaction over multiple episodes E. The relation between the extents of interaction and adaptation can be observed in two ways: (i) Within a given simulation run over time adaptation becomes stronger after more interaction has taken place (ii) In a comparative manner, simulation runs that show longer interaction durations will also show more adaptation compared to simulation runs with shorter interaction durations F.More experiences with interactions will in general not only lead to adaptations in interaction behaviour (first-order adaptation effect) but also to faster and stronger adaptation in interactions in the future (second-order adaptation effect).This happens (i) within a given simulation and (ii) more in simulations where more interaction occurs.

Adaptive dynamical systems modeled by Self-Modeling networks
The dynamical systems view as a useful perspective on cognition has been put forward, for example, in (Ashby, 1960;Port & Van Gelder, 1995;Schurger & Uithol, 2015).Dynamical systems are usually considered as state-determined systems which are systems for which each (current) state at some time t determines the future states after t, e. g., (Ashby, 1960;Van Gelder & Port, 1995).For example: 'the fact that the current state determines future behaviour implies the existence of some rule of evolution describing the behaviour of the system as a function of its current state.For systems we wish to understand we always hope that this rule can be specified in some reasonably succinct and useful fashion.' (Van Gelder & Port, 1995), p. 6.
Concerning the issue how 'this rule can be specified in some reasonably succinct and useful fashion', often a first-order difference or differential equations format is suggested (Ashby, 1960;Port & Van Gelder, 1995).However, also causally-oriented views have been developed, which may provide more intuitive conceptualisations than mathematical difference or differential equations.For example, it has been shown by several applications how a causal format can be a useful format to model complex dynamic and adaptive processes, as long as dynamics of states (Treur, 2016) and adaptivity for causal relations and their characteristics (Treur, 2020a) are taken into account in that format.Moreover, it has been mathematically proven in a general manner in (Treur, 2021;Hendrikse, Treur, & Koole, 2023b) that any smooth dynamical system has a canonical representation in network format based on temporal-causal networks and that any smooth adaptive dynamical system has a canonical representation in self-modeling temporal-causal network format.It is this format that is used in the current paper.

Modeling dynamical systems by a temporal-causal network format
A temporal-causal network model, as characterized by Treur (2020aTreur ( ,2020b)), is composed of nodes (also referred to as states) denoted by X and Y with real numbers (usually in the [0, 1] interval) as activation values X(t) and Y(t) over time t.A specific model is defined by the following network characteristics:

• Connectivity characteristics
There exist connections from states X to Y, each with a corresponding weight specified, denoted by ω X,Y .

• Aggregation characteristics
For any state Y, a combination function on Y from its incoming connections from states X i .

• Timing characteristics
Each state Y has a speed factor denoted by η Y , which determines the rate at which it changes for a given causal impact.
Note that sometimes the parameters in combination functions are left implicit and c πY ,Y (V 1 , .., V k ) is simply denoted by c Y (V 1 , .., V k ) .For the network's dynamics, these network characteristics are incorporated into a canonical difference equation (or related differential equation) used for both simulation and analysis purposes: where X 1 to X k represent states from which Y receives incoming connections.Eq. ( 1) bears a resemblance to the format of recurrent neural networks.
The software environment outlined in Treur (2020a, Ch. 9) includes a library of approximately 70 basic combination functions for use in the model design process.Examples of these functions are listed in Table 1.Note that when the function alogistic σ,τ (V 1 , …,V k ) is applied to intended positive values, it is cut off at 0 if its formula produces a negative value.However, this cut-off is not applied when the function is applied to intended negative values.Overall, these concepts enable the declarative design of network models and their dynamics based on mathematically defined functions and relations.By instantiating this general difference equation ( 1) by proper values for the network characteristics for all states Y, the software environment runs a system of n difference equations where n is the number of states in the network.
The four functions in Table 1 are all used in the introduced model.To obtain stochastic generation of runs, in addition a fifth function is used, called scenario-generation.Over time this function determines in a stochastic manner environmental circumstances: which pairs of agents can interact with each other over which time periods and for which modalities.

Modeling adaptive dynamical systems by self-modeling networks
Dynamical systems for real-world cases often involve a number of

Table 1
The combination functions used in the introduced network model.

Notation Formula Parameters
Advanced logistic sum Order n Scaling factor λ S.C.F.Hendrikse et al. parameters that for changing contextual circumstances have to be adaptive, resulting in an adaptive dynamical system.For network models representing adaptive dynamical systems, such parameters have the form of network characteristics: connection weights, combination functions and their parameters, and speed factors, like indicated in Section 3.1.To achieve a transparent declarative description for adaptive networks, self-modeling networks (also called reified networks) have been introduced (Treur, 2020a;Treur, 2020b .This concept corresponds to the notion of synaptic plasticity in neuroscience, which, for example, is described by (Bear & Malenka, 1994;Stanton, 1996).In contrast to this synaptic type of plasticity, an example of nonsynaptic plasticity (Chandra & Barkai, 2018;Debanne et al., 2019;Zhang et al., 2021) This universal difference equation is incorporated in the dedicated software environment.By instantiating this general difference Eq. ( 2) by proper values for the network characteristics for all base states Y and instantiating Eq. ( 1) for all self-model states, the software environment runs a system of n difference equations where n is the number of base states plus self-model states in the network.For the adaptive dynamical system model introduced in the current paper, n = 357.
Note that this difference Eq. ( 2) is not exactly in the standard format of a temporal-causal network, as H Y is not a constant speed factor and also the P-and W-values are not constant.However, it can be rewritten into the temporal-causal network format when the following general combination function c* Y (..) is defined: ) are variables for adaptive parameters of the comination function, H is a variable for adaptive speed factors, W i are variables for adaptive connection weights, and V i and V are variables for state activations.Based on this universal combination function, consider the following difference equation: This is indeed in temporal-causal network format (with speed factor 1) defined by (1).Now note that using (3), equation (4) can be rewritten as follows: ] Δt (5) where P Y (t) = (P 1,Y (t), ⋯, P m,Y (t)).
This (5) shows exactly difference Eq. (2) above; this confirms that the chosen combination function c* Y (..) in (3) to show that the selfmodeling network has a temporal-causal network format like (1) indeed works.
As the above shows that a self-modeling network is also a temporalcausal network model itself, this self-modeling network construction can easily be applied iteratively to obtain multiple orders of self-models at multiple (first-order, second-order, …) self-model levels.For example, a second-order self-model may include a second-order self-model state H W X,Y representing the speed factor ηW X,Y for the dynamics of first-order self-model state WX, Y which in turn represents the adaptation of connection weight ωX,Y.So, this second-order self-model state HW X,Y represents the adaptation speed (or learning rate) for that connection weight.Similarly, a second-order self-model may include a second-order self-model state HT Y representing the speed factor ηT Y for the dynamics of first-order self-model state TY which in turn represents the adaptation of excitability threshold τY for Y, so this second-order selfmodel state HT Y represents the adaptation speed for that excitability threshold.Moreover, also higher-order W-states can be used of the form WZ,W X,Y representing the weight of the connection from a given state Z to state WX,Y or of the form WZ,T Y representing the weight of the connection from a given state Z to state TY.All such second-order self-model states can be used to modulate an adaptation process for WX,Y or TY over time.Therefore, these secondorder self-model states as indicated can be used to exert control over different aspects of firstorder adaptation processes, in line with the notion of metaplasticity from neuroscience to control plasticity in a context-sensitive manner, e.g., (Abraham & Bear, 1996).In this way such second-order adaptation effects contribute to (the speed and strength of) first-order adaptation.
In the current paper, this multi-level self-modeling network perspective will be applied to obtain a second-order adaptive network architecture addressing both first-and second-order adaptation.As an example, the adaptation control level can be used to make the adaptation speed context-sensitive as addressed by metaplasticity literature such as (Abraham & Bear, 1996;Robinson, Harper, & McAlpine, 2016).This indeed has been done for the introduced model, as will be discussed in Section 4.

The multi-adaptive dynamical systems model
In this section, the agent model (modelled through multi-adaptive dynamical systems) is introduced.First, the conceptual assumptions underlying the model are discussed for the base behaviour (Section 4.1) and for the adaptivity (Section 4.2).Next, an overview of the model is presented (Section 4.3).After this, the states used in the model are discussed in more detail, together with the related network characteristics, subsequently for the base level (Section 4.4), the first-order selfmodel level (Section 4.5), and the second-order self-model level (Section 4.6).

The base multimodal interaction behaviour
All considered agents share the same design.For the base level of this design, see Fig. 1.An overview of explanations of all base states for agent A can be found in Table 2. Recall from Section 3 that all states have real numbers in the interval [0, 1] as activation values over time.Such states are depicted in Fig. 1 by circle or diamond shapes and there causal relations by arrows.All of the agents receive input from a stimulus world s regularly for certain time periods, meaning the activation of world s alternates between 0 (s absent) and 1 (s present), with s an instance of a stimulus (which is kept abstract).For the chosen scenario, this external stimulus influences any agent A by a causal relation to the sensing state indicated by sense s,A and subsequently by that to the sensory representation state of this stimulus indicated by rep s,A (a kind of mental image) which in turn affects the agent's own internal preparation states for the three considered modalities: prep m,A , prep b,A and prep v,A , with m, b and v instances of a movement, affective response and a verbal action of agent A, respectively (also kept abstract).For any other agent B with which agent A interacts, the model for agent A also includes sensing and sensory representation states for the three considered modalities for B: for  (Iacoboni, 2008), and (2) how internal simulation of the prepared action with a backward effect takes place (Hesslow, 2002).
As discussed in Section 2.1, during interaction often forms of interpersonal synchrony emerge and this affects the interaction behaviour in an adaptive manner.Therefore, in the model introduced here it is assumed that for each of the modalities and each other person such emerging synchrony can be detected.This is modeled by states such as intersyncdet B,m,A , intersyncdet B,b,A , intersyncdet B,v,A (depicted by diamond shapes in Fig. 1).These states have high activation values in the interval [0, 1] if the indicated synchrony is detected and low values when no synchrony is detected.In Fig. 2 it is shown how the agents interact with each other.

Conceptual assumptions behind the introduced adaptive dynamical system model
The aim of this article is to analyse in which ways agents change their behaviour during social interactions with multiple others over time and how these changes can be related to internal mental learning mechanisms.The chosen mental learning mechanisms are based on what is known from neuroscience, in particular on synaptic and nonsynaptic plasticity (Bear & Malenka, 1994;Chandra & Barkai, 2018;Debanne et al., 2019;Stanton, 1996) and metaplasticity (Abraham & Bear, 1996).Overall, the following assumptions were made on adaptive changes that can take place during social interaction: As discussed in Section 3, first-order self-model states can be used to model adaptation.In particular, W X,Y -states and T Y -states can be applied to model synaptic and nonsynaptic types of adaptation of connections from states X to Y and excitability thresholds for states Y used to model the mental processes perceiving, responding and executing or expressing (a) to (c) mentioned above.Thus, stronger and more sensitive activation of states used to model the three types of mental processes (a)-(c) can be obtained.
More specifically, if sensor states and sensory representation states for others are used to model the way how someone else is perceived, then strengthening the connections from sensor states and representation states and lowering the excitability thresholds for representation states can lead to stronger and more sensitive forms of perceiving (a).Furthermore, if in addition preparation states and connections from representation states to them are used to model the basis of the responding process, then strengthening these connections and lowering the excitability thresholds for these preparation states will lead to stronger and more sensitive responding (b).Finally, if in addition (action) execution states and connections from preparation states to them are used to model the execution process, then strengthening these connections and lowering the excitability thresholds for these execution states will lead to stronger and more sensitive acting and expressing (c).
We assume here that synaptic plasticity is used to model long-term changes (over multiple episodes) like in bonding, whereas nonsynaptic adaptation of excitability thresholds is used to model short-term changes like in affiliation (within one episode).For all of the above forms of adaptations, it has been be considered in how far they are relationshipspecific or relationship-independent.For example, when an agent learns to respond stronger to an agent within an interaction, will that only change the behaviour in interaction with that specific agent, or also when interacting with other agents?In terms of the model to be designed, which (adaptive) state and connection characteristics play only a role when interacting with one and the same agent and which (adaptive) state and connection characteristics play a role when interacting with any other agent?Below we will make these distinctions for the different forms of learning addressed.Still other assumptions are made about second-order adaptivity.Within neuroscience this is described by metaplasticity which can control plasticity in a context-sensitive manner (Abraham & Bear, 1996).In the first place, this concept is used to address adaptation of the adaptation strength for first-order self-model states W X,Y and T Y .As discussed in Section 3, second-order self-model states W Z,W X,Y and WZ,T Y can be used to make the strength of adaptation of WX,Y and TY adaptive.This can be used to model how over time a person learns to adapt stronger.Secondly, the concept of metaplasticity is used to address adaptation of the adaptation speed for WX,Y and TY.Again, as discussed in Section 3, second-order self-model states HW X,Y and HT Y can be used to make the speed of adaptation of WX,Y and TY adaptive.This can be used to model how over time a person learns to adapt faster.In this way, in the model both stronger adaptation and faster adaptation over time are modelled by these second-order self-model states.This second-order adaptation effect can substantially contribute to first-order adaptation.In total two first-order adaptation mechanisms have been identified, modelled through W-states for long-term adaptation and T-states for short-term adaptation.Moreover, these two learning mechanisms have been applied to the different types of mental processes (a) to (c) mentioned above, which makes six types of first-order adaptation.In addition, four second-order adaptation mechanisms have been identified modelled through W Z,W X,Y -, WZ,T Y -, HW X,Y -, and HT Y -states.Further assumptions have been made about which of these 10 learning types work within a given episode and which ones over multiple episodes, and which are other-agent-specific and which relationshipindependent, see Table 3 for an overview of this and Fig. 3 and Fig. 4 for a 3D graphical overview of the model.Here, in the first four rows (in purple) the assumptions are indicated that all four types of second-order adaptivity are relationship-independent and long-term.So, their influence on the first-order adaptation is a long-term relationship-independent influence.Furthermore, in rows 5 to 7 (in blue) the assumptions are indicated that synaptic plasticity is used as a form of long-term learning, for representing and responding relationship-specific, and for executing actions relationship-independent.In contrast, in the last three rows (in blue) the assumptions are indicated that nonsynaptic plasticity of excitability thresholds is used as a form of short-term learning, again for representing and responding relationshipspecific, and for executing actions relationship-independent.Note that the second-order adaptation also provides long-term relationship-independent effects, even on the relationship-specific short-term first-order adaptation modeled by the T-states for the representation states.

Overview of the model
The designed adaptive dynamical system model takes into account actions of agents for three different modalities: for moving m, expressing affect b and talking v.In total, the model covers four agents and their interactions and adaptivity, modelled according to a second-order adaptive dynamical system and represented in a network-oriented format; see Fig. 3 and Fig. 4 for a graphical overview and Table 4 for explanations of all types of states.Overall, it has 357 states X i , which in view of Section 3 means that its dynamics is based on a system of 357 difference or differential equations which are instantiations of (1) for the chosen network characteristics including the specific combination functions shown in Table 1.Within the overall model, each of the four agents A to D is modelled by 79 states: Timing characteristics for base states

Table 3
Overview of the assumptions on first-order and second-order relationship-specific and relationship-independent adaptivity and short-term and long-term learning adaptivity.

Longterm
Order and type of adaptivity o World state ws s has speed factor 0.5.o The context states have speed factor 0, which makes them constant.

Network characteristics for the First-Order Self-Model level
Next, some more details are discussed of the network model's (connectivity, aggregation, and timing) characteristics for the first-order self-model level states.
Connectivity characteristics for first-order self-model states An overview of explanations of all first-order self-model states for agent A can be found in Table 5.An overview of explanations of all second-order self-model states for agent A can be found in Table 6.

Network characteristics for the second-order self-model level
For more details of the model specification, see the App endix in Section 9.For a full specification of the model as Linked Data, see https://www.researchgate.net/publication/363802843.

Research question and hypotheses operationalised for the introduced model
Recall the main research question and hypotheses A. to F. introduced in Section 2.3.Now the model has been described, these hypotheses can be operationalised by relating them to the states of the model.We go through them one by one.The base states for representing, responding and executing are the rep-, prep-, and exec-states (note that the (inter)action execution states move m , express b , talk v are also indicated by exec m , exec b , exec v ).Their activation values over time will be analysed to verify to what extent behaviour is adapted; see Table 7. C. Two types of adaptation will occur, (i) other-agent specific and (ii) other-agent independent: (i) More experiences with interactions with a given agent A lead to faster and stronger adaptation in interactions with A in the future.(ii) More experiences with interactions with any agent A lead to faster and stronger adaptation in interactions with any agent B in the future (transference).D. Such adaptation occurs both (i) in the short-term and (ii) in the long-term: (i) Interaction within episodes (ii) Interaction over multiple episodes E. The relation between the extents of interaction and adaptation can be observed in two ways: (i) Within a given simulation run over time adaptation becomes stronger after more interaction has taken place (ii) In a comparative manner, simulation runs that show longer interaction durations will also show more adaptation compared to simulation runs with shorter interaction durations F.More experiences with interactions will in general not only lead to adaptations in interaction behaviour (first-order adaptation effect) but also to faster and stronger adaptation in interactions in the future (second-order adaptation effect).This happens (i) within a given simulation and (ii) more in simulations where more interaction occurs.
First-order self-model T-and W-states and second-order self-model W T -, H T -, W W -, H W -states are indicative for the different types of adaptations that occur.The above hypotheses can be observed at the firstorder and second-order self-model level as follows (see also Table 8).
First-order self-model T-states represent the adaptive excitability  thresholds for certain base states within the model: the rep-, prep-, and exec-states.Lower thresholds imply that a state can get stronger activation (more sensitive, higher extent of excitability).First-order selfmodel W-states represent adaptive connection weights between two states, e.g., between sense and representation states.Higher connection weights imply that a connection between two states is stronger, resulting in a stronger activation of the target state.Globally, for these first-order self-model states we expect that • Within a given simulation run, the activation values of the T-states will become lower within each interaction episode, and higher when no interaction episode for a specific relationship occurs; in contrast, the activation values of the W-states will become higher over time in general.For the W-states for prep-exec connections this will be irrespective of which are the relationships that have interactions.For the other W-states this will be relationship-specific.• In a comparative manner, on the average, the activation values of the T-states will be lower and of the W-states will be higher in simulation runs with longer interaction durations.For the W-states for prep-exec connections this will be irrespective of which are the relationships that have interactions.For the other W-states this will be relationship-specific.
Second-order self-model H T -and H W -states control the speed of   • Within a given simulation run, for all relationships the activation values of all second-order self-model states will increase over time, irrespective of which are the relationships that have interactions, except those of the W T -states which instead will decrease over time.• In a comparative manner, on the average the activation values of all second-order self-model states will be higher in simulation runs with longer interaction durations irrespective of which relationships it concerns, except those of the W T -states which instead will be lower.

Simulation setup and example
In this section, the setup of the simulation experiments and one illustrative example simulation run is discussed.In Section 7 an extensive analysis is presented for a collection of 20 simulation runs performed according to this setup.For all settings for the simulations, see the full specification of the model available as supplementary material and as Linked Data at https://www.researchgate.net/publication/363802843.

Design of the simulation experiments
A stimulus occurs regularly during certain periods: 40 time units without the stimulus followed by 40 time units with the stimulus and this pattern is repeated every 80 time units.Moreover, in a random manner interaction-enabled periods happen for certain modalities for randomly chosen pairs of agents.After each interaction-enabled period for two agents, a new pair of agents is chosen at random from A, B, C and D for the next interaction-enabled period (see Fig. 5).In addition, the duration until the next interaction-enabled period (interaction break length: the durations of the intervals between the purple boxes in Fig. 5) and the duration of the next interaction-enabled period (interaction length: the lengths of the purple boxes in Fig. 5) are both chosen at random from the interval [0, 50].
Furthermore, the enabled modalities for the interaction in the next interaction-enabled period are chosen.Each modality has a 5/6 = 0.83 independent probability to be available.In other words, for each interaction-enabled period there is 0.58 chance that all three modalities are available and 0.42 chance that only one or two modalities are available (and 0.005 chance that none is available, in which case the agents in principle would be able to interact during some time, but

Table 8
The first-order and second-order self-model states that are examined to analyse adaptation.We have conducted 20 independent runs, each of them with a total time duration of 4000 time units and step size Δt = 0.5.We chose this approach because of the stochastic set up of the experiment, to evaluate the consistency of behavior under multiple circumstances.

Evaluation of a simulation run
First, we zoom in on how agents' states develop over time.Therefore, we evaluate the patterns of the activation values of the states of an agent A from simulation 1.In the other 19 simulations (available on request), we have seen that the patterns of adaptivity are roughly the same over simulations.This means that, overall, the described findings from simulation 1 are representative for all simulations.

Base level states
The sensing states are only activated during interaction episodes, and their activation values become generally higher within interaction episodes later in time (Fig. 6).The representation states get already activated within the stimulus intervals, regardless interaction is enabled (Fig. 6).During the interaction episodes, when no stimulus is present, the representation states do not become higher, because both agents do not have enough input to trigger their actions then.However, later in time, for example around time unit 800, the representation states are extra activated during interaction and common stimulus episodes, when the sensing states are activated as well.This highlights the role interaction plays in combination with the stimulus regarding increased activation of the representation states.The patterns of the preparation and the execution states tend in general to be the same: they align with the activations of the common stimulus and when on top of this common stimulus interaction between agents A and C happens, their activation levels elevate further (Fig. 7).The interpersonal synchrony states only get activated through the interaction episodes and seem to achieve higher peaks over time, indicating agents A and C become more attuned towards each other.These findings are in line with hypotheses A and B: adaptation of representing (rep-states), responding (prep-states) and expressing (exec-states) occurs for the three modalities.This adaptation is shown through the elevated activation values of the relevant states over the simulation when interaction between agents A and C (in combination with a common stimulus) happened.

First-order adaptation: T-states
The two types of first-order T-states are roughly following the same patterns for all modalities: they show downwards jumps during the interaction episodes of agents A and C and these jumps are becoming larger over time until time 1600 (Fig. 8).Not similarly, the T exec states sometimes display small downwards jumps when no interaction between A and C occurs, for example around time 1950.An explanation for this might be that not only the interpersonal synchrony states of agents C and A influence the T exec states of agent A, but also the detected interpersonal synchrony states from agent A towards agents B and D (T execstates are other-agent independent).These findings are in line with the expectations that within simulations the activation values of T-states will become lower during interaction episodes with a specific agent and higher when the interaction episode is finished, meaning that during interaction with a specific agent the relationship-specific adaptivity emerges.This is in accordance with hypotheses D(i) and E(i) from Section 5.

First-order adaptation: W-states
All first-order self-model W-states show a kind of breakthrough around time 600: a sharp increase, when two relatively long interaction episodes follow closely on each other (Fig. 9).After that sharp increase, the W sense-repx,C,A and W rep-prep x,C,A states balance around an activation level of 0.8, with jumps towards 1 during interaction episodes.More extremely, the W prep-exec x,C,A -states already reach an activation level around 1 at time 700 at the end of the sharp increase, meaning its maximum is already achieved and effects of interaction episodes cannot cause an extra effect anymore.These results are in line with the hypotheses C, D(ii), and E(i).But note that this is not a gradual monotonous development in which the W-states become higher over time, but the main increase is rather suddenly (tipping points that are reached around time 600) and the W sense-rep x,C,A and W rep-prep x,C,A states are not completely monotonous during the non-interaction episodes, as observed by the drops in activation values from the moment interaction ends (especially from time 800 onwards).

Second-order adaptation
The H TA -state from agent A increases during each interaction episode with any other agent, and drops again when no interaction with any agent occurs.Around time 650, there is a sharp incline in the activation values, and thereafter there are again fluctuations that show the same trend between interaction and no interaction episodes (Fig. 10).The patterns for the H W A -states and the W W A -states are roughly similar, although the oscillations are less pronounced and their inclines are more gradual.The WT A -state is declining over time, towards negative activation levels around − 0.2.All these patterns are in line with the hypothesis F(i) that within a given simulation run, the activation values of all second-order self-model states will increase over time, except those of the W T -states which instead will decrease over time.This indicates that relationshipindependent adaptivity can emerge from interactions with specific agents.Namely, each of these second-order self-model states from agent A adapt during the simulation, regardless the other agent with whom agent A interacts and all second-order self-model states have their effect in a relationship-independent manner.Moreover, these second-order effects induce relationship-independent influences on first-order adaptivity.

Statistical analysis of the simulation outcomes
In Sect.6, we have evaluated the adaptation patterns of one typical simulation run at the base level, first-order level and second-order level.Within that simulation run it can be observed how over time more and more adaptations take place, based on the social interaction episodes and emerging synchronies within them.Although this run was claimed to be typical, a single run cannot show in how far the adaptations indeed depend on the extent to which social interaction takes place as that extent is fixed.Therefore, as an additional step, in the current section we quantify the main patterns for all 20 generated simulation runs in a statistical manner.These 20 runs do have variation in the extent of social interaction.In this way some more evidence is obtained about the behaviour of the model in relation to the hypotheses formulated in Section 2 and related to the model in Section 5, in particular, for example, also E(ii) and F(ii).This can be done especially in a comparative manner by comparing runs with more interaction to runs with less interaction, so that the extent of interaction can be considered an independent variable over the 20 runs and it can be analysed how other factors depend on this.

Variation in total interaction durations
The 80 agents in the 20 simulation runs had on average a total individual interaction duration of 2512 time units (diameter: 1580 time units), with a standard deviation of 345 time units, and their individual interaction durations seemed to be normally distributed, see Fig. 11.The smallest and largest interaction durations from an agent from the 80 agents equaled 1650 and 3230 time units, respectively.These results indicate that there was enough variation in the interaction durations, and that the average of 2512 of their total interaction durations over the whole simulation runs of 4000 time units was only slightly above the 2000 time units.These results enable a further evaluation of the agents' mechanisms performances in relation to their total interaction durations (see Sect. 7.3).

Averaged learning effects between two phases
To evaluate the hypotheses, in all runs we zoom in on the first half (time 0-2000, Phase 1) and second half (time 2000-4000, Phase 2), see Fig. 12 and Table 9.All activation levels of the different types of states were averaged over time and over the three modalities m, b and v. Additionally, the synchrony detection activation levels are averaged over all interpersonal synchrony states.Regarding the states (rep, prep, exec and sync) from the base level, it appears that their activation levels have on average increased from Phase 1 to Phase 2 by a factor more than 2.5 (see Table 9).Although the average activation values of synchrony detection states are generally low in both phases (because each dyad only interacts only a small part of the time) and the differences are hardly seen in Fig. 12, the synchrony activation levels still have increased with a factor around 2, from 0.026 to 0.054.These results at the base level are in line with hypothesis A from Sections 2 and 5 that the activation values (between 0 and 1) of the base states will increase over time within simulations.
Concerning the first-order self-model W-and T-states, the activation levels of all W-states display an increasing pattern and those of all Tstates a decreasing pattern from Phase 1 to 2, see Fig. 12.The overall factors are approximately 3 and 0.9, respectively, see Table 9.These results are overall in line with our expectations formulated in hypotheses C to E. Higher activation levels of W-states demonstrate that the adaptive connection weights became stronger, and lower activation levels of T-states that the target states can get a stronger activation (more sensitive target states: enhanced excitability).Although the overall decrease with a factor of 0.92 for the T-states does not seem that high, it fits perfectly within the hypothesis.We expected that the values   of the T-states of given agents would decline when they interact, but would become higher when they are not interacting, because this type of adaptation works in the short-term only within interaction episodes.Since the values of the T-states are computed over all episodes (interaction episodes and no interaction episodes) and in the non-interaction periods they stay high as they should, in the averaging process the adaptation effects are flattened out.
The outcomes of all second-order self-model W W -, H W -, W T -, H Tstates are in line with the hypothesis F(ii) as well.The activation values of the W W -, H W -, H T -states are all elevated in Phase 2 compared to Phase 1, with factors ranging from 2.19 to 3.09.In contrast, the mean (negative) values of the W T -states are dropped by a factor of 3.07 from Phase 1 to Phase 2, so decreasing over time.These results for the second-order self-model states have the second-order adaptive effect that independent of the relationship: • the speeds of adaptivity for the W-states and T-states increase over time, • the strengths of the connections from the synchrony detection states to the W-states increase, and • the strengths of the connections from the synchrony detection states to the T-states decrease over time.
This shows both second-order relationship-independent adaptivity and the relationship-independent influences on first-order adaptivity induced by it.
Note that all described differences between activation levels of states between Phase 1 and 2 do not only hold for the mean values, but as well for the mean values minus and plus the standard deviation (see Fig. 12).This is an extra indication that the hypotheses are confirmed.

Averaged learning effects versus overall interaction duration
Next, we want to more explicitly relate the adaptive effects to the extent of interaction during a simulation run.Therefore, we created scatterplots for the 80 agents of the 20 runs with for each of them their total interaction duration on the horizontal axis and for the different types of states their average activation levels on the vertical axis.Within each of the scatterplots we added the trendline and determined its slope and the R 2 coefficient.

Effects of the learning on the base states
First, we analyse how the adaptive effects on the base level states relate to the interaction duration.For each of the four considered types of base states, scatterplots were created: in Fig. 13 for representation and preparation states and in Fig. 14 for execution and interpersonal synchrony states; see also Table 10.Indeed, they all show positive slopes for Fig. 14.The base level effects for the mean values for execution states (upper graph) and interpersonal synchrony detection states (lower graph) against the interaction durations (both for times 0-4000) over all 80 individuals.

Table 10
Trendline slopes, R 2 coefficients, and correlation coefficients for the mean values for all types of states against the interaction durations (both for times 0-4000) over all 80 individuals.the trendlines.The R 2 coefficients are between 0.19 and 0.27.Moreover, as listed in Table 10, Pearson correlation coefficients were determined which varied from 0.44 to 0.51.This shows that the way how adaptivity results in changes for the base states strongly depends on the extent of interaction.

First-order adaptation: W-states
Next, the adaptivity shown in the first-order self-model states were addressed: see the scatterplots for the W-states in Fig. 15 and for the Tstates in Fig. 16.All W-states show increasing trendlines.The R 2 coefficients in Table 10 are around 0.25-0.27.Note that de results for the W sense-rep -states and W rep-prep -states are the same because for all four agents we have used a uniform structure for this with only slight differences in the way they sense as indicated in Section 8.The correlation coefficients are around 0.51-0.52,see Table 10.This shows also a clear dependence of the adaptations of the W-states on the extent of interaction.

First-order adaptation: T-states
Similarly, we analysed the T-states.Here the trendlines in Fig. 16 have negative slopes.But these trends should indeed be negative as the adaptive effect concerns lowering the T-values during interactions so that more sensitive responses can be generated.In this case the R 2 values in Table 10 vary from 0.17 to 0.31.Moreover, the correlation coefficients vary from − 0.41 to − 0.56.So, also here a strong dependence of the adaptivity on the extent of interaction is found.

Second-order adaptive effects
Finally, we analysed how the states for second-order adaptation depend on interaction duration.For the scatterplots of the H W -and H Tstates, see Fig. 17, for the W W -and W T -states, see Fig. 18.It is also found here that for the H W -, H T -and W W -states the trendlines are positive.The R 2 coefficients are around 0.28 and the correlation coefficients are around 0.53, see Table 10.Indeed, for these second-order adaptivity states, a strong dependence is found on the extent of interaction as well.However, regarding the fourth type of states, the W T -states show a negative trend, which is also what it is supposed to display as these activation values are negative and adaptation makes them more negative.Here the R 2 -coefficient is 0.28 and the correlation coefficient is − 0.53.These findings highlight how second-order relationship-Fig.16.The first-order learning effects for the mean values for the T-states against the interaction durations (both for times 0-4000) over all 80 individuals: T rep (upper graph), T exec (lower graph).independent adaptivity depends on the extent of social interaction and thus also the relationship-independent influences on first-order adaptivity induced by it depend on this extent of social interaction.

Overall findings
We conclude that the adaptive changes in activation values for the Tstates for first-order adaptation and corresponding W T -states for secondorder adaptation may seem relatively low as the T-states concern shortterm adaptation to contextual circumstances that only occasionally occur.This implies that there are long periods covered in the averages in which no adaptation takes place as the context does not ask for that.Moreover, for threshold values (represented by these T-states) small differences often already have a substantial effect.Therefore, most indicative are the three W-states for first-order adaptation and the first three states H W , H T , W W for second-order adaptation.For the Pearson correlation coefficients, these correlation coefficients show convincing numbers around 0.51 for the first-order adaptation W-states, around − 0.49 for the first-order adaptation T-states, and around 0.53 resp.− 0.53 for the second-order adaptation states H W , H T , W W , W T .Similarly, the R 2 numbers are around 0.26 for the first-order adaptation W-states, around 0.24 for the first-order adaptation T-states, and around 0.28 resp.− 0.28 for the second-order adaptation states H W , H T , W W , W T .All these data indicate a strong dependence of the different forms of adaptivity on the durations of the social interaction episodes.In particular this holds for the second-order adaptation states which have their effects on first-order adaptation in a relationship-independent manner.

Discussion
In this paper, an adaptive agent-based dynamical system model was introduced for how persons develop during the social interaction they have.It incorporates how interaction behaviour changes on the short term during interaction episodes and on the long term over multiple interaction episodes.Furthermore, it addresses social interaction in multiple relationships and transference between them: how behaviour learned in one relationship can also be carried over to other relationships, like described, for example, in attachment theory developed by Mary Salter Ainsworth and John Bowlby (Salter Ainsworth & Bowlby, 1965;Salter Ainsworth, 1967;Salter Ainsworth et al., 1978;Salter Fig. 17.The second-order learning effects for mean values for the H W and H T -states against the interaction durations (both for times 0-4000) over all 80 individuals: H W (upper graph), H T (lower graph).
The model was developed for four agents as an adaptive dynamical system specified by its canonical self-modeling network representation (Treur, 2021;Hendrikse et al., 2023b).The four agents do not interact all the time but only during episodes separated by periods without interaction.In each interaction episode only one dyad interacts, selected at random, while also the modalities used, the durations of the interaction episodes, and the times between episodes are chosen at random.In this way, we generated 20 simulation runs and statistically analysed, among others, the dependence of the different types of adaptivity on the extent of social interaction.The outcomes of the analysis indeed show a strong dependence: more social interaction leads to more adaptation of the interaction behaviour, both for the short-term and long-term firstorder adaptation and for the second-order adaptation, which is long-term.
Many other modeling approaches in mental domains and beyond such as (Samsonovich, 2020), use some form of dynamical system modeling.The modeling approach used in the current article based on self-modeling networks can cover any adaptive dynamical system as is shown in (Treur, 2021;Hendrikse et al, 2023).As an illustrative example, consider equation ( 3) from (Samsonovich, 2020) In this way equation ( 3) from (Samsonovich, 2020) can be modeled in the network-oriented modeling approach used in the current article.
The extent of adaptation of the agents can be observed in a most clear manner in their three W-states for first-order adaptation and the first three states H W , H T , W W for second-order adaptation.It was found that the degree of adaptation of an agent depends in a significant manner on the overall duration of the interaction episodes of this agent.In more detail, the collected simulation data indicate a strong dependence of the different forms of adaptivity on the durations of the social interaction episodes.This does not only hold for the first-order adaptation states but also for the second-order adaptation states which have their effects on first-order adaptation in a relationship-independent manner.
The work presented here has adopted some elements of earlier work.For example, modeling of the emergence of synchrony during social interaction between agents was addressed in earlier work such as (Hendrikse, Treur, Wilderjans, Dikker, & Koole, 2022a;Hendrikse et al., 2023a).However, in these models no (subjective) internal detection of synchrony was incorporated and in (Hendrikse et al., 2022a) no adaptivity was modeled, whereas in (Hendrikse et al., 2023a) another type of adaptivity was captured: of internal responding from representation to preparation.The idea of subjective synchrony detection in an agent-based model was introduced in (Hendrikse et al, 2023c) and subsequently the distinction between short-term and long-term behavioural adaptivity was introduced in (Hendrikse, Treur, Wilderjans, Dikker, & Koole, 2022b) and (Hendrikse et al., 2023b).However, these papers did not distinguish between relationship-specific and relationship-independent adaptation and were limited to a fixed dyad as they did not include a context of interaction with multiple agents as studied here.
A specific computational model for attachment theory has been contributed in (Hermans, Muhammad, & Treur, 2021;Hermans, Muhammad, & Treur, 2022).This model is based on the internal working models for the self and the other, following & Horowitz, 1991).It does not address the distinction between short-term and long-term adaptivity and also not the differentiation between relationship-specific and relationship-independent adaptivity as in the current article.Moreover, there the second-order adaptation is limited to the speed of adaptation, whereas here also second-order adaptation for the strength of adaptation is addressed.
For further work, note that in the model only few variations for individual differences between the agents have been addressed.As followup research it can be interesting to study such differences in much more detail.Furthermore, as mentioned, mechanisms for plasticity and metaplasticity from neuroscience have been used as a basis for the adaptive agent model.However, more work on the neuroscientific mechanisms behind attachment theory exists, for example, in (Beckes & Coan, 2015;Beckes et al., 2015;Coan, 2016;White et al., 2023).This work can provide input for refinement of the presented models.As an example, Coan (2016) mentions that topics concerning neural systems supporting emotion and motivation and emotion-regulation, filial bonding, familiarity, proximity seeking, and individual differences are important.Moreover, Beckes and Coan (2015) put forward processes such as person perception, familiarity, anticipatory motivation, behavioral organisation, consummatory behaviour, emotion regulation, and aversive motivation.Another perspective that might be interesting for further work is the multidimensional model for attachment proposed by Gagliardi (2022).So, these literatures can provide many forms of inspiration to extend the current model.Our current adaptive agent model provides a solid base to model further refinements of relationshipspecific and relationship-independent social behaviour development.

Conclusion
Based on 20 runs, the outcomes of the analysis of the (stochastic) simulation results show a strong dependence of adaptation on the extent of social interaction: more social interaction leads to more adaptation of the interaction behaviour.This holds both for the short-term and long-term first-order adaptation and for the second-order adaptation, which is long-term.The modeling approach used is a general dynamical system modeling approach which means that any other dynamical system model can be remodeled or interpreted using the network concepts used here.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
(a) representing the other agent (b) responding to the representations of the other agent (c) executing interaction actions (d) emergence of synchrony shown by its detection B. The adaptation can be considered for multiple modalities (movement m, affect b, verbal v).C. Two types of adaptation will occur, (a) relationship-specific and (b) relationship-independent: (i) More experiences with interactions with a given agent A lead to faster and stronger adaptation in interactions with A in the future.

Fig. 1 .
Fig. 1.Base level connectivity for agent A: states and within-agent connections.With three modalities and (in dark pink) six synchrony detection states for interpersonal synchrony.
B's verbal actions v (indicated by sense B,v,A and rep B,v,A ), emotions b (indicated by sense B,b,A and rep B,b,A ) and movements m (indicated by sense B,m,A and rep B,m,A ).The latter sensory representation states are directly connected in a reciprocal manner with the corresponding preparation states prep m,A , prep b,A and prep v,A of agent A. These reciprocal causal links show (1) how the sensory representation states lead to mirroring of the same action by preparing for it (a) Changes in the way others are perceived.(b) Changes in the way of responding to other persons.(c) Changes in the way of executing or expressing certain behaviours to other persons.

Fig. 2 .
Fig. 2. Base level connectivity for three agents A, B, C: within-agent connections and between-agents connections.

Fig. 3 .
Fig.3.The adaptive agent model: the base level and first-order self-model level with in the upper picture their upward interaction links (in blue) and in the lower picture their downward interaction links (in pink).

Finally
, more details are discussed of the network model's (connectivity, aggregation, timing) characteristics for the second-order selfmodel level states.Connectivity characteristics for second-order self-model states • Connectivity for within-agent second-order self-model states o Within-agent connections from base-level within-agent states for sensing, execution and synchrony detection to respective aggregation states have weights 0.1.o Within-agent connections from within-agent first-order self-model W-states for representing, responding and executing to respective aggregation states have weight 0.1.o Within-agent connections from second-order within-agent selfmodel aggregation states for representing, responding and executing to overall W-aggregation states have weight 0.1.o Within-agent connections from aggregation states for sensing, execution and synchrony detection and from overall W-aggregation states to H W -, H T -, W W -, W T -states have weight 1 for the first three H W , H T , W W , and weight − 0.35 for the W T -states.In addition, these H W -, H T -, W W -, W T -states have circular persistence connections to themselves with weight 1 for H W -and H T -states, 0.15 for W W -states, and − 1 for W T -states.Aggregation characteristics for second-order self-model states • Aggregation for within-agent second-order self-model states o All second-order self-model aggregation states use the Euclidean function eucl from Table 1 of order n = 1 with a normalising scaling factor λ = the sum of the weights of the incoming connections.o All second-order self-model states H W -, H T -, W W -states use the logistic function alogistic from Table 1 with steepness σ = 5, threshold τ = 0.8 for H W -and H T -states and threshold τ = 0.2 for W W -states. o All second-order self-model states W T -states use the logistic function alogistic for negative values with steepness σ = 2 and threshold τ = 0. Timing characteristics for second-order self-model states • Timing for within-agent second-order self-model states o All second-order self-model aggregation states use speed factor 1. o All second-order self-model H W -, H T -, W W -, W T -states use speed factors 0.005, 0.1, 0.005, and 0.009, respectively.

Fig. 4 .
Fig. 4. The adaptive agent model: the base level and first-order and second-order self-model levels with their downward interaction links (in pink).
A. Adaptation in basic interaction behaviour can be observed for (a) Representing the other agent (b) Responding to the representations of the other agent (c) Expressing (executing) interaction actions (d) Emergence of synchrony shown by its detection B. The adaptation can be considered for multiple modalities (movement m, express b, verbal v).

FirstFig. 5 .Fig. 6 .
Fig. 5. Example of a timeline with interaction episodes based on randomly chosen dyads from {A, B, C, D}, up to three modalities from {m, b, v}, interaction durations from the interval [0, 50] and durations between interactions from the interval [0, 50].

Fig. 7 .Fig. 8 .
Fig. 7.The grey solid line is the common stimulus.The orange dashed lines are interaction episodes of agent A and C. The purple lines are the preparation states, the green lines the execution states and the red lines the interpersonal synchrony detection states for the three modalities (m, b and v) of A towards agent C.

Fig. 9 .
Fig. 9.The grey solid line is the common stimulus.The orange dashed lines are interaction episodes of agent A and C. The overlapping red lines are the W sense-rep x,C,A and W rep-prep x,C,A states and the overlapping blue lines the W prep-exec x,C,A .

Fig. 11 .
Fig. 11.The distribution of individual total interaction durations over the whole population of 80 agents.

Fig. 12 .
Fig. 12. Mean values and standard deviations for all types of states in comparison for Phase 1 (upper graph) for time 0-2000 and Phase 2 for time 2000-4000 (lower graph).

Fig. 13 .
Fig.13.The base level effects for the mean values for representation states (upper graph) and preparation states (lower graph) against the interaction durations (both for times 0-4000) over all 80 agents.

Fig. 15 .
Fig. 15.The first-order learning effects for the mean values for the W-states against the interaction durations (both for times 0-4000) over all 80 individuals: W sense- rep (upper graph), W rep-prep (middle graph), W prep-exec (lower graph).

Fig. 18 .
Fig.18.The second-order learning effects for the mean values for the W W -and W T -states against the interaction durations (both for times 0-4000) over all 80 individuals: W W (upper graph), W T (lower graph).

c πY ,Y (..) and η Y can
is modulation of excitability thresholds.This can be modeled for a given state Y by a selfmodel state T Y which represents the excitability threshold τ Y of Y. Similarly, the other network characteristics from ω X,Y , be made adaptive by including self-model states for them.For example, an adaptive speed factor η Y can be represented by a self-model state named H Y and an adaptive parameter π i,Y can be represented by a self-model state P i,Y .If for all network characteristics

ω, π, η for all base level states, respective self-model states W, P, H are introduced repre
senting these network characteristics, then the canonical difference equation for the base level states of the self-modeling network model is:

Table 2
Base states of the adaptive dynamical systems model for agent A. For the other agents, similar base states are used: X 41 to X 75 for B, X 76 to X 110 for C, X 111 to X 145 for D.
B,v,A Sensory representation state of A for verbal action v of B X 20 rep C,m,A Sensory representation state of A for movement m of C X 21 rep C,b,A Sensory representation state of A for expressed affective response b of C X 22 rep C,v,A Sensory representation state of A for verbal action v of C X 23 rep D,m,A Sensory representation state of A for movement m of D X 24 rep D,b,A Sensory representation state of A for expressed affective response b of D X 25 rep D,v,A Sensory representation state of A for verbal action v of D X 26 prep m,A Preparation state for movement m of A X 27 prep b,A Preparation state for affective response b of A X 28 prep v,A Preparation state for verbal action v of A X 29 intersyncdet B,A, m Interpersonal synchrony detection of A for executing m by B and A X 30 intersyncdet B,A, b Interpersonal synchrony detection of A for executing b by B and A X 31 intersyncdet B,A, v Interpersonal synchrony detection of A for executing v by B and A X 32 intersyncdet C,A, m Interpersonal synchrony detection of A for executing m by C and A X 33 intersyncdet C,A, b Interpersonal synchrony detection of A for executing b by C and A X 34 intersyncdet C,A, v Interpersonal synchrony detection of A for executing v by C and A X 35 intersyncdet D,A, m Interpersonal synchrony detection of A for executing m by D and A X 36 intersyncdet D,A, b Interpersonal synchrony detection of A for executing b by D and A

•
From these 79 states per agent, 35 states are at the base level and model the base processes of perceiving other persons, responding to them and executing actions.These 35 states are sensor and execution states, representation and preparation states and synchrony detector states, all for different modalities.• Furthermore, 33 states are at the first-order self-model level and model the agent's first-order adaptivity: W-states and T-states representing connectivity and aggregation characteristics of the base-Aggregation for base agent states: o Sensing states use the Euclidean function eucl from Table 1 of order n = 1 and scaling factor λ = 1.1.o Interpersonal synchrony detector states use the synchrony detection function compdiff from Table 1.o All other base agent states use the logistic function alogistic from Table 1 with steepness σ = 5 and adaptive thresholds modelled by within-agent first-order self-model T-states.
• Connectivity for base agent states: o Within-agent connections for representing, responding, and executing have adaptive weights modelled by within-agent firstorder self-model W-states.o Within-agent connections for representing stimulus s and from representation states of s to preparation states have weights 1. • Aggregation for base world states: o World state ws s uses the stimulus repetition function stepmod from Table 1 with repetition ρ = 80 time units and step time δ = 40 time units.o Context states use the identity function via Euclidean function eucl from Table 1 of order n = 1 and scaling factor λ = 1.

Table 4
Overall overview of the adaptive dynamical system model.
• Second-order self-model H W -and H T -states representing the speed factors of the first-order self-

Table 5
First-order self-model T-states and W-states of agent A in the adaptive dynamical systems model: modelling excitability thresholds and connection weights.For the other agents, similar first-order self-model states are used: X 179 to X 211 for B, X 212 to X 244 for C, X 245 to X 278 for D.
First-order self-model state for the weight of A's internal connection from representing verbal action v of B to preparing for verbal action v X 158 W rep-prep m, C,A First-order self-model state for the weight of A's internal connection from representing movement m of C to T rep C,v,A First-order self-model state for the excitability threshold of A's sensory representation state rep v,A for verbal response v of C X 173 T rep D,m,A First-order self-model state for the excitability threshold of A's sensory representation state rep m,A for movement m of D X 174 T rep D,b,A First-order self-model state for the excitability threshold of A's sensory representation state rep b,A for affective response b of D

Table 5
(continued )First-order self-model state for the excitability threshold of A's sensory representation state rep v,A for verbal response v of D X 176 T exec m,A First-order self-model state for the excitability threshold of A's execution state move m,A for movement m X 177 T exec b,A First-order self-model state for the excitability threshold of A's execution state state express b,A for affective response b X 178 T exec v,A First-order self-model state for the excitability threshold of A's execution state state talk v,A for verbal response v

Table 6
Second-order self-model H W -, H T -, W W -and W T -states for agent A in the adaptive dynamical systems model: modelling the adaptation speed of the T-states and W-states and the connection weights of their incoming connections.For the other agents, similar second-order self-model states are used: X 346 to X 349 for B, X 350 to X 353 for C, X 354 to X 357 for D.
ASecond-order self-model state for the weights of the incoming connections of the first-order self-model T-states for A (from the respective synchrony detector states)

Table 7
The interaction-related base states that are examined to analyse behaviour adaptation.
adaptivity for the T-states and W-states.Second-order self-model W Wstates and W T -states control the strength of the incoming connections to the T-states and W-states and through that the strengths of the activations of the T-states and W-states.For the second-order self-model states we expect that.

Table
Comparisons of mean values for types of states over all 80 individuals for Phase 1 and Phase 2.