Computational models of the “active self” and its disturbances in schizophrenia

The notion that self-disorders are at the root of the emergence of schizophrenia rather than a symptom of the disease, is getting more traction in the cognitive sciences. This is in line with philosophical approaches that consider an enactive self, constituted through action and interaction with the environment. We thereby analyze different definitions of the self and evaluate various computational theories lending to these ideas. Bayesian and predictive processing are promising approaches for computational modeling of the "active self". We evaluate their implementation and challenges in computational psychiatry and cognitive developmental robotics. We describe how and why embodied robotic systems provide a valuable tool in psychiatry to assess, validate, and simulate mechanisms of self-disorders. Specifically, mechanisms involving sensorimotor learning, prediction, and self-other distinction, can be assessed with artificial agents. This link can provide essential insights to the formation of the self and new avenues in the treatment of psychiatric disorders.


Introduction
Computational psychiatry is a recent approach that aims at explaining psychiatric disorders on a computational level. Specifically, different disorders are modelled within a complex cognitive system by looking at aberrant computations in the brain. One disorder that has been of interest is schizophrenia because of the effective translation of psychotic symptoms into a predictive coding framework (Sterzer et al., 2018;Heinz et al., 2019).
The World Health Organization defines schizophrenia as a "severe mental disorder characterized by profound disruptions in thinking, affecting language, perception, and the sense of self. It often includes psychotic experiences, such as hearing voices or delusions" (World Health Organization, 2001). Much research focus has been placed on its cognitive symptoms (Addington, Addington, & Maticka-Tyndale, 1991;Andreasen, Arndt, Alliger, Miller, & Flaum, 1995;Rector, Beck, & Stolar, 2005;Simpson, Kellendonk, & from observing and thinking (hyperautomaticity) (De Haan & Fuchs, 2010). In some patients this can lead to solipsistic delusions, in which an individual ontological reality exists (Parnas & Sass, 2001), or the existence of other sentient beings and even the external world outside one's conscious experience is denied completely (Bradley, 2016). For clinical case studies, see Parnas and Sass (2001). On the other extreme, self-disorders can also manifest in exaggerated sensation monitoring and self-consciousness (hyper-reflexivity) (Sass & Parnas, 2003). However, recent evidence suggests the experiences of hearing voices might be more common in the non-helpseeking population. Especially in the clairaudient psychics community, evidence suggest they were less distressed by their auditory hallucinations and the reception from their peers about their experience was more likely to be positive, leading to a lesser disruption of their social relationships. Interestingly, psychics showed to have more agency over their auditory hallucinations. Specifically, they were able to control the onset and offset of their voice-hearing experience (Powers, Kelley, & Corlett, 2017). Overall, this implies the continuum from health to disease might be larger than suggested by the WHO. While most psychiatrists treat schizophrenia as a mainly cognitive disorder, a growing body of literature views schizophrenia as a disorder rooted in a disconnectedness from one's body and a lack of intercorporal attunement that ultimately leads to the loss of self and intentionality. Recognizing the fundamental role of embodiment is therefore crucial in understanding schizophrenia (De Haan & Fuchs, 2010;Fuchs, 2005). In fact, a phenomenological exploration in patients suffering from schizophrenia reveals multiple layers of disconnectedness from the world and a breakdown of their perception of the world into separate chunks rather than a synthesis thereof (De Haan & Fuchs, 2010;Stanghellini, 2004). This points to the importance of analyzing subjective experiences when investigating self-disorders.
In order to regain the connection to the world, Sass and Parnas (2003) argue, hyperreflectivity and hyperautomatic behavior are used as a coping strategy of a disembodied mind rather than it being a mere symptom (Sass & Parnas, 2003;Sass, 2004;Sass & Parnas, 2007). However, there is still an ongoing debate on the extent to which these mechanisms remain conscious or unconscious; while it has been argued that schizophrenia can be described as a deficiency of automatic processes that lead to aberrant contents of consciousness, others argue for a clear distinction between them. Several authors hypothesized that schizophrenia is characterized by a failure of automatic processing leading to abnormal contents of consciousness (Gray, Feldon, Rawlins, Hemsley, & Smith, 1991;Maher, 1983;Frith, 1979). For example, it has been proposed that features that are usually unconscious (e.g. physiological processes, automatisms) glide into conscious awareness in patients with schizophrenia, due to aberrations in information processing (Gray et al., 1991). On the other hand, alternative approaches focus on unconscious dynamics that are distinct from conscious processes, and explain self-disturbances such as a decreased feeling of having a minimal self as being linked to general impairments in perception (Giersch & Mishara, 2017).
The rather new approach of computational psychiatry aims to characterize these mental disorders through multi-leveled aberrant neuronal computations (Montague, Dolan, Friston, & Dayan, 2012). By investigating aberrant computations in the brain, we could describe psychiatric disorders, disruptions of the self, and the senses of ownership and agency through these computational alterations.

Computational models of perception, action, and the self
Many ideas of modern computational psychiatry and cognitive robotics are based on the ideas of Bayes theorem (Bayes, 1763) and Helmholtz's unconscious inference (Helmholtz, 1867). According to Helmholtz, the motor organ serves as a tool for exploration in order to deduce invariances by trial and error, thus deriving real world knowledge (Westheimer, 2008). In the following sections, the adaptation of different approaches in computational psychiatry will be described.

The Bayesian brain hypothesis
A growing body of literature (Barber, Clark, & Anderson, 2003;Khrennikov, 2004;Rao, Olshausen, & Lewicki, 2002) points to the notion that the brain represents sensory information in the form of probability distributions rather than in a deterministic "as it is" manner (Knill & Pouget, 2004).
In short, a "Bayesian brain" continuously improves its fit by updating its model to minimize the error Friston, 2003;Friston, FitzGerald, Rigoli, Schwartenbeck, & Pezzulo, 2017). In Bayesian terms, perception can be regarded as the brain's inference about the origin of sensory information that is calculated by comparing sensory information with predictions based on their priors (Jardri & Deneve, 2013). Specifically, this Bayesian brain model consists of three distributions: the prior, the likelihood, and the posterior. Each distribution can be thought of as a probabilistic Gaussian curve. The prior refers to the model's prediction, in other words the expectations of an agent. The likelihood refers to the sensory input. The posterior can be seen as the perception that represents a compromise between the prior and the sensory evidence. Each of the three distributions can be altered in precision, i.e. through clinical conditions like schizophrenia, or certain drugs. Conceptually, precision can be thought of as a steeper or flatter Gaussian curve (alpha), that influences the probability of e.g. the sensory evidence that would ultimately influence perception. A mismatch between the prior belief and the likelihood results in a prediction error that can be used to update the model's priors. The higher the inverse variance of the likelihood, the higher the error signal will be, and thus more influential in updating the brain's model (Adams, Brown, & Friston, 2014). However, if the prior also has a high precision, the prior is computationally more resistant to being updated. The brain is therefore required to balance the precision of the priors and likelihood to minimize prediction errors (Adams et al., 2014;Fletcher & Frith, 2009). Under these assumptions, schizophrenia can be understood as the result of an imbalance of prior and likelihood precision (Humpston & Broome, 2020;Horga & Abi-Dargham, 2019;Stephan & Mathys, 2014). This model will be discussed in more detail in Section 3.2.

Predictive processing models
Based on Bayesian statistics, the "predictive coding framework" and its hierarchical application in a cognitive context, the predictive processing approach, were developed. These postulate that the minimization of free energy is approximately equivalent to the maximization of model evidence, which corresponds to the maximization of the mutual information between sensory input and internal representations Friston, 2005). Simply put, the brain engages in testing the accuracy of its internal model by formulating hypotheses for the sensory origin of its sensation, and resolving discrepancies by either revising its model, or by changing the bottom-up information to match the predictions (Limanowski & Blankenburg, 2013).
The predictive coding framework is nowadays widely used for research in many fields such as robotics, data science, neuroscience, and computational psychiatry. The strength of this framework is that it can be employed as a "common language" for modeling approaches, which facilitates interdisciplinary research. Also, with the help of the predictive coding framework, closer links between the phenomenology of the self and conscious experiences, and their mechanistic properties of neural substances can be analyzed (Hohwy & Seth, 2020). Specifically, the assumptions of predictive coding are well represented in the neuroanatomical structure of the cortex, indicating that the encoding of neuronal populations is indeed probabilistic (Clark, 2013;Bastos et al., 2012). As reviewed in detail by Hohwy and Seth (2020), the most promising approaches in the science of consciousness and the self indeed have uncertainty reduction and top-down signalling as a major foundation (e.g. integrated information theory, global neuronal workspace theory, recurrent processing theory; for a review, see Hohwy & Seth, 2020). Specifically, it seems that conscious systems have a tendency to settle in one unified, highly informative representational state (and maintain this homeostasis), by reducing uncertainty through learning and information integration (Tononi, Boly, Massimini, & Koch, 2016).
Furthermore, it seems unlikely that a self can emerge without top-down signalling, complementing bottom-up signalling, shown e. g. by anesthesia studies that indicate a disruption of top-down signal loops (Boly et al., 2012). Areas in which predictive coding surprise minimization is thought to take place are perception, action, attention, recognition, understanding, and exploration. Specifically, while the internal model always undergoes updates (perception), this process can be guided through active inference. Concretely, by engaging motor actions, specific sensory evidence can be acquired (action). Moreover, policies for behavior can be selected with the goal to reduce the expected prediction error. Goals can thereby be seen as priors within the predictive coding framework, guiding the action command of an agent. Sensory evidence can be further modulated by attention guided alterations of precision. Specifically, prediction error minimization is affected by the precision of sensory evidence, that can be increased by focusing on a specific target (attention). Lastly, following the Bayesian notion, model complexity is penalized which ultimately leads to simpler models in which prediction errors can be minimized as in the case of recognition, understanding, exploration.
It is hypothesized that by these statistical inferences, the brain generates an embodied self-model that on the long run, behaves as a representational system. Since the error dynamics of the world are ever changing, this system has to be capricious to maintain homeostasis, meaning it has to be hierarchically constructed, and capable of forming meta-expectations (Hohwy & Seth, 2020).

The comparator model
In order to have a subjective experience of oneself in the world, it is crucial that we can implicitly draw a boundary between ourselves and others. While the source of one's movement is important for self-identification, self-other distinction in terms of action intention attribution requires goal orientation in order for an agent to understand and attribute actions of others (Jeannerod, 2007). Tsakiris (2010) gives a neurocognitive model for the development of ownership (Tsakiris, 2010). According to this model, ownership develops through the interaction of multisensory input and multiple self-related internal models of the agent. First, a model of one's body enables the distinction between that which belongs to one's body, and that which does not. Moreover, the representation of the location of own body parts in space modulates the sensory information which might lead to a recalibration of the visual and tactile positions. Ultimately, the resulting coordinate system of tactile sensations leads to a subjective experience of body ownership (Tsakiris, 2010).
According to the comparator model, whenever the brain initiates a new movement, an internal prediction model for this movement is generated which is subsequently compared to the actual movement achieved. If there is a match between the actual movement and its prediction, action authorship is more strongly attributed to one's own body movement and thus sense of agency is perceived (David, Newen, & Vogeley, 2008). While the comparator model is an approach that is often used as an objective indicator to describe whether an agent subjectively experiences agency based on a mismatch between predicted and actual action (i.e. a low prediction error is interpreted as indicating high sense of agency), this approach has been criticized as being too simplistic (Lanillos, Pagès, & Cheng, 2020b;Zaadnoordijk, Besold, & Hunnius, 2019;Synofzik, Vosgerau, & Newen, 2008). Some authors stressed the more complex interplay between sensory input and efference copy signals (Synofzik, Vosgerau, & Voss, 2013). More specifically, the comparator model has been criticized as lacking important top-down components, i.e. sensorimotor contingency detection and causal inference, that are important for a more realistic model of self-detection (Lanillos et al., 2020b). Based on a model by Wegner (2017), Lanillos et al. (2020b) proposed to extend the current comparator model to the "double comparator" model that also takes processes like spatiotemporal contingency into account, to perform a distinction between the self and other.
Motor commands activate neuronal discharges-the efference copy or corollary discharge-that affects activity in both sensory and motor pathways, allowing an organism (fish, insects, mammals, etc.) to monitor and if necessary alter motor activity before muscle contraction actually takes place. Furthermore, it gives an agent the information whether a movement is self-generated or caused by an external source, leading to a successful mechanism of self-other distinction. According to this theory, sensory illusions experienced by patients with schizophrenia (and their alterations in self-other distinction) could result from a disordered internal feedback loop to the high level sensory representations. Since most drugs given to treat symptoms of schizophrenia are capable of producing extrapyramidal syndromes (motor disorders such as dystonia, akathisia, parkinsonism, and in some cases tardive dyskinesia), they might alter internal striatal pathway communication (Feinberg, 1978). Specifically, efference copies are sent via pyramidal tract neurons to the dorsal striatum, synapsing with inhibitory GABAergic neurons (Fee, 2014;Shipp, 2017). The hypothesis is that excessive dopaminergic signalling leads to a stronger inhibition of the striatal transmission of the efference signal, thus impeding the internal monitoring of one's movement (efference copy), and producing extrapyramidal syndromes (McCutcheon, Abi-Dargham, & Howes, 2019). This efference copy model has also been proposed as an explanation for why we cannot tickle ourselves. By attenuating selfproduced tactile sensations through accurate sensory predictions of a forward model, we interpret this sensation as being caused by ourselves and thus as non-harmful, while if this sensory evidence was not predicted (and therefore surprising), we might be ticklish. This predictive mechanism is thought to be aberrant in patients with hallucinations or passivity experiences (e.g. schizophrenia) who have been found to be able to tickle themselves Pynn & DeSouza, 2013). Similar to the corollary discharge, the (double) comparator model has been proposed to explain sense of agency. Whenever the brain initiates a new movement, an internal prediction model for this movement is also generated that is then compared to the actual movement. If there is a match between the movement and its prediction, action authorship is more strongly attributed to one's own body movement and thus sense of agency is perceived (David et al., 2008). A graphical depiction of the classic comparator model is shown in Fig. 1.

Neuroscientific research on human self-models
There is ongoing discussion about the influence of top-down vs. bottom-up processes on sense of agency. While bottom-up approaches describe sub-systems that lead to the emergent property, top-down approaches gain insight by breaking down a system into its sub-systems.
Top-down influences such as psychotherapy, and bottom-up approaches like pharmacological interventions can substantially modify subjective experience (Fuchs, 2009). Famous examples of drugs that can lead to profound self-transformations are classic psychedelics like lysergic acid diethylamide (LSD), dimethyltryptamine (DMT), psilocybin, and mescaline (Baumeister & Exline, 2002;Preller et al., 2019;Timmermann et al., 2018;Mason et al., 2020;Hermle et al., 1992), but also clinically used substances such as ketamine (Vlisides et al., 2018). Nonetheless, it seems most likely that we perceive the world rather as top-down predictions of the world. Predictions that are then fine-tuned through bottom-up sensory experience (Hohwy & Seth, 2020). Ultimately, it is the interaction of bottom-up and top-down processes that shapes the beliefs and percepts which cannot be attributed to a single factor alone. Specifically, there is evidence that Glutamate may mediate top-down signals at NMDA receptors, and bottom-up signals at α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) receptors while its integration may be mediated by dopamine (Wilson, Humpston, and Nathan, 2021;Sterzer et al., 2018).
On the other hand, much research has been conducted to investigate what underlies the self-model in terms of neuronal activity and brain function. Evidence for a strong top-down influence of certain brain areas that mediate the emergence and constitution of the self comes from ample optogenetic, transcranial magnetic stimulation (TMS), electrophysiological, and brain imaging studies. Some examples will be described below.
By using electrical pulses or light impulses, specific areas in the neuronal chain can be (de-) activated, thus giving the possibility to empirically investigate the self. Of interest is for example the Cortical Midline Structure (CMS) which is involved in mechanisms of selfreflection. Specifically, if the CMS is damaged, patients show impairments in evaluating problems they encounter and an  Recent studies using Electroencephalography (EEG) tried to disentangle neuronal bottom-up and top-down loops that are required for perception. Specifically, frequency-tagging could allow to unravel loops for prior expectation and attention. Moreover, they could show that expectation and attention increases perceptual top-down and bottom-up integration (Gordon, Tsuchiya, Koenig-Robert, & Hohwy, 2019).
Many forward and backward loops between neuronal populations in the human visual cortex share the same frequency. Microcircuits along feedforward and feedback projections in the human visual cortex show synchrony in their α, β, and γ-frequency bands, which is in line with circular inference models of brain dynamics (Leptourgos, Denève, & Jardri, 2017;Michalareas et al., 2016). Furthermore, semi-synchronous γ-frequency (40-70 Hertz) between neuronal populations pose a potential candidate as a neural correlate of consciousness (Michalareas et al., 2016). Also magnetoencephalography (MEG) experiments show evidence for the influence of probabilistic top-down priors on perception (Aru, Rutiku, Wibral, Singer, & Melloni, 2016). The sense of self is a system-wide phenomenon thought to involve many brain regions with varying degrees of interconnectedness (Northoff et al., 2006;Tsakiris, Hesse, Boy, Haggard, & Fink, 2007;Knyazev, 2013;Tsakiris, 2017). Therefore, by inducing disturbances of the self through stimulating the implicated brain areas, one can pinpoint locations in the brain that are arguably involved in the sense of self. Evidence from TMS studies shows that different body parts are represented along the motor cortex (homunculus). By stimulating these areas, certain movements can be forced, e.g. the twitching of a finger or involuntary arm movements (Barker, Jalinous, & Freeston, 1985;Ziemann, Wittenberg, & Cohen, 2002). In another TMS study, Blanke et al. (2005) showed that stimulating the temporoparietal junction (TPJ) impairs mental transformation of the bodily self. Also, evoked potential mapping revealed that the TPJ is active 330-400 ms after stimulus onset when participants were asked to imagine themselves in the position and visual perspective that is usually reported by people reporting Out-of-body experiences (OBE). In a case-study of an epileptic patient with OBEs originating from the TPJ, partial activation of the seizure area during mental transformation of her body and visual perspective was seen. However, activation of other cortical sites was not found (Blanke et al., 2005). Therefore, it is suggested that the TPJ is a central brain area for conscious non-corporal self-experience, mediating spatial unity of self and body, while damage to this region could lead to pathological (temporal) loss of self such as in an OBE (Blanke et al., 2005;Blanke, Landis, Spinelli, & Seeck, 2004;Blanke, Ortigue, Landis, & Seeck, 2002). In another case-study by Blanke et al. (2002), the authors succeeded to repeatedly invoke corporal (as opposed to non-corporal visual hallucinations of the TPJ) OBEs in a patient undergoing epilepsy treatment by electrically stimulating the patient's right angular gyrus. Results suggest that the gyrus angularis could be a crucial node in a large neural circuit, involved in mediating complex own-body perceptions. Also, the experience of dissociation from ones body arises through a failure of the integration of complex somatosensory and vestibular information. These findings are in line with other studies that induced the feeling of a "proximal sentient being" and sensed presence within the laboratory. By applying pulsed magnetic fields over the temporoparietal region while wearing opaque goggles in a quiet room, a sense of presence and an experience of "another consciousness" could be induced in two thirds of the participants. This indicates that an altered sense of self, as well as ephemeral phenomena, like visitations by so called "spirits" or "gods" have a top-down guided Neuronal Brain Correlate of Consciousness (NCC) (Persinger & Healey, 2002).
Recent fMRI research in a psilocybin study indicates that glutamate might play an important role for ego-dissolution that is common in psilocybin experiences. Specifically, higher levels of medial prefrontal cortical glutamate were associated with negatively experienced ego-dissolution, while lower levels of hippocampal glutamate were associated with positively experienced ego dissolution (Mason et al., 2020). Additionally, gamma-aminobutyric acid (GABA) concentration deficits are found in the occipital cortex of patients with schizophrenia that are correlated with impaired visual inhibition (Yoon et al., 2010). The effect of GABA is specifically the inhibition synapses and its deficit in patients with schizophrenia is believed to cause the cognitive impairments in patients with schizophrenia (Cho, Konecky, & Carter, 2006). Multiple studies indicate that the posterior insular cortex is one central node for interoception and interactions with motor, somatosensory, and limbic systems (Ebisch & Gallese, 2015;Augustine, 1996;Craig, 2002;Craig & Craig, 2009) and probably contributes to self-awareness (Tsakiris et al., 2007). Specifically the neuronal activation of the posterior insular cortex is positively associated with the experience of the "rubber hand illusion" (Tsakiris et al., 2007). In this famous illusion, a rubber hand is placed in front of the participant, while their real hand is hidden from view. If the rubber hand and the participant's own hand are stroked synchronously with a brush, a multimodal conflict is induced which might lead to the experience that the rubber hand feel like one's own hand. Studies investigating an impaired sense of agency in patients with schizophrenia revealed an aberrant activation of the posterior insular cortex (Farrer et al., 2004). In a computational sense, the rubber hand illusion is thought to result from the integration of visual, tactile, and proprioceptive information and can be explained by the inference of a common cause thereof within a Bayesian causal inference with optimal multisensory integration (Samad, Chung, & Shams, 2015). This points to the notion that computational aberrations in patients with schizophrenia may lead to an altered experience of disembodiment and depersonalization. In fact, active inference as well as predictive processing have already been put forward as a promising account of symptoms of depersonalization and disembodiment and as a model for enacted existence in general (Deane, Miller, & Wilkinson, 2020;Seth, Suzuki, & Critchley, 2012;Gerrans, 2019;Hesp et al., 2021). Recently, emotions and "valenced bodily feelings" have also been proposed to represent a feedback source of information about the predictive success of an agent that all fundamentally shape the disturbances of the "minimal self" (Gerrans, 2019;Deane et al., 2020;Hesp et al., 2021).

Guiding principles of computational psychiatry
The rather new field of computational psychiatry formalizes brain functions mathematically or computationally to characterize mechanisms of psychopathology (Friston, Stephan, Montague, & Dolan, 2014). Computational psychiatry aims to deliver explanations especially of aberrant mental conditions through methods of calculations . In other words, the approach is to examine psychiatric conditions by looking at the disruptions in information flow and to gather knowledge about the principle governing brain function with the help of computational and mathematical models.
The incentive is that psychiatric conditions such as schizophrenia can be better understood by artificially altering computational models. Especially with newer theories such as the free energy principle and the Bayesian brain hypothesis, these mechanisms can be examined more directly in artificial agents, where the calculations are more easily accessible, compared to biological agents.
The sensible link between computational psychiatry and cognitive developmental robotics arises from the fact that both fields are concerned with the information flow in a complex cognitive system. Using cognitive developmental robotics provides the advantage of using a physical body. First, embodiment is important not just because it is closer to reality, but especially for self-disorders, since it is believed that self-disorders arise from a fundamental disconnectedness from one's body, rather than being a mere symptom.
Second, the developmental aspect is crucial in that a self is arguably an emergent property coming from developmental processes (Wolputte, 2004;Piaget, 1954;Erikson, 1950;Rochat, 2003). Therefore, one needs to have a developmental approach when modeling self-disorders in a robot. While schizophrenia is mainly diagnosed as a mental disorder characterized by its symptoms of altered perception, thoughts, mood, and behavior (National Collaborating Centre for Mental Health, 2014), its behavioral prodromes, felt disconnectedness towards one's body, and alterations in brain physiology are seen much earlier. In fact, the neurodevelopmental model of schizophrenia posits that the symptoms of schizophrenia are the end state of an aberrant neurodevelopmental process rather than a degenerative process (Rapoport, Giedd, & Gogtay, 2012;Murray & Lewis, 1987;Weinberger, Berman, & Zec, 1986;Insel, 2010;Owen, O'Donovan, Thapar, & Craddock, 2011). Even prenatally, the placental pathology is an indicator for the risk of developing schizophrenia (Rapoport et al., 2012). Therefore, these aberrant developmental processes can be simulated more thoroughly by using cognitive developmental robotics as opposed to non-embodied simulations.
An example where computational psychiatry is especially interesting is the case of apperceptive agnosia. Patients with apperceptive agnosia have an intact visual field and functional and consciously perceived low-level visual perceptions, but fail to recognize objects they are looking at, distinguish between different shapes, or copy the same shape. This disorder is often a result of selective lesions in the occipital and temporal cortex caused by a lack of oxygen or carbon monoxide poisoning (Heider, 2000). Even though patients with apperceptive agnosia have an integrated and coherent world-model, certain gestalt cues for organizing their visual perception are lacking (Metzinger, 2014). There is more evidence from disorders like autopagnosia in which patients cannot name, identify, or even localize their own body parts, which is also caused by cortical brain damage. Another interesting clinical picture where multimodal integration fails is disjunctive agnosia. Patients with this disorder cannot merge their visual with their auditory sensory input (Metzinger, 2014).
By taking a computational approach, these syndromes emerge through a disconnectedness of certain neural pathways and corresponding alterations of prior beliefs of underlying likelihood distributions (Parr, Rees, & Friston, 2018). This approach can provide explanations for various clinical conditions, e.g. the phantom limb syndrome De Ridder, Vanneste, & Freeman, 2014). In the phantom limb syndrome, patients who underwent an amputation of a limb still experience phantom sensory percepts, in some cases even pain. One of the most commonly cited explanations for this syndrome is cortical reorganization Table 1 Marr's three level computational approach of information processing tasks.

Framework
What is the goal of the computation, why is it appropriate, and what is the logic of the strategy by which it can be carried out?
How can this computational theory be implemented?
In particular, what is the representation for the input and output, and what is the algorithm for the transformation?
How can the representation and algorithm be realized physically?

Schizophrenia
Phenomenological anomalies of selfhood e.g. free energy principle framework Aberrant predictive coding and active inference Neurobiological anomalies (e.g. dopamine, glutamate, GABA) Neurophysiological anomalies (e.g. gray matter loss, enlarged lateral brain ventricles)

Robotics
Bayesian inference of the latent causes of sensory evidence Predictive coding artificial neural networks and using the robots physical body, e.g. arms, legs, sensors.
Marr's three levels for machine information processing: The computational level refers to what the device is doing and why. The algorithmic level to how input, output, and transformations are represented. The implementational level refers to how to processes are implemented physically. Adapted from Marr (1982). (Baron, Binder, & Wasner, 2010;Ramachandran, Brang, & McGeoch, 2010;Flor, Nikolajsen, & Jensen, 2006). One nonpharmacological treatment is the mirror therapy (Ramachandran & Rogers-Ramachandran, 1996). In this treatment, a mirror is placed parasagittally between the intact and the missing limb and the patient is instructed to move the missing, and the intact limb in the same manner. Thereby, the patient sees the phantom limb movements as reflection in the mirror. This movement might resolve the visual-proprioceptive dissociation in the brain and might ultimately lead to a symptom reduction (Ramachandran & Rogers-Ramachandran, 1996;Feinberg, 2011;Subedi & Grossberg, 2011). Overall, computational psychiatry can be utilized to deliver explanations and offer targets for finding possible treatments or strategies for coping with symptoms of various disorders. Using a computational approach to describe complex cognitive phenomena is not entirely new. Already in 1982, computational approaches to investigating visual processes have been described in David Marr's Vision (Marr, 1982). His general framework has sparked debates about levels of explanation, the nature of computation, externalism vs. internalism in computational theories of mind, association of computation and content, as well as top-down versus bottom-up methodology (Marr, 1982;Shagrir, 2010). Marr argues that a computational theory consists of two building blocks that are the "what", the "why". Concretely, the "what" describes the function that is calculated, while the "how" specifies the algorithm of use, to calculate the function (Marr, 1982;Shagrir, 2010). In his framework that fundamentally influenced the philosophy of neuroscience but was, however, not without critics, he postulates three levels of information processing. Specifically, he loosely couples three different aspects that are "Computational theory", "Representation and hardware", and "Hardware implementation". His description of the three levels is presented in Table 1. Bayesian inference of the latent causes of sensory evidence as a goal can thereby be seen as the first level of Marr (computational theory) (Marr, 1982;Aitchison & Lengyel, 2017). The predictive coding framework can be seen as one biologically plausible mechanism that reflects the second level (representational/ algorithmic implementation) as a utilization of Bayesian inference Aitchison & Lengyel, 2017;Valton, Romaniuk, Steele, Lawrie, & Seriès, 2017). The third level (hardware implementation) entails that every algorithm must be executed within a physical system. This physical system can be of biological origin (e.g. humans) but importantly can also be an artificial agent (e.g. robots). Particularly, in the case of schizophrenia in humans, the physical system often entails neurobiological and neurophysiological anomalies, e.g. structural deviations and neurotransmitter irregularities.

Explaining the self-disorders in schizophrenia
Previous attempts at explaining the self-disorders in schizophrenia investigated them on the symptom level, with the goal of trying to explain symptoms rather than viewing them as the root cause of the mental disorder itself. Specifically, current diagnosis schemas for schizophrenia often involve the diagnostic tools "Structured Clinical Interview for DSM Disorders" (SCID), "the Scale for the Assessment of Positive Symptoms" (SAPS), and the "Scale for the Assessment of Negative Symptoms" (SANS), that assess schizophrenia on positive, negative, and cognitive symptoms. However, newer research in schizophrenia points to the notion that schizophrenia rather represents a fundamental self-disturbance (or "ipseity-disorder") out of which the positive, negative, and cognitive symptoms emerge. Nonetheless, these self-disturbances are often neglected in contemporary psychiatry. Symptoms include feelings like a longtime persisting identity void and feeling of self-transformation. Other neglected symptoms are a disturbed stream of consciousness, self-awareness, corporeality, demarcation, and existential reorientation, all of which are interrelated. The Research Domain Criteria (RDoC) can provide a more modern taxonomical framework that address the heterogeneity of mental disorders (Insel & Lieberman, 2013). Furthermore, schizophrenia can be assessed in more phenomenological detail using the "Examination of Anomalous Self-Experience" (EASE) instrument (Parnas et al., 2005), a semi-structured clinical interview focusing more strongly on the experiential and phenomenological anomalies of schizophrenia spectrum disorders. Different explanatory approaches have been used, namely, machine learning approaches, comparator model-based approaches, biological approaches, predictive coding-based approaches, as well as circular inference-based approaches (Section 3.2.3). In this section we will present example studies that were based on these approaches.
Machine learning approaches have been used to identify variables that have predictive value for the clinical prognosis of patients with schizophrenia. Specifically, Koutsouleris et al. (2016) used Kaplan-Meier log-rank analyses to predict discontinuation and readmission to the hospital in patients with poor versus good treatment outcome prediction. In addition, generalized linear mixedeffects models were used to identify factors that predict a positive or negative treatment outcome of patients with schizophrenia after 4 and 52 weeks. Concretely, previous depressive episodes, male sex, and suicidality were all identified as risk factors for the one year period. Additionally, unemployment, poor education, functional deficits, and unmet psychosocial needs predicted a bad outcome for both the 4-week and 1 year outcomes. Furthermore, their analysis was used to asses the efficacy of and comparison between certain antipsychotic medications (Koutsouleris et al., 2016).
Beside general prognostic tools, many models that focus on explaining self-disorders have been proposed. One of the earliest was the comparator model (Feinberg, 1978;Wolpert, Ghahramani, & Jordan, 1995) (see Section 2.3). With the help of the comparator model, studies aimed at explaining self-disorders such as in patients suffering from schizophrenia and reporting thought insertion originating from external sources. According to comparator model-based approaches, if the efferent copy is not correctly transmitted, this causes a mismatch between the representation or prediction of a movement vs. the movement that was actually executed. However, sometimes patients with schizophrenia feel more agency, and also the role of dopamine in the model is lacking. Specifically, the neurotransmitter dopamine is involved in the encoding of salience (precision) during the processing of information . Since this crucial part of encoding the salience of a stimulus is lacking, the comparator model can be seen as reductionist and unable to explain certain symptoms (e.g. thought insertion) (Frith, 2012).
The investigation of aberrant glutamate signaling between brain regions has also been a focus of computational psychiatry studies.
For example, in one study, synaptic disinhibition on interneurons through NMDA receptor disturbances has been modelled as a network model of spatial working memory using behavioral data from participants who performed a spatial working memory task. Some of these participants have been given ketamine, thus inducing NMDA disinhibition and simulated certain symptoms of schizophrenia .

Predictive processing-based models
Studying disorders in perception of the bodily self have been recognized as promising avenues for modeling mechanisms hypothesized to underlie symptoms of schizophrenia. In a comprehensive review, Lanillos et al. (2020a) reviewed and analyzed neural network models of autism spectrum disorders and schizophrenia, and compared Bayesian approaches such as circular inference and predictive coding models. They discussed neural network model implementations that resulted in similar phenomena observed in autism spectrum disorder and schizophrenia based on underlying mechanisms such as circular belief propagation (Section 3.2.3), weak priors and aberrant precision weighting which leads to an imbalance in integration of priors and sensory signals Philippsen & Nagai, 2020b), and network dysconnectivity leading to motor phenomena similar to those observed in schizophrenia (Yamashita & Tani, 2012).
In predictive coding, perception is the result of a balanced integration of priors with sensory input. This balance is disturbed if the precision of the prior is higher (hyper prior), leading to a stronger reliance of the posterior on the prior, and less on the sensory evidence. Low prior precision would then lead to a stronger reliance on sensory evidence. This has been implemented in a series of studies with computational neural network models Philippsen & Nagai, 2020b), that showed how aberrant reliance on priors during training, could result in impairments in the network's internal representation, as well as replicated behavioral findings from humans and chimpanzees in a representational drawing task. These results are discussed in the context of autism spectrum disorder, as the heterogeneity in symptoms could be explained by this demonstrated mechanism of over-reliance on either predictions or sensory evidence in the course of development .
On a purely mathematical sense, and as displayed in Fig. 3, in the case of higher precision of the prior, the perception can be computationally more influenced by the prior, and there will be less reliance on the sensory evidence. In the same way, in the case of lower precision of the prior, the integration of the perception will rely more on the sensory evidence, and the posterior will shift towards it, i.e., there is a trade-off between the precision of the prior, and the variance of the sensory evidence.
Nowadays, a strong emphasis on explaining self-disorders computationally relies on the predictive coding framework. However, the classical interpretation of the predictive coding framework struggles to explain symptoms that are detached from kinematics and sensations (e.g. thought insertion) (Frith, 2012). Specifically, it is debated how predictions can be made and how the prior is updated when sensory feedback of internal processes (e.g. thoughts) is lacking. This is because it is challenging to view thoughts as actions in the context of the generative predictive model (forward model) (see Section 3.3 for discussion).

Circular inference
Circular inference or circular belief propagation is a promising computational approach that builds on Bayesian modeling and the predictive coding framework (Jardri & Deneve, 2013). In order to make sense of the world, the brain uses a generative model that hierarchically represents the causal links between variables that underlie events. This model consists of nodes that represent these (latent) variables, and edges that represent conditional dependencies between them. Bottom-up processing is the process of sensory information going up the hierarchy in a feedforward way. At the same time, top-down processing refers to prior information passing down the hierarchy as feedback.
The sum-product algorithm or belief propagation is one way to perform inference in this generative model. Information is propagated through the system in a feedforward and feedback manner, in the form of beliefs. These beliefs regarding the underlying variables (or causes) are calculated at each node, which sends messages to neighboring nodes. Each node then sums up the information from all its neighbors. The messages passing from node to node depend on the belief of the sending node, after subtracting the effect that the receiving node has on the sending node. It is important to subtract the message from the receiving node, because otherwise, the algorithm would produce loops, i.e. bottom-up and/or top-down information would reverberate. In such a situation when there are loops of bottom-up or top-down information, causes are treated as effects and vice versa. The brain continuously integrates top-down and bottom-up information of different entangled feedback loops. If an effective control mechanism is lacking, processed top-down information might be reused as sensory evidence and thus over-counts the actual sensory evidence. This same process could also hold true for bottom-up processes; By reverberating sensory information to a lower hierarchical level, this information can be mistakenly used as top-down expectation, leading to multiple accounting of "old" priors and sensory information. This might ultimately lead to aberrant belief formations, i.e., prior beliefs that are echoed to lower level hierarchies might be misinterpreted as sensory evidence which could explain the "jumping-to-conclusions" bias of patients with schizophrenia in a sense of an overrepresentation of weak sensory evidence (Huq, Garety, & Hemsley, 1988).
In this framework, bottom-up evidence and top-down predictions are echoed back, leading to a repeated use of already processed information which can explain hallucinations and delusions. Using the same information multiple times is usually avoided when every excitatory loop is compensated by an equally strong inhibitory loop that predicts and cancels out informational redundancies. However, if that is not the case (e.g. in schizophrenia) circular inferences might occur.
In its simplest form, the model consists of three nodes that pass messages up and down the stream. Redundant information is created by sending both information upwards and downwards the information stream which is normally subtracted to avoid redundancies. However, according to this model, some information is not removed in a neuronal system of patients suffering under certain clinical conditions, such as schizophrenia (Jardri & Deneve, 2013;Leptourgos et al., 2017). Fig. 2 presents the model and examples of behavioral consequences of circular inference. We used the experience of paranoia of being followed by the CIA or any other secret service as an example, because it is very common in patients suffering under schizophrenia. In the case of climbing circular inferences (stronger likelihood, weaker priors), sensory evidence is reused, thus the person expects what they see. Even weak evidence for a siren (e.g. reinterpretation of a similar sound), will result in an expectation of a siren, reinforcing itself. In the case of descending circular inferences (stronger priors, weaker likelihood), the person sees what they expect (e.g. expecting to be in danger results in hearing sirens). The case of both ascending and descending reverberating loops, results in the formation of a "frustrated network". In such a network, mutually exclusive facts might be experienced at the same time. For example, hearing a siren and not hearing a siren from the same car. The strength of the belief update is proportional to the prediction error, weighted by the precision ratio of the inverse variance of the likelihood and priors (Petzschner, Weber, Gard, & Stephan, 2017).
In sum, the repeated use of sensory evidence as higher order top-down expectation leads to a sensory over-representation which could explain the origin of hallucinations, delusions, and the subsequent consolidation of delusional beliefs, which is common in patients with schizophrenia. An example of this is the common experience of patients with schizophrenia of being followed by a secret intelligence service. Even little evidence for a siren (e.g. a similar sound) could be reused multiple times which results in hearing a siren and the formation of the belief that one is followed by a secret intelligence service and imminent danger. At the same time, the feeling of being surveilled could enhance the perception of seeing black cars everywhere (that are often used by secret intelligence services) which leads to hallucinating a siren, further enhancing the paranoia.
Some patients are also impaired in their downward loops, and thus, over-count their priors which could explain the diversity in patients' behavior (Jardri & Deneve, 2013;Tandon, Nasrallah, & Keshavan, 2009). Specifically, some patients with schizophrenia sometimes assume to have significantly more agency over their own actions, while other patients experience significantly less agency compared to their objective level of agency (Proust, 2006). The circular belief propagation model aligns well with dysconnectivity hypotheses of imaging studies in patients in schizophrenia (Jardri & Deneve, 2013 Amad et al., 2014). Evidence for that comes from e.g. the latent inhibition paradigm (Lubow & Moore, 1959;Lubow, 1973;Swerdlow, Braff, Hartston, Perry, & Geyer, 1996) that tests the capability to filter irrelevant stimuli, a capability that is impaired in patients with schizophrenia. This is an expected outcome when upward inhibitory loops are impaired, leading to over-counting of unreliable sensory information (Jardri & Deneve, 2013).
Belief propagation is considered a biologically plausible mechanism because the algorithm is comparable to mechanisms of propagation and integration of neuronal activity in neural microcircuits (Leptourgos et al., 2017). Specifically, inhibitory loops can remove information redundancies either by reducing the feedforward information of the top-down flow, or by diminishing bottom-up information. Belief propagation can be implemented on the neuronal level by balancing excitation and inhibition. Pathways in the human visual cortex show different frequency band synchronicity (microcircuits), influenced by feedforward and feedback projections (Michalareas et al., 2016). These findings provide support for the possible implementation of belief propagation on the neuronal level, since imbalances in excitation and inhibition in these microcircuits could lead to reverberations, leading to circular inference.
Imbalances in excitation and inhibition could occur in local and global inhibitory loops. In fact, inhibitory deficits are present in patients with schizophrenia. These have been correlated with GABA deficits and have been shown to predict the formation of an aberrant belief system (Denève & Jardri, 2016). These findings are also in line with brain imaging findings within the same population . Looking at the long-range inhibitory loop, the thalamic and limbic loops have been identified to be involved in neocortical inhibition (Maffei, 2017). Of special interest are the neocortical-striatal pathways, that are involved in hallucinations and more broadly, in psychosis (Howes et al., 2011;Rolland et al., 2015), and could be related to inhibition of feedback signals.
The inhibition of feedforward signals could be associated with thalamocortical inhibitory loops. Dysfunctions in long-range inhibition could be due to a dysconnectivity between the thalamus and the visual cortex, which has been found in patients with schizophrenia (Yang et al., 2014). The thalamus constitutes a junction for sending and receiving information to and from the cortex, therefore it could be implicated in inhibition of feedforward signals (Rolland et al., 2015). In patients with schizophrenia, the effective connectivity between the thalamus and the visual cortex is reduced, leading to a disruption of causal information flow (Iwabuchi & Palaniyappan, 2017).

Open Questions and challenges in modeling self-disorders in schizophrenia
While the comparator model has had success in explaining certain symptoms of schizophrenia, including certain aspects of the selfdisorders (Frith, 2012), some open questions still remain. It is still unclear why specific symptoms in patients with schizophrenia often differ markedly from one individual to another. The comparator model does not explain inter-individual differences such as elevated sense of agency in some patients, and reduced sense of agency in others. Also, this model does not take into account the elevated role of dopamine that is involved in the precision weighting in patients with schizophrenia (Frith, 2012). In addition, the model struggles to explain thought insertion, a first-order symptom experienced by half of all patients with schizophrenia. The most important question is whether thoughts can be treated the same way as actions, since thoughts are not linked with intentions . The argument is that in order to have the intention to think, this needs to be preceded by an intention, leading to an endless loop (Akins & Dennett, 1986;Stephens & Graham, 2000;Heinz, 2014). Others question whether a comparator is at all a valid model to explain thought patterns (Gallagher, 2004), since it is counter-intuitive that thinking needs to "explain away" sensory signals. Furthermore, one has access only to self-generated thoughts, so usually a distinction from others' thoughts is not necessary. Additionally, the comparator model encounters problems in explaining the difference between thoughts transferred by another individual and intrusive, unwanted thoughts in general Stephens & Graham, 2000;Heinz, 2014;Gallagher, 2004;Vosgerau & Newen, 2007;Brewin, Gregory, Lipton, & Burgess, 2010).
However, there are some concerns about the predictive coding framework. A major concern comes from the methodological point of view. As put forward by Popper (1959), a theory like predictive coding must be proven useful in predicting the results of the observation. Inferring the values of the parameters by using the observation refers to an "inverse problem" and is invalid, since observations should only be used to falsify possible solutions (Tarantola, 2006). This is however not the case in predictive coding. Another criticism of the predictive coding approach is that it does not take the enactive, embodied, and encultured aspects of phenomenology into account (Humpston & Broome, 2020;Allen & Friston, 2018). Specifically the enactive approach emphasizes the social, cultural, and relational aspects of the illness which is mostly neglected by current predictive coding models (Kiverstein, 2020).
Another critique of predictive coding is that there is no agreement among the scientific community, if predictive coding sufficiently explains thought insertion. Some argue its explanatory value only applies to action and perception and not to thoughts, questioning whether we can treat a thought like an action (Frith, 2012). Arguing in favor of a predictive coding account of thought insertion, Sterzer et al. (2016) assert that inserted thoughts are interpreted as surprising, and in no logical continuity with prior thoughts and are thus interpreted as "coming from nowhere" and inserted by another agent. They argue that thoughts constitute prior beliefs about what thought is likely to arise next which fits the predictive coding framework. Their second argument is, in contrast to Frith (2012) that thoughts can be just like perceptions and actions reduced in sensory prediction and increased in sensory precision, leading to a higher prediction error and thus surprising thoughts. In other words, the salience of thoughts is the basis for the perception of thought insertion and it is rather the high salience of the thought caused by imprecise prior beliefs, rather than the content itself (or disturbed T.J. Möller et al. interoception as in the comparator model (Campbell, 1999)), thus providing evidence for the predictive coding approach. Benrimoh, Parr, Vincent, Adams, and Friston (2018) noted that the increase in sensory precision may lead to the realistic and "loud" quality of hallucinations. Kaminski, Sterzer, and Mishara (2019) give a case report in which a patient suffering from schizophrenia describes "seeing rain" that demarcates him from his surroundings. They conclude that self-disturbances can be seen as adaptive copingstrategies to compensate for the rupture in the perception-action cycle in patients with schizophrenia. Interestingly, many patients report uncertainty if they are asked to indicate if the voices they are hearing are auditory hallucinations, thought phenomena, or something in between, revealing there is an underlying spectrum of hearing voices and interpreting them as thoughts (Humpston & Broome, 2016). Overall, the phenomenology of psychotic experience can be explained well by the dynamics of hierarchical predictive coding since both utilize the perception-action cycle within the hierarchical formalism (Kaminski et al., 2019). However, on a computational level, the posterior perception can either be shifted through the precision of the prior belief, or the precision of its likelihood. This makes it unclear whether patients with schizophrenia have stronger, or weaker priors. There is indication that in general, priors are weaker for delusions, and stronger for hallucinations (Corlett et al., 2019;Stuke, Weilnhammer, Sterzer, & Schmack, 2019). The formation of imprecise prior beliefs either occurs by a faulty acquisition of prior beliefs, or through an inability to use prior beliefs as inferences . Faulty prior beliefs can be consolidated either by misleading sensory information (e.g. being followed by black cars from the secret service), or if the encoding and/or detection of sensory information is disturbed Lisman & Grace, 2005). Another possibility for faulty prior beliefs is if the detection and/or computation of higher order prediction errors are perturbed or inadequately precise Adams et al., 2013;Adams, Huys, & Roiser, 2016). The inability to use prior beliefs as inferences on the other hand, could occur if higher order beliefs contain erroneous predictions of the volatility of new sensory stimuli (e.g. mistake a tree branch for a dangerous snake in the park) . These different possibilities and individual differences of impairment could explain the heterogeneity in symptoms of schizophrenia (Sterzer, Voss, Schlagenhauf, & Heinz, 2019). For a simplified graphical overview on the changes of priors, posteriors, and likelihood in patients with schizophrenia compared to neurotypical persons, see Fig. 3. Notably, this simplified model is a moment-to-moment depiction of a highly dynamic process with adaptation throughout, while the relationship between delusions and prior and sensory evidence usage is more complex (Stuke et al., 2019;Schmack et al., 2013;Teufel et al., 2015;Alderson-Day et al., 2017;Powers, Mathys, & Corlett, 2017). Contrarily, there is evidence that higher order priors have a higher impact on delusions, while lower-order priors are reduced (Stuke et al., 2019;Schmack et al., 2013;Schmack, Rothkirch, Priller, & Sterzer, 2017). More concretely, while in the formation of delusions the prior is decreased, higher order priors may be increased to compensate for the lower-level priors. Fig. 3. The computational foundations of perception in psychosis. Top: In both neurotypical states and in preclinical states of psychosis, the computational mechanisms are undisturbed. Bottom left: In hallucinations (state abnormalities), either the prior precision is increased, the likelihood precision is decreased, or both (indicated by the difference between dashed and solid line of the respective color. The precision is represented as the alpha of the Gaussian curve). This shapes the perception (black curve) towards an over-reliance on top-down predictions. Bottom right: In the formation of delusions (trait abnormalities), either the prior precision is decreased, the likelihood precision is increased, or both. This shapes the perception towards an over-reliance on bottom-up sensory evidence. Subsequently, the low-level prior suppression that initially leads to the formation of delusions is strengthened and maintained by higher order priors that maintain and strengthens the delusion (Stuke et al., 2019). The circular inference model can explain the origin of hallucinations, delusions, and the consolidation thereof. Additionally, possibilities of disruptions in the feedforward, and feedback loops can explain the diversity of symptoms that can be on either side of the extremes. Another advantage is the biological plausibility of this theory that falls in line with the dysconnectivity hypothesis of schizophrenia (Jardri & Deneve, 2013;Notredame et al., 2014;Liang et al., 2006;Amad et al., 2014). However, there is still a need to validate the framework on the neurophysiological level.

Guiding principles of CDR
Cognitive robotics is concerned with providing robots with a cognitive architecture that will allow them to have complex interactions in a complex world, and to be able to adapt to changes in the environment. A branch of this field of research, developmental robotics, focuses on the processes and mechanisms that allow lifelong and open-ended learning of new knowledge and skills in an embodied system, modeled after human infant development.
An important design principle in developmental robotics is modeled after the human development process i.e. as a process of incremental acquisition of knowledge from experience, in a physically embodied system (Lungarella et al., 2003;Asada et al., 2009;Asada, MacDorman, Ishiguro, & Kuniyoshi, 2001;Stoytchev, 2009;Cangelosi & Schlesinger, 2015). The focus here is on identifying and implementing those basic behavioral and computational building blocks that enable the autonomous bootstrapping of motor and cognitive skills in artificial agents. Developmental robotics can therefore provide new understanding regarding the emergence of higher cognitive functions in humans. Different issues are faced by researchers in this field. Incremental learning is a challenge yet to be solved in the artificial intelligence community (Nguyen et al., 2020). Updating an artificial neural network in an online fashion typically deteriorates the knowledge that has been previously acquired. Current measures employed to balance the stability and plasticity of these models still barely resemble those of the human brain.
Another issue regards the problem of scalability. This arises when hidden assumptions in pre-programming make it increasingly difficult for the robot to autonomously adapt to situations that violate these assumptions. From this emerges the principle of verification in cognitive developmental robotics (Stoytchev, 2009). Verification requires that the robot will be in charge of testing and verifying everything that it learns. Furthermore, for a robot to be able to verify what it learns, then we need the robot to act upon the world, and for this the robot needs to have a body (Stoytchev, 2009).

Modeling the self in CDR
While cognitive models of self-recognition in biological agents used to be very different from self-recognition models in artificial agents, novel algorithms and techniques of machine intelligence are developing that are coming closer to biological processes i.e. timing, spatial information, and sensorimotor contingencies, and even hardware (e.g. neuromorphic computing: Schuman et al., 2017), ultimately simulating mechanisms thought to underlie the self in humans more thoroughly (Lanillos et al., 2020b;Stoytchev, 2009;Nguyen et al., 2020;Lanillos, Dean-Leon, & Cheng, 2016;Gold & Scassellati, 2009). In the following sections we will review some of the modeling approaches used in CDR.

Comparator models in CDR
Comparator models are widely used in cognitive robotics. Optimal control theories (Wolpert & Kawato, 1998) identify two types of internal models: the inverse model maps a desired sensory state to the motor action that will most likely achieve it, and the forward model maps a motor action to a sensory outcome.
The inverse model is used for action selection and the forward model is used for mapping action-effect pairs and to compare the achieved sensory state to the predicted one. The sensory prediction is based on the motor command (the efferent signal) and is then compared to the afferent sensory signals. According to the comparator model theory of the sense of agency, the congruence between predicted and observed sensory consequences of an action will give rise to the sense of agency (Gallagher, 2000).
A study by Lang, Schillaci, and Hafner (2018) investigated learning in a humanoid robot that learned the visual sensory outcomes of self-generated movements through a self-exploration behavior. The sensorimotor experience acquired through this self-exploration was used as training data for a deep neural network integrating convolutional layers. The deep forward model mapped proprioceptive (e.g. initial arm joint positions) and motor data (motor commands) onto the visual outcomes of these actions. This forward model was then used in two experiments. First, the forward model generated visual predictions of self-generated movements, which were then compared to actual visual outcomes to compute a prediction error. Higher prediction errors occurred when an external subject was performing actions in front of the robot, in contrast to when the robot was only observing itself performing the same arm movements (Lang et al., 2018). This is in line with the notion that prediction errors can be utilized for self-other distinction by way of body ownership, which is thought to be linked to the sense of agency (Braun et al., 2018;Ma & Hommel, 2015b;Ma & Hommel, 2015a;Möller, Braun, Thöne, Herrmann, & Philipsen, 2020). The results showed that prediction can be used to attenuate self-generated movements, and to create enhanced visual perceptions. The sight of objects was still maintained even when the view on the object was occluded by the robot's arm movement (Lang et al., 2018). This indicates that similar processes could be used to further our understanding of the sense of object permanence and short term memory systems in humans.
In a biologically inspired model, proposed in Schillaci, Ritter, Hafner, and Lara (2016), multimodal body representations were acquired through learning and predicting the robot's ego-noise i.e. auditory noise from the motors during movements. In an ego-noise attenuation experiment, a predictive process was implemented by a forward model, that took as inputs coherent and incoherent proprioceptive and motor information. The effects of the coherent and incoherent information were shown in the performance of the ego-noise suppression. Ego-noise attenuation was found to be stronger when the robot was the owner of the action. Ego-noise attenuation was less pronounced when the robot was only listening to the noise of a simulated moving robot. This is because greater prediction errors occurred when motor and proprioceptive information was incongruent with the predicted ego-noise. The "surprise" caused by this incongruence allowed the artificial agent to classify self-generated actions and those generated by other subjects differently.
Utilizing the fact that self-produced signals are likely to be maximally predictable, Schillaci and colleagues developed an artificial system that used the prediction of self-generated auditory information for sensory attenuation (Pico, Schillaci, Hafner, & Lara, 2016;Schillaci et al., 2016;Bechtle, Schillaci, & Hafner, 2016). This allowed the robot to classify self-and other-generated auditory signals. Concretely, the robot identifies external auditory information as more salient, since the predictions were matched with the selfgenerated signals. When there is a match of predictions and sensory information, the comparator model filters out self-produced signals, leading to a stronger salience (or surprise) of external signals.
However, the explanatory value of the comparator model for the sense of agency is debated. Zaadnoordijk et al. (2019) argue that the sense of agency requires not only the representation of the match between predicted and observed sensory consequence of an action (the prediction error), but also a representation of an action performed by the agent (an "ownership predicate"), and a representation of the causal relation between the (own) action and its effect (Zaadnoordijk et al., 2019). Lanillos et al. (2020b) used a "double comparator" model for robot body estimation and self-recognition and self/other distinction tasks. In this model the first comparator is used when the robot needs to infer the most plausible location of its arm (the learned forward model) by using the prediction error between observed and predicted sensory input. The second comparator considers the spatiotemporal contingencies between visual input from optical flow and motor actions (joint velocity) of the robot to compute the probability of the sensor values being generated by itself.

Active inference models
Predictive coding questions the need for an inverse model and a resulting efferent copy for the achievement of goals. In fact, optimal control theories present difficult issues to solve, among which the ill-posed problem of learning such inverse models (Pickering & Clark, 2014;Dogge, Custers, & Aarts, 2019). In predictive coding there are no reward or cost functions to optimize behaviors, instead, these are replaced by priors about sensory states and their transitions (Friston, Samothrakis, & Montague, 2012).
An overview of the differences between internal models in optimal control theory and in predictive coding can be seen in Fig. 4 and Fig. 5. Fig. 4 depicts the classical approach, in which an inverse model provides an efference copy of the motor command to an auxiliary forward model. In the integral forward model à la predictive coding, motor commands are replaced by top-down proprioceptive predictions (see Fig. 5). These can be viewed as control states that are translated into muscle-based coordinates fulfilled by classical reflex arcs .
Under the Bayesian brain hypothesis, the brain is an inference machine that can make sense of the world based on partial information. According to active inference, empirical information is adapted to the world model either by changing the belief, or by performing an action that would alter the world according to predictions. Active inference models in cognitive robotics usually entail a Fig. 4. The auxiliary forward model. In the auxiliary forward model architecture, the inverse model outputs a motor command, which serves as input to the forward model, that predicts the sensory feedback. (Adapted from Pickering and Clark, 2014). generative model that makes predictions, minimizing prediction error between expected sensory effect of an action and observed sensory effect, thereby minimizing the free energy. Minimizing free energy, prediction error, or "surprise" can be achieved by either adapting the (generative) model that makes the predictions (perceptual inference), or acting on the world, thus changing the sensory information (active inference). Active inference is implemented when actions are selected such that the free energy will be minimized. For example, Tani and White (2020) reviewed a series of studies that employed analogous models for minimizing the free energy.

Robotic modeling of self-disorders in schizophrenia
It has been hypothesized that schizophrenia symptoms, and the disorder itself stem from system-level dysconnectivity (Friston & Frith, 1995;Stephan, Baldeweg, & Friston, 2006). Specifically, aberrant prediction-error signals stemming from underconnected neural networks, would change the goal-orientation in the network, even without an overt change in behavior. Yamashita and Tani (2012) simulated the network functional dysconnectivity in a neural network-driven model in a humanoid robot. They used a hierarchical model of top-down and bottom-up network representing the intention/goal and the lower sensorimotor level, respectively. To test network dysconnectivity, they slightly modified the connective weights between the higher and the lower levels of the model by adding varying levels of random noise. These represent changes in synaptic connectivity in the brain that are thought to occur in patients with schizophrenia. The task for the robot was to repeatedly move an object in two different ways, depending on the location of the object. The position of the object was changed by the experimenter at unpredictable times, which produced a temporary increase in prediction error. In turn, this increase in prediction error produced a modulation in the robot's intention state to minimize the prediction error, causing a flexible switching of behavior. Different levels of network dysconnectivity have been manipulated to observe how the robot will deal with a surprising event on both the computational and behavioral level, by observing the spike in prediction error, switching between intentional states, and observed patterns of motor behavior. The findings showed that mild network dysconnectivity produced an increase in prediction error but outwardly normal behavior. However, higher, more severe levels of network dysconnectivity produced spikes in prediction error and irregular switching in intention states, and overt behavioral deficits such as disorganized actions, cataleptic (stopping or freezing) or stereotypic (repetitive) behavior. These findings were consistent with similar phenomena observed in patients with schizophrenia (Yamashita & Tani, 2012).
Artificial systems can also enter delusional states, as a result of dysfunctions in predictive learning relating to the forward model. Predicting outcomes of own actions on the body and on the environment is a crucial part of predictive learning in both biological and artificial systems. This requires the system to disambiguate self-induced sensory input from externally generated sensory information, using a forward model to essentially filter the sensory signal as to attenuate the reafferent component and learn from the residual signal. When sensory information is too strongly filtered, over-reliance on priors might cause a "delusional loop", in which the forward model is overly weighted, and learning is stalled (Kneissler, Drugowitsch, Friston, & Butz, 2015). In this case learning will not reach proper convergence because the uncertainty of the forward model is not addressed. Specifically, if the forward prediction filters the sensory input too strongly, hyper-confidence in the prediction can lead to a delusional state that completely ignores new incoming information, and stalls learning. To disentangle this problem, Kneissler et al. (2015) proposed a Bayes-optimal linear forward model, which they call "Predictive Inference and Adaptive Filtering" (PIAF). This method filters incoming sensory information, but simultaneously improves the forward model, thus preventing delusional states. When approaching to model self-disorders it is important to incorporate in the architecture the underlying mechanisms thought to T.J. Möller et al. be involved in the emergence and constitution of the self, such as body representations, multimodal integration, and predictive processes (Nguyen et al., 2020). In addition, one should also consider developmental aspects. In simple terms, if we want to study a self-disorder in an artificial agent, we need to provide a plausible model of an embodied emerged self that underwent an iterative and interactive developmental process . This is especially necessary when considering self-disorders as resulting from system-level impairments. It follows then, that we also need to consider the measures and metrics for an artificial self (Georgie, Schillaci, & Hafner, 2019;Hafner et al., 2020). With the advantage of being able to look inside the "black box", researchers are able to analyze both computational and behavioral measures and indices (Lanillos et al., 2020b;Hinz, Lanillos, Mueller, & Cheng, 2018) of different aspects of the self. Zaadnoordijk et al. (2019) argue that the match between predicted and observed sensory signals is necessary, but not sufficient to understand the emergence of the sense of agency in humans, nor to bring about the sense of agency in a robot. The argument relies on the question: How does the robot know that the action was produced by itself? there are some possible options: (i) The robot does not know that the action was produced by itself. The categorization of signals to self-generated or non-self-generated comes about as a result of the similarity between the observed signal and the predicted one on the signal level, and the interpretation of these categories as "self" and "other" is done by the researchers. (ii) "self" and "other" are labels that were hard-coded into the robot, and as such, after categorizing the signals (as in (i)), the robot assigns the labels based on its pre-programming. (iii) the match between the observed and predicted sensory signals is used by the robot to infer the cause of that match (here the cause being its own actions), and to infer that the robot is a distinct agent. Option (iii) is the only one which could lead to the sense of agency in a robot, Zaadnoordijk argues. In this view, it is not the match between prediction and observation, nor a comparison between sensory signals, that led to the sense of agency, but the process of inferring the cause of the match. Therefore, in order to gather insight on the emergence of the sense of agency in developmental sciences, and to develop a sense of agency in a robot, one needs to rather focus on the additional inferential process regarding the authorship of the action that caused the observed sensory signal (Zaadnoordijk et al., 2019). Yet, it is not clear whether such an authorship attribution would be a pre-or post-reflective process. An unanswered question in this study is how this attribution could be achieved in an artificial agent. If one assumes that self-perception results from learning sensorimotor contingencies and causeeffect regularities, such an attribution could rely on the quality of the predictions of the learned forward models. A binary self-other labelling depending on this match is perhaps too simplistic, but the same machinery could be used to achieve the additional inferential process presumed by Zaadnoordijk et al. (2019).

Open questions and challenges in modeling the self in CDR
In a recent review, Ciria, Schillaci, Pezzulo, Hafner, and Lara (2021) provide a comprehensive review of robotic works employing predictive coding and active inference schemes, highlighting their limitations and suggesting avenues for further research. Ciria et al. (2021) highlighted that different challenges in applying active inference schemes in robotics are still unsolved. Learning is among the most evident ones. Several methods are used and only a few are equivalent to the formulations of the free energy principle. Moreover, learning and testing are often decoupled. Furthermore, little attention has been placed on how multiple modalities-apart from proprioception and vision-can be integrated into a learning mechanism under an active inference scheme. Limited research has addressed the scaling up of the predictive coding paradigm towards higher cognitive capabilities. The question of what are the longterm possibilities of using generative models for perception, action, and planning in cognitive robotics still remains unclear.

Crossing computational psychiatry and cognitive developmental robotics
The main advantage of linking computational psychiatry with cognitive developmental robotics is that cognitive and developmental processes in humans can be modelled more thoroughly. Not only is a representation of a body closer to reality, but a body might be an indispensable prerequisite to develop a self, be it biological or artificial (Neisser, 1988;Gallagher, 2006;Varela, Thompson, & Rosch, 2016;Pfeifer & Bongard, 2006). By detecting crucial developmental aspects that lead to the emergence of a self, and analyzing vulnerabilities that might lead to disruptions of self experience, robots allow us to test hypotheses regarding aberrant self-development and functioning in embodied agents. Since increasing evidence points to the notion that self-disorders such as schizophrenia arise from a fundamental disconnectedness from one's body, rather than being a mere epiphenomenal symptom of the disease, only embodied artificial systems would allow investigating the crucial embodiment aspect of these disorders, since cognitive process are deeply entangled with one's body that acts upon the world (for a review, see Wilson, 2002). Therefore, robots can be utilized as a testing ground for computational theories that can also incorporate the enactive approach as opposed to purely computational simulated approaches such as neural networks and AI. Furthermore, by simulating the development of human behavior, cognition, and emotions in artificial agents, we might gain a better understanding of human mental skills, their evolution, as well as how to build more intelligent artificial agents (Weng et al., 2001).
An example for a suitable experiment where computational psychiatry and cognitive developmental robotics could be linked is the "force matching task" (Shergill, Bays, Frith, & Wolpert, 2003;Shergill, Samson, Bays, Frith, & Wolpert, 2005;Bays, Wolpert, & Flanagan, 2005;Bays, Wolpert, Haggard, Rosetti, & Kawato, 2008). According to the comparator model, an agent needs the ability to differentiate one's body and actions from the sensations and events of the environment in order to perceive oneself. To distinguish one's body, the agent additionally needs to integrate multisensory afferent signals. To predict and attenuate the sensory feedback stemming from movement, an agent relies on efferent information, that depend on the forward model processing (Fig. 1) (Kilteni & Ehrsson, 2017). The brain usually decreases the salience of self-generated sensations (sensory attenuation) compared to externally generated sensations to avoid cognitive overload and to distribute attention to external information that may contain more predictive value (Bays & Wolpert, 2007;Voss, Ingram, Haggard, & Wolpert, 2006;Voss, Ingram, Wolpert, & Haggard, 2008). This underlying principle can for example be seen by the fact that healthy persons usually cannot tickle themselves. The same physical properties of the touch feels less intense when it is caused by oneself, compared to when it is caused by another person, or even a machine Weiskrantz, Elliott, & Darlington, 1971;Blakemore, Frith, & Wolpert, 1999;Wolpert, Ghahramani, & Flanagan, 2001). This reveals that the sense of ownership is a determining factor in the attenuation of somatosensory information (Kilteni & Ehrsson, 2017). Specifically, the sense of ownership rejuvenates the internal body state representation that in turn sends information to the forward model, generating predictions during voluntary action (Kilteni, Maselli, Kording, & Slater, 2015). Therefore, somatosensory attenuation can also be seen as an indicator for body ownership if the task entails the integration of active movement (Kilteni & Ehrsson, 2017).
Sensory attenuation for self-perception has been studied in robots in the visual (Lang et al., 2018) and auditory  domain, but has received limited exploration in the tactile one. Existing robotics studies using the tactile modality in the development multi-modal body representations and internal models (Gama, Shcherban, Rolf, & Hoffmann, 2020;, in fact, do not address the role of prediction and sensory attenuation in self-perception. We encourage further exploration towards this research direction. Modeling the self and its disorders in computational and robotic systems poses challenges in both computational psychiatry (Section 4.4) and CDR (Section 3.3). We have reviewed some of the relevant challenges in each field. Yet, crossing these two fields might involve further challenges, rooted in both theory and implementation.
Most models that are used in computational psychiatry are probabilistic models that aim to closely represent the neuroanatomical structure of the cortex (e.g. Predictive coding, circular inference, dynamic Bayesian networks, Kalman filters, variational Bayes recurrent neural networks (Friston, 2005;Clark, 2013;. Indeed, many findings in neuroscience and quantum dynamics (e.g. EPR paradoxon, Bells inequality) point to the notion that the world and the perception thereof is of probabilistic nature Knill & Pouget, 2004;Wetterich, 2020;Mückenheim, 1983;Bell, 1964;d'Espagnat, 1979). Probabilistic and variational aspects of such frameworks can be implemented in robotics and AI, although at the cost of introducing difficulties in their scaling up. Probabilistic modeling of multi-modal integration and incremental learning is still challenging in this field. The probabilistic formalism elegantly explains mechanisms in computational psychiatry. However, several processes can be synthesized in robotics and AI by other means than Bayesian modeling. For instance, precision weighting, here modulating the inverse of the variance of a given distribution, could be modelled through gating systems in deep sensor fusion models.

Guiding principles in linking computational psychiatry and robotics, in the study of schizophrenia as a self-disorder
In Section 3, we described Marr's three levels for understanding complex information processing systems and how to apply them to the study of patients with schizophrenia, and to the development of self-models in robotics. For both, human and robotics studies, the free energy principle can be applied on the first level as a computational theory. On the algorithmic and computational level, the predictive coding framework and active inference can be applied (Tani & White, 2020). On the last level, the hardware implementation, we have the brain in humans in which calculations are made using mainly electricity and neurotransmitter. In patients with schizophrenia, the brain often shows anomalies like grey matter loss, and a dysconnectivity in certain brain areas. In robots on the other hand, the implementation is represented by neural network models within the central processing unit of the robot. Similar to the disconnections in patients with schizophrenia, artificial neural network models in the cognitive architecture of the robot can be disturbed (e.g. by adding noise, see Yamashita & Tani, 2012) to develop robotic lesion studies. Overall, Marr's framework provides a common language for designing comparable experiments between human and robotics cognitive sciences. These implementations can be validated and may ultimately deliver insights to cognitive phenomena like the self and the loss thereof (Tani & White, 2020).
We argue that state abnormalities like hallucinations usually stem from an increase of prior precision, while trait abnormalities like delusions are more likely to stem from a decrease of prior precision (Sterzer et al., 2018). The formation and maintenance of delusions is thereby likely supported by a dysregulated activity of dopaminergic neurons . This additional noise leads to a higher "aberrant salience attribution", thus drowning relevant stimuli in noise Heinz, 2002;Miller, 1976;Kapur, 2003). Moreover, this noise-inflation prevents relevant information from gaining enough novelty and salience to be incorporated into one's belief system (Lisman & Grace, 2005;Adams et al., 2016).

Deriving novel research topics for robotics from computational psychiatry
Patients with schizophrenia show consistent impairment in learning tasks that require explicit learning and memory. Implicit processing and learning (especially motor learning) however seems to remain relatively intact (Horan et al., 2008). Illusions can be seen as the difference between the objective and perceived object properties and are products of rational Bayesian evidence that is present in both healthy humans and humans suffering under pathological disorders (Notredame et al., 2014). Evidence points to the notion that patients with schizophrenia are more prone to illusions compared to healthy persons (Notredame et al., 2014).
Besides learning, another interesting aspect of schizophrenia that could be studied with robots are dream states. It has been shown that patients suffering from schizophrenia not only experience sleep problems more frequently, but their dream states seem to be altered as well. Specifically, patients with schizophrenia experience significantly more nightmares compared to healthy controls which is also positively correlated with their subjective distress (Michels et al., 2014).
It has been suggested that the state of acute schizophrenia can be described as a minds' in-between state of waking life and dreaming. While both waking and sleeping states are functional, an in-between state is dysfunctional since the brain attempts to be in two conflicting brain states at the same time (Llewellyn, 2009). Interestingly, many researchers have ascribed phenomenological and neurobiological similarities to dream states in schizophrenia symptoms such as delusional beliefs, sensory hallucinations, instinctual behaviors, emotional disturbances, orientational instability, and bizarre imagery (Skrzypińska & Szmigielska, 2013;Hobson, Stickgold, & Pace-Schott, 1998). In both dream states and in acute schizophrenia, the person is involved in internal, cognitive events that are characterized by an incongruity and discontinuity of cognition and dream perception with rather limited connection to the outside world (Skrzypińska & Szmigielska, 2013;Hall, 1953;McCreery, 2008). Furthermore, a control mechanism that monitors the source of (internal or external) stimulation is lacking (Windt & Noreika, 2011). A study by Noreika, Valli, Markkula, Seppälä, and Revonsuo (2010) showed that dream states of patients with schizophrenia are even more bizarre compared to a non-clinical population. Looking at the neurobiological characteristics, there are striking similarities of schizophrenia and dream states (REM-sleep) from electrophysiological, topographic, and pharmacological approaches. In both cases, there is an impairment in inhibitory processes (Gottesmann, 2006), suppressed gamma rhythms in visual areas, prefrontal, and frontal cortices (Pérez-Garci, del Río-Portilla, Guevara, Arce, & Corsi-Cabrera, 2001). Similar alterations in cerebral blood flow and decreased activation of the dorsolateral prefrontal cortex might further explain disturbances in mentation and self-reflectiveness (Callicott et al., 2000;Maquet et al., 2000). The reduced thalamocortical gamma activity has previously been linked to the occurrence of hallucinations (Gottesmann, 2005;Behrendt & Young, 2004), higher dopamine activity during REM sleep might indicate an explanation for the loss of reflectiveness (Gottesmann, 2006;Gottesmann, 2005), and an increased activity of the Amygdala through higher levels of glutamate that might lead to problems with the perception of emotions (Gottesmann, 2006). Lastly, the levels of Noradrenaline, Serotonin, and Achetylcholine are decreased significantly in REM sleep and schizophrenia, while specifically Achetylcholine might be associated with hallucinations (Llewellyn, 2009) (for an in depth review of the similarities, see Skrzypińska & Szmigielska, 2013). Despite the striking parallels, there are still some unanswered questions. For example, it is yet unclear why in dreams visual hallucinations are more vivid, while in schizophrenia auditory hallucinations are more dominant (Skrzypińska & Szmigielska, 2013).
Patients with schizophrenia have also a specific deficit of sleep spindles (Manoach & Stickgold, 2019), i.e., neural oscillatory activity occurring during a stage of non-REM sleep and presumably mediating long-term memory consolidation. This deficit correlates with impaired sleep-dependent memory consolidation. Consolidated memories are malleable and can be destabilized and reconsolidated (Sinclair & Barense, 2018). The rate of consolidation seems to be driven by prediction errors. A surprising experience, when incongruent with prior knowledge, destabilizes episodic memories and promotes its updating (Sinclair & Barense, 2018). Experiments have shown that prediction error-driven memory consolidation improves learning performance also in artificial systems (Schillaci, Schmidt, & Miranda, 2020). Evidence links sleep-dependent memory consolidation with dreaming (Wamsley, 2014), as well as suggests that novel experience influences dream content especially in the visual domain (de Koninck, Christ, Hébert, & Rinfret, 1990;Kussé, Shaffii-LE Bourdiec, Schrouff, Matarazzo, & Maquet, 2012;Wamsley, Tucker, Payne, Benavides, & Stickgold, 2010). One claim is that delusions and dreams can both be seen as states with a deficient "reality testing" (Gerrans, 2014). Specifically, the model proposes that hallucinations and false memories emerge from faulty reality testing (Moulin, 2013;Bentall, 2003;Hobson, 1999), while dreams can be seen as hyperassociative default processing resulting from activation of the default mode network (DMN) that instantiates "raw material" for delusions, confabulations, and narrative context for waking cognition. In a healthy brain, these confabulations can be overwritten and (dis-) confirmed by evidence, which is thought to be associated with activity in the right dorsolateral prefrontal cortex (Gerrans, 2014). However, if these circuits are lesioned or hypoactive, as it may be in patients suffering from schizophrenia, the hypothesis cannot be overwritten leading to a missing evaluation of beliefs (Gerrans, 2014). This link between dreams and delusions explains the contextualization of beliefs which could explain the emergence of delusions. Since these confabulations cannot be contextualized as confabulations or hypotheses, and implausible beliefs cannot be overwritten, this might explain why patients with schizophrenia experience psychotic symptoms due to decontextualized memories (Gerrans, 2014).
In AI research, deep generative models have already been of use to simulate dream states (e.g. Google deepdream). Such neural networks were trained to detect patterns in pictures even when there is a high variance involved. By integrating a variational architecture with the iterative enhancement of the activation of certain layers of the network, this leads to the creation of pictures that look alien and far from reality the more iterative cycles were involved. Over-excitation of neurons in similar generative models have shown to lead to artificial hallucinations (Reichert, Series, & Storkey, 2013). Some of these "dreaming AIs" are based on aberrant salience attribution which is similar to psychotic states as explained above. Therefore, deep dreaming neural networks have been proposed as mechanisms representing the pathogenesis of schizophrenia that can be used to generate and test predictions for psychosis (Keshavan & Sudarshan, 2017). Limited research has focused on implementing generative models in robots for studying similar phenomena. As multi-modal embodied agents, robots are perfect test-beds for these investigations. In fact, dreaming is not a uni-modal phenomenon, as it is often associated with strong sensorimotor activity (Hobson, Pace-Schott, & Stickgold, 2000;Speth & Speth, 2018). Empirical studies have shown that motor imagery can be even induced during REM sleep through transcranial direct current stimulation (tDCS) in the motor cortex (Speth & Speth, 2016). Studying multi-modal generative models and memory systems in robots could provide insights on the nature of dreams and hallucinations.

Conclusions
In this review, we aimed to unravel the interactions between different informational sources in the construction of the self, and how to synthesize them into robotic agents. By modeling self-disorders, one can better understand disorders of self in a computational sense (computational psychiatry). For that, we discussed several models (e.g. circular inference, predictive coding). These allow us to furnish a framework for possible explanations as to how and why some humans feel a disruption in their stream of consciousness or a demarcation from their body. We reported evidence for different models from computational and empirical human studies, and assessed their biological plausibility.
Some evidence points to the notion that neuronal Bayesian dynamics might encode expected change in the environment (Hohwy & Seth, 2020;Hohwy, Paton, & Palmer, 2016). We stress our core belief that the main computational foundations can be applied to both biological and artificial agents, and lead to the reduction of uncertainty and top-down messaging by a probabilistic, generative model (Hohwy & Seth, 2020). Utilizing predictive coding in human studies seems promising for capturing a fuller phenomenological picture of the self, by matching neural correlates and neural computations with the underlying conscious experience (Hohwy & Seth, 2020). In addition, employing this framework in cognitive developmental robotics shows promise for developing artificial agents with more advanced self-models (Georgie et al., 2019). These artificial self-models can be utilized for human phenomenological research of the self, or disturbances of the self. Specifically, predictive coding can explain multimodal integration through Bayesian optimal approximation and offers explanations for altered sense of ownership and sense of agency, as well as for hallucinations and delusions in clinical populations (e.g. overusing perceptual priors).
One strength of the predictive coding framework is that the phenomenology of being an embodied agent with a self concept is arguably created through active inference of an active self and its interoceptive states (Gallagher, 2000;Northoff, 2013;Neisser, 1988;Jeannerod, 2007;Jardri & Deneve, 2013;Huq et al., 1988;Lanillos, Dean-Leon, & Cheng, 2017;Dayan, Hinton, Neal, & Zemel, 1995;Tsakiris, 2017;Hafner et al., 2020). Also, inferences might be prompted by a regulation of hidden causes of states, rather than by an accurate representation (Hohwy & Seth, 2020;Seth, 2015a;Seth, 2015b;Wiese, 2014). These mechanisms, currently mostly applied to the study of biological agents, can be simulated and analyzed in cognitive developmental robotics with the added value of having access to the "black box". Understanding neurocognitive processes on phenomenology and investigating their impairments in psychiatric disorders might bring about newer approaches regarding the emergence of the self in humans, and spark ideas on how to develop a more sophisticated self model in embodied artificial agents.

Funding
The work of T.J.M., L.K., and M.V. was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) in the project "Functional aspects of the minimal self-the case of schizophrenia" (DFG KA 4920/1-1 VO 1744/2-1). The work of Y.K.G. and V.V.H. was also funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation), in the project "Prerequisites for the Development of an Artificial Self" (402790442). Both projects are within the Special Priority Program "SPP-The Active Self" (SPP 2134). G.S. has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 838861 ("Predictive Robots"). Predictive Robots is an associated project of the "SPP-The Active Self" (SPP 2134).

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.