Stress and its sequelae: An active inference account of the etiological pathway from allostatic overload to depression

Survival requires the implementation of adaptive changes that demand energy resources. The efficient regulation of energetic resources thus plays a critical role in enabling systems to adapt to the demands of their internal and external environments. The framework of active inference explains how living organisms can build probabilistic models that enable them to predict, track, and regulate energy expenditure in the short and long run. The aim of the paper is to characterize the physiological changes that accompany stress, and the relationship between these changes and the loss of confidence in a system's predictions about its internal and external milieu-ultimately manifesting as depressive symptomatology. We identify the systems that underwrite goal-directed behavior, and the neuroendocrine and immunological systems, as the hierarchical controller that regulates energy resources. In doing so, we establish an etiological pathway from allostatic overload to depression via active inference.


Introduction
Living organisms must resist the dissipative influences of a constantly changing environment to survive. Homeostasis is the name given to the capacity of living creatures to maintain their internal environments within the relatively narrow range of states that ensure survival (Cannon, 1929). However, given that the demands of existence are ever changing, homeostasis is-for many creatures-not the complete story. Sterling and Eyer (1988) introduced the concept of allostasis, which they defined as the process of achieving stability through change. This construct acknowledges anticipatory bodily responses to expected alterations in one's situation. Allostasis requires a model of the self, the environment, and the interactions between them. The anticipatory role of allostasis (at least arguably) calls for the use of predictive theories of the mind (Hohwy, 2014).
The aim of the present work is to explore the etiological pathway from allostatic load to depression. We review theories of allostatic regulation in the context of a new framework for behavioral and neural modelling called active inference, which may furnish the first integrated framework to study behavior, cognition, learning, goal-motivated behavior, and evolution by natural selection. The purpose of this paper is to study the consequences of prolonged exposure to changes to allostatic setpoints (known as allostatic load) that occur because of environmentally induced stress. These adjustments have significant effects on the motivation and behavior of the individual. When sustained over time, allostatic load leads to depressive symptoms and results in a 'helpless' system that is unable to learn and is therefore confined to-or obliged to construct-predictable, nonvolatile environments.
We argue that there are three independent but layered and interconnected (i.e., heterarchical) systems that work together in allostasis and that are implicated in the development of allostatic load. At the top level of the heterarchy, there is goal-directed behavior that maps overt action to its consequences. When these higher-level predictions fail, the system falls into a state of uncertainty, wherein the neuroendocrine system triggers a stress reaction. This mid-layer anticipates that learning during stress will restore confidence over goal-directed policies. We argue that when stress is repeated over time and becomes chronic, confidence in the efficacy of the neuroendocrine system is lost, calling for the intervention of bottom level immune responses (long-term policies) that replace neuroendocrine control (short-term policies). In summary, the goal of the paper is to review the allostatic mechanisms that are involved in the development of allostatic load and the possible motivational and cognitive symptomatology associated with the different stages of the etiological pathway from allostatic load to depression.

Depression
According to the World Health Organization, major depressive disorder (MDD) affects more than 300 million people worldwide and is a prominent cause of disability. Negative schemas, or beliefs, have been consistently identified as an essential component of depression (Beck, 1967;Chekroud, 2015). There is agreement that MDD is characterized by negative expectations about the self and the world (Beck, 1967;Klein et al., 1976;Paulus and Stein, 2010;Northoff, 2007). However, the manifestation of clinical symptoms varies considerably among individuals (see Table 1 for the DSM-V diagnostic criteria for depression for adults and adolescents); while the search for their neurobiological and socio-environmental roots remains a key priority for psychiatric research.
At its core, MDD is an anhedonic-mood disorder associated with several cognitive deficits and motor alterations, which are thought to result from a complex interplay of affective, motivational, and cognitive factors (Austin et al., 2001;Ravnkilde et al., 2002). Cognitively, MDD is associated with increased attention towards negative stimuli and withdrawal from positive cues (Leppänen, 2006), as well as enhanced and persistent processing of negative emotional information (Siegle et al., 2002). The increased salience (and consequent attentional capture) of negative stimuli enhances the probability that negative information will be stored and recalled, resulting in bias towards negative stimuli (Disner et al., 2011).
At the behavioral level, depression is also characterized by diminished goal-directed activity and psychomotor retardation (Lemke et al., 1999). These behavioral aspects have been linked to anhedonia, namely, the inability to feel pleasure and the loss of interest in previously enjoyed activities. Anhedonia has been identified as the central symptom for the diagnosis of depression (Aoki et al., 2019;Leentjens et al., 2008). Notably, it often precedes depressive episodes, and it modulates the development of depression in anxiety disorders (Winer et al., 2017). Clinically depressed patients lack a self-enhancing attributional style observed in healthy subjects. Their attributional cognitive style drives them to internalize responsibility over negative events, and externalize agency for positive ones (Alloy and Abramson, 1979;Seligman et al., 1979). They also tend to estimate their errors more accurately, but underestimate their achievements (Dunn et al., 2007). Alloy and Abramson (1988) coined the term depressive realism for this phenomenon, as they observed that depressed patients make more accurate inferences (i.e., contingencies of events) about the world than non-depressed patients. Overall, these biases reinforce ruminative patterns of thought, which refer to repetitive self-referential thoughts often involving ideas of inadequacy. They occur without control of the individual and they are associated with decreased executive control affecting goal-directed behavior (Kraft et al., 2017;Matthews and Wells, 2004).
Recent approaches in computational neuroscience have extended Beck's original negative schemas of depression to autonomic regulation, as they posit that depression is rooted in maladaptive (non-propositional, probabilistic) beliefs about the internal and external environment, and about the interaction between them. Before discussing these ideas, we first need to introduce the basic concepts of this framework, called active inference.

Active inference
The active inference framework (AIF) is a new framework for modelling behavior in living systems (Friston, 2019) that inherits from Bayesian approaches to perceptual synthesis, such as predictive coding (Rao and Ballard, 1999). Heuristically, the AIF asserts that life is a self-fulfilling prophecy of sorts. Technically, the AIF describes the dynamics (i.e., the behavior) of living systems as a gradient descent on, or minimization of, variational free-energy Ramstead et al., 2018). Very roughly, variational free-energy quantifies the difference between those sensory states that a system expects to register, based on beliefs about the causes of those observations, and those that it actually observes, (Friston, 2010;Friston et al., 2016). In the AIF, an adaptive action is defined as an action that reduces this discrepancy between expected and observed sensory states, thereby realizing the system's preferences about the kind of sensorium that it experiences . In short, both perception and action work together to minimize free-energy, or the discrepancy between predicted and observed outcomes.
In active inference models, this activity entails the formation of a probabilistic model of the process that generated sensory outcomes-the so-called generative model. By 'probabilistic model', here, we mean a statistical (generative) model that captures probabilistic beliefs (in the form of sub-personal probability distributions or density functions) about the causal structure of the environment and how that structure generates observations. Active inference accounts of neural dynamics posit that the human brain entails a multi-layered generative model that minimizes the variational free-energy of its sensory states through hierarchical message passing. The advantage of using such a layered architecture is that the system can extract causal regularities that are pitched at different spatial and temporal scales; with 'higher' layers more sensitive to regularities and events occurring at slower timescales and pertaining to larger events, and with 'lower' layers responding preferentially to regularities at faster and smaller scales. Under this scheme, prior beliefs concerning future sensory inputs are passed from higher to lower levels within the neural hierarchy, while discrepancies between predicted and actual sensory states are passed back up the hierarchy to update higherorder predictions. Under simplifying assumptions, these signals that are shuffled up and down the hierarchy correspond to prediction errors and  predictions of the sort that characterize predictive coding (Rao and Ballard, 1999). In this setting, minimizing free-energy reduces to minimizing prediction errors at each level of the hierarchy, based upon expectations about external states of affairs at multiple levels of abstraction.
In this paper, we will mostly speak of heterarchies rather than hierarchies (Pessoa, 2017(Pessoa, , 2019. This is because the scientific image of the neural system as a well ordered, unambiguous layering of neural systems is an idealization. In reality, neural mechanisms play several different roles in different contexts, and their ordering is far from linear. This speaks to the notion of a heterarchy, which is a layering in which there is not one single allowable ordering. A key concept for hierarchical prediction error minimization is precision, which controls the relative weighting afforded to prediction errors during active inference. Precision can be viewed as a measure of confidence or reliability; mathematically, it corresponds to the uncertainty (inverse variance) of a distribution (Feldman and Friston, 2010). The brain needs to select which signals it should attend, and it does so by adjusting their relative 'volume'. Prediction errors to which high precision is assigned will have more impact on inference, and will thus penetrate more deeply into the processing hierarchy.
A plausible biological mechanism for precision-weighting is the control of synaptic gain of the precision error units; for example, modulation by serotonin, acetylcholine, dopamine, or norepinephrine (NE) This aspect of predictive processing has been related to attentional control in sensory processing (Feldman and Friston, 2010;Peters et al., 2017). On the other hand, sensory attenuation or reducing the precision of afferent information occurs when one is highly confident about the sensory consequences of one's own action (Clark et al., 2018). In other words, assigning high precision to prior beliefs reduces the impact of incoming sensory prediction error signals, and vice versa.
Finally, a crucial element of the active inference framework is the reduction of expected free-energy (i.e., expected prediction errors) through the selection of beliefs about action and plans (known as policies). Essentially, a complementary way to minimize the discrepancy between expected sensory input and actual sensory input is to act in such a way as to bring about the sensory inputs that you prefer . Predictions about future sensory inputs can be derived from the generative model, such that inferences about how an action will affect external states can be made (Stephan et al., 2016). An active inference agent is essentially trying to work out "what must I be doing, given what I am perceiving and what I believe about the process that generated my observations." Active inference is then said to have a counterfactual or conditional aspect, since adaptive actions require the system to have expectations about how sensory signals will change under a specific policy (Corcoran et al., 2020;Seth and Tsakiris, 2018).

Neuromodulation, neuroendocrine, and immunological processes in AIF
Having established the fundaments of active inference, we are now in a position to rehearse the basic idea upon which this review is based. Active inference entails belief updating about states of the world and the most likely courses of action that will reduce uncertainty. Central to this formulation of sentient behaviour is the estimation and encoding of uncertainty or precision; in the sense that any probabilistic (Bayesian) belief entails a precision (Ainley et al., 2016;Clark, 2013;FitzGerald et al., 2015;Moran et al., 2013).
The idea here is that if generative models are equipped with representations of precision over increasing temporal scales, then certain phenotypes (i.e., people) can regulate belief updating and learning over distinct timescales. In this setting, the precision of beliefs is informed by-and informs-a prior precision that changes more slowly than the posterior estimates of precision that fluctuate from moment to moment. In turn, prior beliefs about precision are themselves informed by-and inform-hyperpriors over precision that change slowly over extended periods of time.
We conjecture that the precision of various beliefs, at three temporal scales, is encoded by neuromodulatory, neuroendocrine, and immunological states, respectively. For example, at the fast scales of active inference, dopamine may encode the precision of beliefs about policies that change at a timescale commensurate with fluctuations in attention, motor vigor, arousal, and so on (Parr and Friston, 2017). At a slower timescale-associated with fluctuations in emotions (Smith et al., 2019) and related dispositions-a high dimensional neuroendocrine status encodes priors over various precisions. Similarly, one's immunological status supplies hyperpriors over neuroendocrine priors that may fluctuate over days and months, commensurate with the time constants of changes in mood (Badcock et al., 2017;Clark et al., 2018).
Crucially, each level contextualises and is informed by the level below. For example, situations in which no policies provide a definitive reduction in expected free-energy will lead to imprecise beliefs over courses of action and a sense of uncontrollability (Friston et al., 2014). This will be recognised by the neuroendocrine level, where belief updating will reduce the (empirical) prior over the kinds of policy in question. The ensuing reduction in the prior precision now allows hitherto less likely policies to compete for selection, effectively extending the repertoire of policies. In a similar way, the immunological encoding of hyperpriors over precision will recognise long-term trends in controllability and the precision or confidence in particular policies and adjust suitably.
There are some interesting aspects of this formulation of precision or uncertainty encoding in the embodied brain. First, the belief updating may not necessarily involve neuronal message passing. In other words, from an embodied perspective, the message passing at slower (neuroendocrine and immunological) timescales may rest on neurohumoral and metabolic pathways that operate at a much slower timescale than fast neuronal message passing. Having said this, as noted above the central nervous system (CNS) and peripheral neuroendocrine (and immunological) systems are intimately coupled in both directions. For example, the release of various neuroendocrine signals via the hypothalamic-pituitary-adrenal (HPA) axis speaks to the influence of the CNS on endocrinology. Conversely, as we will see below, neuroendocrine and immunological status can have profound effects on synaptic gain and plasticity, which contextualises neuronal message passing. Please see (Bhat et al., 2021) for an example.
Second, the implicit separation into three timescales speaks to the distinction between inference, plasticity, and structure learning in the brain. Active inference, as a free-energy minimising process, unfolds at many levels. These range from fast inference processes, through to slow updates in model parameters associated with synaptic plasticity and learning, to structure learning or Bayesian model selection operating at a much slower timescale (e.g., neurodevelopment or, indeed, synaptic regression during sleep) Smith et al., 2020). This suggests that neuromodulatory encoding of precision may be expressed at the fast timescale of sentient interactions over seconds, while neuroendocrine fluctuations may be reflected in plasticity and learning. Finally, immunological status may be more associated with structure learning and reconfiguring generative models over extended periods of time.
The distinction between inference and learning will figure centrally in what follows-and has an interesting relationship to precision. In general, when states of affairs or actions become uncertain, the precision of prior beliefs will fall, and the relative precision afforded sensory evidence will increase. This means that more attention is paid to sensory information in situations that are more volatile or unpredictable. The same phenomenon occurs at the temporal scale of learning, where precision can be regarded as the learning rate. For example, increases in sensory precision induce greater associative plasticity, such that the Bayesian brain can learn about new contingencies in a volatile or novel environment. This means that getting the precision right at various hierarchical levels-and temporal scales-becomes crucial in terms of optimising both inference and learning. Below, we will exploit this by linking uncontrollability to learned helplessness.
Finally, an important facet of increasing the learning rate-or paying more attention to sensory information-is the computational and implicit energy costs. Technically, enabling a greater degree of belief updating (or learning) through a rebalancing of sensory and prior precision allows posterior beliefs to move further away from prior beliefs. This is known as a computational complexity cost that translates directly into thermodynamic or metabolic energy expenditure. One can see this in terms of brain activation, when measured with fMRI. The bottom line here is that Bayes optimal adjustments to precision, prior precision, and hyperpriors over precision will look as if the brain is allocating energy resources-in the sense of optimising the degree and timing of belief updating during inference and learning. It is in this sense that we refer to the control or allocation of energy resources by neuromodulatory, neuroendocrine, and immunological precision control.
In what follows, we now unpack this idea in relation to existing theories and the wealth of empirical work in this area. This material can be regarded as a selective review of material in the field that is synthesised under the active inference formulation above.

Active inference accounts of depression
In recent years, several theorists working under the umbrella of predictive coding and active inference have proposed that MDD is a consequence of atypical internal modelling (e.g., Barrett et al., 2016;Seth and Friston, 2016;Stephan et al., 2016). Notably, more than 40 years ago, Beck (1979) drew attention to the negative expectations about the self, the world, and the future exhibited by depressed individuals-the triad of beliefs that jointly constitute a major part of the brain's internal model. If the brain regulates organismic activity through the modelling of such beliefs, then depression may be rooted in the effects of maladaptive beliefs (Chekroud, 2015).
Broadly speaking, most proposals that apply ideas from the AIF to depression point to miscalibrations in precision weighting in the interoceptive and autonomic domain as the common root for the development of MDD. Due to the repeatedly reported and large variability in physiological alterations in depressed patients, most of these proposals focus on miscalibrations of interoceptive signaling. For instance, Paulus and Stein (2010) suggested that (propositional) negative views about the self amplify sub-threshold interoceptive signals. A failure to attenuate interoceptive prediction errors sinks the individual into a state of uncertainty due to relative loss in the confidence placed in predictive engagement with others, motivating individuals with depression to withdraw from their environments.
Later proposals have emphasized that the allocation and regulation of energy resources is at the core of brain's functioning, and must be implicated in depression (Barret, Quigley, & Hamilton et al., 2016;Stephan et al., 2016). These accounts suggest that an individual will be at risk for depression if they fail to allocate energy resources efficiently over an extended period of time. Stephan and colleagues (2016) defined dyshomeostasis as "a state of elevated interoceptive surprise (.) indexed by increased precision-weighted prediction errors about viscerosensory inputs" (p. 16). The inability to restore the body's optimal state might materialize in fatigue, a pivotal symptom in depression (Stephan et al., 2016). On this account, if fatigue fails to reduce interoceptive surprise (i. e., by motivating the individual to rest and withdraw from unpredictable situations), the system enters a second phase of generalization. Chronic dyshomeostasis indicates to the system that its viscerosensory predictions are defective, and it is thought to result in generalized feelings of uncontrollability and worthlessness, or what they called a metacognitive lack of self-efficacy. In the following section, we will unpack these ideas in further detail.
Similarly, the 'lock in' brain hypothesis posits that the depressed brain is unable to adaptively learn . The construction of an inefficient internal model that is extremely reactive (i.e., sustained cortisol levels after exposure to a stressor in MDD patients) can result in increased false alarms and overly precise (e.g., spurious) prediction errors. Miscalibrations in precision signaling underlie the inability of the system to learn, which creates a positive feedback loop of inefficient autonomic regulation, inducing pervasive negative affect and anhedonia. Some authors have proposed that the reallocation of energy resources during stressful events results in aberrant interoceptive predictions and biased beliefs, which are associated with alterations of the HPA axis (Heim et al., 2008).
Nevertheless, it remains unclear how the inability to regulate the internal environment impacts affective regulation and goal-motivated behavior, and the control mechanisms involved in interoceptive inference; particularly, modulating interoceptive precision. The present paper builds on the above proposals to provide an integrative active inference account of energetic regulation that rests on three distinct modules: the systems that underwrite motivated, goal-seeking behavior, the neuroendocrine system that is responsible for autonomic stress responses, and the immunological system. These modules work on different timescales and with high individual variability, and might therefore help explain the heterogeneity of depressive symptomatology. However, before pursuing these ideas further, we first need to tease apart different aspects of physiological regulation as construed under the AIF. We address these issues next.

Homeostatic control and homeostatic goals
As intimated above, the AIF is inspired by the insight that adaptive biological organisms must act in ways that bring them as close as possible to their preferred (i.e., phenotypic) sensory states, or risk perishing. This follows from the consideration that surprising sensory states-that is, those states that elicit a large prediction error, relative to the sensory evidence that is consistent with continued survival-cannot be sustained for very long before the system begins to break down. For instance, failure to maintain interoceptive signals, such as blood oxygen concentration and pressure levels, within their expected bounds will drive the system into a perilous region of state space, precipitating hypoxia and eventual death. Active inference is thus deeply concerned with the maintenance of biological variables within a window of viability through loops of adaptive action, i.e., homeostasis.
Homeostasis is classically conceived in terms of closed-loop control, whereby essential physiological variables are monitored in relation to prespecified setpoints (i.e., goal states) (Cannon, 1929). Deviations beyond setpoint bounds are countered through corrective autonomic actions designed to reinstate acceptable parameter values. Under active inference, these setpoints are replaced by prior preferences about the range of sensory inputs. This reformulation of the traditional homeostatic feedback loop effectively absorbs both the evaluation of interoceptive states and the prescription of corrective actions under a single, unified scheme, in which deviations from homeostatic setpoints automatically drive corrective action via the recruitment of autonomic reflexes (Pezzulo et al., 2015). This arrangement is formally identical to active inference accounts of classical motor reflexes, which reformulate motor commands in terms of proprioceptive expectations (Adams et al., 2013, Friston, 2011, only now extended to actions in the internal environment. The state x represents a constant state of the body, and the brain holds a belief p(x) about that state of the body. This belief will be updated sequentially over time through viscerosensory input (blue lines), which convey the current available evidence. The posterior belief here results from precision-weighted prediction error, which represents the difference between prior predictions p(x) and current physiological state of variable x. This update rule can be turned to trigger corrective action (red lines) by which physiological variable x can be influenced through another precision-weighted prediction error. Allostatic predictions (green lines) represent beliefs that can change the mean and/or precision of homeostatic beliefs over time, to avoid dyshomeostasis in the future. Adapted from Stephan et al. (2016) Fig. 1.

Anticipation and allostasis
Organisms are not merely reactive agents, but are also endowed with the capacity to adapt in anticipation of changing physiological or environmental conditions. Such predictive or prospective modes of regulation are captured by the concept of allostasis. Allostasis emphasizes the importance of anticipatory control mechanisms that serve to initiate compensatory physiological adaptations in advance of (rather than in response to) homeostatic perturbation (Sterling and Eyer, 1988). Although a variety of allostatic theories have been developed over the past three decades (see Corcoran and Hohwy, 2018), notions of top-down, anticipatory regulation remain integral to its contemporary usage (see, e.g., Ramsay and Woods, 2014;Schulkin and Sterling, 2019).
Effective allostatic regulation calls for a model of the way sensory states are likely to evolve through time, and the way potential actions are likely to impinge on sensory flows. While some of these causal dependencies might be captured in model parameters endowed by genetically inherited information (Allen and Tsakiris, 2018), many will need to be acquired (or updated) through the course of the agent's interactions with its environment. Learning mechanisms thus play a pivotal role in allostasis, enabling the organism to tap into predictive (internal or external) environmental cues and infer the policies most conducive to the long-term maintenance of homeostasis (for recent reviews, see Ramsay and Woods, 2016;Schulkin and Sterling, 2019).
Some forms of allostatic regulation are deeply embedded within the organism's phenotype, reflecting the enduring stability and salience of these environmental regularities over the course of phylogenetic development (Ramstead et al., 2018). One such example is the circadian rhythm, which in humans (and many other animals) regulate diurnal patterns of physiological variation (e.g., temperature and blood pressure levels) and coordinate adaptive behavior (e.g., sleep-wake cycle; Moore-Ede, 1986). While such rhythms are genetically encoded within most (if not all) biological organisms, they are also subject to online modulation to ensure the organism's ongoing attunement with its environment (e.g., adjusting to seasonal variation in the photoperiod; or, more abruptly, jetlag). Under the AIF, such adaptation corresponds to a form of structure learning or model parameter updating (see Corcoran et al., 2020).
Homeostatically-relevant model updating may also occur over much shorter timescales in response to a few highly salient events. Allen and Tsakiris (2018) construe an episode of food poisoning as an example of one-shot interoceptive learning, in which a single experience is sufficient to form a deep and enduring aversion to the offending dish. The origins of such profound aversion lie in the 'hyper-precision' afforded to interoceptive prediction errors, on account of their relevance for ongoing homeostasis and survival. To prevent repeated engagements in potentially fatal activities, these interoceptive warning signals compel rapid and dramatic model updates that label sensory features of the event as aversive or harmful. In other words, the agent updates its beliefs about the sorts of things it expects (prefers) to consume, and those it expects (prefers) to avoid.
Affect and emotion are at the core of interoceptive inference (Barrett, 2017;Barrett and Simmons, 2015;Seth, 2013;Seth and Friston, 2016). Under the AIF, the brain is continuously updating an internal model of the body in the world. These models derive interoceptive predictions and "their consequences for allostasis are made available to consciousness as affect" (Barrett, 2017, p. 7). In other words, emotions do not arise from the interoceptive signals per se, but rather emerge as an attribute of visceral predictions or inferences (Allen and Tsakiris, 2018;Seth and Friston, 2016). Recent work has argued that emotional valence, which represents the positive and negative attributes of emotional states, emerges from the perceived fit between an organism and their environment (Hesp et al., 2021). Particularly, emotional states reflect changes in expected uncertainty given the current action, which results in changes to the confidence (precision) ascribed to an action and its sensory consequences (Clark et al., 2018).
To sum up, according to active inference, in order to continue existing, living systems attempt to minimize the discrepancy between the data that they expect (or prefer), and the data that they sense (or anticipate contingent on action). The difference between the expected and actual data is quantified by variational free-energy (while the difference between preferred and anticipated outcomes is quantified by expected free-energy). It follows that a good model is one that minimizes free-energy (i.e., that selects actions which lead to predictable outcomes). Interestingly, free-energy can be thought of as a summary of how good the model is, in the sense that a good model endowed with predictive power will generate less free-energy on average than a poor model (Technically, a good model is a model that maximises the marginal likelihood of observable outcomes, which is the same as minimising variational free-energy). In active inference, the policy that ends up being selected is the one that minimizes the expected free-energy, i. e., the free-energy that is expected to arise as a result of pursuing a specific course of action or policy.
Recently, these constructs were used to operationalize emotional valence (Hesp et al., 2021;Miller et al., 2021). In brief, in these active inference formulations, valence arises from "error dynamics," or fluctuations in the difference between the free-energy expected under the pursuit of some policy and the free-energy actually that is registered. This quantity has been labelled "affective charge" (Hesp et al., 2020); with positive affective charge corresponding to lower than expected free-energy, and vice versa. Heuristically, in these accounts, valence can be mapped onto inferences about 'how well I am doing, in predicting outcomes'. Positive valence arises from a greater reduction in free-energy than was expected under that policy. In other words, 'if my actions generate less free-energy than I expected, my model is probably a good model'; so the agent can infer that 'I am doing well'. On the other hand, negative valence is associated with actions that generate more free-energy than was expected, with the implication that the model is not a good predictor of outcomes.
An important result of this formalization of valence is its implications for a model's estimations of confidence in its own inferences and plans: fluctuations in affective charge can be used to estimate the reliability of the free-energy generated by the model. Positive affective charge (i.e., generating less free-energy than expected on average under some policy) will endorse the model's belief in its own reliability by increasing precision (i.e., I know exactly what I'm doing'); while negative valence will reduce the precision afforded beliefs about plans. I.e., 'I don't know what to do'.

Motivated action control
The link between interoceptive predictions and affective states set the basis for the emergence of motivated action control by modulating approach and avoidance behaviors, allowing the development of more elaborate allostatic models. Motivated control refers to the regulation and coordination of behavior oriented towards the attainment of affectively valenced goals (Pezzulo et al., 2018). Adaptive motivated control relies on learned associations that (1) relate to the contingencies in the environment (control domain), (2) inform the desirability of outcomes (motivational domain).
Classically, the study of learning is associated with Pavlovian conditioning that was later expanded to operant conditioning, which refers to the process by which a behavior is strengthened or extinguished through positive reinforcement or punishment (Skinner, 1938). The acquisition of conditioned and operant responses rests on the principles of contiguity (i.e., events co-occur in time) and contingency (i.e., a causal relationship between events has a predictive value for the organism) (Elsner and Hommel, 2001).
Classical and operant conditioning allow organisms to learn about their environments and to develop adaptive behaviors that can be exploited in their benefit. However, preferences about events are necessary to navigate complex environments. In their mathematical model of homeostatic reinforcement learning, Keramati and Gutkin (2014) describe how the rewarding value of an outcome depends on the fulfilment of homeostatic needs of the organism. In this way, organisms learn to navigate their environments as a function of the potential consequences that external events have for their homeostatic states. For instance, we will take a blanket when we feel cold (i.e., the observed temperature is lower than the desired one) but it is very unlikely we do that if we feel warm.
Notably, for learning to occur, stimulus-stimulus associations need to reflect reliable statistical contingencies. That is, stimuli (causes) come to acquire value when they are reliable predictors of other (aversive or pleasant) stimuli (consequences). There is substantial evidence from animal studies suggesting that motivational aspects are largely mediated by controllability over events. Controllability is defined there as the conditional probability (P) of an outcome (O) provided that an action (A) or lack of action (A') is taken. Given this definition, an agent will be in control if and only if P(O | A) ∕ = P (O | A') (Maier and Seligman, 1976). Control thus refers to the anticipation that current policies will bring the desired states (and have actual counterfactually relevant effects).
In the late 1960 s, a group of psychologists exposed healthy animals to unpredictable and inescapable shocks (Overmier and Leaf, 1965). Later, these same animals did not attempt to escape escapable shocks. This is referred to as learned helplessness, which arises from the previously experienced uncontrollability of actions over outcomes (i.e., electric shocks) (Maier and Seligman, 1976). Helplessness is produced only by previous exposure to unpredictable and unavoidable stressors (Lieder et al., 2013). The conditions of contiguity (i.e., no cues that predict presentation of shock), and contingency (i.e., pressing lever has no effects), necessary for attributing motivational aspects to stimuli are not met in learned helplessness paradigms. Under such conditions, prior beliefs are not suited to guide action, and therefore they should not be consolidated. Computationally, uncontrollability implies lack of contingency between an action and the attainment of the desired outcomes (Lieder et al., 2013). This means that motivated actions that were employed to bring the desired states are afforded very low precision. In this way action or no action predict the same sensory consequences so that energy resources are not allocated to futile policies, i.e., if P The attribution of incentive salience has been directly linked to the production of dopamine, which guides the consolidation of certain associations (i.e., positive reinforcement) (Berridge and Robinson, 1998). Dopamine is thought to mediate the 'wanting', that is the motivational aspect of stimuli, but not the 'liking' itself. Notably, the presence of uncontrollable stressors is associated with reductions in dopamine concentrations in the nucleus accumbens halting positive reinforcement (Cabib and Puglisi-Allegra, 2012). Recent computational work supported the idea that dopamine indexes the expected confidence about the desired outcomes (Schwartenbeck et al., 2015). Only when we feel we are in control we become motivated to engage in action (Pezzulo et al., 2015).
In the face of an uncontrollable stressor, serotonin is thought to be secreted, which has been directly linked to freezing behavior and the inhibition of dopamine production (Christianson et al., 2009;Maier and Watkins, 2005). As a result, different components of reward processing can be altered: (1) the experience of pleasure; (2) evaluation of reward and cost-benefit analysis; (3) prediction anticipation and motivation (Der-Avakian and Markou, 2012). The take home message is that motivational processes, and consequently behavior and learning, will be modulated according to the degree of confidence in action-outcome contingencies (control). A recent active inference model tested this hypothesis further to show that apathy (or lack of motivated action) results from reduced prior precision about the consequences of actions (Hezemans et al., 2020).
However, stable environments facilitate learning by providing reliable stimulus-stimulus contingencies to the organism. Increased confidence about policies allows for the emergence of habitual behavioral controllers, which enable some behavioral patterns to be stabilized so new ones can be acquired (Pezzulo et al., 2015). In other words, because of their high precision, prior policies override posterior beliefs about reward contingencies in the environment (i.e., sensory prediction errors are assigned low precision). This allows the emergence of more complex behavioral patterns by allocating belief updating resources to the learning of new action-outcome contingencies. Belief updating refers to changes in neuronal activity encoding (usually some personal) beliefs or expectations about states of affairs or policies in play. The degree of belief updating rests upon the precision of prediction errors that drive revisions to expectations and subsequent predictions at each level of the heterarchy. Technically, the energy expenditure that underwrites belief updating increases with the precision of prediction errors (Jarzynski, 1997). In other words, in changing one's mind from prior beliefs to posterior beliefs, there is an inevitable computational cost that is reflected directly in terms of metabolic activity. Therefore, one can regard the control of precision as controlling the computational resources for perceptual synthesis and policy selection.

Complex goal-motivated behavior and the emergence of conflict in active inference models
So far, we have seen how low-level motivational drives (visceral signals), along with low-level control processes (sensorimotor possibilities), complement each other to support learning. In this way, simple forms of homeostatic regulation, like autonomic reflexes (e.g., salivation), lay the foundation for the development of more complex conditioned responses (e.g., lever pressing).
Conditioned responses can be contextualized by a further hierarchical (or heterarchical) layer that integrates contextual cues with instrumental responses. Goal-directed behavior is thus distinct from conditioned responses in that it allows systems to anticipate future states, and thus adjust preferences or goal states as a function of the outcomes that are predicted (Pezzulo et al., 2015). Most complex forms of allostasis require generative models that can predict the sensory consequences of potential actions across multiple modalities, imbuing active inference with a counterfactual aspect (Corcoran et al., 2020;Seth and Tsakiris, 2018;Tschantz et al., 2021). These generative models also reflect the ability to integrate more distal, long-term contingencies, as allostatic models increase in complexity.
Complex allostatic models emerge over the course of the organism's developmental maturation and its interactions with its environment. An especially relevant (but often overlooked) aspect of this developmental trajectory is the role of the social environment in motivational control-generative models come to encode the policies and goals (i.e., goal states and prior preferences about sensory states) that are relevant for the society or cultural niche, in which the organism is embedded.
A key aspect of motivated control is the complexity of the decision problem. This is because the coordination of behavior is a drive-to-goal decision problem, in which one must prioritize what is most relevant for well-being (i.e., taking a nap versus studying for the exam), and then select what is the best policy accordingly (i.e., going to the library, or studying at home) (Pezzulo et al., 2018). Complex goal-directed behavior involves the prioritization of some prior preferences over others, and the selection of policies that will ensure their attainment.
Low levels of complexity in the control domain reflects the conflict between current sensorimotor affordances (i.e., lying in bed, sitting on a chair); while at higher levels, this requires the coordination of plans on higher temporal scales and executive control (i.e., going to the library, or studying at home). The motivational domain differentiates between low-level visceral drives (i.e., sleepiness) and higher-order goals (i.e., getting good grades). In this way, these two dimensions complement each other to propagate prior preferences or goals (in the control domain), and their precision or relative influence is informed by the motivational domain.
Allostatic models are therefore built as a self-scaffolding structure that grows in complexity. This complexity, however, opens more possibilities for dissonance within each layer, and across levels. For instance, conflict is more likely to arise within the same level as a function of the number of the current preferences/action plans (i.e., studying in your room, going to the library, studying with a friend etc.). Different levels can also enter in conflict as higher-level drives, or distant goals (i.e., 'I want to have good grades') preclude lower level ones or current states (i.e., 'I want to sleep'), or vice versa. Functional integration of goals and drives calls for models that can make accurate inferences about state transitions, as they account for the effects that a certain policy would have at different levels of the hierarchy. In other words, action selection needs to track the way in which policies will reduce expected free-energy in the short term and the expected freeenergy in the long run.
Adaptive, motivated control largely depends on the ability to implement suitable policies that guarantee the attainment of goals through homeostasis and adaptive action, as well as the ability to flexibly adapt preferences to the situation through allostatic control. The student will have a nap only if she believes that setting an alarm would prevent her from sleeping all night. Here, we tap again into a key aspect of goal-directed behavior; namely, controllability. If an agent is to believe that its actions will bring the desired consequences, it must also trust (i.e., have confidence in) its (counterfactual) predictions.
Control refers to the ability of an action, or lack of action, to alter the probability of an outcome (Maier and Seligman, 2016). Therefore, controllability is reflected in the confidence in (a.k.a., the precision of) predictive signals that are expected to bring the system closer to its homeostatic needs. We observe two important components for the system to be able to implement goal-directed behavior: (1) the availability of policies that guarantee the attainment of goals (i.e., setting an alarm), and (2) the ability to flexibly adapt preferences (i.e., it is okay to lose some hours of studying, because I need to rest if I want to do well on the exam).

Pathways to thwarted allostasis
So far, we have highlighted the imperative for living organisms to exert control over themselves and over their environments. Survival demands adaptive actions that optimize the fit between individual and environment. In active inference, adaptation may be mediated by actions that bring about preferred sensory states (i.e., pragmatic action), or by actions that procure information about the state of the world (i.e., epistemic action), thus facilitating pragmatic decision-making in the future . The distinction between pragmatic and epistemic imperatives for policy selection inherits in a straightforward way from the nature of expected free-energy. In statistical terms, free-energy can be expressed as complexity minus accuracy. When considering expected free-energy in the future, the expected complexity becomes risk, and the expected inaccuracy becomes ambiguity. This means that policies that minimize expected free-energy necessarily have risk-reducing, pragmatic and ambiguity-reducing, epistemic aspects. Risk, in this setting, is simply the difference between anticipated outcomes and prior preferences. The degree to which different policies minimize expected free-energy determines their relative likelihood of selection. If all policies reduce expected free-energy to the same degree, then there is an inherent uncertainty about which policy is the most likely and a concomitant loss of controllability. This goes hand-in-hand with a loss of confidence (i.e., precision) about 'what to do next'.
Uncontrollability, or the unavailability of an adequate response to the demands of a situation, is regarded as a necessary condition for the perception of an event as aversive (Averill, 1973). Under active inference, stress occurs when the system is surprised about its sensory data and therefore it is unsure about "what to do to safeguard its physical, mental or social wellbeing" (Peters et al., 2017, p. 184). Stress is conceptualized as a response to predictive uncertainty, where the agent is unsure how to return to a state compatible with its homeostatic drives. Stress alerts the system of the need to radically reallocate energy resources-through belief updating and subsequent action-in order to achieve the desired goals.
Controllability can then be understood on a continuum, ranging from the highest levels of the cortical hierarchy (i.e., goal-directed behavior) to lower levels of allostatic and homeostatic machinery. In the following section, we will unpack the process by which perceived uncontrollability outstrips the unavailability of overt action and extends to the inability of the neuroendocrine system to reduce expected uncertainty. We will detail how this cascade of mechanistic failures to restore homeostasis will sink the individual into a depressive state.

Stress and learning what to do next
In active inference models, an acute stress reaction is evinced when there are no policies available to realize preferred (goal) states, or similarly when the individual fails to attenuate the precision of those goal states (i.e., 'It is okay to get lower grades on this exam').
As goal-motivated behavior fails to bring about desired consequences, control is delegated to a series of autonomic reactions that ensure the realization of allostatic drives. In hierarchical (viz. heterarchical) models of active inference, uncontrollability is understood as an inability of higher levels to contextualize lower levels, indicating that available policies are unlikely to reduce expected free-energy (Pezzulo et al., 2015).
The acute stress response is orchestrated by higher levels of the heterarchy to induce changes that guarantee efficient allocation of energetic resources (Peters et al., 2004). Cortical visceromotor areas, such as the cingulate cortex, send projections to lower levels to control autonomic, endocrine, and immunological systems reflex arcs (Stephan et al., 2016). When 'available actions' and 'no action' policies are expected to result in the same outcomes, the sympathetic nervous system and the HPA axis are activated to provide additional energy to the brain (Hitze et al., 2010;Peters et al., 2017). The autonomic nervous and endocrine systems oversee the energy supply during stressful situations (Rotenberg and McGrath, 2016). They can induce hyper-alert states, where energy resources are allocated for the collection of new information, so better predictions can be made, which will hopefully make acting at least marginally better than doing nothing. Here, we tap into the adaptive value of stress, whereby three crucial processes for the control of predictive uncertainty are modulated: attention, learning, and habituation, which will determine the course of allostatic changes (Peters et al., 2004).
An integrative hub of goal-directed behavior is the anterior cingulate cortex (ACC), which is thought to monitor the degree of uncertainty or the difference between 'goal states' and 'attainable states' (Barrett and Simmons, 2015). The ACC outputs connections that influence visceral and autonomic functions by issuing signals (thought to carry predictions) to amygdala, midbrain, and brainstem (Barrett and Simmons, 2015;Rolls, 2019). The ACCis considered to have a role in the regulation of predictive uncertainty, as it tracks action-reward contingencies to guide action-outcome learning (for a review, see Rolls et al., 2019). The sympathetic system facilitates the classical 'fight or flight' response. The ACC-amygdala complex has descending connections to the locus coeruleus, a brainstem region that secretes the neurotransmitter norepinephrine (NE) to prepare the system for action (i.e., increase arousal, alertness, and attention/vigilance) (Berridge and Waterhouse, 2003;Jedema and Grace, 2004). It is thought to be in charge of the transition from an energy-efficient mode to an energetically expensive mode that equates to full information processing capacity (Peters et al., 2017). It has been directly linked to 'waking' states, independently of the affective valence of the stimulus. Indeed, this catecholamine is thought to increase precision of prediction errors, facilitating model updates about environmental changes and flexible learning (Parr and Friston, 2017;Sadacca et al., 2017). In addition, the activation of the noradrenergic system by stress-induced hormones facilitates the consolidation of declarative memories during emotional events (Ferry, Roozendaal, McGaugh, 1999;Hu et al., 2007).
For its part, the HPA axis is a hormonal system, which is activated by descending predictions to the ventromedial hypothalamus (Hitze et al., 2010;Peters et al., 2017). The neuroendocrine response begins with the release of chemical messengers that act in a non-linear manner to induce adaptive changes in tissues and organs by cellular activity (i.e., acting on ion channels) (McEwen and Wingfield, 2003). Some of these primary mediators include hormones of the HPA axis, such as catecholamines (i. e., dopamine, NE) that are secreted by the adrenal gland (McEwen and Wingfield, 2003) (see Fig. 2). Ultimately, HPA axis activation results in glucocorticoid secretion. Glucocorticoids (e.g., cortisol) have been a primary focus in stress research as they are seen as the main physiological correlate of stress. Apart from their role in inducing 'fight or flight' responses, they also pass through the blood-brain-barrier, activating adrenal steroid receptors MRs (mineralocorticoid receptors) and GRs (glucocorticoid receptors) in the brain to modulate synaptic plasticity, and therefore learning (Peters et al., 2017;Rotenberg and McGrath, 2016).
These two types of receptors are present throughout the brain, but areas such as the hippocampus and the amygdala, which are both key regions for memory encoding, have the most numerous populations (Finsterwald and Alberini, 2014;Smith and Vale, 2006). The effects of glucocorticoids are still debated, as data suggests they have disparate repercussions in long-term potentiation and long-term depression, especially in the hippocampus (Korte et al., 2005;Maggio and Segal, 2009;McEwen et al., 1986). MR receptors tend to be more occupied at basal levels of cortisol, while GR receptors are generally occupied when stress levels are higher (for a review, see Sapolsky et al., 2000).
Intermediate activation of GR receptors is necessary for memory consolidation, but saturation of GRs results in memory impairments (Finsterwald and Alberini, 2014;Peters et al., 2017;Sorrells et al., 2009). As stress levels rise, this hormone decreases hippocampal excitability, impairs hippocampal dependent learning, inhibits glucose uptake, and induces pyramidal cell loss (McEwen and Sapolsky, 1995). In sum, a relatively small amount of glucocorticoid facilitates learning, however suboptimal concentrations have the reverse effects. This leads us to the question: When is it worth learning? If higher order structures try to regain control by promoting model updates, when does learning stop being successful?
The HPA axis is a major neuroendocrine system in charge of controlling stress reactions and the regulation of several autonomic processes such as digestion and the immune system. The hypothalamus contains neuroendocrine neurons that secrete corticotropin-releasing hormone (CRH), which in turn stimulates the secretion of adrenocorticotropic hormone (ACTH) by the pituitary gland. ACTH modulates the production of glucocorticoids, mainly cortisol (CORT) in the adrenal cortex, which act back on the hypothalamus and pituitary by negative feedback mechanisms (From Pariante and Lightman, 2008).

Thwarted allostasis: maladaptive neuroendocrine profiles
Perceived uncontrollability halts behavioral controllers oriented towards overt action, at which point goal-directed behavior is relegated by simpler controllers, such as autonomic reflexes. If the inability to reduce uncertainty is extended over time the system falls into a state of allostatic load (Peters et al., 2017). The term allostatic load was coined by McEwen and Stellar (1993) to refer to the fact that some protective changes initiated by allostatic mechanisms can be very costly for the organism when overused or maintained over extended periods of time. Allostatic load captures the cumulative physiological effects of allostatic responses to a stressor.
Allostatic mediators are secreted at first to induce adaptive shortterm adjustments (i.e., larger weight on afferent information). However, chronic production can lead to imbalances of these primary mediators, which are referred to as allostatic states (e.g., glucocorticoid imbalances) (Koob and Le Moal, 2001;McEwen, 2003). An allostatic state refers to the changes in allostatic setpoints induced by chronic Fig. 2. The hypothalamic-pituitary-adrenal axis (HPA axis). deviations from the 'normal' state, which alter the previously expected setpoint (Koob and Le Moal, 2001). Allostatic states are thus learned in response to chronic environmental stress. Glucocorticoid cortisol has been identified as one of the four main primary mediators of this effect (Seeman et al., 1997). High glucocorticoid concentrations are believed to index states of high uncertainty, where the model is not fit to make adequate predictions; therefore, associations are not consolidated during times of uncertainty, when the system is stressed, as they do not represent reliable contingencies (Peters et al., 2017). For instance, glucocorticoid secretion during acute stress has been directly linked to cessation of dopamine production in limbic areas necessary for learning action-outcome contingencies (Butts et al., 2011).
Repeated engagement of stress reactions can result in maladaptive neuroendocrine profiles, for example, by hampering learning via dopamine inhibition. These stressed models of the world can have four types of deleterious consequences: (1) failure to predict the need for an adaptive response; (2) predicting the need, but not implementing the necessary response; (3) employing a standard response when it is no longer required or adaptive; (4) lack of habituation to a recurrent stressor (McEwen, 2006;McEwen and Gianaros, 2011). The influence of glucocorticoids and neurotransmitter alterations on cortical plasticity shape brain-based priors such that they become unable to adaptively anticipate rewards in relation to the environment (Der-Avakian and Markou, 2012).
Habituation to a chronic stressor can be impaired by exposure to stressors of high intensity, chronic unpredictable stress, or repeated social stress (Herman et al., 2011). In these cases, the stimuli are strong enough to interfere with feedback-inhibition mechanisms, and thus enable the release of stress related hormones. Facilitation of glucocorticoid responses provides an adaptive mechanism that prevents the system from habituating to potentially threatening stimuli. In this way, chronic stress profiles can result in increased basal glucocorticoid secretion together with extended stress response activation. Biologically, this is associated with loss of glucocorticoid receptors (GR) that interferes with the inhibitory influences over the HPA axis activity (Boyle et al., 2005;Sapolsky et al., 1984).
Consequently, this HPA activation pattern facilitates and prompts stress reactions, making individuals more susceptible to stress (Burke et al., 2005;Danese and McEwen, 2012;Hasler et al., 2004;Herman et al., 2011). Failure to habituate to chronic stress produces allostatic changes (i.e., lower GR expression, which in turn causes sustained HPA activity) that cause a shift from motivated control to habitual control. When stress is chronic and repeated, the statistical regularities that are learned 'it is better to have an automatic response'. The energizing aspect of motivated control (i.e., HPA activity) becomes established as a habitual reaction.
In the active inference framework, the instauration of HPA activation as a habitual controller corresponds to a change in the precision at higher levels of the hierarchy (i.e., goal-directed behaviors), which is downregulated relative to lower levels of motivated control hierarchy (i.e., HPA activity, arousal/vigilance). That is: ''placing a high precision on sensory prediction errors produces habitization (i.e., the shift from goaldirected to habitual control), where habits directly activate reflexes and preclude (unnecessary) inference at higher hierarchical levels" (Pezzulo et al., 2015, p.25).
When we look at the allostatic changes that follow repeated stress, it becomes apparent that neuroendocrine alterations can be understood as disturbances in motivated behavior. The effects of glucocorticoids on cortical plasticity can be seen as a correlate of precision adjustments in the motivated control hierarchy. It is known that the exhaustion of GRs affects limbic and frontal function. For instance, repeated glucocorticoid secretion enhances connections between the ventral region of the hippocampus, with the hypothalamus and amygdala, has been associated with the consolidation of emotional memories, and has been directly linked to fear potentiation (Bannerman et al., 2003;Maggio and Segal, 2009;Korte et al., 2005;Korte, 2001). On the other hand, the connections between the dorsal hippocampus and neocortex are functionally associated with cognitive functions, such as working memory, and spatial maps (Bannerman et al., 2003). Acute stress, via glucocorticoid production, impairs long term potentiation between these areas, resulting in less executive control over autonomic reactivity (Maggio and Segal, 2009).
Notably, the inhibitory role of the hippocampus of the HPA axis is key to regulating stress responses, and alterations in this area cause glucocorticoid hypersecretion in animals (Jacobson and Sapolsky, 1991;Smith and Vale, 2006). We observe that higher-order areas lose their ability to downregulate stress reactions (or similarly an inability to cancel prediction errors). Note that the hippocampus has been identified as a key hub in the regulation of goal-motivated behavior due to its role in episodic control and in memory consolidation (Pezzulo et al., 2015, p. 29).
This suggests that stressful circumstances shape beliefs about learned helplessness and inefficacy: agents become confident about the uncontrollability of the environment (i.e., glucocorticoids effects on fear potentiation), while inhibiting the consolidation of generative models that facilitate goal-directed behavior (i.e., lack of positive reinforcement via dopamine inhibition). This is key, because the adaptive value of stress lies in its goal-directed effect towards the obstacles in the environment (i.e., increased computational resources deployed to learning). Nevertheless, when habitual stress responses overrule goal-directed actions, it can be said that autonomic reflexes will be inflexible (insensitive to action-outcome contingencies), and energy resources will be overused by continuous limbic activation (Le Heron et al., 2019). We now enter a second phase of allostatic load, the re-regulation of energy resources by the immune system.

From allostatic load to allostatic overload
Over time, other systems compensate for the imbalance of primary mediators, leading to a series of metabolic, inflammatory, and cardiovascular biomarkers, known as secondary outcomes (i.e., inflammatory markers such as cytokines) (Juster et al., 2010). Disturbances in the immune system are especially relevant in allostatic load, as they might mark the transition to allostatic overload.
Adaptive responses to stress, such as the classic 'fight or flight' response orchestrated by the neuroendocrine system, are also accompanied by changes in the immune system that prevent potential infections and promote wound repair. There are two types of immune reactions: natural and specific; both are necessary to restore homeostasis at different timescales (Coutinho and Chapman, 2011). Natural immune responses coordinate host inflammatory cascades that act as a primary defense mechanism against infection and tissue damage (for an active inference model of the immune system and its relation to psychiatric conditions, see Bhat et al., 2021). These processes are therefore not specific and are associated with a congregation of cells that promote inflammation and fever. On the other hand, specific immune responses confer the ability to learn and develop adaptive responses over time, such as specific antigen responses (Coutinho and Chapman, 2011;Cruz-Topete and Cidlowski, 2015). Thus, during stressful events, specific immunity relies on host natural immune reactions to contain infections, while it learns how to deal with various pathogens.
One of the main primary mediators of the immune system are cytokines. Cytokines are the proteins that mediate the balance between humoral and cell-based immunity, and they are produced by two different types of T helper cells. Th1 cells are in charge of specific immunity, also known as cell-based, and they trigger the production of other proinflammatory cytokines (Chen, 2007). On the other hand, Th2 cells elicit humoral reactions and stimulate the production of anti-inflammatory cytokines to create inhospitable environments that respond in a general manner to any pathogen (Finkelman et al., 1997). Similarly to the MR-GR-receptor mechanism described above, whereby glucocorticoids exert their influence in the body, a good balance between these two types of cytokines is necessary for optimal immune activity.
There is a continuous crosstalk between the neuroendocrine and immune systems through a number of hormonal and neuropeptide mediators that allow the interaction of the two systems, and consequently the activation of the immune system to a stressor. One of the key axes for these exchanges is the HPA axis, which secretes adrenal hormones such as epinephrine, norepinephrine, and cortisol that bind into various receptors on white blood cells to influence immune function (Ader et al., 2001). Importantly, most actions over the immune system mediated by glucocorticoids are associated with the transcriptional effects of GR binding, which affects T cell production (Ashwell et al., 2000;McEwen et al., 1997). Therefore, alterations in the immune system emerge to a large extent as a function of the allostatic changes that take place in the neuroendocrine system such as the loss of GR expression.
Acute time-limited stressors are associated with an enhancement of the immune response, while chronic stress has immunosuppressive effects (Coutinho and Chapman, 2011). For instance, the secretion of catecholamines by the medulla of the adrenal gland has been directly linked to the activation of pro-inflammatory responses early during acute stress response (Flierl et al., 2008). Stress mediated increments in CORT secretion facilitate immune mobilization to injured areas . Natural immune reactions are more rapid and less energy consuming, and thus they are better suited to respond to the demands of the impending stressors. As stress responses prepare cardiovascular and endocrine systems for 'fight or flight' responses, the immune system should also be readying to provide protection following wounding (Dhabhar, 2014). On the other hand, glucocorticoids can also exert anti-inflammatory effects following initial immune activation during stress to restore homeostasis (Munck et al., 1984;Cruz-Topete and Cidlowski, 2015) and optimal glucocorticoid levels are necessary to prevent excessive immune activation (Sorrells et al., 2009). This led some theorists to posit that CORT secretion downregulates immune activation during early stress (Munck et al., 1984). The immunosuppressive and anti-inflammatory properties of glucocorticoids have been exploited for therapeutic use in immune diseases such as asthma, allergies, and autoimmune diseases among others (Vandewalle et al., 2018).
Nevertheless, glucocorticoids can have adverse effects for the immune system (Dhabhar, 2009;McEwen et al., 1997). Chronic cortisol secretion is thought to suppress Th1 cytokines that regulate cellular immunity while activating Th2 cytokines that promote humoral reactions (Hou et al., 2013;Seidl et al., 2011). Notably, the negative feedback loop that is formed between the HPA axis and the immune system is compromised. Reduced GR receptors might underlie the switch in the effects of glucocorticoids for immune activation, as occupancy of GR receptors is necessary to inhibit proinflammatory factors (Danese et al., 2007;Sterling and Eyer, 1988).
The aggregation of multiple physiological dysregulations and their secondary outcomes ultimately result in allostatic overload, leading to clinical manifestations (a.k.a. tertiary outcomes, such as depression) (McEwen and Stellar, 1993;Juster et al., 2010). The severity and extent of allostatic load is measured by assessing primary mediators and biomarkers, which are used to predict the vulnerability for tertiary outcomes such as depression (McEwen, 2000a(McEwen, , 2000bMcEwen and Seeman, 1999). Therefore, chronic stress can result in imbalances of pro-inflammatory responses that contribute to immunopathology and other pathologies like depression (Dantzer et al., 2008;Dhabhar, 2014).
In sum, we argue that stress is monitored by three distinct systems that work on different timescales: neuromodulatory, neuroendocrine and immune. The presence of external uncontrollability puts goaldirected behavior on hold and delegates its activity to the neuroendocrine system. The HPA system responds by triggering a stress reaction that is meant to inhibit itself when the stressor is not sustained over time. This negative feedback mechanism depends on the availability/expression of GR receptors. Chronic stress leads to reduced expression of GR receptors, resulting in sustained glucocorticoid reactions. We refer to this stage as "neuroendocrine regime dominance", a state characterised by high levels of circulating cortisol.
In parallel, there is continuous cross-talk between the endocrine and immune systems. There is bidirectional communication between the two systems whereby cortisol inhibits the production of immune cells, while cytokines stimulate glucocorticoid release. A burst of pro-inflammatory cytokines associated with acute stress aims to stimulate the negative feedback loop that downregulates HPA axis activity via cortisol. Nevertheless, for these circuits to function correctly, they need to be in balance. As GR receptor expression decreases, inhibitory effects over the immune system are reversed (via glucocorticoid transcriptional effects), resulting in a period of hypercortisolism and inflammation. These two markers speak to a transition from neuroendocrine dominance to immune dominance. At some point the adrenal glands stop secreting cortisol as they reach their production threshold, and the system is left in a state of hypocortisolism and inflammation.

Allostatic pathways to depression
We have exposed how the interaction between neuroendocrine and the immunological systems acts as an interface that coordinates the allocation of energy resources in the short and long term, respectively. As the activation of HPA response mobilizes energy to the system with the intent to reduce expected free-energy (i.e., cortisol, NE), inflammatory reactions reflect the enactment of long-term immunological policies that evince the agent's learning about inefficiency of short term adjustments. HPA over-reactivity and its markers act as hidden states for the immunological system to infer the inability of neuroendocrine mechanisms to reduce uncertainty. The immunological system tries to compensate for the imbalances of primary mediators by triggering inflammatory reactions (i.e., indexed by cytokine levels) that result from glucocorticoid over-reactivity (Juster et al., 2010;McEwen and Gianaros, 2011). Notably, inflammatory processes affect motivation and motor circuits that can result in anhedonia, fatigue, and psychomotor impairment, as well as increased threat sensitivity (hyper-vigilance) (Morris and Cuthbert, 2012). Inflammatory processes are immunological reactions oriented to the preservation of energy, clinically manifesting as 'sickness behaviors' (Raison and Miller, 2013). The secretion of humoral cytokines acts as an immunological hyperprior over the neuroendocrine system to compensate for its over-reactivity (i.e., HPA as a habitual controller).
Life-stress is largely recognized as a predisposing and precipitating factor for depression (Caspi et al., 2003;Kendler and Karkowski-Shuman, 1997;McEwen, 2003). The HPA axis is a major system in charge of controlling stress reactions, and its sustained activity results in allostatic load and allostatic overload (McEwen, 2003). Importantly, depression has been conceived as the tertiary outcome, or final stage of allostatic load (Juster et al., 2010;McEwen, 2003). There are several allostatic states in MDD that have been reported as a consequence of alterations in the HPA axis, while increased biomarkers of allostatic load correlate positively with the severity of depressive symptoms (Juster et al., 2011;Kobrosly et al., 2014;McEwen, 2003).
Nevertheless, depression has many faces and the allostatic markers associated with the same disorder vary largely across individuals. Therefore, individuals that are diagnosed with depression might find themselves at different stages of the allostatic transition. Although it is well known that stressful experiences underline the emergence of depression there is still a gap by which mechanisms these events result in depression. Here we draw some hypotheses regarding the type and the timing of the stressor and the consequences it has for the allostatic machinery.

Etiology of the stressor
To understand the development of depression, we need to explore the etiology and the conditions that surround the presence of the stressor. Learned helplessness and mild chronic stress models are well established animal models of depression (Vollmayr and Henn, 2003). Both models use uncontrollable stressors to trigger allostatic alterations observed in depression, such as weight gain, excess sleep, and locomotor disturbances, as well as anhedonia, a cardinal symptom of depressive symptomatology. In these experiments the condition of uncontrollability may lead to an overgeneralization of perceived uncontrollability to other settings, or fostering the beliefs that all actions are equal (i.e., glucocorticoids that increase limbic connectivity, and decrease prefrontal control) by affecting associative-learning processes. Clinically, this has been associated with anhedonia (Miller and Seligman, 1975;Beck, 1979).
Early life adversity is a robust predictor of depression and the main risk factor in its development (Heim et al., 2008;Hostinar et al., 2018). Parental maltreatment and low socioeconomic status during childhood are considered severe and chronic stressors that might trigger depression (Hostinar et al., 2018). Particularly, childhood trauma and posterior development of depression are related to HPA axis dysregulation evidenced by glucocorticoid resistance, increased corticotropin-releasing hormone (CRH) activity, immune activation, and reduced hippocampal volume (Danese and McEwen, 2012;Heim et al., 2008). Note that childhood maltreatment is associated with an increase in inflammation biomarkers, and concurrent MDD accentuates these inflammatory indicators in individuals with a history of childhood abuse (Danese and McEwen, 2012).
Nevertheless, chronic environmental stressors are also important risk factors for the development of allostatic load markers, and consequently depression. For instance, individuals with lower socioeconomic status experience higher chronic stress (Steptoe et al., 2003) and lower perceived control at work (Warren et al., 2004), which correlate positively with allostatic load (Szanton et al., 2005). Other social inequalities, such as discrimination based on race and ethnicity, are associated with more severe allostatic changes (Geronimus et al., 2006), implying that discrimination is a major source of stress amongst individuals from minority communities. For instance, homosexual and bisexual women show increased cortisol reactivity compared to heterosexual females (Juster et al., 2015). Curiously, disclosure to family and friends has been shown to be a protective factor against psychopathology and cortisol hyperactivity (Juster et al., 2019).

Conditions of perceived uncontrollability
It is therefore reasonable to argue, as we shall presently, that perceived uncontrollability will depend upon the developmental phase at which the stressor is experienced, as well as the environmental conditions that surround it, and consequently it will have different repercussions for the individual.
Early life social stress, especially emotional or physical neglect, is a major predisposing factor to MDD (Heim et al., 2008;Kendler and Karkowski-Shuman, 1997;McEwen, 2003), and remains the best predictor of depression (Watson et al., 2014). In animals, early-life stress, generally studied through maternal deprivation, is associated with lower GR expression causing longer and sustained stress reactions (Ladd et al., 2004). Key for mapping the conditions of perceived uncontrollability would be: (1) availability of precise goal motivated policies; (2) confidence, or relevance of homeostatic goals. During infancy and early ontogenetic periods living organisms have minimal prior experience of the world. This translates into highly imprecise priors that are waiting to be informed, and few but highly precise homeostatic drives (i.e., maternal care) (Fotopoulou, and Tsakiris, 2017).
Furthermore, as we have seen, changes in allostatic states can have non-linear repercussions in other processes, such as neurotransmitter metabolism, neuroendocrine function, synaptic plasticity, and circuits involved in the regulation of mood and motor activity (Capuron and Miller, 2011). The maturation of the neuroendocrine system involves the acquisition and learning of the contingencies in the environment. Hence, early adversities will shape imprecise beliefs that rely on the activation of the stress machinery, as precise priors (i.e., about the consequences of action) can only emerge in stable, predictable environments (Clark et al., 2018). Furthermore, glucocorticoid imbalances have been shown to directly affect dopamine neurotransmission, especially during childhood and adolescence, since brain maturation will be dependent upon those experiences (Sinclair et al., 2014). Here, we argue that depressive episodes rooted in early life experiences would correspond with more severe and advanced stages of allostatic load.
There are other stressors that might appear later in life, which by an interaction with individual vulnerabilities might result in depression. The conditions for perceived uncontrollability might take a different nuance since there is a wider range of goal-directed actions that are available, and the homeostatic goals that are endangered are frequently more culturally shaped. Depression has been called a disease of modernity due to the increased prevalence in countries that display greater 'modernity' markers, such as higher GDP per capita (Hidaka, 2012).
We argue that environments that encourage individual achievements, e.g., in economic terms, will result in more complex motivated action systems that are difficult to keep up with. The creation of stimulus-stimulus associations to achieve homeostatic goals (i.e., social approval) conforms to more complex maps of action-outcome contingencies where uncertainty is more difficult to reduce. For instance, data shows that social support is a modulating factor in the development of depression for groups that suffer racial discrimination (Noh and Kaspar, 2003). Although a full model of this hypothesis is out of the scope of this paper, we speculate that this type of stress will establish a continuous crosstalk between goal-motivated systems and HPA axis that will index states of anxiety but the allostatic changes will be less severe (i.e., less involvement of the immunological system).

Atypical versus melancholic depression
In recent years, efforts have been directed towards the construction of reliable differential diagnosis that would help explain the heterogeneity of depressive symptomatology. Even if the findings are still debated, there are two main subtypes in DSM-V with opposing allostatic profiles: atypical and melancholic depression.
Atypical depression is characterized by a hypoactive HPA system, whereas melancholic depression is associated with a hyperactive HPA system (Lamers et al., 2013;Stetler and Miller, 2011). Melancholics have significantly higher levels of cortisol compared to atypicals, whereas atypical depression represents a pattern of relative hypocortisolemia compared to melancholic depression (Lamers et al., 2013). On the other hand, atypical patients show increased inflammation indexed by high levels of proinflammatory cytokines, and increased body-circumference (Lamers et al., 2013;Lee and Kim, 2015). These biological mechanisms result in reversed vegetative symptoms such as hypersomnia and weight gain, while melancholic depression is characterized by loss of appetite and sleep (American Psychiatric Association, 2013).
Although the differential etiology of both subtypes has not been clearly elucidated, some studies suggest that atypical depression has generally an earlier onset and has a more chronic course (Stewart et al., 1993). We speculate that early life adversity would condemn individuals to highly imprecise priors about the consequences of neuroendocrine activation that calls for the immune system to take over (i.e., immunological hyperprior). The predominance of inflammatory biomarkers in atypical depression has been linked to HPA hypoactivity, since reduced cortisol secretion disinhibits immune function (Gold and Chrousos, 2013). Experimental studies, for instance, have shown that individuals who suffered early-life adversity display stronger increases of proinflammatory cytokines in response to stress (Pace et al., 2006).
The presence of such a strong oversight by the immune system speaks to the believed inability of more immediate mechanisms to reduce uncertainty driven by HPA activation. We argue that the different types of HPA activation in atypical compared melancholic depression represent different degrees of confidence in the availability of neuroendocrine policies, partially due to the developmental stage at which the stressors were experienced. Atypical depression would represent a system that has given up due to unsuccessful implementation of HPA machinery in the control of uncertainty as a child. Energy control is relegated to an immune system that would ensure the protection of the individual in the long run with sickness behaviors that force the individual to withdraw. On the other hand, a hyperactive HPA system in melancholic patients suggests some confidence in its ability to restore homeostasis by increasing energy supply. Increased HPA activation via CRH secretion has been linked to hyperarousal and anxiety in animals, and it triggers melancholic symptoms such as loss of appetite (Sutton et al., 1982). The difference in the allocation of energy resources speaks of two different stages of allostatic load that correspond to varying degrees in the confidence placed on neuroendocrine policies for the resolution of uncertainty.

Network connectivity in depression and allostatic overload
Overall, studies suggest that atypical depression has more severe correlates of allostatic (over)load, since allostatic mechanisms have been overused at earlier stages and with higher frequency than those with melancholic features. The following section attempts to track brain network variability that could account for the different stages of allostatic load.
There are two main intrinsic networks involved in allostasis, interoception, and consequently stress: the default mode network (DMN) and the saliency network (SN), which comprise most of the limbic cortices (Barrett, 2017;Kleckner et al., 2017). The activity and connectivity of these networks is central to psychological functions that support survival (i.e, perception, emotion, action) (Kleckner et al., 2017). Notably, depression is characterized by abnormal functional connectivity in these networks (Mulders et al., 2015;Pannekoek et al., 2014). Briefly, the DMN is a pivotal system in self-referential processes, as it is activated during resting states when individuals are engaged in processes, such as autobiographical memory retrieval and prospective planning (Sheline et al., 2009). Relevant for our discussion the medial prefrontal cortex (mPFC) is one of the core nodes of the anterior DMN, while the hippocampus and subgenual anterior cingulate cortex (sgACC) have important functional associations to the DMN (Mulders et al., 2015).
On the other hand, the SN is thought to modulate attentional shifts towards relevant information (Barrett, 2017). It comprises the anterior cingulate cortex (ACC), anterior insular cortex (AIC), amygdala and other subcortical structures (Menon, 2015;Seeley et al., 2007). It plays an important role in the detection of salient and novel stimuli by controlling the precision associated with prediction errors (Feldman and Friston, 2010).
Some of these hubs comprise the so-called visceromotor areas (VMAs), which encompass the AIC, ACC, sgACC, and the orbitofrontal cortex (OFC) (Barrett and Simmons, 2015). They are situated at the top of an interoceptive hierarchy, and they are thought to encode the viscrosensory generative model that is necessary to maintain allostasis (Barrett and Simmons, 2015;Seth, 2013;Seth and Friston, 2016;Stephan et al., 2016). They are highly connected areas that exchange information with effector regions that control homeostatic reflex arcs such as midbrain, brainstem, and spinal cord nuclei (partially relayed by the amygdala), periaqueductal gray (PAG), and basal ganglia, to coordinate autonomic, immune and endocrine systems (Craig, 2003;Stephan et al., 2016;Barrett and Simmons, 2015;Barrett, 2017).
Acute stress is associated with increased connectivity in the SN, prompting attention and vigilance at the cost of executive control, a third well-defined network that is involved in cognitive processing (Hermans et al., 2014;Bressler and Menon, 2010). The release of catecholamines (i.e., NE, DA) marks the activation of the salience network relegating executive control that interacts with the more rapid effects of corticosteroids (Hermans et al., 2014;Clark et al., 2018). However, the slower effects of corticosteroids, or genomic action, favor the activation of dorsolateral prefrontal cortex (DLPFC, an important node of the executive network) at the cost of amygdala activation to restore 'network homeostasis' (Henckens et al., 2011). Nevertheless, the wear and tear of stress (i.e., GR loss) prevents corticosteroids from exerting their inhibitory influences on HPA axis, and consequently SN activation, resulting in imbalances between the networks.
Depressed individuals show increased responses in the SN to negative stimuli. Specifically, aberrant connections in the right anterior insula have been linked to emotional reactivity in depression (Manoliu et al., 2014). Notably, impairments in the SN dominate alterations in reward and motivation processes of depressed subjects due to its direct connection with the reward system (Beck, 1967;Disner et al., 2011). For instance, decreased SN intrinsic connectivity characterizes depressed individuals that score high on apathy (Yuen et al., 2014). Sustained hyperactivation of the amygdala and hypoactivation of the DLPFC are thought to underline the increased salience of negative stimuli in MDD, which in turn decreases the salience of rewarding stimuli (Menon, 2015;Disner et al., 2011). These plastic changes would explain anhedonic symptomatology that often precedes depressive episodes and characterizes melancholic depression (Dryman and Eaton, 1991).
Furthermore, the alterations in goal-motivated behavior observed in depressed individuals can be tracked to the difficulties of executive areas to exert control over lower-level areas. For instance, rostral ACC activation in depressed patients has been associated with difficulties in the inhibition of negative stimulus processing (Disner et al., 2011;Elliott et al., 2002). Lack of top-down regulatory control mechanisms, reflected in DLPFC and ACC abnormalities, are associated with executive dysfunctions and learning impairments that prevent flexible behavior (Hasler et al., 2004;Ravnkilde et al., 2002). Particularly, experimental research demonstrated that depression impairs performance adjustments following negative feedback Pizzagalli, 2007, 2008). The event-error and feedback-related negativities (ERN and FRN) are event-related potentials involved in the processing of errors and negative outcomes, and are thought to be generated by the ACC (Olvet and Hajcak, 2008;Thoma and Bellebaum, 2012). Notably, enhanced FRN and ERP to negative feedback was observed in clinically depressed patients (Mies et al., 2011;Olvet and Hajcak, 2008), which suggests an hypervigilant ACC action monitoring system that is weighting sensory precision over prior expectations. Some have argued that these components are directly involved in empathic responses, and they may explain why depressive individuals report increased affective empathy manifested as enhanced empathic distress (Thoma and Bellebaum, 2012).
In addition, acute stress is also associated with increased connectivity between SN and DMN. Notably, a consistent finding in the literature is the increased connectivity within DMN, mainly involved in selfreferential processing in adults with depression (Scalabrini et al., 2020;Mulders et al., 2015), and in children with a history of depression (Gaffrey et al., 2012). More specifically, subgenual ACC (sgACC) has been identified as a key hub in depression, and shows aberrant connectivity with amygdala and insula in depressed adolescents (Connolly et al., 2013). Furthermore, the intra-connectivity measures of this region correlate positively with systemic inflammation markers (Marsland et al., 2017).
Recent approaches suggest that these abnormalities stem from internetwork abnormalities and not necessarily from within network alterations (Scalabrini et al., 2020). We interpret that these allostatic changes try to compensate for the excessive attempts to coordinate adequate responses towards relevant stimuli (SN) by favoring inferences on already acquired information (DMN). Particularly, excessive DMN activation and hyperconnectivity with the sgACC have been linked to ruminative patterns of thought in depressed individuals (Berman et al., 2011). Rumination is considered as "a mode of responding to distress that involves repetitively and passively focusing on symptoms of distress, and on the possible causes and consequences of these symptoms" (Nolen-Hoeksema et al., 2008. p.400). We argue that DMN abnormalities and increased inflammation focus on optimizing probabilistic mappings about the self and the based on past events avoiding the engagement with the impending events (i.e., external and interoceptive) that are considered uncontrollable.
Prior to this work, active inference accounts of depression as a disorder of inefficient energy regulation have yet to identify and locate the maladaptive allostatic predictions in neural systems. This paper contributes to a better understanding of depressive symptomatology as a result of the allostatic changes that occur at different levels of organization when stress is sustained over time. Research in allostatic load provides the physiological correlates that track the attempts of the allostatic system to regain control over the internal and external environments. The understanding of the neuroendocrine and immunological systems as two mid-and long-term systems in the regulation of energy resources in the face of a stressors help establish different profiles of depression and the severity of the symptoms.With this review we would like to highlight the need for future research into the relationship between neuronal network connectivity and immune responses.

Conclusions and future directions
This paper reviewed and explained the physiological changes that occur in biological systems as a result of stress from the point of view of the active inference framework. We examined the relationship between these changes and the system's loss of confidence in its predictions about its internal and external milieu. We described the etiological pathway from allostatic overload to depressive symptomatology. We identified the neural systems that underwrite goal-directed behavior, and the neuroendocrine and immunological systems, as the hierarchical controller that regulates energy resources. We proposed a model of the pathway to depressive symptomatology where these systems interact at different timescales. We assert that long-term strategies (embodied by the immune system) are deployed as means to respond to internal uncertainty, which emerges in response to the hyperactivation of shortterm control processes (in the neuroendocrine system). We considered some hypotheses regarding how these adaptive strategies mark transitions in the severity of depressive symptomatology. We further explained how depressive symptomatology arises from a series of adaptations that represent an attempt-by the nervous system-to control external and internal uncertainty. More specifically, we argued that the dominance of immune adaptations are indicative of efforts to control the expenditure of energy resources.
Future research is necessary to unpack much of this framework and to test the active inference account of the allostatic pathway to depression. There are many unanswered questions regarding the etiology of the stressor, which are necessary to draw a mechanistic map of the development of different subtypes of depression. The literature focuses mainly on the relationship between early-life stress and immune alterations. Nevertheless, having more data regarding the etiology of the stressor (timing and nature) can help draw a mechanistic map of the onset and development of depression. In addition, more light should be shed into prior beliefs (i.e., individual differences) of these three different components to have a better understanding of what leads to the preponderance of one of the systems over the others. Furthermore, this paper postpones a more in-depth analysis on the role that the social environment has in the regulation of energy resources. We believe it is essential to understand the external environment as a constituent part of our bodies. Therefore, relationships, institutions, and socio-economic status, among other things, act as important mechanisms for the reduction of uncertainty. We want to underline that a plausible computational model of depression must account for the external and internal factors that regulate the reduction of uncertainty under first (e. g.,free-energy) principles. Therefore, a formal model of depression should accommodate the external environment, which is key to understanding the development of depression.
To foreground the implications of this theoretical synthesis-from a clinical perspective-we address three key questions (formulated by our reviewers) to reiterate some key points: 1) What is the pathophysiology of depressive disorders? Our model involves three components that are well known to be dysregulated in depressive disorders (neuromodulatory, endocrine, and immune systems). The theoretical contribution is the proposal that these systems are organised as a heterarchy, where they inform each other at different timescales. This construction allows one to understand how stress is tracked physiologically by three different systems that operate over different timescales, and how these correspond to changes in arousal, emotion, and mood. 2) How can we diagnose depression? While it was not within the scope of this review to develop new diagnostic tools or criteria, we have attempted to map the account of pathophysiological processes to formal model properties on the one hand, and depressive symptomatology on the other. One advantage of this formal, model-based approach is that it may help engender novel predictions about the different profiles of depressive symptomatology that can be expected to manifest under differing disease profiles. This may constitute a step towards the development of personalised, computationallyinformed approaches to psychiatry. Notably, the proposed model points towards areas where we select a single neurotransmitter (dopamine), hormone (cortisol), or protein (cytokine) that can be used as multimodal markers of allostatic load at each level-that may offer a diagnostic tool for depressive disorders. 3) How can we treat depression? One might anticipate that this kind of model will generate insights about the computational mechanisms underpinning depression, and to guide the development of novel biomarkers of these processes. Taken together, we would expect these theoretical and diagnostic advancements to enable betterdesigned, more efficacious interventions for the prevention and/or treatment of depression. For instance, one might motivate personalised treatment plans, depending on whether individual profiles of depressive symptomatology derive predominantly from stressinduced HPA activity versus inflammatory activity.