Degeneracy and stability in neural circuits of dopamine and serotonin neuromodulators: A theoretical consideration

Degenerate neural circuits perform the same function despite being structurally different. However, it is unclear whether neural circuits with interacting neuromodulator sources can themselves degenerate while maintaining the same neuromodulatory function. Here, we address this by computationally modeling the neural circuits of neuromodulators serotonin and dopamine, local glutamatergic and GABAergic interneurons, and their possible interactions, under reward/punishment-based conditioning tasks. The neural modeling is constrained by relevant experimental studies of the VTA or DRN system using, e.g., electrophysiology, optogenetics, and voltammetry. We first show that a single parsimonious, sparsely connected neural circuit model can recapitulate several separate experimental findings that indicated diverse, heterogeneous, distributed, and mixed DRNVTA neuronal signaling in reward and punishment tasks. The inability of this model to recapitulate all observed neuronal signaling suggests potentially multiple circuits acting in parallel. Then using computational simulations and dynamical systems analysis, we demonstrate that several different stable circuit architectures can produce the same observed network activity profile, hence demonstrating degeneracy. Due to the extensive D2-mediated connections in the investigated circuits, we simulate the D2 receptor agonist by increasing the connection strengths emanating from the VTA DA neurons. We found that the simulated D2 agonist can distinguish among sub-groups of the degenerate neural circuits based on substantial deviations in specific neural populations’ activities in reward and punishment conditions. This forms a testable model prediction using pharmacological means. Overall, this theoretical work suggests the plausibility of degeneracy within neuromodulator circuitry and has important implications for the stable and robust maintenance of neuromodulatory functions.

Degenerate neural circuits perform the same function despite being structurally different. However, it is unclear whether neural circuits with interacting neuromodulator sources can themselves degenerate while maintaining the same neuromodulatory function. Here, we address this by computationally modeling the neural circuits of neuromodulators serotonin and dopamine, local glutamatergic and GABAergic interneurons, and their possible interactions, under reward/punishmentbased conditioning tasks. The neural modeling is constrained by relevant experimental studies of the VTA or DRN system using, e.g., electrophysiology, optogenetics, and voltammetry. We first show that a single parsimonious, sparsely connected neural circuit model can recapitulate several separate experimental findings that indicated diverse, heterogeneous, distributed, and mixed DRNVTA neuronal signaling in reward and punishment tasks. The inability of this model to recapitulate all observed neuronal signaling suggests potentially multiple circuits acting in parallel. Then using computational simulations and dynamical systems analysis, we demonstrate that several different stable circuit architectures can produce the same observed network activity profile, hence demonstrating degeneracy. Due to the extensive D2-mediated connections in the investigated circuits, we simulate the D2 receptor agonist by increasing the connection strengths emanating from the VTA DA neurons. We found that the simulated D2 agonist can distinguish among sub-groups of the degenerate neural circuits based on substantial deviations in specific neural populations' activities in reward and punishment conditions. This forms a testable model prediction using pharmacological means. Overall, this theoretical work suggests the plausibility of degeneracy within neuromodulator circuitry and has important implications for the stable and robust maintenance of neuromodulatory functions.

Introduction
The nervous system can be modulated by endogenous chemical messengers, called neuromodulators (Levitan, 1987). Neurons or synapses, and hence neural circuits, that succumb to neuromodulation can change circuit configuration and function (Marder, 2012;Marder et al., 2014). It is also known that neural circuits can degenerate, that is, circuits can comprise different elements and/or structure while performing the same function or yielding the same output (Cropper et al., 2016). This can allow robust maintenance of functions and behavior in the face of changes in the underlying structure (Edelman and Gally, 2001;Whitacre, 2010). It should be noted that the degeneracy defined here should be distinguished from other definitions e.g., having different states with the same energy level in quantum physics or a decline in health condition (Griffiths and Schroeter, 2018;Hocking et al., 2019). Although it has been shown that neuromodulators can selectively regulate degenerate neural circuits (Marder et al., 2014;Cropper et al., 2016), it is unclear whether neural circuits with neuromodulator-containing neurons can themselves be degenerate, which could in turn provide stable widespread neuromodulator influences on targeted neural circuits ( Figure 1A).
In this theoretical study, we investigate the plausibility of degenerate and stable neuromodulator circuits by focusing on the neural circuits in the midbrain which are the source of ascending pathways of two highly studied monoaminergic neuromodulators, serotonin (5-hydroxytryptamine; 5-HT) and dopamine (DA). These neuromodulators have major roles in modulating cognition, emotion, and behavior, and are linked to the pathogenesis and pharmacological treatment of many common neuropsychiatric and neurological disorders (Muller and Cunningham, 2020). The majority of 5-HTproducing neurons are found in the dorsal and median raphe nuclei (DRN and MRN), while most DA-producing neurons reside in the ventral tegmental area (VTA) and substantia nigra compacta (SNc) (Muller and Cunningham, 2020).
At the functional level, DA and 5-HT are known to play important roles in reward-and punishment-based learning and decision-making (Hu, 2016). For example, there is strong evidence that the DA neuronal activity signals reward prediction error (difference between predicted and actual reward outcome) to guide reinforcement learning (Doya, 2002). Specifically, DA neurons are phasically excited upon unexpected reward outcomes or rewardpredictive cues and inhibited upon unexpected reward omission or punishment (Schultz et al., 1997;Cohen and Uchida, 2012;Watabe-Uchida et al., 2017), although there is heterogeneity among DA neurons in this regard (Cohen and Uchida, 2012;Cohen et al., 2015).
In comparison to DA neurons, DRN 5-HT neurons exhibit greater complexity in function, with studies reporting that 5-HT neuronal activity encodes both reward and punishment. For instance, it has been shown that certain 5-HT neurons (labeled "Type I") were phasically activated only by reward-predicting cues, but not punishment in a classical conditioning paradigm (Cohen et al., 2015). Yet, in the same study, a different population of 5-HT neurons ("Type II") signaled both expected reward and punishment with sustained elevated activity toward reward outcome (Cohen et al., 2015). The latter study also found that, unlike DA neurons, baseline firing of Type-I 5-HT neurons was generally higher in rewarding than punishment trials, and this effect lasted across many trials, suggesting information processing over a long timescale, possibly used to perform meta-learning via learning rate modulation (Grossman et al., 2022). Similar 5-HT neuronal responses to reward and punishment were reported in other rodents (Liu et al., 2014;Li et al., 2016;Matias et al., 2017;Zhong et al., 2017) and non-human primates (Hayashi et al., 2015) studies. These differential responses of DA and 5-HT neurons to reward and punishment cannot be trivially reconciled with a simple model, e.g., using two opposing neuromodulatory systems as previously proposed (Boureau and Dayan, 2010).
Other studies reveal further complexity in reward/punishment processing, specifically in the form of altered activity of non-5-HT/DA midbrain neurons. For example, DRN neurons utilizing gamma-aminobutyric (GABA) were tonically inhibited during reward waiting with further inhibition during reward acquisition but phasically activated by aversive stimuli (Li et al., 2016). On the contrary, GABAergic neuronal activity in the VTA exhibited sustained activity upon rewarding cue onset but no response to the presence or absence of an actual reward outcome (Cohen and Uchida, 2012). Furthermore, other studies found that VTA GABAergic neuronal activity was potently and phasically activated by punishment outcome, which in turn inhibited the VTA DA neuronal activity (Tan et al., 2012;Eshel et al., 2015). Another study showed that glutamatergic (Glu) neurons in the DRN reinforced instrumental behavior through VTA DA neurons (McDevitt et al., 2014).
Taken together, information on reward and punishment signaled by neuronal activities within the DRN and VTA seems to be diverse, heterogeneous, distributed, and mixed. Some of these signaling responses are illustrated in Figure 1B and summarized in Supplementary Table 1. This led to the following questions ( Figure 1A). First, can these experimental findings from separate studies be reconciled and understood in terms of a single, parsimonious neural circuit model encompassing both the DRN and VTA? Second, can there be several degenerate DRN-VTA circuits, which are stable, that can signal reward and punishment information as observed in experiments? Third, if there are degenerate and stable DRN-VTA circuits, can some of them be distinguishable from others, for example, through perturbative means?
To address these questions, we developed a biologically plausible DRN-VTA computational neural circuit model based on our previous multiscale modeling framework for neuromodulators (Joshi et al., 2017;Wong-Lin et al., 2017), constrained by some of the observed neuronal signaling profiles under reward and punishment tasks. The modeling considers known direct and indirect pathways between DRN 5-HT and VTA DA neurons. Upon simulating the model under reward and punishment conditions, we found that many, but not all of the experimental findings, could be captured in a single, parsimonious DRN-VTA model. Furthermore, several distinct model architectures could replicate the same neural circuit activity response profile, hence reflecting the possibility of degeneracy. Applying Degenerate neuromodulator circuits are constrained by observed neuronal signaling. (A) Multiple neuromodulators that influence neural circuits, cognition, and behavior, may be embedded within degenerate neural circuits. Neuromod: Specific neuromodulator type. (B) Schematic of DRN and VTA activity profiles in reward and punishment tasks. Activities (firing rates) aligned to the timing of unexpected punishment outcome (left, vertical red dashed lines) and learned reward-predictive cue (right, vertical green dashed lines) and reward outcome (right, vertical red dashed lines). Top-to-bottom: VTA DA neural activity exhibits phasic excitation (inhibition) at reward-predictive cue (punishment) outcome (e.g., Cohen and Uchida, 2012;Tan et al., 2012). VTA GABAergic neural activity shows phasic excitation upon punishment (e.g., Tan et al., 2012;Eshel et al., 2015) while exhibiting post-cue tonic activity which is not modulated by the presence/absence of actual outcome (e.g., Cohen and Uchida, 2012). DRN Type-I 5-HT neurons show phasic activation by a reward-predicting cue (right) but not punishment (left). DRN Type-II 5-HT neurons signal punishment outcome (left) and sustained activity toward expected reward outcome (right) (e.g., Cohen et al., 2015). DRN GABAergic neurons have phasic activation upon punishment but have tonic inhibition during waiting and reward delivery (e.g., Li et al., 2016). DRN glutamatergic neurons were deduced to be excited by reward-predicting cues, in line with VTA DA neural activation (McDevitt et al., 2014), and assumed not to respond to punishment outcome. Baseline activity for Type-I 5-HT DRN neurons is higher in reward than punishment tasks (e.g., Cohen et al., 2015).
dynamical systems theory, we found that all these circuits were dynamically stable. To distinguish among these degenerated models, we simulated the drug effects of DA D2-receptor-based agonists and were able to distinguish sub-groups of these degenerate model architectures. Overall, our study provides theoretical support for the degeneracy and stability of neural circuits of neuromodulators.

DRN-VTA network modeling
To develop the DRN-VTA neural circuit models, we made use of our dynamic mean-field (neuronal population-based) modeling framework (Joshi et al., 2017) for neuromodulator interactions. The modeling approach was constrained by data from known electrophysiological, neuropharmacological, and voltammetry parameters [see (Joshi et al., 2017) and below]. Each neural circuit model architecture investigated consisted of DRN 5-HT, VTA DA, VTA GABA, DRN GABA, and DRN Glu neuronal populations. Direct and indirect interactions among these five neuronal populations were then explored. The main aim of this work was to evaluate the plausibility of neuromodulator circuit degeneracy and stability rather than replicate every neuronal population in these brain regions.
Directed interaction between two neuronal populations, in which the source was a neuromodulator, was mathematically described by the neuromodulator neuronal population firing (rate) activity, followed by the release-and-uptake dynamics of the neuromodulator, which in turn induced certain population-averaged currents on the targeted neuronal population (Joshi et al., 2017). An induced current, I x , which could be effectively excitatory or inhibitory depending on experimental findings, can be described by: where x was some neuromodulator (5-HT or DA), τ x the associated time constant, k x some constant that determined the current amplitude, and constants g x and [x] 0 that controlled the slope and offset of the function on the right-hand-side of Eq. (1). The releaseand-uptake dynamics for a neuromodulator x are described by: where [x] was the concentration of x, [x] p the release per neural firing frequency (Joshi et al., 2017), and constants V max,x and K m,x were constants determined from voltammetry measurements. If the source of the interaction came from GABAergic (or Glu) neurons, then for simplicity, we assumed an instantaneous inhibitory (or excitatory) current-based synaptic influence on the targeted neuronal populations. This reflected the faster ionotropic receptorbased synaptic dynamics compared to metabotropic receptorbased neuromodulatory effects, while also reducing the number of free model parameters. Similarly, we also ignored the relatively faster neuronal membrane dynamics. Threshold-linear input-output function for each neuronal population was used (Joshi et al., 2017) and described by: where F was the neural population firing rate (output), I was the total input current into a neural population, g was the slope, and I 0 was some threshold current, and with [z] + = z if z ≥ 0, and 0 otherwise. Here, I is a function of the summed currents, including the neuromodulator-induced currents I x s and ionotropic receptor-based currents (proportional to presynaptic neural firing rates). For simplicity, fast co-transmission of neurotransmitters was only considered in one modeling instance (co-release of 5-HT and glutamate via fast 5-HT 3 and ionotropic receptors) based on findings by Wang et al. (2019). From a computational modeling perspective, the DRN (Type I) 5-HT and Glu neuronal populations, which have rather similar activity profiles ( Figure 1B, 3rd and 6th rows) could also be effectively grouped and considered as a single 5-HT neuronal population that "co-transmit" both 5-HT and Glu to DA neurons (Wang et al., 2019). For each model circuit architecture that we investigated, we considered separately the inclusion of either Type I or II 5-HT neurons in the circuit (Cohen et al., 2015), and the possibility of excitatory and inhibitory projections from 5-HT to DRN Glu/GABA and DA neurons, and from DA to GABA neurons in the VTA. To allow tractability in the search for the many possible connectivity structures, the models' connections were largely based on known experimental evidence. For example, the connections between VTA GABAergic and DRN Glu neurons were not considered as, to date, there is little experimental support. We also focused only on learned rewards (with reward-predicting cue followed by reward outcome) and unexpected punishment conditions, simulated using a combination of tonic and/or phasic afferent inputs. It should be noted that we did not consider other conditions and network learning effects (e.g., Hu, 2016) as the main aim was to demonstrate the plausibility of DRN-VTA circuit degeneracy and stability, constrained by specific observed neuronal signaling in reward and punishment tasks. In particular, we investigated various DRN-VTA circuit architectures with network activity profiles that closely resembled those shown in Figure 1B. Note also that all activity profiles in Figure 1B were based on experimental studies except VTA Glu neural activity, which we deduced to be similar to that of DA neural activity in the rewarding task (McDevitt et al., 2014) and assumed to be non-responsive in the punishment task. See below for further details regarding the modeling procedures, parameters, simulations, and analyses.

Input-output functions of neural population firing rates
The computational models developed were based on our previous mean-field, neural population-based modeling framework for neuromodulator circuits (Joshi et al., 2017), in which the averaged concentration releases of neuromodulators (e.g., 5-HT) were monotonic functions of the averaged firing rate of (e.g., 5-HT) neuronal populations via some neuromodulator induced slow currents. All five neural populations' firing rates were described by threshold-linear functions [general form in Eq.
where [z] + = z if z ≥ 0, and 0 otherwise. The threshold input values for I 0,DA was −10 (a.u.) for DA neurons, and I 0,5−HT was 0.13 (a.u.) for 5-HT neurons, to allow spontaneous activities mimicking in vivo conditions. A 5-HT neurons had a threshold-linear function with a gain value g 5−HT of about 1.7 times higher than that for DA neurons, and so we set there for DA and 5-HT neurons to be 0.019 and 0.033 (Hz), respectively (e.g., Shepard and Bunney, 1991;Richards et al., 1997;Crawford et al., 2010;Wong-Lin et al., 2012;Challis et al., 2013). For simplicity, we assumed the same currentfrequency or input-output function in either tonic or phasic activity mode (Jalewa et al., 2014;Joshi et al., 2017).

Afferent currents and connectivity
The afferent current, I, for a neural population consisted of summed contributions from external excitatory inputs I ext including those induced by reward or aversive stimuli, and recurrent interactions with other neural populations (see below) e.g., I 5−HT,DA for effective DA-induced currents in 5-HT neurons. Additionally, for a neuromodulator population, auto receptor-induced current, I auto , was included.
Due to limited experimental evidence, and to reduce the parameter search space, we did not consider the following connections from (i) DRN GABA to VTA DA neurons; (ii) DRN GABA to VTA GABA neurons; (iii) VTA GABA to DRN Glu neurons; (iv, v) DRN Glu to DRN GABA neurons, and vice versa; and (vi) VTA DA to DRN Glu neurons. Then the total (populationaveraged) afferent input currents to DA and 5-HT neurons were, respectively, described by: where the first terms on the right-hand sides of Eqs. (6) and (7) were autoreceptor-induced self-inhibitory currents, and the second terms were the 5-HT-to-DA (labeled with subscript DA, 5 − HT) and DAto-5-HT (with subscript 5 − HT, DA) interactions, the third terms were excitatory interactions from DRN Glu neurons, the fourth/fifth terms were inhibitory interactions from local DRN/VTA GABAergic neurons, and the last terms were additional external constant biased inputs from the rest of the brain and the influence of behaviorally relevant stimuli (due to rewards or punishments; see below) 5-HT neurons have a possible additional negative interaction from VTA GABA neurons (second last term on the right-hand-side) (Li et al., 2019). Negative or positive signs in front of each term indicated whether the interaction was effectively inhibitory or excitatory. The ±

FIGURE 2
A parsimonious, sparsely connected DRN-VTA circuit model. Gray: Brain region. Colored circle: Neuronal population. Legend: network's afferent inputs. Model architecture implicitly encompasses either Type I or II 5-HT neurons with two different inputs for reward/punishment task (bright red arrows if Type I; blue arrows if Type II; black arrows denote common inputs for reward/punishment task for both Types). Circuit connections: triangular-end arrows (excitatory); circle-end arrows (inhibitory). Thicker arrows: Stronger connection weights. Constant long-term reward inputs simultaneously to 5-HT and DA neurons to alter baseline activities. Sustained activity for the expectation of reward outcome implemented with tonic input between cue and reward outcome. All other inputs are brief, at cue or reward/punishment outcome, producing phasic excitations/inhibitions. Note: Self-inhibitory (self-excitatory) connections within GABAergic (Glu) neurons, and auto-receptor inhibitions within 5-HT or DA neurons were implemented but not shown here (see section "2. Materials and methods" and Supplementary Figure 1). This is the most basic, sparsely connected model architecture considered.
sign indicated effective excitatory (+) or inhibitory (−) interactions which we investigated (Figures 2-5), given their mixed findings in the literature [see also ]. This form of summed currents was consistent with some experimental evidence that showed different afferents modulating the tonic and phasic activation (e.g., Floresco et al., 2003;Tian et al., 2016). Similarly, the total (population-averaged) afferent current to the glutamatergic (Glu), and VTA and DRN GABAergic neurons can, respectively, be described by: were typically faster than currents induced by (metabotropic) 5-HT or DA currents. Thus, we assumed the former currents to reach quasi-steady states and described (Jalewa et al., 2014) and represented by I e/i = ±J e/i F e/i , where the subscript e/i denoted excitatory/inhibitory synaptic current, J the connectivity coupling strength, F the presynaptic firing rate for neural population e/i, and the sign ± for excitatory or inhibitory currents. Furthermore, dimensionless coefficients or relative connectivity weights, W s (with values 0), were later multiplied to the above neuromodulatorinduced current terms [right-hand-side of terms in Eqs. (9-13); see Eqs. (20)(21)(22)(23)(24)]. Both the J s and W s were allowed to vary to fit the network activity profiles of Figure 1B within certain tolerance ranges (see below) while exploring different neural circuit architectures (Figure 4 and see Supplementary Table 2 for specific values). The self-connection weights J s within the DRN Glu, DRN GABA, and VTA GABA neurons were set at 0.5, 0.5, and 10, respectively, for all network activity's response profiles.
Similarly, we assumed the sigmoid-like influence of [5 − HT] ([DA]) on DA (5-HT) neural firing activities between the DRN and VTA populations such that the induced current dynamics could be described by Wang and Wong-Lin (2013), Joshi et al. (2017): with the time constants τ 5−HT,DA = 1s and τ DA,5−HT = 1.2s (Haj-Dahmane, 2001;Aman et al., 2007). We set k 5−HT,DA = k DA,5−HT = 0.03 a.u. and g 5−HT = g DA = 20µM −1 , [5 − HT] 1 = 0.3 nM, [DA] 1 = 0.1 nM such that the neural firing activities and baseline neuromodulator concentration levels were at reasonable values (see below) and similar to those in experiments (e.g., Bunin et al., 1998;Hashemi et al., 2011). For simplicity, we assumed Eqs. (16) and (17) to be applied equally to all targeted neural . This will be used as a standard activity profile template to test for degeneracy. Time label from cue onset. Green (red) vertical dashed-dotted lines: Cue (outcome) onset time (as in Figure 1B). Top-to-bottom: VTA DA, DRN 5-HT, DRN GABAergic, DRN Glu, and VTA GABAergic neural populations. (C) Hypothesis for multiple different DRN-VTA circuits operating in parallel, which may consist of different clusters of neuronal sub-populations and different sets of afferent inputs, leading to different outputs. Vertical dots denote the potential of having more than two distinctive circuits.
populations, but with their currents multiplied by their appropriate weights w s (see above).

Reward and punishment conditions with type I and type II 5-HT neurons
To limit the size of the parameter search space, we focused on only the classical, fully learned reward conditioning task, and unexpected punishment task. For each simulated trial or realization within a set of conditions (reward/punishment, excitatory/inhibitory connection), we set the cue onset time at 4.5 s to allow the network time to stabilize and reach a steady state. The within-trial protocol for the external input current, I ext , was implemented as a function of time t as followed, depending on the simulated conditions. Note that, for simplicity, all external input currents were assumed to be excitatory, regardless of reward or punishment task, unless stated.
For reward tasks with Type-I 5-HT neurons (i) constant input currents I 5−HT,ext and I DA,ext to 5-HT and VTA DA neurons, respectively, of amplitude 50 a.u. to simulate long-term reward; (ii) brief 0.2 s pulse I Glu,ext to DRN Glu neurons of amplitude 1,000 a.u. at 4.5 s; (iii) constant step input current I GABA−VTA,ext to VTA GABAergic neurons with amplitude 200 a.u. from 4.5 to 5.7s to simulate sustained activity; and (iv) no external input I GABA−DRN,ext to DRN GABAergic neurons. For punishment task with Type-I 5-HT neurons (i) no external input to VTA DA, 5-HT neurons, and DRN Glu neurons; (ii) brief 0.2 s pulse to VTA (I app ) and DRN GABAergic (I app5i ) neurons with amplitude 1,000 a.u. at 5.7 s. Neural circuit model architectures with similar network activity profiles. The activity profiles were similar to that in Figures 2A, B, satisfying the inclusion criterion. Model architectures (A-K) denote architectures of decreasing connectivity, with Figure 2 as architecture (K). Model architecture (L) has an asterisk to denote that it was the only model with fast 5-HT to VTA DA connection, simulating fast 5-HT 3 or Glu receptor-mediated connection or their combination (co-transmission). Architectures (A,L) have additional inhibitory input to DRN GABA neurons in reward tasks. All labels, connections, and nomenclature have the same meaning as that in Figure 3, except that, for simplicity, the relative connection weights (thickness) are not illustrated, and the diamond-end arrows denote connections that are either excitatory or inhibitory, with both explored. Self-connectivity is not illustrated [see the general example in Supplementary Figure 1   Negative real eigenvalues at steady states of degenerate models. (A) Complete set of the real part of the eigenvalues for model #1 (Supplementary Tables 2-4) with architecture "A" in Figure 4. Horizontal axis: Eigenvalues ranked from the largest to the smallest (magnitude wise). Blue (red): More negative eigenvalues with phasic (blue) than tonic (red) input. Asterisk: Maximal eigenvalue (largest magnitude) for each condition. (B) For each of the 84 models, only the real part of the eigenvalue with the largest magnitude is plotted under phasic (blue cross) and tonic (red circle) input conditions. Model architectures "A" to "L" refer to the different architectures as in Figure 4, in which each has its own distinctive model types (e.g., different 5-HT neuronal or excitatory/inhibitory connectivity types). Eigenvalues for all model types have negative real parts, indicating dynamically stable. For most models, the eigenvalues are generally more negative during phasic than tonic activities.
For reward task with Type II 5-HT neurons (i) constant input current to 5-HT and VTA DA neurons with amplitude 50 a.u. to simulate long-term reward; (ii) additional step input current to 5-HT neurons with amplitude 100 a.u. from 4.5 to 5.7 s to simulate sustained activity; (iii) brief 0.2 s pulse to DRN Glu neurons of 1,000 a.u. at 4.5 s; and (iv) no external input to VTA and DRN GABAergic neurons. For punishment task with Type II 5-HT neurons (i) no external input to VTA DA and GABAergic neurons, and DRN Glu D 2 receptor agonists can distinguish subsets of DRN-VTA neural circuits. Simulated drug administered during punishment and reward tasks with efficacy factor X increments of 1, 10, 40, 70, and 100 times. Baseline condition: X = 1. Colors other than black in the pie chart denote that at least one neural population activity in specific model architecture(s) (labeled in yellow, and as in Figure 4) has deviated (increased or decreased) beyond the inclusion criterion. Specific colors denote changes in specific neural population activities (see Supplementary Tables 5-9 for details.).
neurons; (ii) brief 0.2 s pulse to DRN GABAergic and 5-HT neurons with amplitude of 1,000 a.u. at 5.7 s.
In addition, for transient inputs, we have used multiplicative exponential factors exp (-t/τ) with τ of 50 ms to smooth out activity time courses (e.g., afferent synaptic filtering), but they do not affect the overall results. To simulate long-term, acrosstrial reward/punishment signaling, we assumed a higher constant excitatory input to both 5-HT and DA neurons in reward than punishment trials. When searching for the neural circuit architecture using either Type I or II 5-HT neurons, we limited ourselves as much as possible to the same internal DRN-VTA circuit structure. This also reduced the complexity of the parameter search space.

Baseline neural activities and inclusion criterion
We define the neural circuit activities under baseline conditions (right before cue onset) to follow that in Figures 3A, B. Namely, the baseline firing rates for 5-HT, DRN GABA, DRN Glu, DA, and VTA GABA neurons in the punishment task were 3.0, 21.5, 4.1, 4.8, and 13.5 Hz, and those in the reward task were 4. 5, 19.4, 4.1, 4.8, and 16.3 Hz, respectively (compared with Supplementary Table 3). Baseline [5 − HT] and [DA] levels were constrained to be at 10 and 1.5 nM, respectively. However, it is known that these activities can vary widely across subjects, species, and studies (Supplementary Table 3).
While searching for degenerate neural circuit architecture or investigating substantial changes due to simulated D 2 agonists, we had to define acceptable ranges of neural activities to evaluate whether the variant neural circuit still behaved similarly to that of the model in Figures 3A, B. Specifically, an inclusion criterion was set that evaluated the time-averaged percentage change in the neural population activities of any new model with respect to that of the template activity profiles (Figures 3A, B); the timeaveraged percentage change in activities of the DA, 5-HT, DRN GABA, VTA GABA, and Glu neural populations for any qualified (degenerate) circuits had to be less than 10, 10, 16, 16, and 10% from those of the template activity profiles, respectively. These mean percentage changes were calculated within a 3 s time duration, from 1 s before cue onset to 1 s after outcome onset, encompassing both baseline and stimulus-evoked activities. Any neural circuit architecture which did not satisfy the inclusion criterion for any neural population activity was discarded when in search of degenerate neural circuits or was highlighted (embedded in non-black region in Figure 6) when investigating simulated D 2 agonist (see below). As this part of the study was to prove a concept, specific values of the inclusion criterion were theoretical and not critical-the general results, i.e., overall trends, remained even if these values were altered.

Simulating the effects of D2 agonist
DA and 5-HT induced currents can lead to overall excitatory or inhibitory effects, depending on receptor subtype(s) and the targeted neurons. In particular, DA enhances VTA GABAergic neuronal activity via D 2 receptors and depolarizes the membrane of 5-HT neurons (e.g., Haj-Dahmane, 2001;Aman et al., 2007;Ludlow et al., 2009;Courtney et al., 2012;Ford, 2014). DA also regulates the activity of other DA neurons via D 2 auto-inhibitory receptors (Adell and Artigas, 2004). To study how D 2 receptor-mediated drugs can affect DRN-VTA architecture differently, we simulated different D 2 agonist dosage levels simultaneously by multiplying the connection weights of D 2 receptor-mediated currents (see above; Figure 4; orange connections in Supplementary Figure 1) by a factor "X" (Figure 6) of 10, 40, 70, and 100 times (default of X=1). Then for each dosage, we separately observed the deviation in activity profiles for each neural population with respect to the allowed range. Again, we considered the time-averaged percentage changes due to the drug to be substantial if the mean percentage change in at least one neural population activity had violated the inclusion criterion (see above, and Supplementary Tables 3-8).

Network stability analysis
The five neural population firing rates [Eqs. (4-8)], when combined with their associated afferent currents [Eqs. (9-13)] with explicit parameter values and relative connectivity weights, W s and J s (Supplementary Figure 1 and Supplementary Table 2), can be rewritten, respectively, as: We also inserted the explicit parameter values to the dynamical equations ] to obtain: To check for network stability for each of the considered degenerate neural circuits, we first find each network's steady state (or fixed point) by setting the rate of change for all the dynamical equations to zero, i.e., Using only the linear parts of the above threshold-linear functions (which were validated post hoc), and after some algebraic manipulations, we obtained the following system of equations, in matrix form: But by setting Eqs. (29) and (30)  Next, to check whether the system is stable at a steady state, we compute the 5-by-5 Jacobian matrix M Jacobian (Strogatz, 2018) for the five dynamical equations ], which can be computed using partial derivatives on the right-hand-side of these equations: The eigenvalues of this Jacobian matrix were computed for each steady state for each model under each simulated condition (e.g., reward/punishment task). If the real parts of all the eigenvalues of the Jacobian matrix were negative at a given steady state, then the model was considered to be dynamically stable at that steady state (Strogatz, 2018).

A DRN-VTA model can reconcile many signaling patterns
We began with a parsimonious, sparsely connected DRN-VTA model architecture and adjusted the strength of the afferent inputs and the internal connection weights of the network (section "2. Materials and methods") such that the network activity profiles attained (Figure 3) were readily comparable to the stereotypical profiles as illustrated in Figure 1B. We started the simulations using a minimal network configuration, the most sparsely connected DRN-VTA circuit model investigated in this work (Figure 2). All other subsequent architectures considered would henceforth be derived from this basic architecture. This minimal network architecture readily recapitulated many of the neuronal signaling changes in the DRN and VTA in separate experimental studies, both for reward (Figure 3, black dashed) and punishment (Figure 3, orange bold) tasks. The neural activity generated from the model (Figures 3A,  B) will be used as a standard activity profile template to later test for degeneracy. Deviations from this template will be later used to distinguish different subgroups of the DRN-VTA degenerate networks (see below).
In particular, in the punishment task with Type I 5-HT neurons, brief excitatory inputs to GABAergic neurons in the DRN and VTA led to their punishment-based phasic activation (Figure 3A, 3rd, and 5th rows, orange) (Tan et al., 2012;Li et al., 2016). To replicate the activity profiles for a circuit with Type II 5-HT neurons, brief excitatory inputs to the DRN GABAergic and 5-HT neurons were implemented instead, leading to punishment-based phasic activation for both (Cohen and Uchida, 2012;Tan et al., 2012;Li et al., 2016; Figure 3B, 2nd and 3rd rows, orange). With both Type I and II 5-HT neurons, there was also a phasic inhibition of the VTA DA activity via VTA GABAergic neurons (Figure 3, 1st row, orange), in line with findings from previous studies (Schultz et al., 1997;Cohen and Uchida, 2012;Tan et al., 2012;Watabe-Uchida et al., 2017). It should be noted that with Type II 5-HT neurons, the phasic activation of VTA GABAergic neurons was not as potent due to the filtering by the slow excitatory connection from 5-HT to VTA GABA neurons. Alternatively, a phasic input might instead be sent to VTA GABAergic neurons, leading to more potent activation of the latter.
In the reward task, to replicate a sustained Type II 5-HT neuronal reward signaling between cue onset and reward outcome ( Figure 3B, 2nd row, black dashed) (Liu et al., 2014;Cohen et al., 2015;Li et al., 2016), tonic excitatory input to DRN Type II 5-HT was implemented. The resulting sustained 5-HT activity led to gradual suppression of DRN GABAergic activity ( Figure 3B, 3rd row, black dashed) (Liu et al., 2014;Li et al., 2016) but also a gradual rise in VTA GABAergic activity regardless of reward outcome ( Figure 3B, last row, black dashed) (Cohen and Uchida, 2012;Eshel et al., 2015). The effects on these two GABAergic populations were, respectively, due to 5-HT's inhibitory connection to DRN GABAergic neurons and excitatory connection to VTA GABAergic neurons. With regard to the latter, evidence of such excitatory influence via 5-HT 2C receptors was reported .
Excitatory connection from DRN Glu to VTA DA neurons was particularly strong in the models. This, together with no (or weak) 5-HT-to-DA connection, was consistent with previous work (McDevitt et al., 2014). In the rewarding task, brief excitatory input to the DRN Glu neuronal population led to its reward-sensitive phasic activation (Figures 3A, B, 4th row, black dashed). Alternatively, we could have directly implemented phasic excitatory input to the VTA DA neurons. In any case, the phasic activation of DA activity was consistent with the reported DA neuronal response to a fully learned reward-predicting cue (Schultz et al., 1997;Cohen and Uchida, 2012;Watabe-Uchida et al., 2017).
Variations in some of the model parameters could still recapitulate these profiles, exhibiting model robustness (Supplementary Table 2). Moreover, only modifications to the afferent inputs to DRN 5-HT and VTA GABA neurons (section "2. Materials and methods") were needed to replicate the signaling of Type I or II 5-HT neurons (Figures 3A, B, respectively), while maintaining the same internal connectivity structure. For example, a lack of sustained reward-based activity of Type I 5-HT activity ( Figure 3B, 2nd row, black dashed) required additional external input to sustain VTA GABAergic neural activity ( Figure 3A, bottom row, black dashed; Figure 2). It should be noted that this is assumed that the activity profiles for these non-5-HT neurons were qualitatively similar, regardless of the 5-HT neuronal types (Figure 2).
To understand across-trial reward vs. punishment effects in the model, we implemented a higher constant excitatory input into both DRN 5-HT and VTA DA neurons under reward compared to punishment conditions (Figure 2, long black arrows). This particular model required differential inputs to 5-HT and DA neurons (section "2. Materials and methods") such that the overall tonic 5-HT neural activity was higher for reward than punishment trials, while DA neural activity remained unchanged (Figure 3, black dashed vs. orange bold lines in top two rows), again consistent with experimental observation (e.g., Cohen et al., 2015). In the model, although both 5-HT and DA neurons directly received constant across-trial reward-based excitatory inputs, the indirect inhibitory pathway from 5-HT neurons through VTA GABA neurons onto VTA DA neurons nullified the overall effects on DA neurons (Figures 2, 3). In other words, increased firing of 5-HT neurons could be activating VTA GABAergic neurons to a level sufficient to inhibit VTA DA neurons and thereby canceling out the net long-term reward signals (Figure 2). Note that this is just one out of several possible models; e.g., a more complex case (with more model parameters) which we are not considering could be that the VTA DA, VTA, GABA, and DRN 5-HT neurons individually receive very different inputs.
However, under these conditions, with both Type I and II 5-HT neurons, the baseline DRN and VTA GABAergic activities in the rewarding task were slightly different than those in the punishment task (Figure 3, 3rd and 5th rows, black dashed vs. orange bold), which have yet to be observed in experiments. The model's inability to recapitulate this specific phenomenon might perhaps suggest that there are more complex features in the system, such as further division of neuronal subgroups. This also holds for other model architectures (see below, Figure 4). For example, it could be possible that a subpopulation of DRN GABAergic neurons is directly connected to 5-HT neurons (as in Figure 1B), but not another DRN GABAergic neuronal sub-population, such that across-trial reward signal inputs are distributed differently than that of Figure 1B. Moreover, high chemical and functional diversity among DRN 5-HT neurons are now well recognized (Okaty et al., 2019). Hence, it might be possible that there could exist multiple neural circuits with different circuit architectures operating in parallel, as illustrated in Figure 3C. Another possibility could be due to merely different inputs from e.g., different sources.
In summary, we have shown that, under reward and punishment conditions, many of the observed signaling patterns in different DRN and VTA neuronal types could readily be reconciled within a single sparsely connected DRN-VTA circuit model. However, not all the signaling patterns could be captured, suggesting that multiple different neuronal sub-populations and circuits may be operating in parallel within the DRN-VTA system. Next, we shall investigate whether various different DRN-VTA neural circuits could produce the same output, i.e., be degenerate.

Multiple degenerate DRN-VTA circuits
To search for degenerate DRN-VTA circuit models, we evaluated various combinations of connections within and between the DRN and VTA, and where necessary, adjusted any afferent inputs (section "2. Materials and methods"). We used the network activity profile template as illustrated in Figures 3A, B as output "targets" to check whether other different circuits could replicate similar activity profiles.
Given the variability of neuronal firing rates reported in the literature (Supplementary Table 3), to be considered a degenerate neural circuit, we set an inclusion criterion that allowed the timeaveraged percentage changes of the neural population firing activities to be within certain acceptable ranges as compared to the activities in Figures 3A, B. Specifically, the time-averaged percentage change in activities of the DA, 5-HT, DRN GABA, VTA GABA, and Glu neural populations for any qualified (degenerate) circuits had to be less than 10, 10, 16, 16, and 10% from those of the template activity profiles, respectively. These mean percentage changes were calculated within a 3 s time duration, from 1 s before cue onset to 1 s after outcome onset, encompassing both baseline and stimulus-evoked activities. Any neural circuit architecture which did not satisfy the inclusion criterion for any neural population activity was discarded. Note that here, we aimed to prove a concept and the specific values of the inclusion criterion were theoretical and not critical-the general results, i.e., overall trends, remained even if the values were altered (not shown).
Various neural circuits were created by systematic addition and modification of connections of our parsimonious, sparsely connected DRN-VTA circuit model (now presented as model architecture "K" in Figure 4). For each model architecture, model parameters were searched, in both reward and punishment tasks using both Types I and II 5-HT neurons, until the activity outputs lie within the inclusion criterion. Both excitatory and inhibitory connections from DRN 5-HT to DRN Glu/GABA and VTA DA neurons were also explored. Once the inclusion criterion was met, the model parameter values were varied to check for robustness (Supplementary Table 2). This was repeated for different model architectures until we reached a high connectivity model structure (architecture "A" in Figure 4).
Based on this extensive search process, we obtained a total of 84 different neural circuit model architectures that recapitulated the activity profiles in Figures 3A, B. Their high-level model architectures were illustrated in Figure 4 (see also Figure 5 and Supplementary Tables 2, 3). For instance, model architecture "A" in Figure 4 (see Supplementary Figure 1, for detailed architecture) actually consisted of 32 distinctive models with different 5-HT neuron types and connectivity signs (Supplementary Table 4). Interestingly, this architecture's model parameters remained the same with either excitatory or inhibitory connections (Figure 2, diamond connections; Supplementary Table 2).
We observed that all models with connection from 5-HT to DA neurons were relatively weak or not required, consistent with previous work (McDevitt et al., 2014) [but see below, (di Giovanni et al., 2008;Wang et al., 2019)]. Furthermore, a substantial number of other connections in the DRN-VTA model were identified to be redundant or relatively weak, at least in the context of the conditions investigated (Figure 4). In particular, the relatively weak VTA DA-to-GABA connection was consistent with studies that showed either a weak or non-existent direct effect of VTA DA on VTA GABA neurons (Morales and Margolis, 2017).
With these degenerate models, we could also investigate the effects of specific DRN-VTA connectivity in which there are mixed findings or a lack of evidence in the literature. For example, the influence of 5-HT on DRN GABA neurons could be excitatory or inhibitory (Hernández-Vázquez et al., 2019). In the case when this connection was inhibitory, and not excitatory, we found it to involve several degenerate circuits (model architectures "B" to "L"). In another example, with model architecture "B, " when the connection from DRN 5-HT to DRN Glu neurons was inhibitory, it led to more restricted values in its connectivity strength (0-0.1 a.u.) and hence less robust than when it was excitatory (0-1 a.u.) (Supplementary Table 2). It has been reported that there are mixed findings of 5-HT on VTA DA neurons (de Deurwaerdère and di Giovanni, 2017). In our simulations, we found that an inhibitory connection from 5-HT to VTA DA neurons could lead to several degenerate models (architectures "A"-"D, " "I, " and "L"). Note that in model "L" (labeled with an extra asterisk in Figure 4), we also successfully simulated its fast connectivity version, mimicking either 5-HT 3 receptormediated transmission or 5-HT-glutamate co-transmission (Wang et al., 2019).
In terms of compensatory network effects, we found that upon the removal of the 5-HT to DRN GABA connection (from architecture "B" to "C"), be it excitatory or inhibitory, the excitatory connection from DRN Glu to 5-HT neurons had to be weakened by ∼16% to satisfy the inclusion criterion. With additional removal of the connection from VTA DA to VTA GABA neurons, we found that the strength of the inhibitory connection from 5-HT to DRN GABA neurons had to be increased by ∼150% (architecture "D"). Interestingly, unlike other models, only the models with architectures "A" and "L" had to include an inhibitory connection from VTA GABA to DRN GABA neurons, in which such a connection was observed (Li et al., 2019).

Degenerate DRN-VTA circuit models are dynamically stable
After identifying the theoretical existence of degenerate models, we used dynamical systems theory to determine whether they were dynamically stable, i.e., whether (local) perturbation from their steady states would eventually cause a return to their initial steady states (see section "2. Materials and methods"). Specifically, the stability of each neural circuit could be determined by first finding the possible steady state(s) [i.e., fixed point(s)]. This was achieved by setting all the dynamical (differential) equations to zero and finding the algebraic solutions for the dynamical variables (section "2. Materials and methods"). Then the eigenvalues of the system's Jacobian matrix at the steady states were computed (see section "2. Materials and methods" for the mathematical derivation of the steady states and the Jacobian matrix). For a neural circuit to be dynamically stable, the real part of all the eigenvalues associated with the steady state has to be negative. This was exactly what we found for all the identified degenerate neural circuits in (Figures 4, 5; section "2. Materials and methods"), with no imaginary part in the eigenvalues. Figure 5A displayed the complete set of the real part of the eigenvalues for model #1 with architecture "A" in Figure 4. This model had inhibitory connections from VTA DA to VTA GABA neurons and from 5-HT to VTA DA/Glu neurons, using Type I 5-HT neurons and under punishment conditions (Supplementary Table 4). It was observed that the eigenvalues with phasic input (blue) were generally larger, magnitude-wise, than those with tonic input (red). This was more pronounced for the eigenvalues with the largest magnitude (maximal eigenvalues) (e.g., see Figure 5A). Importantly, all the eigenvalues were negative, indicating a dynamically stable network model even in the presence of additional phasic stimulus input.
We repeated the analysis for all 84 models, under both phasic and tonic input conditions. This analysis was presented in Figure 5B only for the maximal eigenvalues (red circles and blue crosses). The non-maximal eigenvalues had similar trends across the other models. In general, with phasic activities (blue crosses), the models were more stable than with tonic activities (red circles). However, during phasic activations, there were 18 models with rather small (close to zero) eigenvalues (magnitude wise), albeit still negative. Hence, they are easy to lose the stability. This was not observed for tonic activations, where the (most negative) eigenvalues were found to hover within a small range of values (−0.017 to −0.016), except for models with architecture "L" (∼−0.03). In fact, the latter models, which were the only ones with a fast 5-HT-to-DA connection ( Figure 5B, models #77-84), were the most stable under both phasic and tonic conditions. Furthermore, there was no difference identified between the excitatory (models #77-80) and inhibitory (models #81-84) connections. Moreover, most of their eigenvalues under phasic conditions were substantially more negative than their tonic counterparts.

D2-mediated drugs can distinguish some degenerate DRN-VTA circuits
Given the large number of degenerate and stable DRN-VTA circuits identified, how could one distinguish among at least some of them? To address this, we investigated the neural circuit responses to simulated dopaminergic D 2 receptor agonists due to the extensive D 2 receptor-mediated connectivity within the degenerate DRN-VTA circuits (Figure 4 and Supplementary Figure 1). In particular, in the degenerate models, D 2 receptor-mediated connections involved those from VTA DA neurons to DRN 5-HT, DRN GABA, and VTA GABA neurons, and also the self-inhibitory connection (D 2 autoreceptor-mediated) of DA neurons (see section "2. Materials and methods" for references). To mimic the effects in the model of a D 2 receptor agonist, we gradually increased the strengths equally on the connections mediated by D 2 receptors (Supplementary Figure 1, connections emanating from DA neurons) by some factor (X) and observed how the network activity changed.
As we gradually increased the strengths of these specific sets of connections, subsets of the degenerate models gradually behaved differently from the activity profile template in (Figures 3A, B,  6A, model architectures embedded in non-black regions), i.e., not satisfying the inclusion criterion, thus, possibly allowing us to distinguish them. (see Supplementary Tables 5-9 for details). When at a low dose with a 10-fold (X=10) increase in D 2mediated connections, all models showed substantial changes to their DA activities, beyond the inclusion criterion, regardless of reward/punishment condition and 5-HT neuronal type composition. Hence, they could not be distinguished at this dosage.
For moderately higher D 2 agonist doses (X = 40 and 70), models with architecture "A" can be distinguished from others, with substantial alteration to the DRN GABA and 5-HT activities ( Figure 6A, light blue). Models with architectures "B" and "C" were also beginning to reveal substantial activity changes under reward conditions with Type II 5-HT neurons. Furthermore, with an increased factor of 70 under punishment conditions, a model with architectures "D, " "E, " "H, " and "L" could be distinguished from others with additional changes to their 5-HT activities. With a factor of 100, a maximum of five subsets of model architectures could be distinguished based on an activity enhancement of different combinations of neuronal types. An interesting observation we found was that for the same composition of 5-HT neuronal type (I or II), the subsets of distinguishable neural circuits were different between reward and punishment tasks.
Overall, our drug simulation predicted that a gradual increment of the level of D 2 receptor activation could lead to differential enhancements of firing rate activities that could be used to distinguish subsets of the degenerate DRN-VTA circuits.

Discussion
In this work, we have shown that neural circuits which are sources of neuromodulators can degenerate. In particular, the work focuses on computationally modeling and analyzing DRN-VTA circuits, as the DRN and VTA share structural and functional relationships among their constituent neuronal types. In particular, these circuits are involved in the regulation of several cognitive, emotional, and behavioral processes, and are implicated in many common and disabling neuropsychiatric conditions (Muller and Jacobs, 2009). Moreover, their neuronal signaling in reward and punishment tasks are well studied (Hu, 2016).
In this work, we developed biologically based mean-field computational models of the DRN-VTA circuit (Figure 2) with several neuronal types and tested them under classic conditions of (learned) reward and (unexpected) punishment. The modeling was partially constrained by known connectivity within and between the DRN and VTA regions, and their inputs from multiple other brain regions, including mixed combinations of inputs (Watabe-Uchida et al., 2012Ogawa et al., 2014;Pollak Dorocic et al., 2014;Beier et al., 2015;Tian et al., 2016;Ogawa and Watabe-Uchida, 2018). Our work demonstrated that degenerate and stable DRN-VTA neural circuits are theoretically plausible.
We found that a parsimonious, sparsely connected version of the DRN-VTA model could reconcile many of the diverse phasic and tonic neural signaling events reported in the DRN and VTA in punishment and reward tasks observed across separate experimental studies (Figures 1, 3A, B). This model was evaluated using Type I and Type II 5-HT neurons in the DRN as defined electrophysiologically in a previous study (Cohen et al., 2015). In the case of Type II 5-HT neurons in reward condition, the model suggested that sustained 5-HT neuron activity between cue and reward outcome (Cohen et al., 2015) would lead to the gradual inhibition of DRN GABA neuron activity and enhancement of VTA GABA neuron activity (Cohen and Uchida, 2012;Li et al., 2016). The sparsely connected model could also reproduce experimental observations (Cohen et al., 2015) of an increase in baseline firing of Type I 5-HT neurons across several trials in the rewarding task, without similar effects on VTA DA neurons, or in the punishment task (Figure 3). Thus, this specific model suggested that slow, across-trial reward-based excitatory inputs could potentially be directly targeted to both DRN 5-HT and VTA DA neurons and that inhibitory 5-HT to GABA to DA connectivity could cancel out the effects of the direct input to DA neurons, rendering only long timescale changes to the baseline activity of 5-HT neurons but not DA neurons (Figure 2). It should, however, be noted that this is just one possible theoretical speculation offered out of many. For instance, another trivial possibility would be simply to have different separate inputs to DRN 5-HT and VTA DA and GABA neurons.
Despite the success of readily recapitulating the main observed phenomena, this relatively simple model was unable to capture some specific activity profiles with Type I 5-HT neurons, namely, the differential baseline activities of VTA GABA neurons and DRN GABA neurons between reward and punishment conditions. We later found that this was the case even for the more complex neural circuit models. Thus, perhaps additional neuronal populations and DRN-VTA circuits could be operating in parallel (Figure 3C) or it could just be these neurons receive different inputs from e.g., different sources. Such parallel circuits could be validated experimentally in the future, for example, using gene-targeting of specific DRN and VTA neuron subtypes and projections. Indeed, there is increasing evidence that DRN 5-HT neurons are more chemically diverse than previously expected, and that there is a high level of functional diversity in output pathways of the DRN and VTA (Watabe-Uchida et al., 2012Wong-Lin et al., 2012;Ogawa et al., 2014;Pollak Dorocic et al., 2014;Weissbourd et al., 2014;Beier et al., 2015;Fernandez et al., 2016;Tian et al., 2016;Morales and Margolis, 2017;Zhou et al., 2017;Ogawa and Watabe-Uchida, 2018;Ren et al., 2018;Okaty et al., 2019). Further features will need to be incorporated including additional pathways and mechanisms as they are being uncovered, and this could include VTA glutamatergic neurons (McGovern et al., 2021). Clearly, the number of possible connections and models will be increased.
To demonstrate degeneracy in the DRN-VTA system, we showed that several variants of the DRN-VTA circuit model could readily recapitulate the same neuronal signaling profiles, with only occasional slight changes made to their afferent inputs (Figure 4). Our results also suggest experimental studies on the DRN and/or VTA system, e.g., using tracing methods, to search for more than one neural circuit configuration.
Consistent with the previous work (McDevitt et al., 2014), the degenerate models showed a relatively weaker direct connection from DRN 5-HT to VTA DA neurons as compared to the connection from DRN Glu to VTA DA neurons. However, previous studies had also demonstrated a direct influence of DRN 5-HT on VTA DA neuron activity (e.g., de Deurwaerdère and di Giovanni, 2017). More recent work has shown that DRN 5-HT terminals in the VTA corelease glutamate and 5-HT, eliciting fast excitation (via ionotropic receptors) onto VTA DA neurons and increased DA release in the nucleus accumbens to facilitate reward (Wang et al., 2019). Hence, we also incorporated a model with a fast 5-HT-to-DA connection (model architecture 'L * ' in Figure 4) and found such circuits to be plausible in terms of capturing the stereotypical reward and punishment signaling.
Next, we applied dynamical systems theory and showed that all the found degenerate circuits were dynamically stable ( Figure 5). Interestingly, we found that model architecture 'L * ' with fast 5-HT-to-DA connection was dynamically more stable than all other architectures investigated ( Figure 5B). Future computational modeling work could explore the effects of co-transmission of neurotransmitters on neural circuit degeneracy and functioning and using more biologically realistic spiking neuronal network models across multiple scales (Wong-Lin et al., 2012. Finally, we simulated the gradual increase in dopaminergic D 2 receptor activation, due to extensive D 2 -mediated connections in the investigated circuits. This was done by increasing the connection strengths emanating from the VTA DA neurons. This allowed us to distinguish subsets of the degenerate DRN-VTA circuits by identifying substantial and differential deviations in specific neural populations' activities ( Figure 6). Interestingly, we found that for the same composition of 5-HT neuronal type (I or II), the subsets of distinguishable DRN-VTA circuits were different between reward and punishment tasks ( Figure 6A). Future experimental work could verify this model prediction.
As neuromodulators can selectively regulate degenerate neural circuits (Marder et al., 2014;Cropper et al., 2016), our theoretical work showed that even if the neuromodulator sources vary in physical and functional forms, the effects of neuromodulation on targeted brain areas may remain the same. For example, the comodulatory effects of DA and 5-HT on prefrontal cortical rhythms (Wang and Wong-Lin, 2013) may be robust to specific interactions between the VTA DA and DRN 5-HT neurons. From a more general perspective, our computational modeling and analytical framework could be applied to the study of degeneracy and stability of neural circuits involving the interactions of other neuromodulators such as norepinephrine/noradrenaline e.g., Joshi et al. (2017). It should be also noted that in the development of the models, we resorted to a minimalist approach by focusing only on sufficiently simple neural circuit architectures that could replicate experimental observations. Future modeling work may investigate the relative relevance of these connections with respect to larger circuits involving cortical and subcortical brain regions across multiple scales, especially during adaptive learning e.g., Wong-Lin et al. (2017) and Zhou et al. (2018).
Taken together, our computational modeling and analytical work suggest the existence of degeneracy and stability in DRN-VTA circuits. Some of these degenerate circuits can be dynamically more stable than others, and subsets of the degenerate circuits can be distinguished through pharmacological means. Importantly, our work opens up a new avenue of investigation on the existence, robustness, and stability of neuromodulatory circuits, and has an important implication for the stable and robust maintenance of neuromodulatory functions.

Data availability statement
The original contributions presented in this study are publicly available. This data can be found here: GitHub, https://github.com/ ckbehera/degeneracy.