Neural mechanisms of economic choices in mice

Economic choices entail computing and comparing subjective values. Evidence from primates indicates that this behavior relies on the orbitofrontal cortex. Conversely, previous work in rodents provided conflicting results. Here we present a mouse model of economic choice behavior, and we show that the lateral orbital (LO) area is intimately related to the decision process. In the experiments, mice chose between different juices offered in variable amounts. Choice patterns closely resembled those measured in primates. Optogenetic inactivation of LO dramatically disrupted choices by inducing erratic changes of relative value and by increasing choice variability. Neuronal recordings revealed that different groups of cells encoded the values of individual options, the binary choice outcome and the chosen value. These groups match those previously identified in primates, except that the neuronal representation in mice is spatial (in monkeys it is good-based). Our results lay the foundations for a circuit-level analysis of economic decisions.


Introduction
Economic choice entails computing and comparing the subjective values of different goods. In primates, considerable evidence accumulated in the past 15 years indicates that these operations involve the orbitofrontal cortex (OFC). In humans, OFC neurodegeneration or dysfunction is associated with abnormal decision making (Barch et al., 2016;Bechara et al., 1994;Camille et al., 2011;Fellows, 2011;Hodges, 2001;Rahman et al., 1999;Waltz and Gold, 2016;Yu et al., 2018). Furthermore, neural signals recorded in OFC during choices correlate with subjective values (Arana et al., 2003;Chaudhry et al., 2009;Gottfried et al., 2003;Hare et al., 2008;Howard et al., 2015;Howard and Kahnt, 2017;Klein-Flugge et al., 2013;Peters and Buchel, 2009). In non-human primates, neurons in OFC appear intimately involved with the decision process. One series of studies recorded from head-fixed monkeys choosing between different juices. Relative values were inferred from choices and used to interpret neuronal activity (Padoa-Schioppa and Assad, 2006). Different groups of cells in OFC were found to encode the value of individual goods (offer value), the binary choice outcome (chosen juice) and the chosen value. These variables capture both the input (offer value) and the output (chosen juice, chosen value) of the choice process, suggesting that the groups of neurons identified in OFC constitute the building blocks of a decision circuit. Supporting this hypothesis, trial-by-trial fluctuation in neuronal activity correlates with choice variability (Padoa-Schioppa, 2013), the activity dynamics of neuronal populations in OFC reflects an internal deliberation (Rich and Wallis, 2016), and suitable electrical stimulation of OFC biases or disrupts economic decisions (Ballesta et al., In preparation). Complementing these empirical findings, computational work showed that neural networks whose units resemble the groups of cells identified in OFC can generate binary choices (Friedrich and Lengyel, 2016;Rustichini and Padoa-Schioppa, 2015;Solway and Botvinick, 2012;Song et al., 2017;Zhang et al., 2018). In summary, primate studies consistently implicate the OFC in the generation of economic decisions.
In comparison, the picture emerging from research in rodents is more heterogenous. Anatomically, the primate OFC (areas 13/11) appears homologous to the rodent lateral orbital (LO) area (Ongur and Price, 2000). Consistent with this understanding, LO lesions in rodents (Gallagher et al., 1999;Gremel and Costa, 2013) and OFC lesions in primates (Izquierdo et al., 2004;Reber et al., 2017;Rudebeck and Murray, 2011;West et al., 2011) induce similar effects on goal-directed behavior under reinforcer devaluation. A series of studies found evidence consistent with a neuronal representation of value and other decision variables in LO (Feierstein et al., 2006;Hirokawa et al., 2017;Roitman and Roitman, 2010;Sul et al., 2010;van Duuren et al., 2008;van Duuren et al., 2009;Zhou et al., 2019). For example, van Duuren and colleagues showed that neurons in LO reflects both the magnitude and the probability of upcoming rewards. As a caveat, these experiments did not include real economic choices or trade-offs between competing dimensions. Other experiments showed that LO lesions disrupted choices, but some results have been contradictory (Mobini et al., 2002;Rudebeck et al., 2006;Winstanley et al., 2004). Concurrently, several studies cast doubt on the notion that neurons in LO encode economic values (McDannald et al., 2014;Roesch et al., 2006), or that values represented in this area affect economic decisions (Miller et al., 2018;Stott and Redish, 2014). Again, an important caveat is that these experiments did not include any dimensional trade-off. However, in one recent study, rats performed a juice choice task similar to that used in primates (Gardner et al., 2017). Surprisingly, the authors found that optogenetic inactivation of LO did not disrupt choices. This result raised the possibility that economic decisions operate in fundamentally different ways in rodents and primates, or perhaps that seemingly minor differences in task design induce different decision mechanisms (see Discussion).
Assessing whether economic decisions in primates and rodents are supported by homologous brain areas is valuable from a comparative perspective. More importantly, many aspects of the decision circuit identified in non-human primates and discussed above remains poorly understood. For example, it is unclear whether different groups of neurons identified in relation to behavior correspond to different anatomical cell types, or whether these neurons reside in different cortical layers. It is also unclear whether these groups of neurons are differentially connected with each other and with other brain regions. Addressing these questions in monkeys is technically difficult. However, these issues can in principle be addressed using genetic tools available in mice. For this reason, establishing a credible mouse model to investigate this decision circuit is potentially transformative.
Here we present such model. More specifically, the study describes three primary results. First, we developed a mouse version of the juice choice task. In the experiments, head-fixed mice chose between two liquid rewards (juices) offered in variable amounts. Each juice was associated with an odor, and the odor concentration indicated the juice quantity. In each trial, the two odors indicating the offers were presented simultaneously from two directions, and the animal revealed its choice by licking one of two liquid spouts. Apart from using olfactory stimuli instead of visual stimuli, this task closely resembled that used for monkeys (Padoa-Schioppa and Assad, 2006). Mice learned the task rapidly and generally exhibited choice patterns similar to those measured in primates. Second, we used optogenetics to examine the effects of LO inactivation on choices. Inactivation was induced by optically activating GABAergic interneurons expressing channel-rhodopsin2 (ChR2). LO inactivation severely disrupted the decision process, in the sense that it altered the relative values of the two juices and consistently increased choice variability. Hence, economic decisions in mice require area LO. Third, we recorded and analyzed the spiking activity of neurons in LO. Different groups of cells encoded the value of individual offers, the binary choice outcome and the chosen value. In this respect, neuronal responses in LO closely resembled those recorded in the primate OFC. The main difference between the two species was that neurons in LO represented offers and values in a spatial frame of reference, whereas the representation in the primate OFC was good-based (Padoa-Schioppa and Assad, 2006). Furthermore, neurons in LO preferentially encoded the value offered on the ipsi-lateral side, suggesting that economic decisions in mice ultimately involved a competition between the two hemispheres. These results address outstanding questions and establish a new and powerful approach to study the neural circuit underlying economic decisions.

Economic choice behavior in mice
We developed a behavioral paradigm similar to that previously used for monkeys. In essence, we let mice choose between two liquid rewards (juices) offered in variable amounts. During the experiments, mice were head-fixed, and two liquid spouts were placed close to their mouth, on the two sides. For each juice, the offered quantity varied from trial to trial. A key aspect of the experiment was to effectively communicate to the animal the two options available on any given trial. We represented offers using olfactory stimuli because mice can easily learn to make subtle olfactory discriminations (Wachowiak et al., 2009). We used odor identity to represent a particular juice type, and odor concentrations to represent juice quantity. In each trial, the animal was presented with two odors from the two directions (left, right). The odors, representing the offers, were presented for 2.8 s (offer period), at the end of which the animal heard an auditory 'go' signal. The animal indicated its choice by licking one of the two liquid spouts, and the corresponding juice was delivered immediately thereafter (Fig.1a). Throughout the experiments, we used 5 odor concentrations, corresponding to 5 quantity levels for each juice. Juice quantities varied on a linear scale, while odor concentrations varied roughly linearly on a log2 scale. Juice quantities and left/right positions varied pseudo-randomly from trial to trial (see Methods).
Animals' choices reliably presented a quality-quantity trade-off. Fig.1b-f illustrate the behavior observed in five representative sessions. We refer to the two juices as A and B, with A preferred. If the two juices were offered in equal amounts (1B:1A), the animal would reliably choose juice A (by definition). However, if juice B was offered in sufficiently large amount against 1A, the animal chose B. For example, in Fig.1f, the mouse was roughly indifferent between 1A and 2B. For a quantitative analysis, we ran a logistic regression (see Methods, Eq.1). The logistic fit provided measures for the relative value of the two juices (ρ) and for the sigmoid steepness (η), which is inversely related to choice variability. For example, for the session in Fig.1f, we measured ρ = 2.2 and η = 2.1.
The present study is based on 19 mice (see Methods) and a total of 335 sessions. Animals typically performed for 250-400 trials per session. Fig.2 illustrates the whole behavioral data set (excluding trials with optogenetic inactivation; see below). Across sessions and across mice, relative values varied broadly (mean(ρ) = 2.39; Fig.2a). Similarly, the sigmoid steepness varied from session to session (mean(η) = 1.19; Fig.2b). In general, ρ>1 implies that choices are based on values that integrate juice type and juice quantity. In this sense, mice reliably presented non-trivial choice patterns, as previously observed for monkeys (Padoa-Schioppa and Assad, 2006). Conversely, the sigmoid steepness measured in mice was generally lower (higher choice variability) than that recorded in monkeys.
In most of our experiments, higher odor concentrations represented larger juice amounts. One concern was whether low odor concentrations were hard to discern, and whether choices were ultimately dictated by perceptual ambiguity. One argument against this hypothesis follows from the fact that we observed similar choice patterns independently of the pairing between odors and juice type. For an additional control, we trained two animals in a "flipped" version of the task, in which higher odor concentrations represented smaller juice amounts. These two mice learned the task and eventually performed very similarly to the other animals (Fig.1g). More specifically, the relative value and sigmoid steepness measured for these two mice was statistically indistinguishable from those measured for the other 17 mice (p = 0.12 and 0.15 for ρ and η, respectively; t test; Fig.2, asterisks).

Inactivation of area LO disrupts economic decisions
To shed light on the role of LO in economic decisions, we examined how inactivation of this area affects choices. In a series of experiments, we inactivated LO by optically activating GABAergic interneurons expressing ChR2. We chose this protocol because exciting inhibitory cells often induces a more complete shut-down of projection neurons (Wiegert et al., 2017;Zhao et al., 2011).
First, we tested the effects of LO inactivation in two VGAT-ChR2 mice, which (in principle) express ChR2 in all GABAergic cells throughout the brain. In each session, inactivation and control trials were pseudo-randomly interleaved. Inactivation was induced by shining blue light bilaterally in area LO during the offer and choice periods (see Methods). This manipulation consistently disrupted choices. Across sessions, LO inactivation altered the relative value in a seemingly erratic way, and consistently reduced the sigmoid steepness (i.e., it increased choice variability). These effects were quantified with logistic analyses (Methods, Eq.5), which provided measures of relative value (ρstim OFF, ρstim ON) and sigmoid steepness (ηstim OFF, ηstim ON). In both mice, the distribution of relative values measured across sessions was significantly broader under LO inactivation than in normal conditions (both p<10 -5 , F test for equality of variance; Fig.3a; Fig.4ab). Conversely, the sigmoid steepness was consistently lower under LO inactivation than in normal conditions (both p<10 -4 , paired t test; Fig.3a; Fig.5ab).
These results suggested that decisions relied on the neuronal activity in area LO. One concern was the extent of the inactivated region. Since VGAT-ChR2 mice express ChR2 in interneurons throughout the brain, and since the light spreads to some extent through the brain, the behavioral effects described above could in principle be due to the inactivation of neighboring areas such as the olfactory bulb or the secondary motor cortex. To address this issue, we conducted a second experiment. In this case, we tested PV-Cre mice, in which we injected AAV-DIO-ChR2 specifically in area LO (Fig.3g). PV neurons are inhibitory cells that target the perisomatic domain of local pyramidal cells (Tremblay et al., 2016). In this preparation, optical stimulation activated PV neurons exclusively in the injected region, resulting in specific inactivation of area LO. We repeated optogenetic inactivation experiments in five PV-Cre mice. Confirming our initial results, in each mouse the distribution of relative values was significantly broader under LO inactivation than in normal conditions (all p<0.002, F test for equality of variance; Fig.3; Fig.4c-h). Furthermore, in each mouse, the sigmoid steepness was consistently lower under LO inactivation than in normal conditions (all p<10 -3 , paired t test; Fig.3; Fig.5c-h).
Another concern was the fact that area LO is not far from the eyes. Thus the blue light shone during stimulation trials might be seen by the mouse. In principle, such visual input could distract the animal or otherwise interfere with its choosing. To address this issue, we conducted a control experiment, in which we repeated the same light stimulation protocol of area LO in three mice that did not express ChR2 (see Methods). In this case, the stimulation did not affect choices in any appreciable way. In particular, the distributions of relative values under stimulation were indistinguishable from those measured without stimulation (all p>0.2, F test for equality of variance; Fig.4i-l). Similarly, the sigmoid steepness was generally indistinguishable from that measured without stimulation (all p>0.09, paired t test; Fig.5i-l).

Reversion to stereotyped behavior
We investigated more specifically possible ways in which LO inactivation might disrupt decisions. In general, choices can be influenced by other factors besides the juice types and juice quantities. For example, previous work in unrestrained pigeons and rats found consistent side biases (Kagel et al., 1995). Similarly, other things equal, monkeys tend to choose on any given trial the same juice chosen in the previous trial (choice hysteresis) (Padoa-Schioppa, 2013). Here we examined three possible sources of choice biases related to the spatial configuration of the offers (side bias) and to the outcome of the previous trial (choice hysteresis, direction hysteresis). Each effect was examined separately and with a logistic analysis (see Methods, Eqs.2-4).
We first examined choice biases under normal conditions. Fig.S1 illustrates the results for one representative session. In this session, the animal presented a sizeable side bias (ε = -0.66; Fig.S1b), negligible choice hysteresis (ξ = 0.13; Fig.S1c), and some direction hysteresis (θ = 0.42; Fig.S1d). Similar results held across sessions and across animals. In any given session, mice could present some bias favoring either the left or the right option. However, the direction of the side bias varied across sessions and across animals ( Fig.2c), indicating that side biases did not reflect asymmetry in the experimental apparatus. Choice hysteresis was generally low (Fig.2d), and mice presented a small but consistent direction hysteresis (Fig.2e). In summary, under normal conditions, choice biases were relatively modest, as choices were dominated by the trade-off between juice type and juice amount.
We next examined how LO inactivation affected choice biases. We found that optical stimulation significantly increased the side bias in 5 of 7 mice (all p<0.05, F test for equality of variance)an effect not observed in any of the control animals ( Fig.S2). Similarly, LO inactivation significantly increased choice hysteresis in 5 of 7 mice (all p<0.01, F test for equality of variance; Fig.S3). Finally, LO inactivation significantly increased direction hysteresis in 5 of 7 animals (all p<0.05, F test for equality of variance; Fig.S4). Interestingly in some sessions, choice hysteresis (ξ) and direction hysteresis (θ) became negative under optical stimulation, indicating that LO inactivation induced choice alternation rather than choice repetition.
In summary, LO inactivation reduced performance by introducing a variety of choice biases. Normally, economic decisions take place through the computation and comparison of subjective values. Absent LO, animals seem to revert to stereotyped behaviors, whereby choices are dictated by the spatial location (side bias) or by the recent history (hysteresis).

Neuronal activity in area LO during economic decisions
The behavioral effects observed under LO inactivation reveal that this area is necessary for economic decisions. In another set of experiments, we examined the spiking activity of neurons in LO while mice performed the choice task. One of our aims was to compare the results to those previously obtained for central OFC in monkeys. We recorded the activity of 717 cells from 8 mice. Of these cells, 197 and 520 were from left and right hemispheres, respectively. We pooled data from different mice and analyzed them similarly to how we analyzed data from monkeys (see Methods). Specifically, we defined five time windows aligned with the beginning of offer presentation and with juice delivery (see Methods). A preliminary assessment indicated that neurons in LO were modulated by the spatial contingencies of the choice task. In the analysis, an "offer type" was defined by two quantities of juices A and B; a "trial type" was defined by an offer type, a spatial configuration of the offers, and a choice. For each time window and for each trial type, we averaged spike counts across trials. A "neuronal response" was defined as the firing rate of one cell in one time window, as a function of the trial type. Thus the response seemed to encode the variable offer value ipsi. Similarly, the response in Fig.6b, recorded in the right hemisphere, seemed to encode the variable offer value ipsi. The response in Fig.6c was nearly binary. The firing rate was high when the animal chose the offer on the right, and it was low when the animal chose the offer on the left, independent of the chosen juice and the chosen quantity. Thus the response seemed to encode the variable chosen side. Fig.6d shows another neuronal response encoding the chosen side. The response in Fig.6e seemed to encode the variable chosen value. Its activity increased as a function of the value chosen by the animal, independent of the juice type or the chosen side. Finally, the response in Fig.6f seemed to encode the variable position of A. Its activity was roughly binaryhigh when juice A was offered on the left and low when juice A was offered on the right, independent of the quantity and on the animal's choice.
For a quantitative analysis of the whole data set, we proceeded in steps. First, the activity of each neuron in each time window was examined with a 3-way ANOVA (factors: offer type × position of A × chosen side). This analysis confirmed that many cells were modulated by the offer type, the spatial configuration of the offers, and/or the movement direction (Table S1). We also conducted a 1-way ANOVA with factor trial type (which recapitulates information about the offers, their spatial locations and the chosen side). We imposed a significance threshold p<0.001. In total, 565 responses from 301 cells satisfied this criterion and were included in subsequent analyses.
Second, we defined numerous variables neurons in LO could conceivably encode. These included variables associated with individual juices (offer value A, offer value B, chosen value A, chosen value B, chosen juice), variables associated with spatial locations (offer value ipsi, offer value contra, chosen value ipsi, chosen value contra, chosen side), the variable position of A capturing the spatial configuration of the offers, and the variable chosen value. Of note, variables associated with spatial locations may be defined in Euclidean space (e.g., offer value left, offer value right; allocentric representation) or in relation to the recording hemisphere (e.g., offer value ipsi, offer value contra; egocentric representation). If the internal representation was in Euclidean space, cells encoding the offer value left and cells encoding the offer value right should be found in roughly equal proportions in each hemisphere. In contrast, preliminary observations revealed that offer value responses most often encoded the value presented on the ipsi-lateral side. Thus spatial variables included in the analysis were defined in relation to the recording hemisphere (Table S2).
Each neuronal response was separately regressed against each variable. Variables that provided a significantly non-zero slope (p<0.05) were said to "explain" the response. We then generated two population plots. Fig.7a illustrates for each time window the number of responses explained by each variable. Since responses could be explained by more than one variable, each response may contribute to multiple bins in this plot. For each response, we also identified the variable that provided the best explanation (highest R 2 ). Fig.7b illustrates the population results for this analysis. In this case, each response contributes at most to one bin. Fig.7b reveals that few variables were most effective in explaining neuronal responses. To identify a small number of variables that best accounted for the whole data set, we used a stepwise procedure and a best-subset procedure. In the stepwise procedure, we imposed that the marginal explanatory power of each selected variable be at least 5% (see Methods). In the first four iterations, the procedure selected variables offer value ipsi, chosen side, position of A and chosen value, and variables selected in subsequent iterations did not meet the 5% criterion (Fig.8a). Together, these four variables explained 457 responses, corresponding to 91% of the responses collectively explained by the 12 variables (Fig.8b). The best-subset procedure confirmed this result, showing that the selected variables formed the best possible subset of four variables (Fig.8c). We concluded that neurons in LO encode variables offer value ipsi, chosen side, position of A and chosen value.

Discussion
We presented three main results. First, we developed a mouse model of economic choice behavior. The task was very similar to that used in monkey studies, as animals chose between different juices offered in variable amounts. Choice patterns were comparable to those measured for monkeys, although choice variability was generally higher. Of note, mice who had not even experienced head-fixation learned the task within a few weeks. Second, we showed that economic decisions in mice depend on area LO. Specifically, optogenetic inactivation of LO induced erratic changes of relative value and consistently increased choice variability. Third, we showed that neurons in LO encode different variables reflecting the input (offer values) and output (choice outcome, chosen value) of the decision process. This neural representation closely resembles that previously identified in primates, except that the reference frame in the mouse LO was spatial while that in the monkey OFC was good-based. We next elaborate on each of these results.

LO is necessary for economic decisions
Previous work found that lesion or inactivation of orbital cortex in primates and rodents disrupts performance under reinforcer devaluation. This observation is often interpreted as relevant to neuroeconomics, under the assumption that values driving goal-directed behavior are equivalent to values driving economic decisions (O'Doherty, 2014;Padoa-Schioppa and Schoenbaum, 2015). Our results demonstrate more directly that economic decisions critically depend on the orbital cortex.
Interestingly, our results on the effects of LO inactivation stand in contrast to those of a recent study that failed to disrupt economic decisions through optogenetic inactivation of area LO in rats (Gardner et al., 2017). The discordance is striking because the choice task used in the other study is very similar to ours. Furthermore, as a positive control, the other study reported that LO inactivation affected performance under reinforcer devaluation. Several considerations are in order. In itself, their failure to disrupt economic decisions is not particularly informative. Viral infection is typically more reliable in mice than rats (Witten et al., 2011). Furthermore, inactivation through ChR2 stimulation of interneurons is often more effective than inactivation through halo-rhodopsin (Raimondo et al., 2012;Wiegert et al., 2017), which they used. Thus the most cogent questions pertain to their positive control. In this respect, two elements seem most relevant. First, their experiments were not temporally counterbalanced. All their rats were initially trained in the economic choice task, and tested under LO inactivation. Subsequently, animals were trained in the reinforcer devaluation task, and tested under LO inactivation. Viral injections were performed after the initial training. Since viral infection takes time (Witten et al., 2011), LO inactivation was almost certainly more effective in reinforcer devaluation experiments than in economic choice experiments. Second, a close observation of their behavioral data reveals that in the last 3 training sessions, the performance of the experimental group was higher than that of the control group (their Fig.S2a). If data from these three sessions were combined, the difference between the two groups of rats would presumably be comparable to the difference observed under LO inactivation (their Fig.S2c). These various factors, possibly in addition to subtle differences in task design (Gardner et al., 2017), can explain the discrepancy between the two studies.

Representation of subjective values in LO
Our recordings revealed that neurons in LO encoded different variables intimately related to the decision process. Specifically, neurons encoded the spatial configuration of the offers (position of A), the decision input (offer value ipsi), and the decision output (chosen side, chosen value). Comparing our results to those of monkey studies, there are notable similarities and interesting differences. On the one hand, three variables represented in LO (offer value ipsi, chosen side and chosen value) are analogous or identical to those previously identified in the primate OFC (offer value, chosen juice and chosen value). On the other hand, a major difference is that input and output in LO are represented in a spatial frame of reference. In contrast, in the primate OFC, input and output are defined in a non-spatial, good-based reference frame (Padoa-Schioppa, 2011).
The finding that neurons in LO represent options and values in a spatial reference frame confirms previous reports (Feierstein et al., 2006;Roesch et al., 2006). However, a striking aspect of our results is that LO cells in our experiments represented values offered on the ipsilateral side. The majority of cortical representations are entirely or preferentially contra-lateral. One exception is the olfactory system, as sensory neurons in the olfactory epithelium send their axonal projections ipsi-laterally to the olfactory bulb (de Olmos et al., 1978;Royet and Plailly, 2004). Offers in our task were represented by olfactory stimuli. Thus one possibility is that the ipsi-laterality found in LO was "inherited" from the spatial representation in the olfactory bulb. Alternatively, ipsi-laterality in LO might be "endogenous". In other words, the encoding of offer value might be ipsi-lateral even if offers are represented by visual stimuli. Future research should examine this intriguing issue.

Two groups of responses identified in LO encoded variables offer value ipsi and chosen value.
Importantly, these variables reflect the subjective values of the options and integrate the two dimensions varied in our experiments, namely juice type and quantity. Our result matches a large body of work in human and non-human primates. It also confirms and extends previous observations in rodents (Gremel and Costa, 2013;Hirokawa et al., 2017;Roitman and Roitman, 2010;Sul et al., 2010;van Duuren et al., 2009;Zhou et al., 2019). Interestingly, one earlier study in rats failed to find any systematic relation between the effects induced by changes in reward quantity and those induced by changes in delay (Roesch et al., 2006). The reasons for their negative result are not clear, but several factors might contribute. First, in the earlier study, most of the analysis focused on the time window following reward delivery. However, we found that value-encoding responses are most prominent immediately after the offer, and that neuronal responses at juice delivery mostly represent the binary choice outcome. Second, in the earlier study, reduced activity at juice delivery in long delay trials might be due to the animal's inability to predict the reward timing (because of the long delay), as opposed to temporal discounting per se. In fact, at the time of delivery, the subjective value of the juice should no longer depend on the preceding delay. Third, in their study, the two dimensions were never manipulated at the same time. In fact, trials were blocked, both dimensions were fixed within a block, and firing rates were compared across blocks. If the value representation in LO is range adapting as is the value representation in the primate OFC (Cox and Kable, 2014;Padoa-Schioppa, 2009), neurons in the blocked design might appear untuned. Thus range adaptation might explain why varying the reward quantity had only a modest effect on neuronal activity. In conclusion, future work should re-examine the representation of temporally discounted values in area LO with more suitable task design and data analysis.

Mechanisms of economic decisions in mice
As noted above, the variables represented in the mouse LO closely resembled those encoded in the primate OFC, except for the fact that they were defined in a spatial reference frame. Modeling work has shown that the three groups of cells identified in the primate OFC are computationally sufficient to generate binary choices, suggesting that economic decisions are formed in a neural circuit within this area (Padoa-Schioppa and Conen, 2017). Importantly, current models may be formulated equally well in spatial terms. In other words, neurons encoding the variables identified here are also sufficient to generate binary choices, indicating that economic decisions in mice could be formed within area LO. With this premise, one aspect of our results is noteworthy.
In the primate OFC, neurons encoding different variables, including offer value A and offer value B, are found in close proximity of each other. This salt-and-pepper distribution suggests that the competition between values happens throughout the OFC and in both hemispheres. In contrast, in the mouse LO, neurons in each hemisphere predominantly represent the value of the good offered on the ipsi-lateral side. This lateralized distribution seems to imply that decisions in our task ultimately involve a competition between the two hemispheres, as previously observed in other domains (Asanuma and Okuda, 1962;Bloom and Hynd, 2005;Cazzoli et al., 2009;Ferbert et al., 1992;Forss et al., 1999;Hilgetag et al., 2001;Palmer et al., 2012;van der Knaap and van der Ham, 2011). Future work should test this hypothesis more directly, and assess whether the same lateralized organization holds when binary decisions are made in a spatially richer environment (e.g., with free moving animals).
In conclusion, we established a genetically tractable animal model of economic choice behavior, and we demonstrated a clear homology between the mouse LO and the primate central OFC.
We gathered strong evidence that economic decisions depend on LO. We also showed that neurons in LO represent the input and the output of the choice process, suggesting that decisions emerge from a neural circuit within this area. With respect to the decision mechanisms, the main difference between mice and primates appears to be that decision variables in LO are represented in a spatial reference frame. The fact that neurons in each hemisphere predominantly represent the value offered on one side suggests that economic decisions in mice involve a competition between the two hemispheres.

Animals and surgical procedures
This study reports on 19 mice of different strains, including B6 (N=2, Jackson Laboratory, stock #000664), PV-Cre knock-in (N=7, Jackson Laboratory, stock #008069) and VGAT-ChR2 (N=10, Jackson Laboratory, stock #014548). Both male and female animals were used for neuronal recordings and optogenetic inactivation of area LO. All mice were >10 weeks old at the time of the experiments. Animals were housed individually and the experiments were conducted in the dark phase of a 12-hr light/dark cycle. Mice were under water restriction. On testing days, they had access to water or sucrose water only during the experiments. All experimental procedures conformed to the NIH Guide for the Care and Use of Laboratory Animals and were approved by the Institutional Animal Care and Use Committee (IACUC) at Washington University in St Louis.
Neuronal recordings were conducted with N=8 VGAT-ChR2 mice. The optogenetic inactivation experiments were conducted on N=2 VGAT-ChR2 mice and on N=5 PV-Cre mice with AAV-DIO-ChR2 injections. The control group included N=2 B6 mice (no injection) and N=1 PV-Cre mouse injected with saline. The same cannulas were implanted in non-injected and in injected mice, and surgery damage was very similar in the two groups.
All surgeries were conducted under general anesthesia induced with Isoflurane, alone or in combination with Ketamine. A titanium head plate implanted on the bregma was used to restrain the animal and as a landmark for the neuronal recordings. The craniotomy, slightly larger than the target recording area, typically spanned 2.0-3.5 mm anterior to the bregma and 0.5-1.8 mm lateral to the midline. For the optogenetic experiments, we implanted cannulas with an optical fiber (200 µm core diameter) bilaterally at 2.7 mm anterior, 1.4 mm lateral. The tip of the cannula was placed at 1.1 mm ventral to the brain surface. In relevant experiments, adenoassociated virus (AAV.EF1a.DIO.hChR2(H134R)-eYFP.WPRE.hGH; Addgene 20298) or saline was injected at 2.7 mm anterior, 1.4 mm lateral, 1.8 mm ventral.

Economic choice task
We designed the choice task to resemble as much as possible the task used for monkeys (Padoa-Schioppa and Assad, 2006). During the experiment, the mouse was placed in a plastic tube with the head fixed. Two odor delivery systems and two liquid spouts were placed symmetrically on the left and on the right of the animal head, and close to the mouth. In each session, the animal chose between two liquid rewards ("juices") offered in variable amounts, delivered from left and right lick ports. On any trial, the offered juice types were signaled with different odors, and juice amounts were signaled by the corresponding odor concentration. Before each trial, a vacuum sweep removed the odors remaining from the previous trial. Immediately thereafter, the two odors (the offers) were presented simultaneously from two directions (left, right). The odor presentation lasted for 2.8 s, after which the animal indicated its choice by licking one of the two spouts. The response period started with an auditory 'go' cue and licking before the go cue was disregarded. Licking was detected by two photodiodes located posterior to the liquid spouts. If the animal did not respond within 4 s, the trial was aborted (Fig.1). In forced choices, where only one juice was offered, trials in which the animal licked the wrong spout were considered errors. Such cases were almost inexistent in monkey experiments, but occurred sometimes in mouse experiments (Fig.1). Upon error trials, we aborted the trial and repeated the same offer in the subsequent trial.
Throughout the experiments, juices were water and 12% sucrose water. Odors were mint and 4-Pentenoic acid. The association between the juices and the odors varied pseudo-randomly across mice. The juice amount offered to the animal varied from 0 to 6 drops (8 drops for mice #39 and #53). The amount was monotonically related to the represented odor concentration, which varied roughly linearly on a log2 scale (e.g., odor levels 1, 2, 4 and 8 ppm representing quantities 1, 2, 3 and 4 of juice). In each session, the left/right location of the two juices varied pseudo-randomly from trial to trial. Juice delivery took 150-900 ms depending on the juice amount (~150 ms per quantum). Independent of the amount, the reward period was kept constant and equal to 900 ms, to maintain a consistent trial duration. Excepting this rule, when mice #39 and #53 received 8 drops of juice, the reward period lasted 1.2 s. Odor presentation for the next trial started 650 ms after the end of the reward period. For most mice, we used higher odor concentrations to represent larger amounts of juice. However, we also trained two mice in a "flipped" version of the task, in which higher odor concentrations represented smaller amounts of juice. Although training took longer, the two mice reached a similar level of performance in the choice task (Fig.2).
Typically, mice performed the task for ~30 min each day, during which they completed ~300 correct trials and received 0.8 -1.6 ml of liquid reward. The multiplicity of offer types was fixed within each session and the offer type was randomly selected at the beginning of each trial. Sessions typically included ~40% of forced choices (20% for each juice). In forced choice trials, error trials were followed by a 3 s additional delay (with white noise). (For some sessions in mice #36 and #41, the delay lasted 5 s; in mouse #39, the delay lasted 1 s.)

Training protocol
With experience, our training protocol became more standard. Eventually, training developed in four steps. (1) Mice were trained in a direction discrimination task. We presented the odor from the left or from the right in random alternation, and we delivered the juice only when the animal licked the corresponding spout. Mice typically took ~10 days to reach 80% accuracy. (2) We introduced the association between odor concentration and juice quantity. We used the same scheme as in (1), but the odor was presented at different concentrations and coupled with different juice quantities. Mice typically took ~3 days to reach 80% accuracy. (3) We introduced the association between different odors and different juice types. Specifically, we repeated steps (1) and (2) using a second odor and a second juice type (in some cases, we experimented with 3 or 4 juice types). Mice typically took 6-10 days to reach 80% accuracy. (4) We presented mice with the full choice task, where animals choose between the two juice types offered in variable amounts (Fig.1). We trained the animal on the choice task for at least 5 days before starting the experiments.
Of the two mice trained in the "flipped" version of the task, one (#39) was naïve while the other (#53) had already been trained in the standard task. Both mice took ~20 extra days to reach performance level.

Optical stimulation, neuronal recordings, and histology
Optical stimulation was performed with blue light (473 nm Blue DPSS Laser, Shanghai Laser). Core fibers (200 µm, Doric) were connected through two cannulas inserted bilaterally in area LO. To precisely control the stimulation timing, we used an acousto-optic device (AO modulator/shifter, Optoelectronics) and an associated RF Driver MODA110-B4-30 (Optoelectronics). To inactivate area LO, we typically used 3-9 mW intensity, 10 ms pulses and 10-33 Hz frequency. The stimulation started at the beginning of odor presentation and lasted 3.8 s (i.e., throughout the offer period plus 1 s). In most sessions, stimulation conditions (OFF or ON) varied pseudo-randomly on a trial-by-trial basis. In a subset of sessions, trials were divided in blocks of 20-30 trials. The optical inactivation experiments were conducted on N=2 VGAT-ChR2 mice (29 sessions total), N=5 PV-Cre mice injected with AAV-DIO-ChR2 in area LO (78 sessions total), and N=3 control mice (39 sessions total).
We recorded the spiking activity of individual neurons from LO of eight mice. Recording locations spanned 2.5-3.1 mm anterior, 1.0-1.7 mm lateral and 1.0-2.0 mm ventral. Extracellular activity was recorded with a 16 channel array, with or without optical fiber (Neuronexus). The array was advanced before each session. Electric signals were amplified (10,000 gain) and band-pass filtered (300 Hz -6 kHz; Neuralynx). Spikes were identified with a threshold, digitized (40 kHz; 1401, Cambridge Electronic Design), and stored to disk for off-line spike sorting (Spike2, Cambridge Electronic Design).
At the end of the recording experiments, we injected a dye (DiI) roughly at the center of the recording region. The animal was then perfused. The brain was extracted, mounted on an optimal cutting temperature compound and frozen. Subsequently, the brain was sliced (approximately 33 µm sections) with a low-temperature Cryostat (Leica Biosystems) and pasted on cover glass. Sections were then examined and photographed under a fluorescence microscope (Leica DMI6000 B microscopy). Fig.S5 illustrates the reconstructed locations of recordings.

Data analysis, behavior
All the analyses of behavioral and neuronal data were conducted in Matlab (MathWorks). Behavioral choice patterns were analyzed using logistic regression. The basic logistic model was written as follows: choice B = 1 / (1 + exp (−X)) X = a0 + a1 log (qB / qA) where choice B equals 1 if the animal chose juice B and 0 otherwise, qA and qB are the quantities of juices A and B offered to the animal, and A is the preferred juice. Forced choice trials were excluded. The logistic regression provided an estimate for parameters a0 and a1. By construction, a0<0 and a1>0. The relative value of the two juices was defined as ρ = exp(−a0/a1). In essence, in any given session, ρ was the amount of juice B that, if offered against 1A, made the animal indifferent between the two juices. The sigmoid steepness was defined as η = a1. The steepness (also termed inverse temperature) is inversely related to choice variability.
In further analyses, we examined the possible effects on choices of three additional factors. First, the side bias was examined with the following model: choice B = 1 / (1 + exp (−X)) X = a0 + a1 log (qB / qA) + a2 (δA, right -δB, right) where δJ, right = 1 if juice J was offered on the right and 0 otherwise, and J = A, B. This logistic fit returned two sigmoid functions with the same steepness and different flex points. The side bias was quantified as ε = − a2/a1. A measure of ε>0 indicated that, other things equal, the animal tended to choose the juice offered on the right.
Second, choice hysteresis (Padoa-Schioppa, 2013) was examined with the following model: where δn-1, X = 1 if in the previous trial the animal chose juice X and 0 otherwise, and X = A, B.
Choice hysteresis was quantified as ξ = − a2/a1. A measure of ξ >0 indicated that, other things equal, the animal tended to choose the same juice chosen in the previous trial.
Third, direction hysteresis was examined with the following model: choice B = 1 / (1 + exp (−X)) X = a0 + a1 log (qB / qA) + a2 (δn-1, pos A -δn-1, pos B) (4) where δn-1, pos B = 1 if juice B was offered in the same spatial position as that chosen in the previous trial and 0 otherwise, and δn-1, pos A = 1 -δn-1, pos B. Direction hysteresis was quantified as θ = − a2/a1. A measure of θ >0 indicated that, other things equal, the animal tended to choose the juice offered on the same side as the side chosen in the previous trial.
All the logistic models described so far analyzed choices in the absence of optogenetic manipulations. To quantify the effects of these manipulations, we constructed additional logistic models. The basic effects of LO inhibition on the relative value and choice variability were quantified with the following model: choice B = 1 / (1 + exp (−X)) X = (a0 + a1 log (qB / qA)) δstim, OFF + (a2 + a3 log (qB / qA)) δstim, ON where δstim, ON = 1 in stimulation trials and 0 otherwise, and δstim, OFF = 1 − δstim, ON. In essence, Eq.5 repeats Eq.1 twice, once for trials without stimulation and once for trials with stimulation. Hence, the logistic fit returns two sigmoid functions that differ for their flex point and for their steepness. The effects of stimulation were quantified by comparing ρstim OFF = exp(−a0/a1) and ρstim ON = exp(−a2/a3), and by comparing ηstim OFF = a1 and ηstim OFF = a3.

Data analysis, neuronal activity
Preliminary observations revealed that the activity of neurons in LO varied as a function of the offered and chosen juices, but also depended on the spatial contingencies of the choice task. This is unlike what we found in the primate OFC, where firing rates are independent of the spatial contingencies (Grattan and Glimcher, 2014;Padoa-Schioppa and Assad, 2006). Thus the present analysis was designed to capture the spatial components of the choice task. Apart from this aspect, our analyses closely resembled those previously conducted in monkey studies. Neuronal activity was examined in 5 time windows: pre-offer (0.6 s preceding the offer); postoffer (0.2-0.8 s after offer on); late delay (0.6-1.2 s after offer on); pre-juice (0.6 s preceding juice delivery); post-juice (0.6 s following the beginning of juice delivery). An "offer type" was defined by two juice quantities offered to the mouse, independent of the spatial configuration and the animal's choice. A "trial type" was defined by an offer type, a spatial configuration (e.g., juice A on the left), and a choice. A "neuronal response" was defined as the activity of one neuron in one time window as a function of the trial type. Sessions typically included 10-12 offer types (including forced choices), and ~30 trial types. Error trials in forced choices were excluded from the analysis. We also excluded from the analysis trial types with ≤2 trials. Neuronal responses were constructed by averaging spike counts across trials for each trial type.
Our analyses aimed at identifying the variables encoded in area LO and proceeded in steps. First, the activity of each cell in each time window was examined with a 3-way ANOVA (factors offer type × position of A × chosen side), followed by a 1-way ANOVA (factor trial type). The latter was used to identify "task-related" responses. Specifically, we imposed a significance threshold of p<0.001, and neuronal responses that passed this criterion were included in subsequent analyses.
Second, we defined a series of variables that neurons in LO might potentially encode (Table  S2). For each neuronal response, we performed a linear regression against each variable (separately). If the regression slope differed significantly from zero (p<0.05), the variable was said to "explain" the neuronal response. For each variable, the regression also provided an R 2 . For variables that did not explain the neuronal response, we arbitrarily set R 2 = 0.
Third, we conducted population analyses to identify a small subset of variables that best accounted for our data. Based on the regressions, we identified for each response the variable that provided the best fit (highest R 2 ). We then computed the number of responses best explained by each variable, separately for each time window (Fig.7). To identify the variables that best accounted for the whole population, we proceeded as in previous studies, using two methods of variable selection -stepwise and best subset (Glantz and Slinker, 2001;Padoa-Schioppa and Assad, 2006). The stepwise method is an iterative procedure. In the first step, we selected the variable that provided the highest number of best fits within any time window, and we removed from the data set all the responses explained by this variable. In the second step, we repeated the procedure with the residual data set. We defined the "marginal explanatory power" of a variable X as the percent of task-related responses explained by X and not explained by any other selected variable. At each step, we imposed that the marginal explanatory power of each selected variable be ≥5%. We then continued the procedure until additional variables failed to meet the 5% criterion. In contrast, the best subset method is an exhaustive procedure. For this analysis, we pooled responses from different time windows. For each possible subset of d variables, we computed the number of responses explained in the data set, and we identified the subset that explained the highest number of responses. We repeated this procedure for d = 1,2,3...  Table S1. Results of ANOVAs. The table reports the results of two ANOVAs. Each column represents one factor, each row represents one time window, and numbers represent the number of cells significantly modulated by the corresponding factor (p<0.001). The bottom row indicates the number of cells that pass the criterion in at least one of the 5 time windows. The 3 left-most columns report the results of a 3-way ANOVA. Notably, many cells were modulated by each of the three factors. The right-most column reports the results of a 1-way ANOVA (factor trial type). In total, 301/717 (42%) cells passed the p<0.001 criterion in at least one time window. Neuronal responses that passed this test (N = 565) were identified as task-related and included in subsequent analyses.