Amphetamine reduces utility encoding and stabilizes neural dynamics in rat anterior cingulate cortex

The anterior cingulate cortex (ACC) appears to support decisions by encoding the effort-reward utility of choice options. We show here that d-amphetamine (AMPH) has dose-dependent effects on this encoding and on neural dynamics in rat ACC that are concordant with its behavioral effects. Low-dose AMPH increased task engagement and had mild effects on neural encoding, whereas high doses disrupted utility signaling and decreased task engagement. The disruption involved reduced reward signaling and compressed effort-reward encoding of utility cells, which corresponded with reduced reward consumption behaviors. Furthermore, low-dose AMPH stabilized and accelerated trajectories of neural activity in state-space, whereas high-dose AMPH destabilized trajectories. We propose that low-dose AMPH increases both excitability and stability, which preserves information and accelerates evolution of a neural ‘script’ for task execution. Excessive excitability at high doses overcomes stability enhancement to suppress weakly encoded features (e.g. reward) and cause deviation from the script, which interrupts task performance. Significance Statement Amphetamine reduced reward signaling by individual neurons in rat prefrontal cortex, but increased the stability of ensemble dynamics. These effects account for animals’ increased task engagement, despite reduced reward intake.


48
Rats typically choose to exert increased effort, such as barrier climbing or lever pressing, 49 if the associated reward is of considerably higher value than that of lower-effort choice options. transmission in these areas (Cousins et al., 1996;Schweimer & Hauber, 2006;Mai et al., 2012). 54 Therefore, drugs such as d-amphetamine (AMPH) that increase extracellular dopamine levels in 55 these structures (Chiueh & Moore, 1973;Pum et al., 2007), will likely influence effort-reward 56 choices. Indeed, choice preference is biased toward high-effort, high-reward options by systemic 57 AMPH administration (Floresco et al., 2008; Bardgett et al., 2009). It remains unclear what 58 aspects of neural information processing are affected to produce this effect. 59 Effort, reward, and other features pertinent for economic decisions are often formalized 60 within the concept of utility (Phillips et al., 2007). Options requiring low effort but yielding a 61 large food reward have high utility to hungry animals, whereas options requiring high effort or 62 yielding little/unwanted food have low utility. Because utility can be expressed as the expected 63 value discounted by its associated costs, the bias toward high-effort high-reward options under 64 AMPH could reflect a reduction (or 'discounting') of effort, or an increase in expected reward

94
Subjects and surgical procedure 95 In this study, four adult male Fischer Brown Norway (FBN) hybrid aged 6 to 10 months 96 were used. Rats were born and raised on-site, housed individually in a 12h-12h reverse light 97 cycle, and habituated to handling for two weeks prior to surgery. The fabrication and surgical 98 implantation of head-mounted drives was completed as previously described (Euston & 99 McNaughton, 2006), but is explained briefly here. Surgeries were carried out prior to any 100 training. Animals were deeply anesthetized with isoflurane throughout the procedure (1-1.5 % by 101 volume in oxygen at a flow rate of 1.5 L/min). Each animal was implanted with a "hyperdrive" 102 consisting of 12 independently-movable tetrodes (McNaughton et al., 1983;Wilson & 103 McNaughton, 1993) and 2 reference electrodes. The hyperdrive bundle was centered at 3.00 mm 104 AP, and 1.3 mm ML of left mPFC and angled 9.5 degrees toward the midline. A craniotomy was 105 made around the electrode exit site of the drive, and the dive bundle was lowered to the brain 106 surface. The dura was retracted, and the hyperdrive body was secured to the skull with anchor 107 screws embedded in dental acrylic (Lang Dental, Wheeling, US). Following surgery, rats were 108 administered daily injections of 1mg/kg Metacam (analgesic) for 3 days and 10 mg/kg Baytril 109 (antibiotic) for 5 days. Tetrodes were lowered 950 μm from the skull surface after the surgery 110 and then gradually lowered over the next 2-3 weeks to reach the target depth. Food restriction 111 started after the animal recovered from the surgery (7 days) and was monitored daily to ensure 112 the weight was at least 85% of the free-feeding weight for the duration of the experiment. The The behavioural apparatus and the method of data collection used in this study were 118 described previously (Mashhoori et al., 2018). Briefly, we used an automated figure-8 maze 119 (Fig.1A), which is a modified version of the classic T-maze frequently used in studies of effort- represents the amount of food reward at the side feeder. The same trial sequence is repeated two times per 127 session. Dashed lines show the subset of trials used for testing the effort (red) and reward (green) 128 encoding. C, Histological brain sections showing endpoints locations of electrodes. The electrode marks 129 pointed by arrows for three rats are shown on stained coronal brain slices. The estimated electrode 130 location for another rat is also shown superimposed on a figure adapted from a stereotaxic atlas (Paxinos 131 & Watson, 2014). 132 reward decision-making (Salamone et al., 1994;Walton et al., 2002). The track of the maze was 133 15 cm wide, and configured into a rectangular pathway measuring 102 x 114cm. The maze 134 contains a central feeder on the T-stem from which a trial was initiated. Two side feeder wells 135 were located on two platforms located in the upper corners of the maze. The platforms could 136 move vertically to require a variable height climb to reach the feeder. Animals descend from the 137 platform by a ramp to return to the starting feeder. The elevation of the platform was 0 cm for the 138 low-effort condition (level with the track) and was 23 cm for the high-effort condition. The 139 reward was Ensure beverage (chocolate flavoured), and the volume was 0.03ml for low-reward, 140 and 0.12 ml for high-reward. The same small volume of Ensure was delivered at the center 141 feeder in all trials so as to motivate the rat to return to the start position, and to serve as a control.

142
Four gates were located on the entry points of the T-stem and T-arms to prevent animals going 143 backward. These were also used on some trials to force animals to select one target feeder. On 144 other trials, animals were free to choose either target feeder. Over a course of roughly three 145 weeks, animals were trained on the maze with a mixture of force and free trials. Animals were 146 identified as well-trained when they choose high-reward options over the low-reward ones more 147 than 80% of the trials in which the barrier height was the same on both sides. In the following 148 sessions, the animals were required to perform two full blocks of forced trials as shown in Figure   149 1B. Each block of trials consisted of five groups of 20 forced trials where the effort-reward 150 conditions were constant. Each of these trial groups was designed to investigate one of the three 151 decision features (effort, reward, path), which was conducted by manipulating only one of these 152 and keeping the other two constant. However, only effort and reward groups were used in this 153 study. Trials were arranged alternatively left and right throughout the task. One of the recorded 154 rats used a different task design, consisting of either four or six groups of 16 trials (10 alternate-155 side force followed by 6 free trials) in which another level of effort was also added (23 cm for 156 medium-effort, and 46 cm for high-effort).

157
All rats were administered with saline or AMPH during a 5-10 minutes interval after 158 finishing the first block of the experiment, before starting the second block. Three doses of 159 AMPH (0.5, 1, 1.5 mg/kg) were used in this study. Only one dose (or saline) was given per daily 160 session. The order of drug/saline injection across sessions was: saline, AMPH 0.5 mg/kg, AMPH 161 1 mg/kg, saline, AMPH 1.5 mg/kg. The rat performing the alternate task design received the 162 saline and drugs in different orders: saline, AMPH 0.5 mg/kg, AMPH 1.5 mg/kg, saline, AMPH 163 1 mg/kg, saline, AMPH 0.5 mg/kg.

165
Following completion of the study and 2-5 days before transcardial perfusion, recoding 166 sites were marked by passing 10 uA direct current for 10 second through one electrode of each 167 tetrode. Then, rats received lethal injections of sodium pentobarbital (100 mg/kg i.p.) and were 168 perfused with PBS and 4% paraformaldehyde (PFA). The brains were post-fixed for 24 hours in 169 4% PFA and then transferred and stored in 30% sucrose and PBS solution with sodium azide 170 (0.02%). After at least 24 hours, the brains were coronally sectioned at 40 μm thickness using a 171 CM3050 S freezing Cryostat (Leica, Germany) and mounted on glass microscope slides, then 172 stained by cresyl violet. Digital images of the prepared brain sections were produced with a 173 Nano-Zoomer slide scanner (Hamamatsu, Japan) and visually inspected to determine the location 174 of marking lesions. As shown in Figure 1C, electrode marks were identified for three brains. The     Fig.1B). For instance, to compute the effect of effort on neural signaling, we made conditional 224 means from trials in which reward size and reward location were the same. A neuron was 225 considered to be responsive to effort if it significantly discriminated low-and high-conditions in 226 at least one of the two effort groups shown in Figure 1B

306
We recorded ensembles of single neuron activity from the ACC of well-trained rats (n=4) 307 performing a forced-alternation task in which the reward volume and the effort (barrier climbing) 308 to enter either of two reward zones were manipulated independently. The yield was 556.7  The running trajectory of the rats, estimated by video tracking of head-mounted LEDs, showed 316 decreasing path smoothness as the dose of AMPH increased ( Fig.2A). We quantified this by the 317 Hausdorff fractal dimension, which assesses the distance between spatial data points in order to  346 We next examined if AMPH administration affected single unit encoding of effort or reward in a 347 manner that may explain its behavioural effects. We first linearized the maze by partitioning the 348 track into 36 two-dimensional bins in order to facilitate analysis of neural signaling during 349 specific epochs of the task (Fig.3A inset). The percentage of recorded neurons discriminating the 350 ramp height (i.e. encoding effort) gradually elevates from about 10% at the central feeder to a 351 peak near 35% at the barrier, then sharply drops after climbing (Fig.3A). The percentage of cells quadrants. Figure 4B shows the distribution of neurons in the effort-reward coding plane at four 384 locations of the maze. Neurons of quadrants Q II and Q IV have opposing signaling of utility.

385
Neurons of Q IV tend to generate more action potentials for high-utility conditions (i.e. when the 386 effort is low or the reward volume is high), and fire less in low-utility conditions (i.e. high-effort 387 or low-reward). Neurons in Q II exhibit the inverse relationship among firing and value. The slope 388 of the first PC becomes steeper with increasing AMPH (Fig.4C), indicating a loss of neural firing 389 correlation with reward. Circular statistical analysis reveals that the post-injection PCs are 390 significantly different than the pre-drug condition for AMPH, whereas saline injection has no 391 effect (Kuiper two-sample test, test statistic k = 160; p = 0.028 for 0.5mg/kg, k = 176; p = 0.007 392 for 1mg/kg, k = 240; p < 10 -3 for 1.5mg/kg AMPH). To ensure that this is not an effect of 393 heterogeneous sample sizes among conditions, we ran a bootstrap analysis (100 repetitions) in 394 which we randomly sub-sampled the data to obtain the same number of neurons for each 395 condition. The PCs were highly stable, and the results did not change with downsampling. 396 Moreover, the percentage of total variance explained by the first PCs remains above 60% for all 397 conditions, which further indicates that the results of the PCA are reliable. The rotation of neural 398 tuning toward the vertical effort axis indicates that the encoding of reward is 'compressed' more 399 than is the encoding of effort. Furthermore, the explained variance by the first principal In sum, high doses of AMPH induce a significant impairment in the encoding of utility 425 by single units. This effect appears to be caused predominantly by a reduction in reward 426 signaling, rather than effort signaling, which is consistent with our behavioural observations that 427 AMPH-treated rats are less interested in consuming the reward. These results therefore suggest 428 that reward is devalued by AMPH to a greater extent than is effort.

429
AMPH contracts ensemble state space 430 The analysis above suggests that the encoding of utility by single units is compressed by 431 AMPH. This presumes that the primary carrier of information is the mean firing rate of cells, and 432 does not take into account coordination of firing among neurons. We therefore conducted a state-   (Fig.3). This indicates that the encoding is specific to task elements other than spatial 467 location. This is further evident by the large deviations in ensemble encoding when animals 468 spontaneously engage in off-task behavior during trials of the task (Fig.7).

469
Injection of saline had little effect on the trajectories in the reduced-dimension state-space 470 (Fig.6A), whereas AMPH appears to cause a modest contraction in the space (Fig.6B). To better 471 visualize the encoding of task features and the effects of AMPH, we next independently plotted 472 the first four GPFA factors to compare encoding on trials with high-vs low-reward (and constant 473 effort; Fig.8). The first factor in each session shows a clear discrimination of reward at the feeder 474 zones. In some cases, the second factor does as well. AMPH does not appear to strongly affect 475 the discrimination by the ensemble, but does appear to reduce the overall variance of factors 476 (peak to peak amplitude). Note that the roughness of some trajectories (e.g. Fig.8B and C as 477 compared to A and D) might be due to the lower number of cells in those sessions or increased 478 population correlation and is not the effect of drug conditions as they appear even before the 479 injections.  We next sought to quantitatively test these observations. Using only data from bins 498 between the central feeder to target feeders, we tested if the level of effort or reward could be 499 discriminated by the mean value of either the first or second GPFA factor across each task epoch 500 (ANOVA, significant at p<0.05). We then computed the fraction of sessions with significant 501 effort or reward encoding. The effort is well discriminated at all epochs (Fig.9A) in most 502 sessions (>50%), but peaks significantly near the barrier (ANOVA, F 3,28 = 6.6; p = 0.0016 for 503 first, and F 3,28 = 19.01; p = 610 -7 for second factor). The reward, on the other hand, is only 504 discriminated well at the target feeders (Fig.9B) (ANOVA, F 3,28 = 47.55; p = 410 -11 for first, 505 and F 3,28 = 31.41; p = 410 -9 for second factor). AMPH does not affect the discrimination of 506 effort or reward (Fisher's exact test, p>0.05 for prevs post-injection of saline or drug in each 507 epoch). It is interesting to note that although the relative fraction of cells encoding reward is low 508 and does not change across task epochs (Fig.3), the information encoded by the population is 509 strongly modulated by task epochs, and discriminates reward amount with high accuracy (>80%) 510 at the side feeder. high-and low-reward trials, as well as its SEM. The volume is calculated as the state-space enclosed by 514 the neural trajectory from central to side feeders in the 3D space formed by the first three latent factors of 515 the pre-injection condition. The volumes of all post-injection trials were then compared with the mean 516 volume of the same group of trials in pre-injection phase. This comparison occurs in the same space, 517 which is specific to the session, and then the relative change is compared across the sessions as shown 518 here. Negative and positive values in relative change demonstrate contraction and expansion in the state-519 space volume, respectively. (RM-ANOVA, *: significant at p<0.05, **: p<0.01, ***: p <0.005 and ****: 520 p<10 -6 ) 521 To test if AMPH contracted the trajectories in state-space, we computed the change by 522 AMPH of the volume occupied by the GPFA factor trajectories in high-and low-reward trials 523 projected into the same state-space (Fig.10). As compared to saline, the volume is significantly shifted upward, indicating that the patterns are more correlated across the entire maze (Fig.11A). 543 Only the 1.5 mg/kg dose showed a statistically significant difference from the saline condition 544 (ANOVA, F 3,18 = 6.97; p = 0.0026). For this high dose, the half-amplitude of the decorrelation is 545 higher, meaning that the rate of change of correlation is lower (Fig.11B).  and an increase in self-similarity over longer distances. This is consistent with the analyses of 570 single units, which indicate that the contraction may be more pronounced in the reward domain.

571
A loss of reward-related variance without increase in variance related to other task features will 572 manifest as increased self-similarity because reward signaling is not uniform across epochs. sensitivity to reward omission (Wong et al., 2017). Consistent with these data, we found that 584 animals spent less time at feeders, and were more likely to forgo the reward, as AMPH 585 increased. Thus, the behavioural effects appear dichotomous; animals increased task 586 engagement, but were less interested in the reward. This pattern of behavioural effects may 587 reflect discounting of both the physical effort and reward value, but then it remains unclear why 588 an animal would work for a highly discounted outcome, rather than engage in some other 589 behaviour. Indeed, animals did not engage in the task at higher doses ( 2.0 mg/kg), and instead 590 engaged in off-task behaviours such as grooming, sniffing, and exploring the edges of the track.

591
A similar pattern of behavioural effects have been reported elsewhere. For instance, moderate 592 AMPH significantly improves working memory and increases locomotor activity, whereas 593 higher doses do not (Shoblock et al., 2003). We later speculate why AMPH has this dose 594 dependency and seemingly paradoxical effect on reward value. 595 We cannot infer from our data where in the brain AMPH may be affecting decision-596 related processing. We can, however, use neural activity in ACC as a window into network  It is possible that the animals anticipate the effort and reward of the upcoming trial because of 601 the task design; the animal was directed to alternate between left and right options, and the effort 602 and rewards were static over blocks of 20 trials. Animals are thus not able to choose among the 603 options, so the 'choice' is between performing and not performing the task. It is therefore 604 possible that modulation of neural activity reported here serves to track expectations or keep the 605 animal engaged in the task, rather than generating optimal cost-benefit decisions among the right 606 and left feeders. AMPH may either change the value and/or neural mechanisms of task 607 engagement, or could affect the mechanisms by which the animal maintains engagement. This 608 function is more akin to attention and vigilance, which are both increased by AMPH in a variety 609 of task settings and species (Sostek et al., 1980;Ridley et al., 1982;Koelega, 1993 . We likewise found that the ensembles 617 followed a stereotyped trajectory in latent space, and that the trajectory deviated widely when 618 rats were not performing the task (Fig.7). These data suggest that the ACC encoding evolves  , 2006). This provides an explanatory framework by which AMPH could produce the observed 623 effects in our task. Neuromodulation by low-dose AMPH stabilizes task-related patterns, which 624 is predicted by the models to decrease the probability of task disengagement. Note that this 625 mechanism requires no notion of reward value or utility; rather, it is a property of changing the 626 neural dynamics in the network. Furthermore, AMPH has been reported to increase the 627 separation between distinct ensemble activity patterns during distinct task states at a moderate 628 dose (1 mg/kg), whereas it reduces the distance between such distinct neural activity states at 629 high doses (3.3 mg/kg) (Lapish et al., 2015). Our results are consistent with this, but with the 630 added feature that the occupancy of trajectories in state space is also affected. At low doses (0.5-631 1.0 mg/kg), we found a slight contraction of ensemble state space. This became an expansion of 632 state-space at high doses (1.5 mg/kg). Moreover, the decorrelation time increased at high doses.

633
These observations are consistent with that of Lapish and colleagues. Although the state space 634 contracts under low doses, the variance from trial to trial also appears to reduce, which can 635 increase pattern separation from samples at different points of the trajectory (different task 636 epochs). At high doses, the state space expands, but so does the variance across trials. The sniffing, and licking with repetitive head movement (Randrup & Munkvad, 1967;Fog, 1970;651 Schiorring, 1971). Our interpretation presumes that neural activity in ACC reflects neural 652 dynamics elsewhere in the brain that produce behavior. We cannot rule out the inverse 653 relationshipthat some other brain system drives the behavior, and the ACC encoding tracks the 654 animal's state. Our data are consistent with past results showing that rat ACC activity is sensitive 655 to small deviations of position on a track (Euston & McNaughton, 2006). It is therefore possible 656 that AMPH acts on a brain system uncorrelated with ACC dynamics to produce increased 657 variance in the running path, which then increases variance in ACC dynamics. Our data provide 658 some evidence against this possibility. AMPH has a monotonically increasing effect on running 659 path roughness, but reduces ACC variance at low-to-moderate doses. This suggests that ACC 660 encoding is not dictated by the positional state of the animal.

661
Although our data are purely correlational, they suggest novel linkages between the 662 effects of AMPH on neural activity and behaviour. We propose that the reason rats perform the 663 task even though they lose interest in the reward under moderate doses of AMPH is because of 664 this drug's combined effects on reward encoding and dynamics. Specifically, rats have less 665 interest in the reward because reward information is attenuated. Nonetheless, rats continue to 666 engage in the task at moderate AMPH levels because the ensemble dynamics are stabilized such 667 that rats are less likely to disengage. The result is that the brain runs quickly through the task 668 script, even though the reward is not desired.