Introduction

The ability to predict outcomes of biological significance is essential to fitness and survival. To do so, animals, including humans, must identify which stimuli among the many present provide relevant predictive information1; that is, they must solve the problem of structural credit assignment2. A dramatic example of this type of problem is provided by the COVID-19 pandemic, as societies around the world scrambled to ascertain which stimuli posed a risk of contagion and which did not. Failure to assign predictive credit to relevant stimuli (e.g., close physical contact with other individuals, shared meals, enclosed spaces) has led to dire consequences3, while misassigning credit to irrelevant stimuli (e.g., 5G mobile networks, mosquitoes, bleach) has likewise fostered maladaptive behaviors4. Given the intimate relationship between credit assignment and decision making, it is essential to elucidate how credit is apportioned among cues under various learning conditions.

Credit assignment among environmental cues has traditionally been regarded as a competitive process in which the best predictor of the outcome acquires substantial credit over the course of learning at the expense of other predictors1,5,6. In support of this notion is evidence that cues compete for credit in a range of tasks7,8,9,10 and species, from C. elegans to humans11,12,13,14,15,16. However, it has also long been known that cue competition is not ubiquitous17,18,19 and can be disrupted across multiple learning conditions20,21,22. One such condition is experiencing the trials in massed fashion; that is, separated by an intertrial interval (ITI) that is typically shorter than twice the duration of the cue23, as reported in rats24,25, pigeons26, and humans27. This finding has profound implications because in our ever-complex world we are routinely bombarded with information presented in close succession. Diminished cue competition in such situations implies that a host of incidental stimuli may highjack the learning system and wrongfully gain control over behavior.

Recently, Reverte et al.28 reported that granting rats agency over trial presentations protects cue-reward learning from the well-known deleterious effects of massed training29,30,31. Here, we sought to determine whether agency over learning could specifically rescue cue competition under massed trial conditions. To this end, we embedded a trial self-initiating procedure within a powerful master-yoked design that allowed us to vary the degree of agency over learning while keeping the exposure to cue-reward relationships identical28. Using a variety of well-established and novel cue competition tasks, we found robust evidence of cue competition only in animals that had agency over learning. Importantly, this effect cannot be explained by differential levels of engagement, general discrimination competence, or ability to process compounded stimuli concurrently. Our data provide the first demonstration of a critical role for agency in how credit is apportioned among predictive cues and open up new lines of neural and theoretical inquiry.

Results

Agency rescues the blocking effect from the deleterious effects of massed training

We first set out to test whether agency over learning rescues the blocking effect7 from the deleterious effects of massed training in rats (Fig. 1B). This and the remainder of the studies employed a within-subject design embedded within a between-subject, master-yoked procedure (n = 8; Fig. 1A). Master rats (Agency group) were allowed to self-initiate their trials by performing a nose poke into a nose port at any point during a period of trial availability (max = 20 s) signaled by a nose-port light28. On any given trial, a nose poke would turn on one or more 10-s cues of the visual and auditory modality. In contrast, yoked rats (Passive group) received an identical sequence of events to their master counterparts (including the trial availability cue), but trial presentations were noncontingent on their behavior (i.e., standard Pavlovian conditioning).

Figure 1
figure 1

Agency over learning rescues competitive credit assignment from the deleterious effects of massed training in a blocking task. (A) Trial structure in this and the remainder of the studies. On each trial offer, a nose-port light was presented in both groups signaling trial availability to Agency rats. A trial-initiating response (nose poke) by an Agency rat immediately resulted in a 10-s cue being presented to that rat as well as to its yoked animal in the Passive group. On reinforced trials, a sucrose US was presented at dipper magazine following cue offset. Trial offers were separated by a 10-s variable intertrial interval (ITI). (B) Experimental design. Letters A and B denote visual stimuli, whereas X and Y denote auditory stimuli. Digits in brackets represent the probability of reward for each trial type. The pretraining phase involved a simple discrimination between A and B. During the compound phase, these trials were interleaved with compounds AX (where A should block X) and BY (where B should not block Y), both continuously reinforced. To test for blocking (i.e., less responding to X than Y), two daily probe trials with X and Y were introduced on session 9 of the Compound phase. (C,D) Behavioral results in groups Agency and Passive, respectively. The left and center line plots depict performance during the Pretraining and Compound phases, respectively. The bar graphs on the right show average responding to X and Y on probe trials across the last four sessions. Conditioned responding is measured as mean number of head entries (± SEM).

In both groups, a sucrose reward delivered in a dipper cup was made available on reinforced trials immediately after the termination of the cues. Conditioned responding was measured as the number of anticipatory head entries made by the rat at the dipper recess during the last 5 s of cue presentation32 (“Materials and Methods”). Critically, the ITI was programmed to be only 10 s on average (range 5–15 s). Since Agency rats could forgo trial offers, the mean ITI was effectively longer (“Materials and Methods”), but still considerably shorter than the mean ITI typically used in studies of conditioned magazine approach featuring 10-s cues (in the order of minutes).

Both groups underwent two phases of training. In the first phase, rats were pretrained with a simple discrimination involving two visual cues, A and B. Cue A, which would serve as the blocking stimulus in the following phase, was reinforced with a probability of 1 [henceforward symbolized by A(1)], while B was never reinforced [B(0)]. Training continued for 14 days to allow the opportunity for asymptotic discrimination learning (Fig. 1, Panels C and D, left). A group x session block x cue mixed ANOVA revealed a significant main effect of cue (F(1,182) = 398.24, p < 0.001) and group (F(1,14) = 8.10, p = 0.013), and a group by cue interaction (F(1,182) = 15.65, p < 0.001). Bonferroni-corrected post-hoc analyses revealed that this interaction was likely driven by the slightly higher level of responding on A(1) trials in the Passive group (t(17.1) = − 3.95, p = 0.006), as both groups significantly discriminated between A(1) and B(0) (Agency: t(182) = 11.31, p < 0.001; Passive: t(182) = 16.91, p < 0.001).

In the second, compound phase (Fig. 1, Panels C and D, center), training with A(1), B(0) continued for another 20 sessions, but, in addition, two novel auditory cues, X and Y, were presented in compound with A and B on separate, reinforced trials. Specifically, X accompanied A as the stimulus to be blocked, whereas Y accompanied B as the control cue for blocking [AX(1), BY(1)]. Panels C and D (center) of Fig. 1 show that in both groups the compounds evoked similar levels of conditioned responding as A. A group x session block x cue mixed ANOVA revealed a main effect of cue (F(3,266) = 187.35, p < 0.001), session block (F(4,266) = 4.28, p = 0.0002) and group (F(1,14) = 7.07, p < 0.019), and a group by cue interaction (F(3,266) = 10.33, p < 0.001). Bonferroni-corrected post hoc analyses revealed that both groups successfully discriminated between reinforced and nonreinforced trial types (Agency: A(1) vs. B(0), t(266) = 10.66, p < 0.001; AX(1) vs. B(0), t(266) = 10.26, p < 0.001; BY(1) vs. B(0), t(266) = 10.60, p < 0.001; Passive group: A(1) vs. B(0), t(266) = 16.95, p < 0.001; AX(1) vs. B(0), t(266) = 17.24, p < 0.001; BY(1) vs. B(0), t(266) = 16.41, p < 0.001), indicating that the interaction was likely driven by the higher asymptote of responding in the Passive group.

In order to monitor the emergence of blocking (i.e., less responding to X than Y), two probe trials with each of X and Y were randomly interleaved daily from session 9 onward (Fig. 1, panels C and D, center). Inspection of the results suggests that a blocking effect emerged at the end of the compound phase in the Agency, but not the Passive group. This impression was confirmed by a further group x session x cue mixed ANOVA that focused on the mean responding to X and Y across the last four probe sessions (Fig. 1, panels C and D, right). This analysis revealed significant main effects of cue (F(1, 98)) = 7.80, p = 0.006) and group (F(1, 14)) = 6.42, p = 0.024), and a significant group by cue interaction (F(1,98) = 6.28, p = 0.014). Exploration of this interaction with Bonferroni-corrected simple main effects confirmed a significant difference in responding to the cues in the Agency (t(98) = 3.75, p < 0.002), but not the Passive group (t(98) = 0.20, p =  ~ 1). The results thus provide evidence that agency over learning rescues competitive credit assignment to cues from the adverse effects of massed trials.

Agency rescues competitive credit assignment in a novel cue competition task

To further examine the influence of agency on competitive credit assignment under massed trials, we next compared the performance of Agency and Passive groups in a novel cue competition design. This design creates a conflict between the expected pattern of responding to two cues, X and Y, when credit assignment is competitive relative to when it is noncompetitive. By creating this conflict, this design maximizes the chances of detecting differences between competitive and noncompetitive learning. This makes this design ideally suited for examining the full impact of behavioral and neural manipulations on credit assignment. Given the novelty of the design, we piloted it out in a standard Pavlovian magazine-approach setting with spaced out trials (Supplementary Materials, Exp. S1).

The details of the experimental design are shown in the table of Fig. 2A. Two groups were trained with the same master-yoked procedure used in the previous study (Fig. 1A). In the pretraining phase (Fig. 2, panels B and C, left), rats received 10 sessions of discrimination training with two visual cues, A(1) and B(0), and two auditory cues, X(0.75) and Y(0.25), where, once again, the numbers in parenthesis represent the probability of reward associated with each cue. A group x session block x cue mixed ANOVA revealed a main effect of cue (F(3,266) = 24.27, p < 0.001) and a cue by session block interaction (F(12,266) = 2.03, p = 0.022). Bonferroni-corrected post-hoc analysis of this interaction revealed that the discrimination between A(1) and B(0) was solved from session block 3 onward (t(266) = [3.78–4.64], p < 0.01), and that between X(0.75) and Y(0.25) from session block 4 onward (t(266) = [2.94–3.06], p < 0.04). No effect of group nor any interaction involving that factor was found.

Figure 2
figure 2

Agency over learning rescues competitive credit assignment from the deleterious effects of massed training in a novel cue-competition task. (A) Experimental design. The coefficient 3 denotes three times as many presentations of the AX(1) and BY(0) compounds as of the rest of trial types. During the Pretraining phase, all rats received discrimination training between A and B and X and Y. In the Compound phase, these stimuli continued to be presented with the same probability of reward. However, X was presented in compound with A on the 75% of trials in which it was rewarded (allowing A to take away its credit), whereas Y was presented in compound with B on the 75% of trials in which it was not rewarded (allowing B to take the blame for reward omission). Trials with X(0) and Y(1) permitted continual monitoring of the predictive status of these stimuli as the discrimination developed. (B,C) Behavioral results in groups Agency and Passive, respectively. The left and right line plots depict performance during the Pretraining and Compound phases, respectively. Conditioned responding is represented as mean number of magazine head entries (± SEM).

In the second, compound phase, cues X and Y had the same probability of reward as in the previous phase, but were subject to opposing competing forces (Fig. 2, Panels B and C, right). Specifically, X was presented in compound with A on the 75% of trials in which it was followed by reward [3AX(1), X(0); where 3 indicates the proportion of trials]. This allowed A to compete with X as a predictor of reward and steal its credit33. A’s ability to serve as a competitor was further bolstered by continuing to present it by itself followed by reward [A(1)]. In addition, cue Y was presented in compound with B on the 25% of trials in which Y was not reinforced [3BY(0), Y(1)], allowing B to compete with Y for predicting reward omission. Casually put, this training was intended to ensure that B rather than Y would take the blame for the omission of reward on 3BY(0) trials34. Throughout this phase, B(0) trials continued to be presented.

A key advantage of this design is that X(0) and Y(1) trials permit online monitoring of the impact of competition on responding to these cues. If credit assignment is noncompetitive35, X should be expected to evoke more responding than Y given its higher probability of reward. Conversely, to the extent credit assignment is competitive, Y should be expected to evoke more responding than X33. To examine the role of agency over learning, we focused our analysis on responding on X(0) and Y(1) trials. Inspection of Fig. 2 (Panels B & C, right) suggests that cue competition prevailed in the Agency, but not the Passive group. This impression was confirmed by a group x session block x cue mixed ANOVA, which revealed significant group by cue (F(1, 266) = 14.69, p < 0.001), cue by session block (F(9, 266) = 2.19, p < 0.001), and group by cue by session block interactions (F(9, 266) = 2.16, p = 0.025). A Bonferroni corrected post-hoc analysis of the three-way interaction revealed that, consistent with competitive credit assignment, rats in the Agency group responded to Y significantly more than to X on session blocks 9 (t(266) = 3.69, p < 0.002) and 10 (t(266) = 3.78, p < 0.002). In contrast, in the Passive group, the difference between X and Y was marginally significant only on session block 9, but in the opposite, noncompetitive direction (X > Y) (t(266) = 1.97, p = 0.05).

A group x session block x cue mixed ANOVA on responding to the remainder of the cues in the compound phase revealed a significant main effect of cue (F(3,546) = 138.82, p < 0.001) and a group by cue interaction (F(3,546) = 3.23, p = 0.022). A Bonferroni corrected post-hoc analysis, however, confirmed that that both groups discriminated between A(1) and B(0) [Agency: t(546) = 10.81, p < 0.001; Passive: t(546) = 9.26, p < 0.001] as well as between 3AX(1) and 3BY(0) trials [Agency: t(546) = 8.79, p < 0.001; Passive: t(546) = 11.42, p < 0.001]. A likely contributor to this interaction was the greater responding observed on 3AX(1) than A(1) trials in the Passive (t(546) = − 3.86, p = 0.004), but not the Agency (t(546) = 0.47, p = 1) group. This difference can be explained by the fact that learning about X was not blocked in the Passive group, allowing this cue to contribute, along with A, to conditioned responding on AX compound trials (i.e., summation).

This study thus provides further evidence of a profound impairment in cue competition under massed Pavlovian training, rendering the rescuing effects of agency over learning the more striking. One interpretation, however, is that Agency rats did not apportion credit any more competitively than Passive rats, but rather  treated cues X and Y as radically distinct when presented alone vs. in compound. In other words, Agency, but not Passive rats may have treated the compounds AX and BY as separate, configural stimuli distinct from their constituent elements. To rule out this interpretation, we used a variant of the above design that does not afford such an explicit configural solution.

Agency rescues competitive credit assignment in the absence of an explicit configural solution

Figure 3A shows the experimental design. In the pretraining phase, all rats received 8 sessions with a simple visual discrimination of the form A(1), B(0) (Fig. 3, panels B and C, left). A group x session block x cue mixed ANOVA revealed that both groups solved this discrimination as evidenced by greater magazine-approach responding on A(1) than B(0) trials over the course of training (main effect of cue: F(1,98) = 88.72, p < 0.001; cue by session block interaction: (F(1,98) = 5.32, p < 0.002). A Bonferroni-corrected post hoc analysis of this interaction revealed greater responding to A(1) than B(0) across all rats from session block 2 onward (t(98) = [2.21–2.92], p < 0.004). No significant effects of group or interactions involving that factor were found.

Figure 3
figure 3

Agency over learning rescues competitive credit assignment under massed training conditions in the absence of an explicit configural solution. (A) Experimental design. The pretraining phase involved a simple discrimination between A(1) and B(0). During the compound phase, these trials continued to be presented, but were interleaved with AX(.75) and BY(.25) trials. Note that X has a higher probability of reward than Y. However, X signals a net decrement in the probability of reward when considered against the backdrop of A, whereas Y signals a net increase in the probability of reward when considered against the backdrop of B. Thus, to the extent credit assignment is competitive, Y should evoke more responding than X, but the opposite should be true if credit assignment is noncompetitive. To test this, two daily probe trials with X and Y were inter­leaved with training trials, starting on session 13 of the Compound phase. (B,C) Behavioral results in groups Agency and Passive, respectively. The left and right line plots depict performance during the Pretraining and Compound phases, respectively. Conditioned responding is represented as mean number of magazine head entries (± SEM).

In the second, compound phase, all rats continued to receive the A(1), B(0) discrimination, but novel compound trials AX(0.75) and BY(0.25) were introduced, where X and Y were again auditory stimuli (Fig. 3, panels B and C, right). A group x session block x cue mixed ANOVA on responding to A(1), B(0), AX(0.75) and BY(0.25) trials throughout this phase revealed a significant effect of cue [F(3,434) = 142.33, p < 0.001] and a group by cue interaction [F(3,434) = 7.88, p < 0.001]. Bonferroni corrected post-hoc analyses of this interaction revealed that both groups solved the A(1) vs. B(0) [Agency: t(434) = 10.94, p < 0.001; Passive: t(434) = 13.28, p < 0.001] as well as the AX(0.75) vs. BY(0.25) [Agency: t(434) = 4.36, p < 0.001; Passive: t(434) = 9.96, p < 0.001] discriminations. A likely contributor to the group by cue interaction is the fact that Agency rats discriminated better between BY(0.25) and B(0) trials [t(434) = − 7.21, p < 0.001, Cohen’s d = 0.79] than Passive rats [t(434) = − 3.30, p = 0.029, Cohen’s d = 0.53]. The implications of this result are deferred to the Discussion.

Note that, in the absence of competitive credit assignment, this training should result in X evoking more responding than Y, given their respective probabilities of reward (0.75 and 0.25). On the other hand, if credit assignment is competitive, more responding to Y than to X should be observed. This is because X signals a decrement in the probability of reward if considered against the backdrop of A, which is otherwise always reinforced, whereas Y signals an increment in the probability of reward if considered against the backdrop of B, which is otherwise never reinforced. Thus, considered in the context of other cues present, Y is a better predictor of reward than X.

To test this prediction, we randomly interspersed two daily nonreinforced probe trials with X and Y [X(0), Y(0)] starting on session 13 (Fig. 3, panels B and C, right). A group x session block x cue mixed ANOVA on responding during the probe trials revealed a marginally significant effect of cue (F(1,126) = 3.86, p < 0.052) and a significant group by cue interaction (F(1,126) = 5.43, p < 0.021). A Bonferroni-corrected simple main analysis of the interaction confirmed that the Agency group responded significantly more to Y than X (t(126) = 3.04, p = 0.006). In contrast, the Passive group responded equally to both cues (t(126) = − 0.26, p = 1), as expected if cue competition is disrupted. The results thus provide further evidence that agency over learning rescues competitive credit assignment.

Ruling out alternatives to the role of agency in competitive credit assignment

The results so far can be readily interpreted by assuming that agency over learning simply enhances the animals’ attention to task, general discrimination competence, or ability to process compounded stimuli concurrently. To test these interpretations, we compared performance between Agency and Passive rats in a patterning task in which opposite credit must be assigned to compound cues and their constituent elements (Fig. 4A). One such problem was a negative-patterning discrimination where two cues, a visual (A) and an auditory (X) stimulus, were rewarded when presented individually, but not in compound [A(1), X(1), AX(0)]. A second problem involved a positive-patterning discrimination in which another pair of visual (B) and auditory (Y) stimuli were rewarded when presented in compound, but not individually [B(0), Y(0), BY(1)]. If any of the aforementioned interpretations is correct, Passive animals should find this discrimination particularly difficult.

Figure 4
figure 4

Passive rats perform better than Agency rats in a patterning task, suggesting spared attention to task, compound processing, and general discrimination ability. (A) Experimental design. A negative- and positive-patterning discrimination involving two visual stimuli (A,B) and two auditory stimuli (X and Y) were trained contemporaneously. (B,C) Behavioral results in groups Agency and Passive, respectively. The left and right panels show discrimination performance in the negative- and positive-patterning discriminations, respectively. Conditioned responding is measured as the mean number of magazine head entries (± SEM).

Although both discriminations were trained concurrently, for simplicity’s sake we treated them separately when displaying and analyzing the data (Fig. 4, Panels B and C). Inspection of the figure suggests that Passive animals solved both patterning problems at least as well as, if not better than, Agency rats. To further simplify the analysis, we averaged responding on elemental trials [mean of A(1) and X(1) trials and of B(0) and Y(0) trials]. A group x session block x cue mixed ANOVA on the negative-patterning discrimination revealed a main effect of group [F(1,378) = 91.60, p < 0.001] and significant cue by session block [F(13,378) = 5.11, p < 0.001] and group by cue [F(1,378) = 33.48, p < 0.001] interactions. Bonferroni-corrected simple main effects analysis of the latter interaction confirmed that both Agency [t(378) = − 2.68, p = 0.016] and Passive [t(378) = − 10.86, p < 0.002] animals solved the A(1)/X(1) vs. AX(0) discrimination, although the effect size was larger in Passive than Agency rats (Cohen’s d = 0.93 and 0.23, respectively). A parallel mixed ANOVA of the positive-patterning discrimination revealed only a significant effect of cue [F(1,378) = 106.56, p < 0.001], indicating that both groups solved the B(0)/Y(0) vs. BY(1) discriminations similarly (Agency: Cohen’s d = 0.65; Passive: Cohen’s d = 0.68). Taken together, these findings suggest that the deficits in competitive credit assignment previously observed in Passive rats were unlikely due to an inferior level of engagement, ability to solve complex discriminations, process compounded stimuli concurrently, or form configural representations.

To buttress these conclusions, we also presented this patterning problem to the rats from the first study at the end of blocking training (Supplementary Materials, Exp. S2). Since those rats had already experienced the two auditory stimuli as X and Y, two novel auditory stimuli were used. The results of this replication confirmed those with naïve animals. That is, the same Passive rats that exhibited deficits in blocking were if anything better at solving complex nonlinear discriminations than their Agency counterparts. This finding is consistent with prior evidence that cue competition between the elements of a compound can hinder the solution of nonlinear discriminations36.

Discussion

Competitive credit assignment among environmental cues is the backbone of associative and reinforcement learning models of Pavlovian conditioning, to the point that an inability to account for cue-competition phenomena renders a model obsolete37. Yet converging evidence indicates that competition is not automatically determined by the presence of other cues, but also by learning conditions such as trial spacing20,21,22). Specifically, when information is presented in massed fashion, cue competition is diminished24,25,26,27. Here, we provide evidence that granting rats agency over trial presentation can rescue competitive credit assignment from the detrimental effects of massed training.

The beneficial effects of agency—often referred to as free (vs. forced) or self-determined (vs. imposed) choice—on performance has been documented in domains such as education38 and creativity39. In the human cognitive literature, agency over what task40,41 or what feature of a task to engage with42 has been shown to enhance performance, and the neural bases of this phenomenon are receiving increasing attention42,43,44,45. Our data adds to this literature by showing that the way predictive credit is negotiated among environmental cues is dramatically altered by whether the individual has control over the occurrence of those cues, even when there is no knowledge of the specific cue being presented.

Taken together, our results allow us to rule out various trivial explanations. Firstly, the beneficial effects of agency on competitive credit assignment are not simply the product of a heightened ability to process compounded stimuli concurrently or learn complex discriminations. Evidence for this comes from the superior performance of Passive groups in the patterning task. Secondly, neither can the contribution of background (contextual) cues to competitive learning explain our full pattern of results. While contextual conditioning could summate with responding to the target cues46 and mask any differences when responding is at ceiling (e.g., Passive group in our blocking study), such masking would not occur when responding is below asymptote (e.g., in our novel cue-competition task studies). Thirdly, our data are likewise difficult to explain by a differential role of eligibility traces in Agency and Passive groups2. Specifically, in Passive animals, massed training might allow eligibility traces of recently presented cues to spill over the subsequent trial and contaminate credit assignment. This effect would be weaker in Agency rats if trial-initiating responses serve to precipitate the decay of eligibility traces, essentially fulfilling the role of a long ITI. The issue with this hypothesis is that it also predicts poorer performance for Passive rats in the patterning task, which our data disconfirmed.

Our findings speak to the necessity of incorporating agency into theories of associative and reinforcement learning. This raises the question of what mechanisms might be responsible for the effects observed. One possibility is that agency alters the computation of prediction errors (PE). Recently, various authors have posited that self-determined choices induce a positivity bias in PEs47, either because positive PEs are amplified44 or because negative PEs are discounted42. Such an imbalance would give excitatory learning (i.e., cue-reward) the upper hand over inhibitory learning (cue-no reward). To explore this possibility, we conducted a series of simulations of our experimental designs using standard associative theory5, inspired by Chambon et al.47. We assumed that positive PEs are weighted significantly more during learning than negative PEs under Agency, but not Passive conditions (Figure S3, Supplementary Materials). Notably, the general prediction of a greater asymptote of responding in Agency animals was not confirmed by our data, although this may be due to ceiling effects in our dependent measure or differential learning-to-performance functions in the groups. More importantly, our simulations (Fig. S3) provided a proof of concept that a PE positivity bias can explain the basic effects we report. In our first blocking study (Fig. 1), for instance, a positivity bias predicts faster conditioning of the blocking cue A during the Pretraining phase in Agency than Passive rats. Assuming that learning about A is preasymptotic at the end of this phase (asymptotic learning would produce complete blocking in both groups), this cue will be in a better position to block X in the Compound phase in the Agency group. In our third study (Fig. 3), a positivity bias would readily accommodate the greater level of responding on BY(0.25) trials during the Compound phase in Agency relative to Passive rats, which, combined with the ongoing extinction of B on B(0) trials, would ensure that Y accrues more credit in the former group. Assuming that A blocks X from acquiring substantial credit on reinforced AX(0.75) trials in both groups, this would explain why Agency, but not Passive rats, responded more to Y than X on probe trials. A similar argument could be applied to the results of our second study (Fig. 2). Therein, Y(1) trials, which were relatively infrequent in each session, may have enjoyed faster excitatory learning in Agency rats, which combined with the inability of X to acquire significant credit (as it is being blocked by A), would yield greater responding to Y than X in those animals. Finally, our simulations also showed that, in accordance with our findings, a positivity bias will disrupt the acquisition of a negative patterning discrimination in the Agency condition, although our data does not show a greater response summation on AX(0) trials as anticipated by the model. Overall, however, the positivity bias hypothesis44,47 provides a parsimonious account of our results—and one that need not assume that the mechanisms directly responsible for cue competition (e.g., the computation of aggregate PEs5,6) operate any differently in the presence or absence of agency.

Admittedly, agency over learning might regulate other mechanisms besides the relative impact of PEs on learning; for instance, by modulating the allocation of attention to cues. According to selective attention accounts1,48,49, cue competition results from paying increasing attention to the most accurate predictors of an outcome while simultaneously learning to ignore poor predictors. Conceivably, agency over learning might facilitate this attentional divergence in the face of cognitively challenging massed training conditions. However, the beneficial effects of agency on credit assignment need not be limited to learning and attention, but could also work at the level of memory retrieval, in line with comparator mechanisms50.

The current findings have important implications both for normal functioning and mental health. In pedagogical settings, where massed instruction has long been known to be detrimental51,52,53, our data suggest the possibility that agency over the presentation of information might promote more competitive, and therefore selective, learning. In the context of mental health, our findings open up opportunities for therapeutical interventions based on enhancing the individual’s perceived sense of agency in disorders characterized by attenuated cue competition, including psychotic54,55, attentional56, anxiety57, and substance use disorders58.

For the present, much work is needed to elucidate the complex role that agency is likely to play in learning and psychopathology. Consider the case of substance use disorders, where drug self-administration is known to mitigate some of the more dramatic and aversive effects of drugs of abuse59,60. In addition to allowing a more accurate prediction of drug receipt, self-administration might also foster cue competition and thereby preclude irrelevant stimuli from contributing to cue reactivity in the future. We speculate that as the sense of agency over drug consumption wanes and drug-related behaviors transition from voluntary and goal-directed to habitual and compulsive61, credit assignment might also become less competitive. This transition would exacerbate the individual’s vulnerability to drug abuse and relapse by drastically expanding the set of stimuli capable of inducing cue reactivity. Consistent with this, evidence suggests that long-term exposure to potent rewards such as cocaine, heroin, and sucrose undermine competitive credit assignment62,63. In light of such implications, the present findings call for a closer investigation of the role of agency in credit assignment among predictive cues.

Materials and methods

For the sake of convenience, the four studies above will be referred to in this section as Exps. 1–4, and correspond, respectively, to the blocking task (Fig. 1), the novel cue-competition task (Fig. 2), its second variant (Fig. 3), and the patterning task (Fig. 4).

Experimental animals

All studies used 16 experimentally-naïve, gender-balanced, adult Long-Evans rats, making a total of 64 animals. The age and weights of the rats at the outset of each experiment was as follows. In Exp. 1, rats were ~ 20 weeks old (wo) and weighed 441–516 g (males) and 257–290 g (females); in Exp. 2, rats were ~ 13 wo and weighed 342–388 g (males) and 234–269 g (females); in Exp. 3, rats were ~ 20 wo and weighed 448–529 g (males) and 269–298 g (females); in Exp. 4, rats were ~ 22 wo and weighed 475–554 g (males) and 284–325 g (females). All animals were bred at Brooklyn College from commercially available populations (Charles River Laboratories). They were housed individually in standard clear-plastic tubs (10.5 in. × 19 in. × 8 in, Charles River Laboratories) with woodchip bedding. The colony room was maintained on a 14:10 light/dark cycle schedule. Behavioral sessions were conducted between 7–10 h after the onset of the light phase of the cycle. Throughout training, water access was restricted to 1 h/day following each experimental session while food was provided ad libitum. All animal care and experimental procedures were carried out in compliance with the ARRIVE guidelines64 and the National Institutes of Health’s Guide for the Care and the Use of Laboratory Animals65, and approved by the Brooklyn College Institutional Animal Care and Use Committee (Protocol #303).

Apparatus

Behavioral training was conducted in eight modular conditioning chambers (32-cm long X 25-cm wide X 33-cm tall, Med Associates, Inc.). Each chamber was enclosed in a ventilated sound-attenuating cubicle (74 cm × 45 cm × 60 cm) fitted with an exhaust fan that provided a background noise level of 50 dB. All reported locations of stimulus and response apparatus were measured from the grid floor of the conditioning chamber to the lowest point or edge of the apparatus. The left wall of the chamber housed two white jewel lamps (28 V DC, 100 mA) mounted on the left and right panels 9.3 cm from the grid floor. Above each of these lamps was a speaker located 20.6 cm above the grid floor and connected to a dedicated tone generator capable of delivering a 2.5-Hz, 80-dB clicker (left panel) and a 70-dB white noise (right panel). Two additional speakers were located on the left and right panels of the right wall of the chamber 24.8 cm above the grid floor. Each of them was also connected to a dedicated speaker capable of delivering a 12-kHz, 70-dB tone (left panel) and a 1-kHz, 80-dB tone (right panel). The right wall also housed a third jewel lamp located on the center panel 17.2 cm above the grid floor. Below this lamp, 4.6 cm above the grid floor, was a circular nose port 2.6 cm in diameter, equipped with a yellow LED light and an infrared sensor for detecting nose entries. This nose port was flanked by a recessed liquid reward magazine (aperture: 5.1 cm × 15.2 cm) located on the right panel, 1.6 cm above the grid floor. This magazine was equipped with an infrared sensor for detecting head entries, and connected to a liquid dipper that could deliver a 0.04 cc droplet of a 10% sucrose solution. The chambers remained dark throughout the experimental session except during presentations of the visual stimuli. In the same room was a computer running Med PC IV software (Med Associates Inc., St. Albans, VT, USA) on Windows OS which controlled and automatically recorded all experimental events via a Fader Control Interface.

Procedure

Magazine training

Prior to the beginning of each study, rats were first randomly assigned to either the master or yoked group—labeled Agency and Passive, respectively—with the constraint that each group be gender-balanced. Each animal assigned to the Agency group was paired with an age and sex-matched Passive group animal. All sessions began with a 2-min acclimation period in the conditioning chambers. Rats initially received a session of magazine training in which they learned to retrieve a sucrose reward from the dipper cup. This session lasted 62 min and consisted of 60 trials. For the first 10 trials, sucrose was made available for 30 s every 30 s; for the second 20 trials, it was available for 20 s every 40 s; and for the last 30 trials, it was available for 10 s every 50 s.

Shaping

In all four studies, Agency rats went on to receive five shaping sessions in which they learned to self-initiate trials, following the procedure developed by Reverte et al.28. On the first shaping session, the nose-port light was turned on for a maximum of 20 s, during which a nose poke at the nose port immediately resulted in the termination of the nose-port light and a 10-s period of sucrose availability. Trials were separated by a 10 s variable ITI (range: 5−15 s). Failure to respond at the nose port resulted in the nose-port light coming off and the trial being repeated after a regular ITI. Over the following four shaping sessions, we introduced and progressively increased a delay of 2, 4, 6, and 8 s between the rat’s response at the port and sucrose availability. During this delay, the nose-port light would flash at a 1-Hz frequency (on for 0.5 s, off for 0.5 s). Concurrently, reward availability was progressively shortened (8, 6, 4, and 3 s). Throughout shaping training (and for the remainder of the experiment), Passive were yoked to their Agency counterparts to ensure that they received the same exact sequence of events and at the same time, except, of course, for the trial-initiation response.

Trial structure

The trial structure was common to all four studies (Panel A, Fig. 1). Following shaping, experimental sessions began with a 30-s acclimation period. Agency rats would then be presented with their first opportunity to start a trial as signaled by trial-availability cue (the onset of the nose-port light). The duration of this cue was 20 s, during which a response at the nose port would immediately turn off the nose-port light and turn on one of various possible visual, auditory, or audiovisual compound cues which were always 10 s long. The trial types specified by these cues were selected from a pseudorandom list built with the constraint that no trial type could be presented more than three times in succession. On reinforced trials, the cues’ offset coincided with the delivery of a 0.04-cc bolus of sucrose, which remained available for 3 s, after which a short ITI followed (mean: 10 s; range 5–15 s). As during shaping, failure to self-initiate a trial terminated the nose-port light after 20 s and led to a regular ITI period (mean: 10 s; range 5–15 s). Passive rats received the same sequence of events—including the same trial types at the same time and in the same order—as their Agency counterparts, but in standard Pavlovian fashion (i.e., noncontingent on any response). For any yoked pair of rats, a session terminated once the Agency rat completed all scheduled trials or timed out after 90 min.

Since Agency rats self-paced their training, they could take breaks that elongated the effective average ITI. Such pauses were of course also applied to their yoked counterparts. Supplementary Table S1 provides the effective ITI durations as well as the total session durations for each experiment. Furthermore, granting agency over trial presentation necessarily entailed the risk that rats would not complete all scheduled trials within the imposed time limit of 90 min, in which case the session would time out. Once again, the use of a master-yoked procedure ensured that this issue affected both groups equally. Supplementary Table S2 lists all incomplete sessions for all studies presented.

Discrimination training

Experiment 1

Training comprised two phases (see table in Panel B, Fig. 1). In the first, pretraining phase, rats in both groups received 14 sessions of A(1) vs. B(0) discrimination training, where A and B were visual cues and the numbers in parenthesis represent the probability of reward. One visual cue was constructed by flashing the two jewel lamps on the left wall alternately at a 2-Hz frequency (on for 0.25 s, off for 0.25 s), whereas the other was provided by the steady illumination of the white jewel lamp located on the right wall. These cues were counterbalanced, and were presented 48 times each in a session.

The second, compound phase comprised 20 sessions, during which rats continued to receive A(1), B(0) trials presented 36 times each per session. In addition, compound trials AX(1) and BY(1) trials were introduced, where X and Y represent two auditory cues. These auditory cues were provided by a 12-kHz, 70-dB tone and a 70-dB white noise, counterbalanced. There were 12 presentations of each of the AX and BY compounds per session. From session 9 to the end of the compound phase, two probe trials with each of the target cues, X (i.e., the cue to be blocked) and Y (the control cue) were additionally administered. This increased the total number of trials in Phase 2 from 96 to 100.

Experiment 2

Training consisted of two phases (see table in Panel A, Fig. 2). In the first, pretraining phase, rats received 10 sessions of A(1) vs. B(0) and X(0.75) vs. Y(0.25) discrimination training, where A and B were the same visual cues and X and Y were the same auditory cues used in Exp. 1, also counterbalanced. Once again, the numbers in parenthesis indicate the probability of reward. Each cue was presented 24 times in a session.

In the second, compound phase, the probability of reward for each cue trained in Phase 1 was maintained constant, but cue A was added to all trials in which X was reinforced, whereas B was added to all trials in which Y was not reinforced. Thus, the compound phase consisted of the following trial types: 10A(1), 10B(0), 30AX(1), 10X(0), 30BY(0), 10Y(1), where the coefficients represent the number of trials presented in a session (100 trials in total). Phase-2 training proceeded for 20 sessions.

Experiment 3

The study comprised two phases (see table in Panel A, Fig. 3). In the pretraining phase, rats received 8 sessions of A(1) vs. B(0) discrimination training, where A and B were the same visual cues used in Exp. 1. Each of these trial types was presented 48 times in a session.

In the second, compound phase, rats received 32 sessions of discrimination training. During this phase, A(1) vs. B(0) training continued, but audiovisual compounds AX(0.75) and BY(0.25) were added, with X and Y being the same auditory stimuli used in Exp. 1. Specifically, the compound phase consisted of the following training trials: 24A(1), 24B(0), 18AX(1), 6AX(0), 6BY(1), 18BY(0), where the coefficients denote the number of trials presented in a session. Starting on session 13, two probe trials with cues X and Y were interleaved with training trials on every session, raising the total number of trials per session from 96 to 100.

Experiment 4

Rats received a single phase of training consisting of 42 sessions with two concurrently trained nonlinear discrimination problems. One of these problems was an A(1), X(1), AX(0) negative-patterning discrimination, whereas the other was a B(0), Y(0), BY(1) positive-patterning discrimination. Cues A and B were the same visual cues, and X and Y were the same auditory cues used in the previous experiments, counterbalanced within modality. All trial types were presented 16 times in a session, making a total of 96 trials.

Behavioral measures

Conditioned responding was measured in both groups as the number of head entries in the sucrose magazine during the last 5-s of the 10-s cues. Focusing the analysis on the latter half of the cue has two advantages. First, it provides a cleaner measure of goal-tracking behavior, as sign-tracking behavior—which we did not measure, and which may have differed between the groups—tends to concentrate in the first half of a 10-s cue32. Second, it filters out any bias in conditioned behavior resulting from the fact that Agency and Passive rats began their trials at different locations in the chamber relative to the sucrose magazine. Indeed, whereas Agency rats necessarily had their snouts in the adjacent nose port at the time of cue onset, Passive rats were free to roam in the chamber and approach the magazine at all times, compromising any between-group comparison at the start of the cue period.

Statistical analysis

To prepare for analysis, we averaged the number of magazine entries in each session, producing a single response value for each subject, to each cue, for that session. Where session blocks are reported, we then averaged these values over the appropriate number of sessions. These data were then analyzed using a Group x Session (or Session-block) x Cue mixed ANOVAs, with the exception of Experiment S1, which used a Cue by Session-block repeated measures ANOVA. All analyses were conducted using the GAMLj package for Jamovi66,67, which employs the Satterthwaite method for calculating degrees of freedom (https://gamlj.github.io/). The regression intercept of each subject was treated as a random effect to control for subject differences. Significant interactions were explored using either Bonferroni-corrected simple effects analyses or post hoc tests, both reported as t-values. Statistical tables are provided in the Supplementary Materials.