Chronic corticosterone improves perseverative behavior in mice during sequential reversal learning

Background: Stressful life events can both trigger development of psychiatric disorders and promote positive behavioral changes in response to adversities. The relationship between stress and cognitive flexibility is complex, and conflicting effects of stress manifest in both humans and laboratory animals. Objective: To mirror the clinical situation where stressful life events impair mental health or promote behavioral change, we examined the post-exposure effects of stress on cognitive flexibility in mice. Methods: We tested female C57BL/6JOlaHsd mice in the touchscreen-based sequential reversal learning test. Corticosterone (CORT) was used as a model of stress and was administered in the drinking water for two weeks before reversal learning. Control animals received drinking water without CORT. Behaviors in supplementary tests were included to exclude non-specific confounding effects of CORT and improve interpretation of the results. Results: CORT-treated mice were similar to controls on all touchscreen parameters before reversal. During the low accuracy phase of reversal learning, CORT reduced perseveration index, a measure of perseverative responding, but did not affect acquisition of the new reward contingency. This effect was not related to non-specific deficits in chamber activity. CORT increased anxiety-like behavior in the elevated zero maze test and repetitive digging in the marble burying test, reduced locomotor activity, but did not affect spontaneous alternation behavior. Conclusion: CORT improved cognitive flexibility in the reversal learning test by extinguishing prepotent responses that were no longer rewarded, an effect possibly related to a stress-mediated increase in sensitivity to negative feedback that should be confirmed in a larger study.


Introduction
Stress is a multi-level response to current threats that may potentially overwhelm the present state of an organism [1]. Stressful life events is a risk factor for the development and worsening of pre-existing psychiatric disorders such as depression [2,3], anxiety [4], OCD [5], schizophrenia [6][7][8], and substance use disorder [9,10]. A shared trait across these disorders is impaired cognitive flexibility [11][12][13][14][15][16][17][18]. Reduced neurogenesis has been suggested as the link between stress, cognitive inflexibility, and development of mental disorder [19]. Conversely, stress can also serve as a helpful facilitator of change, by enabling rapid learning that, in the appropriate context, can promote positive behavioral changes in response to adversities [20,21].
This complex relation between stress and cognitive flexibility is also mirrored in rodent studies. A widely used translational rodent test of cognitive flexibility is the reversal learning task, which probes the ability to extinguish a previously learned response-reward contingency, while learning a new response-reward contingency [22]. In this task, various types of stressors have been shown to either impair [23][24][25][26][27][28][29][30][31][32][33] or improve [34][35][36] cognitive flexibility. Hence, the relationship between stress and cognitive flexibility requires further investigation.
This study investigated the effects of the rodent stress hormone corticosterone (CORT) on cognitive flexibility in the sequential reversal learning test [37], which was recently developed to better discriminate between the two main processes underlying successful reversal: extinction of the previously learned reward contingency and acquisition of the new reward contingency. We modelled stress by administration of CORT, which is a key physiological marker of the rodent stress response [38,39], via the drinking water, and subjected the mice to reversal learning after removal of CORT. There are various methods of modeling stress in mice, the CORT model is easy to control and implement, and it generally produces reliable results across laboratories [40]. Chronic CORT was originally developed as a rat model of depression [41], and was recently validated as a mouse model of depression [42]. In mice, this model produces lasting changes, including depression-like behavior, molecular changes in the brain, and altered neurotransmission in the prefrontal cortex [40,43], but its effects on reversal learning are relatively uncharted [33]. Other types of stressors have been shown in rodents to affect reversal learning after cessation of the stressor [27,30,31,34]. The lasting effects of CORT and other stressors in rodents are interesting in the context of psychopathology in humans, where stressful life events can cause psychological and behavioral changes long after the actual stress exposure [2,4,5,7,9,20,21]. This study therefore evaluated the post-exposure effects of CORT on reversal learning.

Animals
Female C57BL/6JOlaHsd mice (15 total) were purchased from Envigo and included in this study. The mice were 24-35 weeks old during behavioral testing. The mice were housed 3-4 mice per cage on a reversed 12/12 h light cycle with lights off at 7 a.m. The cages were individually ventilated GM500 Plus cages (Tecniplast) and were enriched with a plastic mezzanine-style mouse house, a plastic tube to serve as a second house, a paper rope ladder, a wooden chewing stick, and nesting material. The mice were food-restricted to 85-90% of free feeding body weight, and lab chow pellets were administered daily, after any behavioral testing. Regular tap water was provided ad libitum, except during the CORT treatment period where water bottles were replaced with vehicle or CORT bottles. Behavioral testing was performed five days per week between 8 a.m. and 12 p.m. Touchscreen testing was performed in the room where the mice were housed, while supplementary tests were performed in separate rooms, following at least one hour of acclimatization. All procedures were performed in accordance with the Directive 2010/63/EU of the European Parliament and of the Council of 22 September 2010 on the protection of animals used for scientific purposes and the Danish Law on animal experimentation LBK nr 474 of 15/05/2014, and the findings were reported according to ARRIVE guidelines. All efforts were made to minimize animal suffering and reduce the number of animals used.

Corticosterone treatment
Corticosterone (CORT) was purchased from Sigma-Aldrich (Søborg, Denmark) and was dissolved in limiting amounts of 99% ethanol and mixed with tap water to a final concentration of 0.1 mg/mL CORT and 1% ethanol according to a protocol by Moda-Sava et al. [43]. This corresponded to an average CORT dose of 29.7 mg/kg/day, based on measurements of water consumption. Water containing 1% ethanol was used as vehicle for the control group. CORT or vehicle solutions were administered to mice as the only water sources for 14 days after all animals passed behavioral training criteria. The solutions were replaced with regular tap water after the last refresher session, on the day before reversal learning so that animals did not receive CORT during the reversal learning phase of training (see Fig. 1).

Behavioral testing
We trained the mice in the recently developed touchscreen-based reversal learning test, where images are presented individually in a sequential manner [37]. This way of presenting images makes it easier to discriminate extinction learning from acquisition of the new reward contingency that both occur during reversal learning, and the free-choice nature of the test may be more representative of real-life situations where organisms have the possibility of both action and inaction at every choice [44]. We trained the mice until successful visual discrimination before administering a 14-day treatment with CORT. After removal of CORT, the reward contingency was reversed, and mice were trained on the reversed schedule for 35 sessions to detect any post-exposure effects of CORT on reversal learning behavior. As the main endpoints of this study (behavior during reversal learning) were during the post-exposure phase of CORT, we included standard tests of anxiety, compulsive-like behavior, and motor activity during CORT administration to validate the efficacy of CORT and account for potential confounding effects on behavior. Behavior during the last refresher session of visual discrimination, when the animals still received CORT, was also included to explain and support findings during reversal learning. Fig. 1 displays a timeline of CORT administration and behavioral testing.

Touchscreen training
2.3.1.1. Touchscreen apparatus. The touchscreen apparatus consisted of individually sound and light insulated boxes, each containing one touchscreen test chamber. The boxes were equipped with ventilation, cameras, light and sound stimuli, and infrared sensors for detection. The trapezoid chambers had stainless-steel grid floors and were equipped with a front-panel touchscreen (W × H: 24.5 × 18.5 cm) and a backcorner reward tray. During test runs, a lid was placed on the top of the chamber to prevent escape. The touchscreen was covered by a black Fig. 1. Timeline of behavioral tests and corticosterone (CORT) treatment. CORT was administered on the day after the last animal passed discrimination training, after the first refresher session of visual discrimination. During the period of CORT treatment, the animals were tested in four additional behavioral tests: the spontaneous alternation behavior (SAB) test, the elevated zero maze (EZM) test, the locomotor activity (LA) test, and the marble burying (MB) test. Before removal of CORT bottles, all animals received their last visual discrimination refresher session. One day after removal of CORT the reward contingencies were reversed, and all animals had their first session of reversal learning. The red line represents CORT treatment. Numbers represent days since administration of CORT. acrylic mask with three square windows (W × H: 6.9 × 6.9 cm). Images were displayed and responses recorded at the central window, while the flanking windows remained black. Yazoo® No Added Sugar Strawberry Milk was dispensed in the reward tray by a peristaltic pump (0.025 mL/s rate) at 20 µL per reward. Reward dispensing was accompanied by an auditory signal (3 kHz, 1 s duration) and illumination of the reward tray. Incorrect responses were accompanied by illumination of the chamber. The data were collected using WhiskerServer software (Cambridge University Technical Services, Cambridge, UK), and ABET II software (Campden Instruments Ltd., Leicester, UK) was used as a userinterface to operate the chambers.

Habituation.
Habituation to handling, strawberry milk rewards, and test chamber was performed before any touchscreen training and lasted 11 sessions. The experimenter responsible for touchscreen training habituated the mice to handling by cupping each mouse three times for 20 s on 11 daily sessions [45]. To habituate to the reward, the mice received 3 mL Yazoo® No Added Sugar Strawberry Milk reward in a petri dish in their home cage on session 5-10. On the last session, the animals we placed in the touchscreen apparatus and allowed to explore the environment for 30 min. One 20 µL milk reward was provided in the reward chamber, but no images were presented on the screen.

Pre-training.
The animals were trained to touch the touchscreen to obtain rewards, before proceeding to discrimination training. Initially, this involved responding to a white square presented in the central window within the stimulus duration (SD) of 10 s (touch training). Then the image was replaced with a snowflake, and the SD was reduced to 5 s (reaction training) to ensure that animals responded quick enough to qualify for discrimination training. All sessions lasted 30 min and the criterion for passing each stage was 40 hits in a session.

Visual discrimination.
The mice were trained to discriminate between two images: S+ and S-, "Left" and "Right" diagonal stipes on daily sessions lasting 30 min. We assigned S+ and S-pairs by counterbalancing treatment group, housing, and order of testing to avoid any influence of image bias. During visual discrimination, the images were randomly presented for 5 s, with a 50/50% probability of S+ and S-, and a 5 s intertrial interval (ITI) between trials, during which a black screen was shown. If a mouse touched the S+ , the response was recorded as a hit, and a 20 µL strawberry milk reward was delivered in the reward tray. After reward dispensing, there was an additional pause of 2 s (in addition to the ITI) to allow mice enough time to consume the reward. The animals could respond to images during a limited hold of 5.5 s (the SD of 5 s plus an extra 0.5 s to account for late responses). If the animal did not respond to the S+ within the limited hold period, the lack of responding was recorded as a miss, and a new image was presented after the ITI. No response to an S-was recorded as a correct rejection. If the animal touched the S-, a mistake was recorded and triggered a correction trial, which consisted of an ITI restart followed by a presentation of the same S-image. A response during a correction trial was recorded as correction trial mistake and elicited another correction trial. This continued in a loop until the animal refrained from responding to the S-. Correction trials were not included in the count of regular trials. Therefore, a mistake during a correction trial did not count towards mistake rate but was defined as a perseverative response. The average number of correction trial mistakes per initial mistake was defined as the perseveration index. This parameter is mostly relevant for the reversal phase (see 2.3.1.5.). The criterion of passing visual discrimination was ≥ 70% average accuracy on two consecutive sessions, with ≥ 65% accuracy and ≥ 50% hit rate on both sessions (see Table 1). After passing visual discrimination, each animal received individually planned weekly refresher sessions to ensure that the learning of the reward contingency was maintained, while avoiding overtraining. If an animal scored less than 70% accuracy or 50% hit rate on a refresher session, it received another refresher session on the following day. When the last animal passed visual discrimination, all mice received a refresher session on the following day, and water bottles were replaced with CORT or vehicle solutions (day 0). During the CORT treatment period, the animals received weekly refresher sessions (on day 7 and day 14), with the last refresher session performed before removal of CORT bottles, on the day before reversal learning.

Reversal learning.
We reversed the reward contingency so that the previous S+ was now S-and vice versa and trained the mice to reversal learning following the same schedule as visual discrimination. All animals were trained for 35 sessions, irrespective of when they passed reversal learning criteria, which were the same as the criteria for pairwise discrimination (≥70% average accuracy on two consecutive sessions, with ≥65% accuracy and ≥50% hit rate on both sessions). The main endpoint was trials to pass reversal learning criterion, and secondary endpoints were behaviors during the reversal learning phase: hit rate, mistake rate, perseveration index, premature response rate, and response criterion. Latencies to collect rewards and respond to correct and incorrect images were also recorded to deepen the understanding of behavior.

Spontaneous alternation behavior test.
The spontaneous alternation behavior (SAB) test was included to evaluate perseverative behavior and deficits in working memory. We performed the test on day 10 of CORT treatment in a red-lit room to reduce the influence of anxiety, as previously described [46]. The mouse was placed in one of the arms of the maze and was allowed to freely explore the three-armed maze (L × W: 40 cm × 7 cm) for 10 min. A treatment-blinded observer recorded arm visits during the test period. The mice were considered to make an arm visit when they entered an arm with all four paws. Consecutive arm visits to three different arms were considered a correct alternation. Alternation rate was calculated according to Table 1 from the number of correct alternations relative to the maximum possible number of correct alternations and served as a measure of compulsive-like behavior [46] and working memory [47]. The maze was cleaned with water and paper towels between animals.

Elevated zero maze test.
The elevated zero maze (EZM) test was performed to evaluate the effect of CORT on anxiety-like behavior. We performed the test on day 11 of CORT treatment as previously described [48]. The EZM test was performed using a circular plexiglass runway Table 1 Formulas used to calculate composite behavioral parameters used in this study.
consisting of two open and two closed quadrants. The runway was situated 50 cm above floor level and the inner diameter was 47 cm. Experiments were performed under regular light (400 lux) and behavior was recorded during 5 min by a camera with the experimenter situated at least 2 m from the maze. The number of stretch-attend-postures (SAPs) [49], head dips (HDs) [50], the latency to enter the open quadrants, the total number of entries, and the total time spent in open quadrants of the maze (TIO) served as measures of anxiety-like behavior and were scored from the video by a treatment-blinded observer. These five behaviors were combined into one Z-score that served as a combined measure of anxiety (see Table 1). Mice were considered to have entered an open quadrant when all four paws were in the open quadrant.
The maze was cleaned with water and paper towels between animals.

Locomotor activity test.
The locomotor activity (LA) test was performed to assess gross effects of CORT on motor activity. We performed the test on day 12 of CORT treatment as previously described [51]. The LA test was performed using empty transparent cages (L × W × H: 42.5 × 26.5 × 18 cm) on a white background in a dimly lit room (50-100 lux). A camera mounted above the cages and coupled to EthoVision XT (Noldus) technology recorded motor activity during 45 min. The total distance traveled served as a measure of locomotor activity. The cages were cleaned with water and paper towels between animals.

Marble burying test.
The marble burying (MB) test was performed to assess the effects of CORT on compulsive-like behavior [52]. We performed the test on day 13 of CORT treatment as previously described [51]. The MB test was performed using transparent cages (L × W × H: 42.5 × 26.5 × 18 cm) filled with ~5 cm sawdust in a dimly lit room (100-200 lux). A second cage was placed bottom-up as a lid to prevent escape. We placed twenty glass marbles with equal distance in a 4 × 5 pattern, keeping a distance of at least 2 cm from the borders of the cage. A treatment-blinded observer counted the number of marbles visible after 30 min. The number of marbles buried served as a measure of compulsive-like behavior.

Data analysis
Prior to analysis, we checked all data for normality (by QQ plot) and variance homogeneity, as they are model assumptions of the parametric analyses used in this study (ANOVA and t-test). Perseveration index in the reversal learning test was square root transformed, while total distance in the LA test and latency to enter open quadrants in the EZM test were log10 transformed prior to analysis to comply with model assumptions. Composite behavioral parameters were calculated according to Table 1. When calculating composite behavioral parameters across several reversal learning sessions, equal weights were given to the number of responses and not the number of sessions. It is not mathematically possible to calculate response criterion when hit or mistake rates are 0 or 100%. To be able to calculate response criterion we therefore added one hit, one miss, one mistake, and one correct rejection to all mice on all sessions, and thereby computed hit and mistake rates that could not reach the natural limits of 0% and 100%. These modified numbers were only used to calculate the response criterion. The number of hits, mistakes, misses, and correct rejections used to calculate all other parameters were not changed.
Behavioral readouts in the EZM test were normalized into one Zscore of anxiety-like behavior, as this is more reliable than looking at individual readouts in isolation [53]. Behaviors that had high values for anxiety-like behavior (SAPs and latency to enter open quadrants) were used in their original form, while behaviors that had low values for anxiety-like behavior (entries, head dips, time in open) were inverted by multiplying by − 1 so that high Z-values represent more anxiety. The results were normalized to the control group. Equal weights were given to the five behaviors in Z-score normalization.
We recorded data during reversal learning on daily sessions and grouped the data into low and high accuracy stages of reversal learning based on the accuracy condition. Sessions with < 50% accuracy were labeled low accuracy-stage, while sessions with ≥ 50% accuracy were labeled as high accuracy-stage reversal learning. The labeling of sessions was individual for each mouse, as the mice had different learning rates. We analyzed behavior across all sessions by Repeated Measures (RM) ANOVA with CORT treatment as the independent factor and session as the repeated factor. Unpaired t-tests were used to analyze behaviors during low and high accuracy stages of reversal learning, behaviors in the additional tests, and behaviors on the last session of visual discrimination. Trials to reach reversal learning criterion were analyzed by the Mantel-Haenszel survival test. Results were considered significant when p < 0.05, but non-significant trends with 0.05 < p < 0.10 were also reported. All statistical analyses were performed in InVivoStat version 4.0.1 [54].

Corticosterone did not affect trials to reversal criterion
The mice were all trained on the reversed schedule for 35 sessions, irrespective of when they passed the reversal learning criterion (≥70% average accuracy on two consecutive sessions, with ≥65% accuracy and ≥50% hit rate on both sessions). One mouse from the CORT group and two mice from the control group did not pass reversal learning within the timeframe of the experiment. Survival analysis showed no significant difference between treatment groups on the trials required to pass reversal criterion ((χ 2 2 )= 0.02; p = 0.879) (Fig. 2a). RM ANOVA of accuracy over time showed a significant main effect of session (F 34,442 =53.22; p < 0.001), but no significant main effect of treatment (F 1,13 =0.05; p = 0.823) or treatment by session interaction (F 34,442 =0.82; p = 0.762) (Fig. 2b).

Corticosterone reduced perseverative responses during low accuracy reversal learning
Behavioral recordings during reversal learning over 35 sessions were divided into low and high accuracy reversal stages, as defined by a performance below or above 50% accuracy, respectively. CORT did not produce any lasting effects on behavior, except for a drop in perseveration index during low accuracy reversal (Fig. 4b), where perseveration is highest [55]. Corticosterone non-significantly increased the response criterium, (Figs. 4e and 4 f), which was driven mainly by a gradual drop in mistake rate (Figs. 3c and 3d), which is used to calculate the response criterion, but none of these effects were large or statistically significant. The pattern of behavior for vehicle mice is consistent with previous observations [37]. The animals exhibited highly conservative behavior after reversal, followed by increasingly liberal trial and error strategies until they reached accuracy around 50%, after which behavior was characterized by optimization of responses.

Hit rate
The RM ANOVA of hit rate across all sessions revealed a significant main effect of session (F 34,442 =67.14; p < 0.001), but no main effect of treatment (F 1,13 =0.28; p = 0.607) or treatment by session interaction (F 34,442 =0.36; p = 1.000) (Fig. 3a). Unpaired t-tests similarly did not show any significant effects of CORT on hit rate during low accuracy (t 13 =1.40; p = 0.184) or high accuracy (t 13 =1.13; p = 0.277) stages of reversal (Fig. 3b).

Corticosterone did not affect response or reward retrieval latency
CORT did not significantly affect the latency to collect rewards or to respond at the correct image. Analysis of latencies to respond at the incorrect image revealed a non-significant increase in response time during low accuracy reversal, but no effect during high accuracy reversal ( Table 2).

Corticosterone increased anxiety-like and compulsive-like behavior
The supplementary behavioral tests performed while the animals still received CORT revealed that CORT did not affect alternation rate in the SAB test, suggesting that working memory may be intact and that corticosterone did not lead to perseverative repetitive arm visits (t 13 =0.02; p = 0.987) (Fig. 5a). CORT caused anxiety-like behavior in the EZM test, as reflected by a significantly higher Z-score of anxiety-like behavior (t 13 =− 3.56; p = 0.004) (Fig. 5b). Analysis of base parameters in the EZM test showed that the anxiety-like effect of CORT was driven by fewer entries, less time in open, more stretched-attend postures, fewer head dips, while CORT did not significantly increase latency to enter the open quadrants (Supplementary material I). CORT also decreased spontaneous locomotor activity in the LA test (t 13 =2.72; p = 0.018) (Fig. 5c). Finally, CORT increased repetitive marble burying (t 13 =− 2.23; p = 0.044) (Fig. 5d).

Corticosterone did not affect behavior during visual discrimination
To test if CORT caused non-specific effects on touchscreen chamber activity that could confound the reversal learning results, we investigated behavior on the last refresher session of visual discrimination. On this session, the animals were exposed to a well-known environment with the expected reward contingency. The animals were still under the acute influence of CORT on this session and had received it for 14 days at this time point.

Discussion
We tested the post-exposure response to two weeks of CORT treatment in the reversal learning test. Behavioral measures during visual discrimination and in supplementary behavioral tests were used to better understand the results collected during reversal learning. During administration, CORT promoted anxiety-like behavior in the EZM test and repetitive digging in the MB test but did not affect touchscreen chamber activity during visual discrimination on any parameter. Surprisingly, after removal of CORT and reversal of the reward contingency, we found that CORT-treated mice showed fewer perseverative responses during the low accuracy phase of reversal learning, suggestive of improved extinction learning.

Corticosterone reduced perseverative responding during low accuracy reversal
We did not detect any lasting changes in overall learning of the new reward contingency as represented by accuracy or trials to reach reversal criterion. Similarly, CORT did not affect the rate of first-touch correct (learning of the new reward-contingency) and incorrect responses (un-learning of the old reward-contingency) or premature response rate, which served as a measure of waiting impulsivity [56,57]. However, CORT reduced the perseveration index, which is a measure of perseverative responses [55]. Perseveration index is, mathematically, the ratio of correction trial mistakes to the number of first-choice mistakes, and animals with high perseveration index scores perform many repetitive mistakes at the same image in the correction trial loop, indicating that they fail to extinguish prepotent responses at the previously correct image. We recently found that this type of repetitive responding relates to impulsive-like behaviors like premature responses at a blank screen, short incorrect response latency, and motor hyperactivity [37], suggesting that a high perseveration index represents the inability to inhibit initiation of learned responses that are no longer beneficial. CORT also caused a non-significant increase in the latency to respond at incorrect images, which is consistent with the corresponding decrease in perseveration index, as these behaviors are inversely correlated [37], and represents a hesitancy to show prepotent responding at the previously correct image.
The drop in perseverative errors is consistent with a similar study in mice that were exposed to stress before the reversal learning stage [34], but that study found decreased perseverative errors primarily during the late stage of reversal learning. Furthermore, they also found a reduction in mistakes and trials to reach the reversal criterion. The main methodological differences between our studies lie in the reversal schedule (pairwise vs. sequential image representation), type of stressor (three days of swim stress vs. CORT) and sex of the animals (males vs. females). Although the stimulus representation in the reversal learning test is different, we have previously found that results obtained with the new schedule are comparable to those in the pairwise test [37]. Considering that the effect observed during low accuracy reversal disappears over time, our results indicate a gradual decay of CORT's downstream effects that wear off across the seven weeks of reversal learning. CORT, especially when administered orally, is a relatively mild type of stressor [40] compared to repeated swim-sessions. Mice previously exposed to swim stress may learn to associate handling by the experimenter with the stressful situation, while mice in our study that received CORT-containing drinking water did not learn this association. A potential difference in the experimenter-stress association would mean that the animals in the swim stress study were exposed to a stressful situation before each test session, when animals are retrieved from the home cage and transferred to the test chamber. Finally, there is evidence that sex hormones can interact with the stress response and cause differences between males and females, which should also be considered in the evaluation of the results [58].
In direct contrast to our findings, Dieterich and colleagues recently tested chronic CORT in mice in a lever-press reversal learning paradigm that involved serial reversals and a probabilistic reinforcement schedule and found that CORT impaired reversal learning performance [33]. A lower dose was used in that study (approx. 1/6 of the dose used in the present study), but for a period that lasted throughout reversal. Other groups also find that chronic CORT can impair reversal learning in rats in the Morris water maze [27,32]. Hence, it appears that effects of CORT on reversal learning are sensitive to differences in dose, schedule, and species.
CORT could have effects on other measures than perseveration index during reversal learning if the animals had continued to receive it during the test period, such as mistake rate and response criterion, where CORT animals appeared to have slightly fewer mistakes and a more conservative response strategy.

Corticosterone caused anxiety-like and repetitive behavior but did not induce cognitive or motivational deficits
Non-specific effects on cognition, motivation, or general activity of the animals could potentially cause false positive or negative results in the reversal learning test. For example, sedated animals or animals that lack motivation to participate in the task would show reduced responding to the images, which could cause a drop in perseveration index without necessarily affecting compulsive tendencies. Therefore, behaviors in other tests and during visual discrimination were included in this study. The responses in these tests also verified the effect of CORT on anxiety-related measures and helped understand the behavior during reversal learning.
Chronic administration of CORT promoted anxiety-like behavior in the EZM test and repetitive digging in the MB test, which aligns with Fig. 4. Effects of corticosterone (CORT) on perseveration index across all sessions (a) and during low and high accuracy stages of reversal learning (b), premature response rate across all sessions (c) and during low and high accuracy stages of reversal learning (d), and response criterion across all sessions (e) and during low and high accuracy stages of reversal learning (f). CORT significantly reduced perseveration index during the low accuracy stage of reversal learning while a nonsignificant trend was observed across all sessions. No significant effects were observed on premature response rate and response criterion. Data are presented as means with corresponding S. E.M. on the original scale. Dots represent individual animal values. * *p < 0.01, statistically significant difference from vehicle control group. n = 7-8.
other reports of this type of stressor in mice [33,[59][60][61]. Repetitive marble burying is often described as a compulsive-like behavior [52], and the increased digging by CORT mice conflicts with the anti-compulsive effect in reversal learning. Notably, the results from the SAB test do not suggest that CORT caused perseverative behavior, i.e., a reduction in alternation rate below the chance level of 50% [46]. Based on this, we suggest that the increased digging by CORT-treated mice reflects a type of repetitive behavior that appears to be unrelated to the repetitive responses expressed as perseveration index [37]. Compulsive-like digging behavior by CORT mice could reflect an increased sensitivity to aversive situations, as rodents exhibit defensive burying behavior in response to aversive situations [62]. In accordance with previous findings [33], CORT-treated mice also showed slightly reduced spontaneous locomotor activity, a reduction that is unlikely to confound the results in the reversal learning test.
Despite the reduced locomotor activity, analysis of data recorded on the last refresher session during visual discrimination shows that the behavior of CORT animals was similar to their vehicle-treated controls on all parameters. The CORT mice were equally active in the chamber and equally good at responding at the correct image. Furthermore, CORT did not affect the latency to respond at images, which is a measure of motivation [63], or latency to collect rewards, which relates to the perceived hedonic value of the reward [64].
Based on these supplementary results, we conclude that CORTtreated mice were equally motivated and able to perform the test, and that CORT did not cause behavioral deficits that non-specifically affected reversal learning measures.

Stress as a facilitator of change
We found that CORT-treated mice had higher levels of anxiety-like behavior and repetitive digging behavior, which could potentially influence their response to changes in the reward contingency. Reversal learning requires both extinction of the previously learned response strategy and simultaneous exploration of alternative options [22,65]. While exploration is driven by unexpected reward of the new correct responses and resulting associative learning, extinction relies on encoding of negative prediction errors when the expected reward is not delivered [66]. The salience of prediction errors depends on both the frequency [67] and magnitude of the reward [65,68]. We found that the drop in perseveration index caused by CORT was selective to the reversal learning phase and not related to a non-specific deficit in chamber activity, which suggests improved extinction learning. The affective bias test can detect positive or negative biases in laboratory animals. Rats treated with CORT during the affective bias test respond in a pessimistic manner during the following preference test, which indicates a deficit in reward processing [69,70]. Likewise, high-anxious mice have been shown to process ambiguous stimuli in a negative way in the judgement bias test, compared to low-anxious mice [71], and similar deficits in processing of ambiguous cues have been reported in rats treated with glucocorticoids [72,73]. We observed that the CORT-treated mice showed an increase in anxiety-like behavior in the EZM test and increased digging in the MB test. Behavior in the EZM test is a conflict-based test where animals balance their drive to explore the new maze against an aversion to the risky open areas of the maze [74]. In the MB test, the animals are exposed to a novel environment that they may find aversive, and the repetitive digging may reflect a coping strategy to temper the aversiveness [62,74,75]. CORT increased measures of anxiety-and compulsive-like behavior in the EZM and MB tests, which could suggest that the animals were more sensitive to aversive situations. Although there is a difference between acute effects on anxiety-like behavior and post-exposure effects on reversal learning, the acute response to CORT can partly explain behavior during the reversal phase. We therefore propose that the drop in perseverative responding by CORT-mice during reversal learning could reflect an increased sensitivity to negative prediction errors that facilitated extinction Table 2 Effects of corticosterone (CORT) on latencies to collect rewards, and to respond at correct and incorrect images during low and high accuracy stages of reversal learning. CORT did not affect latencies, except for a non-significant increase in incorrect response latency during the low accuracy stage of reversal learning. Data are presented as means with corresponding S.E.M. n = 7-8.  learning. In conflict with our hypothesis is a study by Dieterich et al. that evaluated the effects of CORT on probabilistic reversal learning [33]. Analysis of choices after individual wins or losses found no effects of CORT on win-stay or lose-shift strategies, which would be expected if CORT changes the sensitivity to negative feedback.

Strengths and limitations
This study was designed to investigate the post-exposure effects of a period of chronic stress on cognitive flexibility and included supplementary behavioral measures to verify the pharmacological treatment, elucidate possible confounders, and better understand results. Due to the small sample size, we only saw non-significant decreases in mistake rate that corresponded with complementary increases in the response criterion towards a more conservative response strategy. Although firstchoice mistakes differ from perseverative mistakes, mistake rate relates to perseveration index and is driven by a common behavioral phenotype [37]. A larger experiment may have shown effects on mistake rate and response criterion. It is also critical to consider the risk of false positives with small sample sizes, although this seems unlikely considering the low p-value and general trend of reduced perseveration in CORT-animals. The effect of CORT was restricted to the low accuracy phase of reversal learning, and there was no effect on overall acquisition of the new reward contingency, which possibly related to a decay in effect across the seven weeks of training. Future studies where CORT is administered daily throughout the reversal period would provide better insight into the effects of CORT during high accuracy stages of reversal learning and show how continuous administration differs from post-exposure. Although this study included behavioral measures of anxiety-like and repetitive behavior that strengthened the interpretation of results, specific tests of decision-making and reward processing, such as the affective bias test [69] or the judgement bias test [71], would have provided a more exhaustive characterization. Finally, it would have strengthened this study and its interpretation if the experiments had been performed in both sexes, as sex hormones can interfere with stress responses [58].

Conclusion
This study examined the effects of a period of chronic mild stress on future cognitive flexibility in the sequential reversal learning test. We found that, after removal of the stressor, the mice showed fewer perseverative responses at the previously rewarded image, which suggests improved extinction learning, possibly due to an enhanced sensitivity to negative prediction errors during reversal learning. Due to the small size of this study, these findings need to be confirmed in further investigations.

Data Availability
Data will be made available on request.

Appendix A. Supporting information
Supplementary data associated with this article can be found in the online version at doi:10.1016/j.bbr.2023.114479.