Introduction

Cognitive inflexibility is widely associated with depression (Dickstein et al, 2010), schizophrenia (Leeson et al, 2009), obsessive-compulsive disorder (OCD) (Chamberlain et al, 2006; Remijnse et al, 2006), and addiction (Ersche et al, 2008). The capacity to flexibility switch responding to changing stimulus-response (S-R) contingencies is widely assessed using reversal learning procedures, for example, in humans (Fellows and Farah, 2003; Murphy et al, 2002), non-human primates (Butter, 1969; Clarke et al, 2007; Dias et al, 1996; Groman et al, 2013), and rodents (Boulougouris et al, 2007; Chudasama and Robbins, 2003; McAlonan and Brown, 2003). Effective reversal learning requires a new S-R contingency to be learnt while ignoring competing interference from a previously learnt response. A failure to suppress previously learned responses is expressed behaviorally as increased response perseveration (Iversen and Mishkin, 1970).

Convergent evidence indicates that reversal learning is modulated by orbitofrontal–striatal mechanisms (Roberts, 2011). The orbitofrontal cortex (OFC) receives a dense serotonergic innervation from the dorsal raphé nucleus (DRN), which in turn provides regulatory input to the DRN (Azmitia and Segal, 1978; Peyron et al, 1998; Santana et al, 2004). In humans, the OFC is selectively activated during reversal learning (Hampshire and Owen, 2006) and damage to this region disrupts reversal learning in experimental animals (Bissonette et al, 2008; Boulougouris et al, 2007; Burke et al, 2009; Dias et al, 1996; Fellows and Farah, 2003). In contrast, a recent study by Rudebeck et al (2013) found that excitotoxic, fiber-sparing lesions of the macaque OFC had no effect on reversal learning performance. The basis for this discrepancy is unclear but may reflect cross-species differences in OFC anatomy and function together with variation in the methods used to assess reversal learning in different species. A role for 5-HT in reversal learning is substantiated by studies in humans involving dietary tryptophan depletion (Rogers et al, 1999) and in experimental animals depleted of 5-HT, both globally (Mobini et al, 2000) and locally in the OFC (Clarke et al, 2004). In rats, 5-HT2A and 5-HT2C receptors bidirectionally modulate reversal learning (Boulougouris et al, 2008), putatively at the level of the OFC (Boulougouris and Robbins, 2010). Research also links the DMS and its DA, but not 5-HT, innervation to reversal learning (Castane et al, 2010; Clarke et al, 2011; O'Neill and Brown, 2007). Optimal DA levels in the striatum are associated with improved reversal learning (Clatworthy et al, 2009; Cools et al, 2009), and in non-human primates flexible behavior depends in part on 5-HT and DA interactions in the OFC and striatum (Groman et al, 2013).

Recent evidence indicates that gene products associated with the metabolism and transport of 5-HT and DA may have a role in behavioral flexibility. Thus, variants of the 5-HT transporter (5-HTT) gene, SLC6A4, and of the dopamine transporter (DAT) gene, SLC6A3, predict reversal learning performance in humans (den Ouden et al, 2013), and Slc6a4-deletion mice more rapidly reverse visual discriminations than their unaffected littermates (Brigman et al, 2010). However, less is known about how the two isoforms of monoamine oxidase (MAO-A and MAO-B), tyrosine hydroxylase (TH), and tryptophan hydroxylase (TPH2) activity influences reversal learning despite their key role in the synthesis and degradation of biogenic amines (Shih and Thompson, 1999). Thus, by controlling 5-HT and DA homeostasis, MAO, TH, and TPH2 may critically regulate flexible, goal-directed behavior.

Here we investigated the relationship between interindividual variation in spatial reversal learning in rats and the natural heterogeneity that exists in 5-HT and DA functional markers in the OFC and DMS. We investigated the hypothesis that MAO, TH, and TPH2 dysfunction in orbitofrontal–striatal circuitry may be linked to individual variation in spatial-discrimination serial reversal learning.

Materials and Methods

Subjects

Subjects were 192 male Lister-hooded rats (Charles River, Kent, UK), weighing 250–300 g at the start of the experiment, and maintained at 85–95% of their free-feeding weight. Water was available ad libitum. Animals were group-housed, four per cage, and kept under a reversed light/dark cycle (white lights on/red light off from 1900 to 0700 hours). Testing took place between 0800 and 1600 hours. Four cohorts of rats were used for this study, each comprising 48 animals. These were destined for systemic drug administration (cohort 1), post-mortem monoamine analysis (cohort 2), in vitro autoradiography (cohort 3), and quantitative reverse transcription-polymerase chain reaction (qRT-PCR) analysis (cohort 4). Cohorts 2–4 consisted of drug-naive animals only. All experiments were carried out in accordance with the UK (1986) Animal (Scientific Procedures) Act. Ten subjects were excluded from the study (four animals, each from cohorts 2 and 3, and one from both cohorts 1 and 4) because they failed to acquire a spatial discrimination during the acquisition of the task, as described below. In cohort 4, the posterior section of the brain was lost from two animals; these were excluded from the analysis of MAO expression in the DRN and VTA.

Behavioral Apparatus

Testing was carried out in twelve 5-hole operant chambers (Med Associates, Georgia, VT), enclosed in a sound-attenuating box fitted with a fan for ventilation, and masking of external noise. An array of five square nose-poke holes was set in the curved wall of each box. An infrared detector was positioned across each nose poke aperture. A yellow light-emitting diode stimulus light was located at the rear of each aperture. On the adjacent wall a food magazine was located into which rodent food pellets (TestDiet; Purina, UK) were delivered. The three inner apertures of the chamber were blocked using metal inserts, so only the two outermost holes remained unobstructed. The testing apparatus was controlled by Whisker Control software (Cardinal and Aitken, 2010).

Behavioral Training

Subjects were initially habituated to the test apparatus over two days, with each daily session lasting 20 min. During each session, the two stimulus lights, house-light and magazine light, were illuminated, and the food magazine was filled with pellets. After the habituation phase, animals were trained to nose poke in the magazine to trigger the illumination of the stimulus lights and to respond in the holes for food delivery. This phase of training took place successively in each hole under a fixed ratio-1 schedule of reinforcement (FR1) to a criterion of 50 correct trials in 20 min, and thereafter, under FR2 and FR3 schedules to the same criterion. This schedule was used to eliminate the possibility of random, accidental nose poke responses. Responses in the unrewarded hole were not punished but omission errors resulted in a 5 s time-out period, where all lights were extinguished. After the initial nose poke to trigger illumination of the stimulus lights, animals were required to make a response at the nose poke apertures within a 30 s limited hold period. An intertrial interval of 5 s was introduced when responding had stabilized under an FR3 schedule.

Acquisition of Spatial Discrimination

After the initial training stage, subjects were trained on a two-hole discrimination task. A nose poke in the food magazine triggered the illumination of both stimulus lights. A sequence of three nose pokes in one of the holes resulted in reward (see Figure 1). Three nose pokes in the ‘incorrect’ hole resulted in a time-out and no reward. Rats were trained across sessions until they achieved a criterion of 9 correct trials across the previous 10 trials. ‘Correct’ and ‘incorrect’ holes were designated randomly and counterbalanced across subjects.

Figure 1
figure 1

Schematic illustration of the spatial discrimination reversal learning task. Rats were trained under a fixed-ratio (FR) schedule of reinforcement such that three consecutive nose pokes in the same aperture resulted in the delivery of a food pellet in the magazine. A failure to respond within 30 s (an ‘omission’) resulted in a 5 s time-out period. Following the acquisition of a spatial discrimination, the contingency was reversed such that responses in the previously incorrect aperture were now correct (and vice versa). Animals completed three reversals (‘ × 3’) in a single 1 h session.

PowerPoint slide

Within-Session Reversal Learning

This session began with the illumination of both the house-light and magazine light. For individual rats, ‘correct’ and ‘incorrect’ holes were kept the same as those experienced in the acquisition of the spatial discrimination. After rats had reached criterion on this retention phase, the ‘correct’ and ‘incorrect’ holes were reversed such that the previously rewarded response now resulted in a time-out period, and the previously unrewarded response resulted in the delivery of a food pellet (see Figure 1). Subjects completed three reversals, but no more, during the 1 h session. We used this within-session serial reversal design because many animals display marked perseveration on the first reversal that they experience. Consequently, therefore, a single reversal does not effectively differentiate between good, middle, and poor learners. Allowing animals to complete a second and third reversal in the same session provided a more sensitive method to categorize animals on the basis of perseverative responding.

Systemic Drug Administration

The selective 5-HT and DA reuptake inhibitors citalopram hydrobromide and GBR12909 dihydrochloride were purchased from Sigma (UK) and evaluated in the same subjects following a 1-week wash-out period between each compound. Drugs were administered intraperitoneally (1 ml/kg, phosphate-buffered saline, PBS), starting with citalopram (PBS, 1, 3, and 10 mg/kg), followed by GBR12909 (distilled-deionized water, 1, 3, and 10 mg/kg). Doses were selected according to previous research findings in Lister-hooded rats (Baarendse and Vanderschuren, 2012) and administered according to a fully randomized Latin square design. Drugs were administered 20 min before reversal learning, in a different room to the operant testing room. Each experiment started with a baseline retention session (day 1), followed by the test session where the drug was administered (day 2), and a third day where animals were maintained in their home-cages. This cycle was repeated for each dose of drug administered. The criterion for retention and reversal sessions was the same as in initial testing.

Ex Vivo Neurochemistry

Subjects were killed by CO2-induced asphyxiation and cervical dislocation. Brains were rapidly removed and placed on a steel dissection plate, cooled on dry ice, with the dorsal surface uppermost before being frozen at −80 °C. Brains destined for qRT-PCR analysis were flash frozen in isopentane, at −30 °C, to ensure minimal RNA degradation and stored at −80 °C. Brains were sectioned in the coronal plane using a Jung CM300 cryostat (Leica, Wetzlar, Germany). For autoradiography, consecutive 20 μm slices throughout the OFC and striatum were mounted on Superfrost Plus microscope slides (Fisher Scientific, UK). Sections were stored at −80 °C before being thawed at room temperature for processing. Samples destined for analysis by high-performance liquid chromatography (HPLC) and electrochemical detection (ECD) were sectioned into 150 μm consecutive slices and mounted on chilled microscope slides. Aliquots of tissue were removed using a micropunch of 1.2 mm diameter. Tissue from the medial and lateral OFC (mOFC, lOFC) and DMS (see Figure 2) was extracted and frozen at −80 °C. For qRT-PCR analysis, tissue was collected as described above for the HPLC-ECD study, and placed in RNAlater stabilization reagent (Qiagen, UK) for at least 1 h at room temperature before being frozen at −20 °C.

Figure 2
figure 2

Coronal (a) and sagittal (b) sections showing the regions of interest used for neurochemical assessment in post-mortem tissue of rats stratified according to low-, mid-, and high-perseverative behavior on a spatial discrimination serial reversal task. Adapted from Paxinos and Watson (1998).

PowerPoint slide

Neurochemical Analysis

Samples were placed in 75 μl of 0.2 M perchloric acid and kept on ice. Tissue samples were homogenized using an ultrasonic cell disrupter (QSonica LLC, Newton, CT) and subsequently centrifuged at 6000 r.p.m. for 10 min at 4 °C. Twenty-five microliters of the supernatant was collected for analysis. DA, 3,4-dihydroxyphenylacetic acid (DOPAC), 5-HT, and 5-hydroxyindoleacetic acid (5-HIAA) were measured by HPLC-ECD, as described previously (Dalley et al, 2002). Quantification was achieved using a Coulochem II detector with an analytical cell (ESA model 5014B) and two electrodes in series (E1 −250 mV, E2 +250 mV). The signal from E2 was integrated using a computer software (Dionex Chromeleon, v.6.8). The limit of detection varied between 5 and 10 fmol for DA and DOPAC, and between 10 and 20 fmol for 5-HT and 5-HIAA.

Ex Vivo Receptor Autoradiography

[3H]Citalopram (3127 GBq/mmol), [3H]ketanserin (1976 GBq/mmol), [3H]GBR12935 (1480 GBq/mmol), and [3H]raclopride (2812 GBq/mmol) were purchased from Perkin-Elmer (UK). Fluoxetine, mianserin, and mazindol were purchased from Sigma-Aldrich (UK); haloperidol was purchased from Tocris (UK). Duplicate, consecutive slides were prewashed for 15 min at room temperature in 150 mM Tris-HCl (pH 7.4). Slides were incubated in a buffer containing the radioligand. For nonspecific binding, additional cold ligand was added to the incubation buffer. Ligand concentrations and incubation times are given in Supplementary Table S1. Following incubation, slides were washed two times in fresh 4 °C buffer for 2 min and then rinsed in distilled-deionized water. Slides were air-dried for at least 2 h before being fixed in paraformaldehyde vapor. These were subsequently apposed with tritium microscale standards (Amersham Biosciences, Freiburg, Germany) to a tritium-sensitive phosphor-imaging plate (FujiFilm, Tokyo, Japan). The plates were scanned using a FLA-5000 Bioimaging Analyzer (Fujifilm) to digitize autoradiographs at 16-bit gray scale for image analysis. Region-of-interest analysis was conducted using ImageJ (Abramoff et al, 2004).

Gene Expression

Messenger RNA was extracted from the frozen samples using the miRNeasy Micro Kit (Qiagen) with additional DNAse digestion. First-strand cDNA was synthesized from 5 ng total RNA with random hexamer primers using the RevertAid First Strand cDNA Synthesis Kit (Thermo Scientific, UK). SYBR green-based quantitative RT-PCR was performed on the CFX96 Touch Thermal Cycler (Bio-Rad, UK). PCR was performed using 0.25 μM of each primer. The primer pairs, designed using Primer-BLAST software (NCBI) and purchased from Sigma-Aldrich, are given in Supplementary Table S2. PCR conditions were as follows: 95 °C for 5 min; 40 cycles at 95 °C for 10 s; 60 °C for 10 s, and 72 °C for 1 min.

PCR efficiencies for each gene were calculated using LinRegPCR (Freeware, HRFC, The Netherlands). Normalized relative quantities for all genes of interest were calculated using multiple reference genes (tubulin, actin, GAPDH, and RLP19) and by adjusting for differences in PCR efficacy. The stability of each reference gene was assessed by calculating the gene stability value (M) and coefficient of variation (CV) in qBase+ (Biogazelle, Belgium). Reference genes that had mean CV and M values >25% and 0.5%, respectively, were excluded from further normalization calculations.

Statistical Analyses

Inferential statistics were carried out using SPSS for Windows (v.21). The main dependent variables analyzed were the total number of trials and errors to criterion. Errors made were refined by looking specifically at ‘perseverative’ errors. Data were analyzed in moving windows of blocks of 10 trials. In the case where 7 or more commission errors (errors made due to an incorrect response being made, and not due to omissions) were made in a window of 10 trials, and where these were determined to be statistically significant (using Pearson’s χ2 test, p<0.05), the errors were classed as ‘perseverative’. Non-perseverative errors were very small in number with many animals making no learning errors at all. Therefore, our analysis focused on perseverative errors as an index of behavioral flexibility. Dependent variables were measured across three reversals and the mean values used for statistical analysis. Subjects from each cohort were ranked for perseverative responses and divided into high-, mid-, and low-perseveration groups based on the following criterion: high (upper quintile); mid (middle quintile); and low (lower quintile). Behavioral data were analyzed using repeated-measures analysis of variance (ANOVA). When significant main effects or interactions were found, post hoc analysis using Fisher’s LSD test was performed. When the assumption of homogeneity of variance could not be met, a Games–Howell test was used. One-way ANOVA was used to compare ex vivo monoamine and receptor levels and mRNA expression in high-, mid-, and low-perseveration groups. Neurochemical variables were also regressed against perseverative responses for the low-, mid-, and high-perseveration groups combined to determine the proportion of variance explained by the general linear model (R2). Statistical significance was set at α=0.05.

Results

Behavioral Screening

Rats were segregated into three groups according to their perseverative behavior on the first serial reversal learning session: (i) low-perseveration (first quintile); (ii) mid-perseveration (third quintile); and (iii) high-perseveration (fifth quintile). The segregation of rats into quintiles not only allowed the inclusion of rats in the lower and upper regions of the distribution but also rats in the center of the distribution. Figure 3a–d show the frequency distributions of perseverative responses, trials to criterion, and errors to criterion for the four cohorts of animals used in this study. Numerical data are shown in Supplementary Table S3. The mean, median, and interquartile ranges of perseverative responses were: 33.4, 36.6, and 52.6 (cohort 1); 35.2, 35.9, and 30.2 (cohort 2); 28.4, 29.4, and 47.1 (cohort 3); and 43.1, 41.6, and 23.9 (cohort 4). There was no significant difference in perseverative responses between the four cohorts (F3,181=2.39; p=0.07). The overall distributions of perseverative responses (Figure 3b), total trials (Figure 3c), and total errors (Figure 3d) were positively skewed (skewness: 0.073, 0.81, 0.91, respectively). Response latencies to initiate a new trial following a correct response were highly variable between high-, mid-, and low-perseveration groups (cohort 1: means±SEM: 4.33±1.76 s, 3.74±1.31 s and 3.10±1.46 s, respectively). Although the data suggest that highly perseverative animals were slower to initiate a new trial, this was not significant (F2,24=0.158; p=0.855). Following an incorrect response, the latency values were 8.46±2.54 s (highs), 7.57±1.11 s (mids), and 6.33±1.43 s (lows). Again, there was no main effect of group (F2,24=0.365; p=0.699), indicating that perseveration was not accompanied by apparent changes in motor behavior.

Figure 3
figure 3

Cumulative frequencies of perseverative responses, total trials, and errors to reach criterion, during initial testing, in all four cohorts. Cohort 1: systemic drug administration (n=48); cohort 2: HPLC-ECD analysis (n=44); cohort 3: autoradiography (n=44); and cohort 4: qRT-PCR (n=47).

PowerPoint slide

5-HT Reuptake Inhibition Facilitates Reversal Learning Performance, Similar to the Effect of Low-Dose DA Reuptake Inhibition

To validate the sensitivity of the spatial-discrimination serial reversal task to altered levels of 5-HT and DA rats from the low-, mid-, and high-quintile response-perseveration groups were tested after the administration of either citalopram or GBR12909. Citalopram produced a dose-related improvement in reversal learning as reflected by a decrease in the number of trials to reach criterion on the task (dose: F3,63=3.38, p=0.023; Figure 4a). Post hoc tests revealed that the lowest (1 mg/kg; p=0.035) and highest (10 mg/kg; p=0.014) dose of citalopram significantly decreased total trials to criterion compared with the vehicle group. However, no significant dose × group interaction was observed (F6,63=0.61, p=0.720). In contrast, GBR12909 produced a dose-dependent, biphasic effect on reversal learning (dose: F3,60=4.544, p=0.006; dose × group interaction: F6,60=0.89, p=0.511; Figure 4b). Post hoc analysis demonstrated that the highest dose of GBR12909 significantly increased total trials to criterion compared with the vehicle group (p=0.038), whereas the lowest dose (1 mg/kg) significantly decreased the total number of trials required to reach criterion (p=0.049).

Figure 4
figure 4

Effect of citalopram (n=24) on (a) the number of trials to criterion; (c) incorrect trials to criterion; and (e) percentage perseverative responses. Effect of GBR 12909 (n=23) on (b) the number of trials to criterion; (d) incorrect trials to criterion; and (f) percentage perseverative responses on the spatial reversal learning task. Data are means±SEM from a single reversal learning session. Asterisks denote a significant difference between the groups indicated: *p<0.05 and **p<0.01.

PowerPoint slide

Elevated Perseverative Behavior is Associated with Altered 5-HT Metabolism in the OFC

Although no significant group differences were observed in 5-HT levels in the lOFC (group: F2,27=2.05, p=0.150; Figure 5a), a planned comparison between low- and high-perseverative rats in this region approached statistical significance (p=0.059). Levels of the 5-HT metabolite 5-HIAA were significantly decreased in both the mOFC (group: F2,2)=4.13, p=0.028) and lOFC (group: F2,27=4.23, p=0.026) (Figure 5b). A dimensional analysis of all three perseveration groups revealed that response perseveration was inversely correlated with 5-HIAA levels in the mOFC (R2=0.17, p<0.01) and lOFC (R2=0.12, p<0.05). Post hoc analysis using Fisher’s LSD test revealed a significant decrease in levels of 5-HIAA in the mOFC of high-perseverative rats with respect to mid-perseverative (p=0.028) and low-perseverative rats (p=0.014). In the lOFC, 5-HIAA levels were significantly decreased in high-perseverative rats compared with low-perseverative rats (p=0.008). However, indices of DA function in the OFC and DMS were not significantly different between low- and high-perseverative rats (see Supplementary Table S4).

Figure 5
figure 5

(a) Levels of 5-HT and (b) 5-HIAA (pmol/mg) in the medial and lateral orbitofrontal cortex (OFC) of high- (n=9), mid- (n=10), and low- (n=9) perseverative animals. (c) 5-HT2A and (d) 5-HTT receptor binding in the medial and lateral OFC of high- (n=9), mid- (n=10), and low- (n=15) perseverative groups. Data are means±1 SEM. Asterisks denote a significant difference between the groups indicated: *p<0.05 and +p=0.059.

PowerPoint slide

Elevated 5-HT2A Receptor Binding in the OFC is Associated with Reduced Perseveration

5-HT2A receptor binding significantly varied in the mOFC (group: F2,33=4.42, p=0.021) and lOFC (group: F2,33=4.01, p=0.028) of low-, mid-, and high-perseverative rats (Figure 5c). Post hoc Fisher’s LSD tests showed that binding at 5-HT2A receptors was significantly increased in the mOFC of low-perseverative animals compared with high- (p=0.029) and mid-(p=0.013) perseverative animals. Increased 5-HT2A receptor binding was also present in the lOFC of low-perseverative rats compared with mid- (p=0.023) and high- (p=0.028) perseverative rats. Decreased 5-HT2A receptor binding in the OFC was not accompanied by significant changes in 5-HTT binding (Figure 5d) nor was there a significant difference in binding of the DA transporter or D2 receptors in either the OFC or DMS (see Supplementary Table S4). The lack of relationship between perseverative behavior and expression of 5-HTT and DAT was further supported by a lack of associated changes in Slc6a4 and Slc6a3 expression in the DRN and VTA (Figure 6a and b).

Figure 6
figure 6

Expression of (a) tph2 and slc4a6 in the dorsal raphé nucleus (DRN) and (b) th and slc3a6 in the ventral tegmental area (VTA) of high- (n=8), mid- (n=8), and low- (n=9) perseverative groups. Data are means±1 SEM. Asterisks denote a significant difference between the groups indicated: *p<0.05.

PowerPoint slide

Increased Perseveration is Associated with Decreased TPH2 and MAO Expression in the DRN and Increased MAO Expression in the OFC

TPH2 mRNA expression was significantly decreased in the DRN of highly perseverative rats (F2,23=5.59, p=0.011; Figure 6a). Post hoc Fisher’s LSD tests indicated a significant decrease in TPH2 expression in the DRN of highly perseverative rats compared with mid- (p=0.007) and low-perseverative (p=0.013) rats; however, correcting for a lack of homogeneity of variances, a Games–Howell post hoc analysis revealed that TPH2 expression in the DRN of highly perseverative rats differed significantly from mid-perseverative rats (p=0.048) but not from low-perseverative rats (p=0.081). TH mRNA expression in the VTA was not significantly different between the three groups (Figure 6b). As shown in Figure 7a and b, MAO-A and MAO-B mRNA expression was significantly decreased in the DRN of highly perseverative rats (MAO-A: F2,24=5.03, p=0.016; MAO-B: F2,24=4.15, p=0.030). A dimensional analysis of all animals (low, mid, high groups) revealed that response perseveration was inversely related to MAO-A mRNA expression in the DRN (R2=0.23, p<0.05). Post hoc Fisher’s LSD tests showed a significant decrease in MAO-A expression in the DRN of highly perseverative rats compared with mid- (p=0.023) and low-perseverative (p=0.007) rats. MAO-B expression was significantly decreased in the DRN of high-perseverative rats compared with mid- (p=0.020) and low-perseverative (p=0.024) rats. Conversely, MAO-A and MAO-B expression was increased in the lOFC of highly perseverative rats, as shown in Figure 7c and d (MAO-A: F2,27=5.49, p=0.011; MAO-B: F2,27=11.1, p<0.001). In addition, response perseveration was positively correlated with MAO-B mRNA expression in the lOFC (R2=0.13, p<0.05). Post hoc Fisher’s LSD tests showed a significant increase in MAO-A expression in the lOFC of highly perseverative rats compared with mid- (p=0.025) and low-perseverative (p=0.004) rats. Similarly, MAO-B expression was significantly increased in the lOFC of high-perseverative rats compared with mid- (p<0.001) and low-perseverative (p<0.001) rats. No significant differences were found for other 5-HT- (5-HT2A–2C receptors) and DA-related (D1/2 receptors) transcripts in either the OFC or DMS (see Supplementary Table S5).

Figure 7
figure 7

Expression of (a) MAO-A and (b) -B in the dorsal raphé nucleus (DRN) and ventral tegmental area (VTA) of high- (n=8), mid- (n=8), and low- (n=9) perseverative groups. (c) MAO-A and (d) -B in the medial and lateral OFC of high- (n=8), mid- (n=10) and low- (n=9) perseverative groups. Data are means±1 SEM. Asterisks denote a significant difference between the groups indicated: *p<0.05 and **p<0.01.

PowerPoint slide

Discussion

The main findings indicate that naturally occurring perseverative behavior on a spatial-discrimination serial reversal learning task is associated with diminished 5-HT function and abnormal MAO-A and MAO-B expression in the OFC and DRN. Our findings implicate increased constitutive MAO-A and MAO-B mRNA expression, specifically in the lateral OFC, and decreased expression of these transcripts and TPH2 in the DRN as putative novel substrates underlying perseverative behavior. Although the reversal design was somewhat different from those used by other studies, in that there were up to three reversals within the session, rather than the typical single reversal, it is evident that performance was still dependent on dopaminergic and serotonergic modulation by selective reuptake inhibitors. We found that levels of the 5-HT metabolite, 5-HIAA, were significantly reduced in the OFC of highly perseverative rats compared with rats in the lower quintile of the perseveration-response distribution. In addition, low levels of perseveration were also associated with increased 5-HT2A receptor binding in the mOFC and lOFC. Whilst prior studies have demonstrated a role for 5-HTT (Holmes and Fam, 2013; Nonkes et al, 2012) and striatal DAergic mechanisms in reversal learning performance (Clarke et al, 2011; Collins et al, 2000; O'Neill and Brown, 2007), we found no evidence of abnormalities in binding at 5-HTT in high- or low-perseverative rats nor any alterations in several key indices of DA transmission in the DMS. These findings indicate that natural variation in serotonergic tone and MAO-A and MAO-B gene expression in the OFC and DRN, together with reduced TPH2 mRNA expression in the DRN, may underlie poor spatial-discrimination reversal learning in rats. Attenuated serotonergic function may thus be an endophenotype that biases behavior toward perseveration when S-R contingences are reversed, a notion consistent with the effects of direct interventions that decrease central 5-HT function (Rogers et al, 1999; Mobini et al, 2000; Clarke et al, 2004).

The 5-HT metabolite, 5-HIAA, was significantly decreased in the mOFC and lOFC of rats selected for highly perseverative behavior; this was accompanied by a trend reduction in 5-HT levels in the lateral OFC. These findings, together with the demonstration of improved behavioral flexibility after citalopram treatment, suggests a role of 5-HT in spatial reversal performance. Our results thus accord with the disruptive effects of dietary tryptophan depletion on reversal learning in humans (Rogers et al, 1999), which decreases central 5-HT transmission (Chase et al, 2011), as well as the effects of selective focal destruction of 5-HT terminals in the OFC of the marmoset monkey (Clarke et al, 2004). The observation that spatial reversal learning is facilitated by local administration of a 5-HT2C receptor antagonist in the OFC (Boulougouris and Robbins, 2010) lends further support to an involvement of orbitofrontal 5-HT mechanisms in spatial reversal performance. Interestingly, animals exhibiting highly flexible behavior in the present study showed the highest levels of 5-HT2A receptor binding in the OFC. Activation of 5-HT2A receptors on pyramidal projection neurons in the PFC has previously been reported to increase the activity of serotonergic neurons in the DRN (Puig et al, 2003). The resultant increase in 5-HT release in the PFC (Puig et al, 2003) may be linked to enhanced frontostriatal signalling and diminished perseverative responding (Roberts, 2011). Thus, rats in the present study may have exhibited improved behavioral flexibility as a result of increased 5-HT2A receptor binding in the OFC, a notion supported by evidence that 5-HT2A receptor antagonists disrupt spatial reversal learning in rats (Boulougouris et al, 2008).

In the present study, differential binding at 5-HT2A receptors in the mOFC and lOFC of low and highly perseverative rats was not accompanied by changes in 5-HT2A mRNA expression. This apparent anomaly suggests that the differences in binding associated with perseverative behavior may reflect alterations in the binding affinity of 5-HT2A receptors or a change in the total pool of 5-HT2A receptors available for binding in the OFC. At this point, it is difficult to discount the impact of factors such as receptor internalization affecting receptor density as distinct from (i) regulatory mechanisms involved in gene expression (Hitzemann et al, 2007) and (ii) effects on transcript levels in projections to the OFC from non-serotonergic fibers, notably those arising from the mediodorsal nucleus of the thalamus (Scruggs et al, 2000) and implicated in reversal learning performance (Chudasama et al, 2001).

MAO is the main enzyme responsible for the catalytic degradation of monoamines in the brain, present in the synaptic cleft, axon terminals, and in some glial cells (Shih and Thompson, 1999). The normal intraneuronal function of MAO is the catabolism of monoamine transmitters not contained within synaptic vesicles. The novel finding of decreased MAO-A and MAO-B gene expression in the DRN may be indicative of a general reduction in serotonergic tone, a notion supported by the concurrent decrease in TPH2 expression in highly perseverative animals, suggestive of reduced 5-HT synthesis in these animals. This notion is supported by the accompanying reduction in 5-HIAA levels in the OFC and is consistent with the general view that reduced serotonergic transmission underlies poor reversal learning (Clarke et al, 2004; Kehagia et al, 2010). Although the mechanism underlying the hypothesized reduction in serotonergic tone in highly perseverative rats is unknown, it is possible that decreased MAO activity in the DRN resulted in reduced 5-HT breakdown and consequently increased autoinhibition of 5-HT neurons by somatodendritic 5-HT receptors (Liu et al, 2005). Intriguingly, highly perseverative rats exhibited increased MAO-A and MAO-B expression in the OFC. The mechanism underlying these strongly contrasting effects on MAO expression in the DRN and OFC is presently unknown. The hypothesized decrease in serotonergic tone in highly perseverative animals would, however, lead to long-term compensatory effects on 5-HT transmission in the OFC, including alterations in 5-HT release and local metabolism by MAO present in the synapse and surrounding glial cells (Shih and Thompson, 1999). Our results suggest the presence of at least two, functionally distinct populations of MAO involved in 5-HT catabolism: one linked to DRN serotonergic neurons and the other putatively linked to extraneuronal processes in the OFC, possibly linked to glial function.

We found that performance on the spatial-discrimination reversal task was dose-dependently affected by the DA reuptake inhibitor, GBR12909, with low doses improving performance and higher doses impairing performance. This implies that DA may have a biphasic effect on reversal learning performance similar to dopaminergic modulation of other behaviors such as locomotor activity (Eilam and Szechtman, 1989). Such divergent effects may be mediated by inhibitory presynaptic D2 receptors responsible for controlling the rate of neuronal firing, synthesis, and release of DA (Aghajanian and Bunney, 1977), with higher doses affecting postsynaptic DA receptors. However, despite these biphasic effects, we found no differences between high- and low-perseverative animals in levels of DA or its metabolite DOPAC in the DMS. Striatal mechanisms have previously been linked to behavioral flexibility through selective lesion studies (Castane et al, 2010) and local DA depletion (Clarke et al, 2011; O'Neill and Brown, 2007), which have the common effect of impairing reversal learning. Striatal DA levels have also been shown to predict performance on outcome-specific reversal-learning tasks (Clatworthy et al, 2009; Cools et al, 2009). However, despite the DMS being a major output region of the OFC (Mailly et al, 2013; Schilman et al, 2008), our results suggest that absolute variations in post mortem DA content and DA transporter are not associated with natural variation in perseverative behavior following repeated spatial reversals. A similar conclusion was reached by a recent study in non-human primates (Groman et al, 2013), which found that interactions between 5-HT levels in the OFC and DA levels in the putamen predicted behavioral flexibility during reversal learning. Specifically, reversal of a novel visual discrimination in monkeys was impaired by relatively low levels of OFC 5-HT and putamen (but not caudate) DA and by relatively high levels of OFC 5-HT and putamen DA. The lack of similar interactions between 5-HT and DA in the present study may reflect differing task demands (ie spatial vs visual discrimination), possibly engaging associative- as opposed to motor-related regions of the dorsal striatum (ie, the putamen).

Research in primates and rats suggest that the OFC can be functionally segregated into medial and lateral subregions (Elliott et al, 2000; Iversen and Mishkin, 1970; Kringelbach and Rolls, 2004; Mar et al, 2011). The lateral OFC is implicated in cognitive control when previously rewarded responses require suppression (Elliott et al, 2000; Iversen and Mishkin, 1970), whereas the medial OFC has been hypothesized to have a role in assigning and adjusting subjective value to delayed and uncertain rewards (Kable and Glimcher, 2009). Our findings show that both subregions of the OFC are affected by abnormalities in their 5-HT innervation but only the lateral OFC shows constitutively increased MAO expression and a stronger trend towards reduced 5-HT content. Collectively, therefore, abnormalities in the serotonergic modulation of the lOFC may account for impaired flexibility during reversal learning.

In conclusion our research adds to the extensive body of literature implicating a role of orbitofrontal 5-HT in flexible goal-directed behavior (Clarke et al, 2007; Hampshire and Owen, 2006; Schoenbaum et al, 2007). The main findings of this investigation support the novel hypothesis that subjects who naturally perseverate when S-R contingencies are reversed have reduced 5-HT tone in the OFC as a putative consequence of impaired afferent input from the DRN. In the present study, the index of perseverative responding was used to stratify the subjects according to inflexible behavior. These errors may reflect compulsive responding, as expressed in brain disorders such as OCD, where acts are performed in a repetitive and habitual manner (Fineberg et al, 2009). Individuals diagnosed with OCD show impaired reversal learning and aberrant task-related OFC–striatal activity (Remijnse et al, 2006). Moreover, OFC hypoactivity and impaired reversal learning is reported in OCD patients and their first-degree relatives (Chamberlain et al, 2008). As 5-HT2A receptor availability, specifically in the OFC, predicts clinical outcomes in OCD (Perani et al, 2008), our findings suggest that naturally occurring response perseveration in rats may have utility as an endophenotype to investigate the neural basis of OCD and other compulsive brain disorders.

FUNDING AND DISCLOSURE

TWR consults for Cambridge Cognition, Lilly, Merck Sharpe and Dohme, Lundbeck, Teva, Shire Pharmaceuticals, Otsuka, and Chempartners, and has held recent research grants with Lilly and Lundbeck. TWR has also received royalties from Cambridge Cognition (CANTAB), Springer, and Elsevier. All the other authors declare no conflict of interest.