Appetitive Operant Conditioning in Mice: Heritability and Dissociability of Training Stages

To study the heritability of different training stages of appetitive operant conditioning, we carried out behavioral screening of 5 standard inbred mouse strains, 28 recombinant-inbred (BxD) mouse lines and their progenitor strains C57BL/6J and DBA/2J. We also computed correlations between successive training stages to study whether learning deficits at an advanced stage of operant conditioning may be dissociated from normal performance in preceding phases of training. The training consisted of two phases: an operant nose poking (NP) phase, in which mice learned to collect a sucrose pellet from a food magazine by NP, and an operant lever press and NP phase, in which mice had to execute a sequence of these two actions to collect a food pellet. As a measure of magazine oriented exploration, we also studied the nose poke entries in the food magazine during the intertrial intervals at the beginning of the first session of the nose poke training phase. We found significantly heritable components in initial magazine checking behavior, operant NP and lever press–NP. Performance levels in these phases were positively correlated, but several individual strains were identified that showed poor lever press–NP while performing well in preceding training stages. Quantitative trait loci mapping revealed suggestive likelihood ratio statistic peaks for initial magazine checking behavior and lever press–NP. These findings indicate that consecutive stages toward more complex operant behavior show significant heritable components, as well as dissociability between stages in specific mouse strains. These heritable components may reside in different chromosomal areas.

standard inbred mouse strains, such as the 129 strains, commonly used for transgenic studies differ in genetic background and behavioral phenotype, stressing the importance of characterization and selection of the background strain (Crusio, 1996;Gerlai, 1996).
Despite thorough genetic and behavioral characterization, the genetic components of the appetitive learning abilities of standard inbred mice (e.g., strains C57BL/6J and DBA/2J, A/J, Balb/c, C3H/ HeJ, NOD/Ltj and 129S1/Sv) are largely unknown (but see Isles et al., 2004) for estimates of the genetic effect on choice bias for immediate rewards based on four standard inbred mouse lines and (Baron and Meltzer, 2001;McKerchar et al., 2005) for strain comparisons in an operant nose poke task and lever press task, respectively). Especially the behavioral characterization of learning in NOD/Ltj mice is far from complete. Overall, this makes it difficult to select a background strain if one aims to develop a genetic mouse model for appetitive learning disabilities.
Susceptibility to cognitive dysfunctions is mostly affected by quantitative effects of groups of genes, rather than single genes (Valdar et al., 2006). Taking an approach opposite to studying targeted mutations, a behavior-to-genes approach with genome-wide scanning for linkages between behavioral traits and chromosomal areas aims at elucidating the roles of genes in complex traits such as cognitive functions. Complementing the vast number of genetically engineered mouse lines, several sets of recombinant-inbred mouse lines have been created, perhaps most remarkably the BxD

IntroductIon
Appetitive operant learning is often studied in a conditioning chamber in which animals learn to lever press and nose poke in order to receive a food or liquid reward delivered at a specific reward site. Operant learning usually results in complex behavior that depends on a multitude of capacities such as forming place-reward and action-outcome associations, and chaining actions such as lever pressing and nose poking (NP) into a food magazine together. Little is known about the genetic basis of appetitive operant conditioning in general, or of the subsequent trainings stages that lead up to it. The first general goal of our study was to examine whether performance of a large group of mouse strains at consecutive training stages leading up to, and including operant conditioning, contain a heritable component. Secondly, for genetic mouse models of learning and memory, it would be an asset to identify strains that show a dissociation between poor performance at advanced training stages but good performance at preceding stages. Such dissociations may indicate which mouse strains present models specifically targeting more complex operant learning, or alternatively show deficits in more basic behaviors and learning processes.
In the past years, animal models with targeted mutations together with clinical findings in human populations have increased our understanding of the role of genes in cognitive processes such as memory and learning (The Dutch-Belgian Fragile X Consorthium et al., 1994;Khelfaoui et al., 2007;Morice et al., 2008). However, lines developed from the popular inbred laboratory mouse strains C57BL/6J and DBA/2J (Peirce et al., 2004). The high number of unique chromosomal recombinations in BxD lines, resulting in highly variable phenotypes, makes them very suitable for studying heritable components of cognition and behavior.
However, many of the neuroscientific studies on recombinantinbred mouse lines published so far have focused on brain morphology (e.g. Martin et al., 2006;Badea et al., 2009) or behavioral traits associated with substance abuse, such as sensitization and tolerance to alcohol and cocaine (Tolliver et al., 1994;Gehle and Erwin, 1998;Jones et al., 1999;Kirstein et al., 2002). While some studies have managed to correlate morphology with behavior (Yang et al., 2008), research on traits pertaining to learning behavior motivated by naturalistic rewards has been scarce and, to our knowledge, heritability of operant tasks of different complexity, and motivated by appetitive rewards, has not been studied before.
In this study we characterized seven commonly used inbred mouse lines and 28 recombinant-inbred (BxD) lines in their behavior across several consecutive training stages toward the acquisition of an appetitive, composite operant response, consisting of a lever press followed by a nose poke. The training stages preceding this final stage were magazine checking behavior very early during training, in part reflecting exploratory behavior, and learning to nose poke for food reward. To lay a foundation for the development of novel mouse models for operant learning disabilities, we examined whether performance at these stages has a heritable background. We also investigated whether performance at more advanced stages of training can be dissociated from preceding stages, which may yield more specific mouse models deficient in operant learning. To expand the heritability analysis, we carried out a QTL mapping study based on the data from 28 BxD lines and their progenitor lines to study whether these task stages are regulated by different chromosomal areas.

MaterIals and Methods anIMals
The BxD recombinant-inbred mouse lines used in this study were originally created in The Jackson Laboratory 1 . Both BxD and standard inbred mouse lines were bred locally at Harlan Netherlands 2 . The socially housed male mice (8-9 weeks of age at the beginning of experimental training) were kept in a reversed day-night cycle (7.00 lights off, 19.00 lights on). Each tested strain (standard mouse lines A/J, Balb/c/ByJ, C3H/HeJ, NOD/Ltj and 129S1/Sv, BxD lines 1,2,8,14,16,19,21,23,27,29,31,32,33,36,39,40,42,43,51,61,62,65,68,69,73,75,87,90 and their progenitor strains C57BL/6J and DBA/2J, N = 5-19 mice per line, in total 343 animals, on average 9.8 animals per strain) consisted of several batches of mice with at least two litters from separate mothers. Prior to the beginning of the experiments, mice were habituated to the colony room for 4 weeks. Food and water were available ad libitum. In the week preceding the experiments, the mice were handled daily by the experimenter, habituated to the operant boxes for 1 h per day and given samples of food pellets (14 mg dextrose-sucrose precision pellet produced by Bio-Serv, Frenchtown, NJ, USA 3 in the home cage. During the course of experiments, the mice were food-restricted by removing the food prior to the beginning of each training session to achieve about 5% weight loss. After the training session (once daily), food was available ad libitum until the beginning of the next restriction period on the following day. Water was provided in the home cages ad libitum at all times. All experimental procedures were approved by the institution's Animal Welfare Committee and were in compliance with the European Council Directive (86/609/EEC) and Principles of laboratory animal care (NIH publication No. 86-23, revised 1985).

BehavIoral apparatus
Mouse operant boxes (classical mouse modular test chamber, model ENV-307A, inside dimensions 15.9 cm × 14.0 cm × 12.7 cm) were equipped each with two stimulus lights above two retractable levers (model ENV-312-2W). The levers protruded 1 cm into the operant chamber, were 2.2 cm above the floor and had a reward tray in between them (see Figure 1).
Each of the eight boxes was positioned inside a sound-attenuating cubicle (standard medium-density fiberboard cubicle, model ENV-022MD, inside dimensions 55.9 cm × 38.1 cm × 40.6 cm); the chambers were placed in parallel on two shelves, each holding four boxes). Control of the operant boxes and recording behavioral data was carried out by a MED-PC research control and acquisition system (version IV). Behavioral hardware and controlling software were provided by MED Associates, St. Albans, VT, USA 4 .

BehavIoral traInIng and paraMeters
Every training session began with a habituation period during which the mice were placed in the operant box for 1 h before training onset. To avoid decreased motivation by satiety, mice were able to collect a maximum of 30 pellets in one training session. Each training session was terminated when the mice reached this maximum or when the training session exceeded the maximum length of 60 min. The mice were trained for one session per day.

Operant nose poke task
Trial onset was marked by one of the green LED lights on the front panel of the operant box being lit up for 30 s or until the mouse collected a reward (one 14 mg sucrose pellet) with pseudorandom intertrial intervals (ITI) of 5-25 s (15 s on average). While the light was on, the mice were able to collect a reward by the operant behavior of approaching and poking the food magazine in the front panel. Food magazine entries were detected by a photobeam detector. To prevent accumulation of sucrose pellets in the magazine tray, the pellet was only delivered at the moment the mouse put its nose into the tray. The mice were trained on this task for three sessions.
We will refer to this task as operant NP and not as (Pavlovian) cue conditioning (i.e., to the LED light) because no evidence was obtained that specific cue-reward associations were formed and expressed in behavior, and furthermore the mice had to poke their nose in the feeder tray in order to obtain a pellet. We assessed the occurrence of cue conditioning by measuring the selectivity ratio, defined as the nose poke rate during stimulus light onset divided by nose poke rate during the ITI. Cue conditioning should lead to a ratio clearly above one (Nordquist et al., 2003), but this was not the case in any of the mouse strains studied. Initial magazine checking behavior was defined as the number of nose poke entries in the food magazine per minute of ITI during the first ten trials of the first session. Nose poke success was defined as the number of trials in the third, last session of training in which the mouse collected the reward during the trial by approaching and poking the food magazine, divided by the total number of trials in which pellet acquisition was possible. Lever press-nose poke performance was defined as the percentage of action sequences leading to reward deliveries relative to the total number of trials in the fifth, last session of training.
To quantify correlations between behavioral parameters, we computed standard Spearman's rank correlation coefficients and partial correlation coefficients on strain means. To assess the overall significance of behavioral differences between strains, we carried out one-way ANOVAs and post hoc t-tests (Tukey's least significant difference procedure). All analyses were carried out in MATLAB (MathWorks, Natick, MA, USA).

Heritability
Behavioral parameters from BxD lines, progenitor and standard inbred mouse strains were pooled to estimate narrow-sense heritabilities, which reflect the proportion of total phenotypic variation

Operant lever press-nose poke task
In the lever press-nose poke task, the two levers (one on each side of the food magazine) protruded from the operant box wall. While the lever was in a protruded position, the mouse could obtain a sucrose pellet by first pressing the lever and subsequently poking its nose in the food magazine, thus expressing a chaining of two operant behaviors. Following a lever press or timeout after 150 s without response, the levers were retracted to prevent possible extinction behavior during the course of training, and a pseudorandom ITI of 5-25 s followed. The mice were trained on this more complex operant task for five sessions in total. LED lights were not in use during this task.

Behavioral parameters
The following three parameters were analyzed in this study: (1) Initial magazine checking behavior at the beginning of the first session of nose poke training; this behavior is taken to reflect primarily environmental exploration although early nose poke learning may also contribute; (2) nose poke success at the end of nose poke training, and (3) lever press-nose poke performance at the end of training. Heritability of appetitive operant conditioning C57BL/6J and/or 129S1/Sv (the most commonly used strains in transgenic studies) regarding the parameters presented in Figure 2 are marked with a plus sign and an asterisk, respectively.

nose poke task
During the third, last session of the nose poke task, the performance scores defined as the number of trials where the mouse collected the pellet divided by all trials ranged from 20.0% in BxD-73 to 93.2% in NOD/Ltj, the mean being 55.0 ± 3.5% ( Figure 2B). As with initial magazine checking, the majority of the BxD mouse lines showed lower performance levels than either of the progenitor lines, but three BxD lines (2, 16, and 32) achieved higher levels than either progenitor line. The heritability for nose poke success in the last session of training was 19.6% (p < 0.001). Also the nose poke task showed a significant strain effect in one-way ANOVA [F(34,315) = 5.85, p = 0], with 254 out of 595 possible pair-wise comparisons significantly differing from each other.
lever press-nose poke task Over the course of training on this task, one line (BxD-90) failed to complete any trials despite showing clear nose poke behavior in the earlier training phase ( Figure 2C). Most strains showed improving performance over the five training sessions (Figure 3). In the last, fifth session of training, performance in the operant task varied remarkably across strains: performance varied from 0.0% (BxD-90) to 99.6% (NOD/Ltj), the average being 46.4 ± 4.9% ( Figure 2C). A number of BxD lines (27, 8, 2, 33, 51, and 43) outweighed both of the progenitor strains in performance, but none of the above mentioned lines was topping the progenitor strain DBA in the initial magazine checking behavior. Similarly, mouse lines 69, 31, and 16 that showed the highest initial magazine checking activity among the BxD lines, were performing worse in the lever press-nose poke task than either progenitor line. For many strains, a clear dissociation was found between initial magazine checking and lever press-nose poke performance, or between nose poke success and lever press-nose poke performance. For instance, BxD-43, the top BxD line for lever press-nose poke learning, was amongst the lines expressing the least initial magazine checking behavior and below the average in the nose poke task.
Of most interest from the viewpoint of deficient operant learning were the lines showing poor lever press-nose poke learning but moderate or normal levels of NP behavior; these lines included BxD strains 23, 19, 21, and 32, and to a lesser extent C3H. For instance, BxD-32 had a low lever press-nose poke performance of about 20% despite it being among the top lines in during operant nose poke learning.
We found a significant (p < 0.001) heritable component (21.3%) in lever press-nose poke performance in the fifth session. Moreover, we found a significant strain effect in one-way ANOVA [F(34,309) = 3.22, p = 01], with 277 significantly different pair-wise comparisons out of 595.

correlatIons Between dIfferent traInIng stages
There was a significant positive correlation between the strain means of initial magazine checking behavior and the nose poke task (r = 0.63, p = 0.00006; Table 1), the nose poke task and lever pressnose poke performance (r = 0.52, p = 0.00143) and lever press-NP versus initial magazine checking behavior (r = 0.53 p = 0.00116).
that is due to the allelic effects of genes, thus excluding environmental factors, epistatic interactions, etc. (h 2 ; Hegmann and Possidente, 1981) for behavioral patterns. To estimate the heritability, we used a procedure which controls for variable group sizes in different strains (Isles et al., 2004), where N is the total number of animals, S is the total number of tested strains, n s is the number of animals for a given strain s , t s is the trait average for a given strain, v s is the trait variance for a given strain and T refers to the trait average across all strains: The p-levels of the heritability estimates were calculated by a permutation test with 1000 permutations (Moore and McCabe, 2000). Both the heritability estimates and their significance were calculated with a custom MatLab script (Heimel et al., 2008), available at http://www.nin.knaw.nl/∼heimel/software/heritability.

Mapping of quantitative trait loci
Mapping of quantitative trait loci (QTLs) was performed by standard interval mapping scripts available at the WebQTL interface at http:// www.genenetwork.org/ that link the observed behavioral traits with chromosomal areas with the help of established single nucleotide polymorphism (SNP) data. Likelihood ratio statistics (LRS) were calculated for each marker locus (Chesler et al., 2003(Chesler et al., , 2004Wang et al., 2003). The whole-genome significance threshold for QTLs was defined by using a 1000× permutation test. We did not enable use of parent strains in order not to bias the permutation test.

InItIal MagazIne checkIng BehavIor
The average nose poke rate during ITI of the first 10 trials of reward collection training varied from 0.53 pokes/min (129S1/ Sv) to 10.28 pokes/min of ITI (NOD/Ltj), the mean + SEM being 3.35 ± 0.34 pokes/min of ITI (Figure 2A). While none of the BxD lines expressed more initial magazine checking behavior than the progenitor line DBA/2J, the majority of the mouse lines showed less initial magazine checking than either of the progenitor lines, demonstrating transgressive segregation in the trait (see, e.g. Jones and Mormède, 2007, chapter 25).
The heritability of this behavior was 10.4% (p < 0.001). A one-way ANOVA test revealed a significant strain effect [F(34,315) = 3.31, p = 0.15 × 10 −7 ]. Post hoc testing indicated that out of 595 possible pair-wise comparisons between strains, 123 were significantly different from each other. Mouse lines that differed significantly from November 2010 | Volume 4 | Article 171 | 5 Malkki et al.
Heritability of appetitive operant conditioning between nose poke and lever press-nose poke performance (r = 0.29, p = 0.09828; with initial magazine checking partialed out) and initial magazine checking and lever press-nose poke performance (r = 0.31, p = 0.07664; with reward collection partialed out) showed a slight trend toward correlation but were insignificant. Especially the lack of a significant partial correlation We also examined how performance levels at two of the three task stages were correlated when the influence of the third stage was taken into account ( Table 1). The partial correlation for initial magazine checking and the nose poke stage per strain mean remained significant (r = 0.48, p = 0.00382; with lever press-nose poke learning partialed out). In contrast, the partial c orrelations Lever press-nose poke performance at the end of training. Performance in the last (fifth) session of training is presented as the percentage of trials during which the mouse presses the lever and nose pokes into the magazine to collect the sucrose pellet during the trial period (150 s following trial onset).
Heritability of appetitive operant conditioning

Nose poke task
Mapping the nose poke performance at the end of the training either as percentage of correct trials or percentage of correct trials normalized on the total number of nose pokes failed to reveal suggestive QTLs.
Lever press-nose poke task QTL mapping of lever press-nose poke performance in the last session of training resulted in a suggestive peak on chromosome 9 (58 MB; Figure 5). Normalizing lever press-nose poke performance on the total number of nose pokes in the preceding phase did not cause a notable change in the location or significance of the LRS. Due to the relative flatness of the peak combined with a high number of genes situated under the peak area being expressed in the mouse central nervous system, it was not feasible to point out a single candidate gene for this final stage of operant learning. Genes found under the peak are listed in Table 2.
between the nose poke and lever press-nose poke stages is notable, because these were contiguous in time and both represent a form of operant conditioning.

Initial magazine checking behavior
The QTL map for initial magazine checking behavior showed suggestive peaks on chromosomes 4 and 6 ( Figure 4A). When zoomed in further, the LRS is above the threshold for a suggestive QTL around 47-48 megabases on chromosome 4 ( Figure 4B) and around 93-95 MB on chromosome 6 ( Figure 4C). In both cases, the peaks were relatively flat and had several genes expressed in the central nervous system under them, meaning that it was not possible to point out a single candidate gene. Genes found under the peak are listed in Table 2 for chromosomes 4 and 6, respectively.  traits in separately trained subgroups of the same strain. In our paradigm where consecutive learning stages could be monitored in the same mouse, strong correlations were found between all three stages, but the correlations between initial magazine checking and lever press-NP, and between NP and lever press-NP became insignificant when the third stage was factored out. The dissociability of NP and lever press-NP was most poignantly illustrated by several BxD lines (especially BxD-32, but also, e.g., 21, 19, 23, and 90) showing high performance on operant NP but low success on lever press-NP.

operant nose poke learnIng and InItIal MagazIne checkIng BehavIor
Although seemingly simple, the stage of operant NP for food reward may allow several associations to be formed. Apart from actionoutcome (nose poke-food) learning, the animal may have formed cue-outcome (Ito et al., 2005); for a review see, e.g., Savage and Ramos (2009) as well as place-outcome associations (McAlonan et al., 1993;Ito et al., 2008), but in the current study no evidence for conditioning to the cue light was found. Initial magazine checking behavior marks the very beginning of learning to approach the magazine and NP into it, and this stage is likely to be dominated by environmental exploration, as it was measured during the ITI of the first 10 trials. We found that both magazine checking and nose poke performance had a significant heritable component. The positive correlation between nose poke performance and initial magazine checking behavior (Table 1) remained significant when lever press-nose poke performance was partialed out. This can be explained, at least in part, by the notion that exploratory behavior is an essential early step in operant nose poke behavior: nose pokes in the feeder tray are required for the animal to discover a reward. Furthermore, during the course of training the animals expressing more pronounced initial magazine checking behavior are likely to visit the reward site at a higher rate, which directly facilitates performance success. Thus, our measures of magazine checking and final nose poke performance can be taken to reflect a continuum of learning, sampled at extreme time points, making the high correlation between these two stages a logical and expected result.
Of the standard inbred mouse lines, NOD/Ltj mice were expressing the most and 129S1/Sv the least initial magazine checking (Figure 2A, indicated with arrows), which is in accordance with previous open field exploration and locomotor activity studies (for NOD/Ltj, see Bothe et al., 2005, for various 129 substrains, see Baron and Meltzer, 2001;Isles et al., 2004;Bothe et al., 2005). lever press-nose poke perforMance To our knowledge, the heritability of appetitive lever press-nose poke learning has not been previously studied. Isles et al. (2004) studied several inbred mouse lines using an appetitive operant delayed-reinforcement paradigm in which mice were trained to respond to visual stimuli with a nose poke in order to get condensed milk as a reward, but this study did not focus directly on heritability of operant conditioning but on heritability of choice bias for immediate reward (15.8% and 16.5% depending on parameter definition; (Isles et al., 2004). Studies using a non-appetitive, escape/avoid-

dIscussIon
The main results of this study can be summarized as follows: First, all three task stages studied here showed significant levels of heritability, ranging from 10.4% for initial magazine checking behavior to 21.3% for the final and most complex stage, lever press-nose poke learning. A significant strain effect due to multiple differences between mouse lines, not just a few outliner strains, could be seen at all stages. In our QTL mapping analysis, suggestive LRS peaks were found for initial magazine checking (on chromosome 4 and 6) and for the lever press-nose poke task (chromosome 9), but not for nose poke learning. When analyzing correlations and dissociations between task stages, it should be emphasized that the analysis of heritability and QTL hinges on high-throughput screening of many mouse lines, making it unfeasible to study different learning   ing factors and associated behavioral variability, the significantly heritable component of 21.3% in lever press-nose poke performance may be considered rather high.

correlatIon analysIs and dIssocIatIons Between suBseQuent operant tasks
While initial magazine checking behavior, nose poke success and lever press-nose poke performance all appeared to have a positive correlation with each other, the individual positive correlations between lever press-nose poke performance and the two other stages disappeared when taking into account the effect of the third trait. This is less remarkable for initial magazine checking and the lever press-nose poke task because the variable performance on NP for food was temporally situated in between these two stages. Of more interest is the finding that the nose poke and lever press-nose poke stages lost their significant correlation when the influence of initial, exploratory magazine pokes was taken out, because these stages were temporally contiguous and both represent forms of operant conditioning.
In our experiments neither high expression of initial magazine checking behavior nor nose poke success provided reliable predictive power for the outcome of the lever press-nose poke training -a finding which is also reflected by some individual mouse lines. While lines BxD-90 and NOD/Ltj were clearly on the lower or higher end of performance at each training stage, a dramatic dissociation was seen in mouse line BxD-43. When comparing initial magazine checking activity and lever press-nose poke performance, BxD-43 (Figure 2, indicated with an arrow) was among the lines having the lowest initial magazine checking activity, and its nose poke success was below average. Nevertheless, it had the highest lever press-nose poke performance of all the BxD lines, second only to NOD/Ltj. Conversely, C3H and BxD-32 mice performed well above average on the nose poke task, but clearly below average on the subsequent lever press-nose poke task. A similar trend was observed in BxD lines 21, 19, 23, and 90, which were notably worse in lever press-nose poke task.
Several explanations may be noted for a poor performance on lever press-NP given moderate to high levels of nose poke learning. First, animals may be neophobic toward protruding levers and may have difficulty forming a trace between the act of lever pressing and obtaining a reward later on in the trial. Second, the previously established nose poke-reward association may impair or even block acquisition of a novel lever press-reward association. Third, animals may lack the capacity to "chain" lever press and nose poke actions in the correct order (Balleine and Dickinson, 1998;Graybiel, 1998;Suri and Schultz, 1998;Corbit and Balleine, 2003) In all three scenarios, C3H and the BxD lines 32, 21, 19, 23, and 90 may be regarded as interesting models for further exploring deficiencies in more advanced forms of instrumental learning and cognitive flexibility.
It is difficult at present to link the different stages of training to neuroanatomical substrates. Nevertheless, both for the nose poke and lever press-nose poke tasks the learning of action-outcome associations is important, depending on medial prefrontal-dorsomedial striatal systems (Balleine and Dickinson, 1998;Dalley et al., 2004;Yin et al., 2005). Secondly, these systems have also been implicated in the process of chaining or "chunking" two or more actions ance lever press task (Brennan, 2004) and an appetitive task using condensed milk as reward (Baron and Meltzer, 2001;McKerchar et al., 2005) reported notable differences between inbred mouse lines but did not estimate the heritability.
To study the possible confounding effect of strain differences in basal activity levels, we computed correlations between operant performance as measured in our study with basal activity data from another study which assessed general behavioral measures, among others, in 25 BxD lines in common with our study (Philip et al., 2009). We could not find a significant correlation with any of their basal activity measures. This might either indicate that basal locomotor activity does not significantly confound operant learning performance, or the differences may be explained by varying conditions between laboratories, such as testing the mice in different phases of day/night cycle.
Learning to perform an operant sequence of lever press-NP can be argued to be a complex process that depends on multiple components and is influenced, first, by several background factors such as basal exploratory activity, neophobia toward protruding levers, stress, motivational variables and incentive learning of pellet value (Luksys et al., 2009). These factors are not specific to our task per se, but affect learning indirectly, for instance by limiting the number of trials the animal will engage in. A second group of factors is discussed below and relates to learning that a previously effective operant response (nose poke) has to be preceded in this subsequent stage by a second, novel operant response (lever pressing). Given this complexity, it is not too surprising that acquisition of operant behavior varied highly across the strains. Considering all contribut- acquiring high asymptotic levels of lever press performance after more prolonged autoshaping or training than was done in our study (5 sessions; cf. Isles et al., 2004).

genetIc locI potentIally contrIButIng to operant learnIng perforMance
To complement the behavioral and heritability analysis, we also conducted QTL mapping. For initial magazine checking behavior, we found suggestive LRS peaks on chromosomes 4 and 6 (Figure 4), indicating areas containing a number of genes expressed in the mouse central nervous system, but could not point out single candidate genes for this task stage. For nose poke performance, no LRS peaks were found. For lever press-nose poke learning, we identified a suggestive LRS peak on chromosome 9 (Figure 5). Also this peak was relatively flat due to a low number of local SNPs and/or an insufficient number of unique recombinations in the area of interest. Although it was not possible to point out a single candidate gene, it is interesting to note that the genes under the peak region included Bbs4, a locus known to be associated with human Bardet-Biedl syndrome type 4 which is characterized by deficits in sensory function and learning disabilities in addition to physiological symptoms such as obesity (Beales et al., 1997).
Despite careful standardization of the experiments, none of the observed LRS peaks exceeded the conservative threshold for significance. Even with low variability -high heritability traits such as morphometric data on brain area size, QTLs may not be detected due to a relatively small contributing effect of each individual QTL (Crusio, 2004) which often results in difficulties finding highly significant QTLs for complex traits. Despite these limitations, the present QTL results are useful for comparisons with further studies, which may help pinpointing the polygenic nature of learning behavior to specific gene groups. Furthermore, the finding that QTL maps for initial magazine checking and lever press-nose poke learning showed no overlap supports our hypothesis that these processes are genetically dissociable.
According to the www.ensemble.org database, the progenitor lines C57BL/6J and DBA/2J show a small difference in the coding region of Bbs4 (human Bardet-Biedl syndrome type 4 gene, the mouse homolog of which is located on chromosome 9, under the QTL peak for operant performance), raising the possibility that some of the BxD lines may also differ in this locus, although this information is not available in the BXD Genotypes Database 5 , so we could not correlate the Bbs4 sequence in BxD lines with their learning performance. No studies describing the cognitive-behavioral abilities of an existing Bbs4 knock-out mouse line (Mykytyn et al., 2004) have been published, leaving open the question whether this locus may partially explain the observed variance in our lever press-nose poke task.
Even though the involvement of brain areas regulating operant conditioning has been studied extensively, its genetic background remains far from unraveled. One of the few identified genes known to regulate operant learning is Gpr6 on mouse chromosome 10, deletion of which facilitates acquisition of lever press behavior in mice. It codes G protein-coupled receptor 6, which is known to be together (Ostlund et al., 2009;Pennartz et al., 2009). Thirdly, success on the nose poke task also depends on learning place-reward associations, and in general appetitive contextual conditioning is mediated at least in part by the hippocampal-ventral striatal system (Schacter et al., 1989;Sutherland and Rodriguez, 1989;Ito et al., 2008). These dorsal and ventral striatal systems are anatomically and physiologically linked in multiple ways, e.g., via the dopaminergic mesencephalon and via connected cortico-basal ganglia loops (Haber, 2003;Voorn et al., 2004).
Although the dissociability of lever press-nose poke performance and initial magazine checking may not be entirely surprising, it is of note that the QTL maps for initial magazine checking and lever press-nose poke learning showed no overlapping loci (Figures 4 and 5). This result, together with the heritability and correlation analysis, indicates that the neural processes mediating these two task stages have a heritable background and suggests that they are genetically dissociable.

coMparIsons to other studIes of coMMon InBred Mouse lInes
Taken the importance of standard inbred mouse lines as disease models and background in gene-targeting studies, it is interesting to note that the NOD mice, of which relatively little behavioral data is available, were showing not only the highest initial magazine checking activity (consistent with their reportedly high exploratory activity; Bothe et al., 2005) and nose poke success amongst the strains, but also the highest lever press-nose poke performance, and that 129S1/Sv had little success on the lever press-nose poke task, with only one of the eight tested subjects making any lever press responses in this task. Although the large variety in 129 (sub) strains used in various studies makes interpretation of the behavioral data obtained in different laboratories challenging, our results on the 129S1/Sv strain are in agreement with previous reports of poor performance of this strain in an appetitive lever press task (McKerchar et al., 2005), and aversively motivated escape/avoidance lever press task (129S6/SvEvTac; Brennan, 2004). McKerchar et al. (2005) also reported a positive correlation between locomotor activity and operant performance, which is in line with our data on magazine checking and operant learning.
In a study using a delayed-reinforcer task, the 129S2/SvHsd strain learned to respond to a light cue with a nose poke in order to receive a reward. Despite showing lower spontaneous locomotor activity, its start latency, choice latency and number of non-started trials in a delayed-reinforcer task were not significantly different from the highly active BALB/c and C57BL/6J mice (Isles et al., 2004). In a touchscreen-based appetitive operant task, 129S1/SvIMJ performed similarly to C57BL/6J (Hefner et al., 2008) and in a task where the mice had to make a nose poke into an illuminated hole, the latency of 129X1/SvJ and 129X1/SvJ mice to emit 50 operant responses was at the same level as C3H/Hej and DBA/2J mice were performing, while the difference to DBA/2J and Balb/cByJ was significant, but not as dramatic as was the case in operant lever press tasks ( Baron and Meltzer, 2001). Together with our finding that 129 S1 mice performed moderately on the operant nose poke task (Figure 2B), these findings suggest that the poor operant lever press-nose poke performance of 129 strains may be due to a specific learning deficit related to lever pressing rather than insensitivity to a reinforcer or low activity levels. Alternatively, 129 strains may be capable of 5 www.genenetwork.org poor nose poke performance but high lever press-nose poke learning, and by BxD lines 32, 21, 19, 23, and 90 as well as strain C3H, showing an opposite tendency. Especially the latter lines may provide mouse models to study deficiencies in learning more complex, chained operant responses. Together, the correlation analysis and novel QTL maps suggest that different task stages of appetitive operant learning are regulated by different sets of genes.

acknowledgMents
This work was sponsored by SenterNovem BSIK grant 03053 and NWO-VICI grant 918.46.609. We would like to thank Ruud Joosten for help with setting up the MED-PC system, Tobias Kalenscher for valuable comments on the manuscript, Sabine Spijker, Maarten Loos and their colleagues at the Vrije Universiteit Amsterdam for helpful discussions and supporting the transport of BxD mice and Alexander Heimel and Jasper Poort at The Netherlands Institute for Neuroscience for sharing analysis scripts and giving useful tips for graphical presentation of the results. selectively expressed in striatal neurons projecting to the pallidum (Lobo et al., 2007). However, C57BL/6J and DBA/2J do not differ in this locus, so we could not assess its role in the operant performance variability by QTL mapping based on BxD mouse lines. Interestingly, 129S1/Sv, which had the worst lever press-nose poke performance of the standard mouse lines in our study, and also showed poor lever press performance in previous studies, differs in amino acid sequence from C57BL/6J and DBA/2J in this locus. The QTLs found in previous contextual and auditory-cued fear conditioning studies (Owen et al., 1997;Reijmers et al., 2006) did not appear in our study, which suggests that appetitive operant learning may be genetically dissociable from these aversively motivated types of learning -a subject awaiting further examination.
In conclusion, this study first showed that various task stages leading up to appetitive learning of sequential operant actions have a heritable component. Second, lever press-nose poke performance was only poorly predictable from the preceding task stage of operant nose poke learning, as illustrated by BxD line 43 showing