Large-scale assessment of olfactory preferences and learning in Drosophila melanogaster: behavioral and genetic components

In the Evolve and Resequence method (E&R), experimental evolution and genomics are combined to investigate evolutionary dynamics and the genotype-phenotype link. As other genomic approaches, this methods requires many replicates with large population sizes, which imposes severe restrictions on the analysis of behavioral phenotypes. Aiming to use E&R for investigating the evolution of behavior in Drosophila, we have developed a simple and effective method to assess spontaneous olfactory preferences and learning in large samples of fruit flies using a T-maze. We tested this procedure on (a) a large wild-caught population and (b) 11 isofemale lines of Drosophila melanogaster. Compared to previous methods, this procedure reduces the environmental noise and allows for the analysis of large population samples. Consistent with previous results, we show that flies have a preference for orange vs. apple odor. With our procedure wild-derived flies exhibit olfactory learning in the absence of previous laboratory selection. Furthermore, we find genetic differences in the olfactory learning with relatively high heritability. We propose this large-scale method as an effective tool for E&R and genome-wide association studies on olfactory preferences and learning.


INTRODUCTION
Ongoing evolutionary dynamics and genotype-phenotype mapping can be studied during experimental evolution through subsequent phenotyping and genomic sampling (Travisano & Lenski, 1996;Wiser, Ribeck & Lenski, 2013). This method is known as Evolve and Resequence (E&R) (Turner et al., 2011) and can be applied to entire populations by sequencing at the same times hundreds of individuals (Pool-seq, see Futschik & Schlötterer, 2010) of the same population. Thanks to the advancements of high-throughput sequencing techniques, the E&R method has been used to track the changes in genomic composition not only across thousand of generations in bacteria (Wiser, Ribeck & Lenski, 2013) but also in eukaryotes with a fast life cycle, such as yeast (Barrick & Lenski, 2013) and fruit flies (Schlötterer et al., 2015). This approach provides a very promising opportunity to investigate the evolution of complex traits and their genetic architecture with a limited budget (Schlötterer et al., 2014), thus paving the way to the analysis of the evolutionary dynamics of traits that cannot be inferred through fossil records, including complex behavioral phenotypes (Versace, 2015).
The first implementation of E&R on a complex behavior focused on phenotypic and genomic change in response to artificial selection for shorter/longer inter-pulse interval in male courtship song in Drosophila melanogaster (Turner & Miller, 2012). In this study, thousand of loci have been identified that responded to artificial selection and differed between populations selected for different behaviors. Similar outcomes, with thousand of alleles that significantly change in frequency between generations and treatments, have been found also for morphological or physiological traits (Orozco-terWengel et al., 2012;Tobler et al., 2014), showing that the same methodological issues apply to behavioral and other traits. Despite the success in identifying some causative genes (e.g., Zhou et al., 2011;Martins et al., 2014), theoretical (Schlötterer et al., 2015Kofler & Schlötterer, 2014) and empirical evidence (Tobler et al., 2014;Franssen et al., 2015) has clarified that many of the significantly changed variants are in fact false positives derived by short or long-distance linkage disequilibrium.
Another limit that E&R shares with other genome-wide approaches, is low statistical power in identifying unknown causative variants (Rockman, 2012;e.g., Kofler & Schlötterer, 2014). Although haplotype-blocks can be used to study the dynamics of selected genomic regions during experimental evolution (Franssen et al., 2015), an effective E&R study should primarily minimize the false positives rate and maximize statistical power. As shown in recent theoretical and simulation work (Baldwin-Brown, Long & Thornton, 2014;Kessner & Novembre, 2015;Schlötterer et al., 2015;Kofler & Schlötterer, 2014), to reach this aim several issues have to be taken into account in the design of the experiment: (a) use a large starting population (possibly hundreds or thousands of individuals); (b) use a large population size; (c) use at least 5-10 replicate populations; (d) run the experiment for dozens of generations; (e) reduce linkage disequilibrium. In the light of this, during E&R researchers should phenotype and propagate thousands of individuals in multiple replicate populations for many generations (Kofler & Schlötterer, 2014;Versace, 2015). In E&R studies, a practical limitation is imposed by the stages of propagation and phenotyping: when flies have to be individually phenotyped and manipulated, the time and working load required can force a reduction of the census size. For this reason, investigating behavioral traits, that often require a large effort in phenotyping, poses a methodological challenge.
In this work, we focus on the development of a fast and reliable method for phenotyping and propagating medium-size and large-size populations of fruit flies assessed for olfactory behavior. This approach can be effectively used in E&R of olfactory behavior, in particular olfactory preferences and olfactory learning after conditioning with an aversive stimulus, as well as for any large-scale assay. Fruit flies show complex behaviors, can be easily maintained at a large census size, have a fast generation cycle and low linkage disequilibrium (Mackay et al., 2012), and thus are a convenient model to investigate the evolutionary dynamics and genotype-phenotype map of behavioral traits.
Olfactory behavior (olfactory preferences and olfactory learning) is a good candidate for E&R investigation of both spontaneous and learned responses, because of its remarkable conservation between Drosophila and vertebrates (Davis, 2005;Wilson, 2013) and the presence of standing variation for both olfactory preferences and olfactory learning (Mery et al., 2007;Ruebenbauer et al., 2008;Rollmann et al., 2010;Van den Berg et al., 2011). Genetic variability in the experimental population is crucial to apply E&R to fruit flies, because within the time scale of feasible experiments (50 generations of selection take about two years to be completed) new mutations have little impact on evolutionary change.
Different preferences for specific odors and odor concentrations have been documented in Drosophila using T-mazes and olfactometers (Kutsukake et al., 2000;Wang et al., 2003;e.g., Suh et al., 2004;Revadi et al., 2015). Phenotypic variability in olfactory behavior is associated with polymorphisms that influence reactions to different compounds (Wang et al., 2007;Wang et al., 2010) but to date E&R has not been applied to odor preferences (see Rhodes & Kawecki, 2004 for a similar idea). Olfactory learning has been extensively studied at the behavioral, genetic and neurobiological level (McGuire, Deshazer & Davis, 2005;Davis, 2005;Fiala, 2007;Tabone & de Belle, 2014). Wild and mutant flies have been tested in associative conditioning tasks, typically the association between an olfactory conditioned stimulus and an electric or mechanical aversive stimulus. In a first paradigm developed to measure olfactory learning, Quinn and colleagues (1974) tested groups of about 40 flies. From a starting tube flies could approach a light source at the end of a second tube painted with odor A (or B). When flies entered the second tube an electric shock was delivered and could be associated with odor A (or B). After being returned to the starting tube, flies could enter a third tube containing odor B (or A), that was not associated with any electric shock. At test flies could choose to enter either a tube with odor A or one with odor B. A performance index compared the fraction of flies that avoided the unpaired odor and those which avoided the shock-paired odor. Not all flies explored the tubes, and their performance was affected also by phototaxis, thus this small-scale assay produced low learning scores. Tully & Quinn (1985) modified the paradigm to test groups of about 100 flies in an apparatus in which odor A matched with pulses of electric shock was followed by odor B in the absence of electric shock. The odors were delivered by vacuum so that all flies were exposed to the odor-shock contingencies. After training, the flies were tested in a T-maze where could choose to approach either odor A or B. The learning scores for this procedure ranged between 0.7 and 0.9 (McGuire, Deshazer & Davis, 2005) but the need of dedicated machines and hands-on operations on the flies limit the application of this method to large-scale long-term experiments. Other methods used to score olfactory behaviors require individual handling/scoring of the flies (Swarup et al., 2013) and/or air flow and automated systems (Steck et al., 2012;Brown et al., 2013) that can hardly be implemented to the large scales required for long-term evolutionary experiments.
To date, the main method used in experimental evolution for enhanced learning (Mery & Kawecki, 2002;Dunlap & Stephens, 2014) is the oviposition paradigm. This method is a medium/large-scale procedure based on the habit of flies to use the same medium for foraging and egg laying. The procedure starts exposing hundreds of free ranging flies to olfactory (or visual) stimuli associated to palatable or aversive media displaced in petri dishes located in a box (e.g., orange juice smell associated with palatable food, apple juice smell associated with aversive flavor). After exposure to olfactory (or visual) and associated gustatory stimuli, flies are tested for their olfactory (or visual) preferences for stimuli previously associated or not associated with the aversive flavor. Compared to flies that do not remember the association, flies that remember the contingencies presented during the exposure phase are expected to lay more eggs in the substrate whose smell (or color) was never associated with aversive flavor. The proportion of eggs laid in the substrate associated with the palatable flavor is used as a proxy for learning.
To select for enhanced learning across generations, Mery & Kawecki (2002) rinsed, moved to a neutral medium and propagated only the eggs laid in the medium previously associated with palatable food (alternatively Dunlap & Stephens (2009) displaced eggs individually with a needle). As effect of this regime, in about 15 generations the proportion of flies which made the "correct" choice significantly increased. In spite of this, several aspects make the oviposition procedure less than ideal for E&R studies: (a) this method is prone to experimental noise, as shown by the lack of learning effects before selection; (b) only females are exposed to selection, because learning is measured using laid eggs, thus reducing the selective pressure to half of the propagated individuals and preventing to investigate behavioral domains different from oviposition and sex effects; (c) the oviposition paradigm imposes selection for fertility, egg laying during the few hours of the test and resistance to egg washing; (d) this paradigm does not control for the experience provided during the conditioning phases (many flies might not experience all stimuli before making a choice in the test phase); (e) extensive work to rinse/displace the eggs and propagate the flies is required, in turn reducing the number of experimental replicates that can be propagated.
Aiming to use E&R for investigating the evolution of behavior in fruit flies, we have developed a simple and effective method based on a T-maze to assess olfactory preferences and learning in large samples (hundreds) of fruit flies of both sexes as well as in smaller samples (dozens of individuals). This method can hence be used for any behavioral or genomic approach which requires medium or large samples. We find evidence for olfactory preferences and learning in a large population of D. melanogaster originally caught in South Africa and in a population of inbred lines originally caught in Portugal. Our procedure reduces the impact of undesired selective pressures and the effort in propagation and phenotyping. Furthermore, this method is sensitive enough to detect olfactory preferences and learning, and the heritability of these traits. We discuss the relevance of this T-maze based procedure for E&R and genome-wide association studies as well as behavioral/chemical ecology studies.

Subjects
All experiments were run on isofemale lines of the same species, D. melanogaster. Flies were maintained on standard cornmeal-soy flour-syrup-yeast medium, except during the experimental assays. Before the beginning of the experiments we kept all lines for at least two generations at 22 • C in a constant 14:10 h light:dark cycle.
The population-wide experiments were ran on 670 lines derived from a natural population of D. melanogaster collected in Paarl (South Africa) in March 2012. In each trial we used a group of 250 adult flies (males and females 2 days old or older), randomly picked from the 670 isofemale lines of the South African population.
The individual-line experiments were ran on 11 inbred lines derived from a D. melanogaster population of 113 lines originally collected during the summer in Povoa de Varzim (northern Portugal) in July 2008 and maintained as isofemale lines. From these isofemale lines, 11 inbred lines were generated through full-sibling mating of 3 lines of the base stock population (B101, B192 and B211), 8 lines of a population maintained for 53 generations in a hot regime at 25 • C (R1, R2, R3, R5) or for 33 generations in a cold regime at 18 • C (R6, R7, R9 and R10). The hot and cold temperature regimes are described in detail in Orozco-terWengel et al. (2012) andTobler et al. (2014). Inbred lines were generated through full-sib mating for 17 or more generations (B101: 17 generations; B192: 18 generations; B211: 19 generations; R1 and R3: 27 generations; R2: 29 generations; R5: 21 generations, R6 and R9: 20 generations; R7 and R10: 22 generations). For each line a virgin female and a randomly collected male were allowed to mate and from their offspring another virgin female and a random male were used to create the next generation. After inbreeding, these lines were kept as isofemale lines until the experimental assays. In each trial, we used a group of 40 flies (males and females 2 days old or older) of the same line.

Apparatus and stimuli
The T-maze (31 × 17.5 cm) used for the experimental assays ( Fig. 1A) consisted of a starting chamber and a central chamber (12 × 8 × 1.5 cm) connected on each side to a food chamber. The starting chamber (9.5 × 2.5 cm) contained the flies at the beginning of each experimental phase. Food chambers (9.5 × 2.5 cm) were filled with 4 ml of experimental food. In each experimental phase, flies begun the exploration of the apparatus from the starting chamber. The central chamber was connected to the food chambers with a funnel that prevents flies to re-enter the central chamber once they have approached the food. A similar trapping technique has been previously used for fruit flies.
Experimental media were prepared with juice fruit (either orange or apple juice from 100% concentrate) and agar (14 g/l). Aversive flavor was obtained adding 8 g/l of quinine to the experimental medium. The use of complex mixtures of odorants, as those found in fruit juice, is justified both by the previous literature in the field (Mery & Kawecki, 2002) and by chemical ecology studies which showed that in D. melanogaster most olfactory receptors are responsive to complex mixtures of fruit odors and many fruit odors strongly activate multiple receptors (Hallem & Carlson, 2006). Hence, complex fruit odorants provide a wide evolutionary basis of experimental evolution studies.

Olfactory preferences
The olfactory preferences assay is based on overnight starvation (15-16 h) followed by 2 h exposure to the first odor (Exposure 1), 2 h exposure to the second odor (Exposure 2), 4 h of starvation, and 2 h of Test.
We assessed preferences for apple and orange odor in the absence of punishment by using the same procedure described for the learning assays (see below), with the only difference that no food supplemented with quinine (aversive stimulus) was provided during the exposure phases. By comparing the results obtained with this procedure and those obtained with the aversive stimulus we could evaluate the role of the aversive stimulus in determining olfactory choice.

Olfactory learning assays
The learning assay is based on overnight starvation (15-16 h) followed by 2 h exposure to the first odor-flavor contingency (Exposure 1), 2 h exposure to the second odor-flavor contingency (Exposure 2), 4 h of starvation and 2 h of Test.
We used CO 2 anesthesia to collect flies and starve them 15-16 h before the beginning of the conditioning procedure. After starvation flies were moved to the starting chamber for Exposure 1.
In a pilot study we observed that in a time course of two hours flies tend to explore only one arm of the T-maze. Hence we decided to expose flies serially to the olfactory and taste stimuli, presenting the first contingency in Exposure 1 and the second contingency in the subsequent Exposure 2. During Exposure 1, for two hours flies were exposed to the odor associated with the aversive flavor and could approach the aversive flavor (e.g., orange odor and orange juice supplemented with quinine) located in the food chambers. To make sure to score the first choice for all the flies and reduce the exposure to food, after one hour of exposure we substituted the food chambers with new vials filled with the same type of experimental food and moved the trapped flies in an empty vial. We repeated the same procedure at the end of Exposure 1. Flies who entered the food chambers during Exposure 1 were moved to the starting chamber for Exposure 2. During Exposure 2, for two hours flies were exposed to the odor associated with the palatable flavor and the palatable flavor (e.g., apple odor and apple juice). After one hour of exposure, we substituted the food chambers with new vials filled with the same type of experimental food and moved the trapped flies in an empty vial. We repeated the same procedure at the end of Exposure 2. Flies who entered the food chambers during Exposure 2 were starved for four hours prior to the Test.
In half trials we conditioned flies on apple odor associated with aversive flavor and orange odor associated with palatable food (A-/O), and also conditioned flies on orange odor associated with aversive flavor and apple odor associated with palatable food (O-/A). It has been shown that in D. melanogaster appetitive long-term memory occurs after single-cycle training also in the absence of fasting (Krashes & Waddell, 2008), while aversive long-term memory by single-cycle training requires previous fasting (Hirano et al., 2013). For this reason, the exposure to the aversive stimulus was always conducted immediately after fasting (Exposure 1).
Flies began the Test from the starting chamber. Differently from the exposure phases, during the Test the odor associated with the aversive flavor and the odor associated with the palatable flavor were presented simultaneously, each on a different food chamber (Fig. 1C). Across different trials, we alternated the right/left side in which the two odors were presented. No food was supplemented with quinine during this phase. We counted flies that chose to enter either the orange odor side or the apple odor side.

Data analysis
In the test for olfactory preferences between the orange and apple odor we compared the proportion of flies that chose the orange odor vs. the random choices level using a t-test single sample against the random choice proportion of 0.5. Beforehand we controlled for deviations from the normal distribution of the data using the Shapiro-Wilk normality test.
An order score significantly different from zero was expected if the order of presentation of the two odors/flavors had an effect in determining subsequent choices.
Similarly, in the test for conditioned preferences between the orange and apple odor, we compared the proportion of flies that chose the orange odor vs. the chance level using a t-test single sample against the random choices proportion of 0.5. Beforehand, we controlled for deviations from the normal distribution of the data using the Shapiro-Wilk normality test.
To obtain a measure of learning (learning score, l) we calculated the difference in the proportion of flies that in each trial chose orange odor after being conditioned on apple flavor vs. orange flavor:

l = (proportion orange choices O-/A flies) − (proportion orange choices A-/O flies) (2)
A learning score significantly different from zero was expected if the conditioning had an effect.
To investigate the heritable component of learning, we repeatedly tested the same inbred lines (n = 10) and derived the intraclass correlation t, which is an estimate of the genetic heritability (h 2 ) of learning for the tested population, using the variance between (Vb) and within (Vw) inbred lines (Hoffmann & Parsons, 1988): As discussed by David et al. (2005), the intraclass correlation can be used as a proxy for heritability (see also Hoffmann & Parsons, 1988;Betti, Soto & Hasson, 2014).

Population experiments: olfactory preferences and learning assays
To evaluate the sensitivity of our method in detecting olfactory preferences and learning in large groups of naturally derived fruit flies, we investigated the differences in olfactory preferences and learning in a large South African D. melanogaster population (see Data S1 and S2). In each trial, we tested 250 flies of both sexes.
In the olfactory preference experiment, across 28 test trials the overall population showed a marginally significant preference for the orange odor (t 27 = 1.953, p = 0.06). This preference is consistent with the preference for citrus previously documented in D. melanogaster (Mery & Kawecki, 2002but see Betti, Soto & Hasson, 2014e.g., Dweck et al., 2013), that is likely a behavioral strategy against the attack of parasitic wasps (Dweck et al., 2013).
Before testing flies, we exposed them to both odors/flavors: in half trials flies were exposed first to orange then to apple (O/A), in half trials first to apple then to orange (A/O). We have derived the order effect score o to investigate the effect of the order in which the orange/apple stimuli had been presented. We observed a significant order effect score (t 13 = 3.09, p = 0.009; Fig. 2B), indicating that A/O flies (flies first exposed to Apple, then to Orange) had a significantly higher preference for orange odor than O/A flies (flies first exposed to Orange, then to Apple). A post-hoc t-test on A/O and O/A flies vs. the chance In the conditioning experiment, the overall population showed a preference for the orange odor (mean = 0.59, t 57 = 3.95, p < 0.001). Flies that previously experienced Apple as aversive/Orange as palatable (A-/O) were more likely to choose orange than flies exposed to the opposite contingency (O-/A) (t 56 = 2.24, p = 0.029; Fig. 3A). The population showed a significant learning score (t 28 = 2.88, p = 0.007; Fig. 3B).
We also checked for differences in the proportion of orange choices after the conditioning procedure and after the olfactory preference exposures. Overall, after conditioning flies had a stronger preference for orange odor than in the absence of conditioning (t 81 = 2.50, p = 0.014), thus indicating a significant effect of the conditioning procedure. These results suggest that exposure to aversive stimuli can influence the preferences of flies towards specific odors/flavors. Given that conditioning with the aversive stimulus was done more than 4 h prior to the test, we showed that our procedure is sensitive to memory capabilities that last at least 4 h.
When comparing samples that had the same order of presentation of apple and orange odor in the olfactory preference and conditioning and procedure, we observed a significant difference for the A/O but not for the O/A presentation (A-/O: t 40 = 2.49, p = 0.017; O-/A: t 39 = 1.24, p = 0.22). These results indicate that the conditioning procedure is more effective in the A-/O exposure than in the O-/A exposure. As suggested by an anonymous reviewer, this could result from the unconditioned preference for the orange stimuli, that produces an asymmetry in the strength of conditioning, with the O-stimulus less effective in conditioning than the A-stimulus.

Inbred lines: olfactory preferences
To study olfactory preferences in 11 inbred lines of D. melanogaster derived from a population collected in Portugal, we used the same procedure adopted for the large population using 40 flies from the same isofemale line in each trial (see Data S3). Line R6 consistently did not enter the food chambers in both experiments, so we excluded this line and run the analyses in the ten remaining lines.
In the olfactory preference assay, the overall distribution of the orange odor choices was significantly different from the normal distribution (Shapiro-Wilk normality test: W = 0.98, p = 0.03) and we analyzed the data using non-parametric tests (Wilcoxon signed-rank test and Kruskal-Wallis test). Overall, the group of ten responsive lines showed an olfactory preference for the orange odor (mean = 0.56; V = 7862, p < 0.001; Fig. 4A). We did not observe significant differences across lines (Kruskal-Wallis Chi squared 9 = 14.14, p = 0.12; see We calculated the order effect score-(proportion of orange odor choices after A/O exposure − proportion of orange odor choices after O/A exposure)-for the overall sample of ten lines tested and found no significant effect (V = 1300.5, p = 0.23).

Inbred lines: learning assays
To study learning in the 11 inbred lines of D. melanogaster derived from a population collected in Portugal, we used the same procedure adopted for the large population using 40 flies from the same isofemale line in each trial (see Data S4). Line R6 consistently did not enter the food chambers in both experiments, so we excluded this line and run the analyses in the ten remaining lines.
In the learning assay, the overall distribution of the orange odor choices was significantly different from the normal distribution (Shapiro-Wilk normality test: W = 0.98, p = 0.005) and we analyzed the data using non parametric tests (Wilcoxon signed-rank test and Kruskal-Wallis test). Overall, after the conditioning procedure the ten responsive lines showed a preference for orange odor (mean = 0.59, Wilcoxon signed-rank test, V = 8630, p = 9.159x 10−6 ), Fig. 5A. No significant difference in the overall choices was observed between the olfactory preference and the learning assay (W = 13655.5, p = 0.30). Differently from the olfactory preference assay though, significant differences in the proportion of orange choices were apparent between the two conditioning treatments (A-/O vs. O-/A: Kruskal-Wallis Chi square: 34.93, p = 3.424x 10−9 ) (Fig. 5A), and we documented a significant learning effect (Fig. 5B). We also detected significant differences in the proportion of orange choices between lines (Kruskal-Wallis Chi square: 23.45, p = 0.005).
We ran post-hoc tests to evaluate the performances of each line in the A-/O and O-/A conditioning procedure. For each line we measured the proportion of orange choices after conditioning with A-/O and O-/A (Fig. 6). After using the Bonferroni-Holmes correction for multiple comparisons, we found that in the A-/O procedure three lines (R10, R7, R9)

Figure 6 Proportion of orange odor choices for each tested line conditioned with Orange aversive/Apple palatable (O-/A exposure) and Apple aversive/Orange palatable (A-/O exposure).
had a preference significant at 5% level for orange and four lines (R1, R2, R3, R5) had a preference significant at 10% level, whereas in the O-/A procedure only R5 had a preference significant at 10% level. These results indicate that most of the tested inbred lines are able to discriminate between apple and orange odor.
We calculated the learning score-(proportion of orange odor choices after A-/O exposure − proportion of orange odor choices after O-/A exposure)-for the overall population and found a significant effect of learning (mean = 0.22, V = 2752.5, p < 0.001). Since there was a significant effect of Line in the learning score (Kruskal-Wallis Chi square = 22.15, p = 0.008) we ran also post-hoc tests on each line (Fig. 6). All lines exhibited a higher proportion of orange choices after being conditioned with Orange as palatable stimulus (and all lines increased the proportion of orange odor choice after conditioning compared to the olfactory preference assay, see Fig. 4B). Using the Bonferroni-Holmes correction for multiple comparisons, we found that three lines (R3, R7, R9) showed a learning score significant at 5% level and three lines (R1, R2 and R5) that were significant at 10% level. These results suggest that most of the tested lines are able to learn through our conditioning procedure.

Inbred lines: heritability of olfactory behavior
We derived an estimate of the genetic heritability of olfactory preferences and olfactory learning using the variance between (Vb) and the variance within (Vw) lines and calculating the intraclass correlation t as a proxy for heritability in inbred lines (Hoffmann & Parsons, 1988;David et al., 2005).
In the olfactory preferences, the variability between lines (Vb = 0.09) was larger than the variability within lines (Vw = 0.06) and the intraclass correlation is t = 0.6. The same pattern holds true for olfactory learning: the variability between lines (Vb = 0.017) is much higher than the variability within lines (Vw = 0.004), thus leading to t = 0.80. The high intraclass correlations show a moderate to high heritability of olfactory preferences and learning and suggest that our method is suitable to investigate these traits.

DISCUSSION
Historically, the evolutionary dynamics of behavioral traits have been particularly hard to catch. This is not only due to lack of fossil record as a tool to help reconstructing evolutionary change but also to limits in investigating organisms with a complex behavior for hundreds or thousands of generations, as can be done with yeast (e.g., Goddard, Godfray & Burt, 2005) and bacteria (Wiser, Ribeck & Lenski, 2013). Drosophila is a model system which shows either complex behavior and a life cycle fast enough for being studied in experimental evolution. For instance, in few generations of targeted selection it has been possible to obtain a significant increase in learning in a wild-derived population of Drosophila melanogaster (Mery & Kawecki, 2002), or to change its responsiveness to specific (odor-flavor or color-flavor) associations (Dunlap & Stephens, 2014). These behavioral findings have not been accompanied by a correspondent genomic investigation, partly due to the costs and difficulties associated until recently to this enterprise. The recent development of high-throughput sequencing technologies, together with advancements in statistical and bioinformatics tools, has changed this scenario. In particular, using the Evolve and Resequence method (Turner et al., 2011) entire populations can be propagated and investigated for genomic changes at subsequent time points by sequencing collectively hundreds of individuals (a method known as Pool-seq, see Futschik & Schlötterer (2010)). This approach paves the way to the analysis of complex behavioral phenotypes such as olfactory preferences and learning in Drosophila (Versace, 2015).
Empirical (Turner & Miller, 2012;Orozco-terWengel et al., 2012;Tobler et al., 2014;Franssen et al., 2015) and theoretical studies (Schlötterer et al., 2015;Kofler & Schlötterer, 2014) have shown the current limits of E&R in terms of false positive and false negative errors. Different strategies have been suggested to reduce the error rate and increase the efficiency of this method in investigating evolutionary dynamics and the genotypephenotype link (Baldwin-Brown, Long & Thornton, 2014;Schlötterer et al., 2015;Kofler & Schlötterer, 2014), including propagate and phenotype large samples of several replicate populations for multiple generations. The oviposition method (Mery & Kawecki, 2002) is the experimental paradigm currently used for experimental evolution of learning in fruit flies (Mery & Kawecki, 2002;Dunlap & Stephens, 2009;Dunlap & Stephens, 2014). This procedure though is not optimal for E&R due to drawbacks in the effort required for propagation, reduced sample size and selective pressures for traits different from those of interest. With the aim of overcoming these limitations, we have established a method based on subsequent exposure and test in a T-maze used to assess olfactory preferences and learning in large (hundreds or even thousands of individuals) and medium/small (dozens) samples of fruit flies. With our experimental schedule (4 h of exposure following starvation, 4 h of starvation, 2 h of test) it is possible to conduct the assay in a single day, during the light cycle of not sleep-deprived fruit flies. This simple procedure can impose selection on both sexes and does not entail selection for fast egg laying and egg washing to propagate flies.
We have used the T-maze procedure to investigate olfactory preferences and learning in a large population of D. melanogaster originally caught in South Africa and in 11 inbred lines of another population of D. melanogaster originally caught in Portugal. Overall both populations show a preference for orange vs. apple odor. The preference for citrus media had been previously documented in D. melanogaster (Mery & Kawecki, 2002;Dweck et al., 2013;but see Betti, Soto & Hasson, 2014). We find a significant effect of learning in both populations. The presence of olfactory learning in the absence of selection for this trait shows the sensitivity of our method. The fact that we show learning and olfactory preferences in fruit flies of both tested populations, in the absence of selective pressures, suggests that this method can be successfully applied to different genetic pools.
Small population size and inbreeding negatively affect the resolution of genomic scans (Schlötterer et al., 2015;Kofler & Schlötterer, 2014;Franssen et al., 2015), thus limiting the power of E&R and genome-wide association studies. This limitation could be overcome using the T-maze procedure on large samples, since no individual handling of eggs is necessary in this method, and it is not necessary the separate male and female flies.
Moreover, repeatedly testing inbred lines we have detected genetic differences in olfactory behavior between lines. Differences between wild-derived inbred lines have been previously documented (Nepoux, Haag & Kawecki, 2010;Nepoux et al., 2015). We have also calculated the intraclass correlation t (Hoffmann & Parsons, 1988;David et al., 2005) as an estimate of heritability, showing a medium/high heritability for the investigated traits.
We have showed that our method is suitable to be used with large samples with the aim of investigating the evolution of spontaneous preferences and learning performances in large groups of fruit flies with limited effort. The ease of implementation of our procedure will enable researchers to investigate the perceptual/learning abilities of their population of interest, including variability between strains, the phenotypic distribution of large populations, and the hereditability of these traits, before starting long-term projects. On this basis, we suggest to use T-mazes in large-scale experiments as a tool for E&R and genome-wide association studies on olfactory preferences and learning and for other traits. Our experimental paradigm can be easily adapted for the needs of chemical ecology and pest management to assess olfactory behavior with different odors and odor concentrations in a comparative perspective; to test visual behaviors by changing the color/texture of the central and food chambers; and for investigating navigation performance by increasing the number of food chambers, as well the role of social information transmitted (Kohn et al., 2013; see for instance Battesti et al., 2015) by one or both sexes. Varying the delay between conditioning and test it is possible to investigate the duration of memory. We suggest the use of large-scale T-mazes to widen the