Introduction

The emerging field of ecological genomics strives to uncover the genetic and molecular mechanisms influencing organisms' responses and adaptations to their natural environments (Feder and Mitchell-Olds, 2003; van Straalen and Roelofs, 2006; Ouborg and Vriezen, 2007; Ungerer et al., 2008). To achieve this aim requires species with a well-developed genomic tool-kit and a good understanding of ecologically important phenotypes. Rarely can both aims be met simultaneously. Molecular biological and genomic tools have been developed primarily for traditional model and agricultural organisms representing a narrow spectrum of phenotypes, whereas many organisms that are the focus of ecological research have had limited genomic resources devoted to them and/or are genetically intractable. Two solutions for uniting genomic and ecological approaches are to explore the natural variation found within model or crop species, or alternatively, to turn to close relatives for which a broader range of adaptive traits can be investigated. These approaches have been successfully applied to the understanding plant chemical defenses against insect herbivores by using the model species, A. thaliana and Brassica crops, (both members of the family Brassicaceae) as well as their near relatives (Koornneef et al., 2004; Alonso-Blanco et al., 2005; Clauss and Koch, 2006; Padilla et al., 2007; Schranz et al., 2007a). In this report, we examine chemical defenses and their effects on insect herbivory in Boechera stricta, a close-relative of A. thaliana, by quantitative trait locus (QTL) analysis for a set of glucosinolates not present in Arabidopsis or Brassica (Windsor et al., 2005).

The major chemical defense in the Brassicaceae and related families is the two-component glucosinolate–myrosinase complex (Rodman et al., 1998; Windsor et al., 2005) that can influence levels of feeding damage and oviposition by many insect herbivores (particularly Lepidoptera) (Burow et al., 2006; Beekwilder et al., 2008). Generalist insects are sensitive to glucosinolate-based plant defenses, whereas specialists can cope with these compounds (Kliebenstein et al., 2005; Windsor et al., 2005; Grubb and Abel, 2006; Halkier and Gershenzon, 2006). Levels of herbivory generally decrease the fitness of the host plants and it has been shown that herbivores can act as agents of natural selection on plant chemical defenses (Mauricio and Rausher, 1997). When variable secondary compounds are found within a plant species, it is tempting to assume that there will also be differences in defense towards herbivores. However, this must be experimentally validated particularly when the defensive compound(s) being tested are not well studied, such as glucosinolates derived from branched-chain amino acids (BCAA).

In Arabidopsis, naturally occurring genetic polymorphisms and loss-of-function mutations have allowed identification of genes encoding glucosinolate biosynthetic enzymes (recent reviews by Grubb and Abel, 2006; Halkier and Gershenzon, 2006). The biosynthesis of glucosinolates occurs in three steps: chain-elongation of the amino acid (for Met and Phe only), formation of the core structure and side-chain modifications. Glucosinolates are synthesized from one of eight amino-acid precursors. The chemicals derived from Met, Val, Ile, Leu or Ala are classified as aliphatic glucosinolates, those derived from Trp are called indole glucosinolates and those derived from Phe or Tyr are referred to as aromatic glucosinolates. Two consecutive reactions in the core pathway, catalyzed by the specific cytochrome P450s (CYP79 and CYP83 gene families), convert the amino acid to a S-alkylthiohydroximate (Kliebenstein et al., 2005; Halkier and Gershenzon, 2006). Although structural variation among glucosinolates is derived from the precursor amino acids, modifications of the R-groups also contribute enormous diversity of this class of compounds. The carbon chain-elongation of methionine-derived glucosinolates in particular gives rise to much of the diversity of glucosinolates found in the Brassicaceae. Chain elongation is controlled by the GS-Elong locus, which encodes several methythioalkylmalate synthases (MAM). The final step in glucosinolate biosynthesis is possible side-chain modification, for example, acylation, alkenylation, esterifcation, hydroxylation and oxidation.

Genomic and ecological approaches can by integrated through detailed investigations of natural variation displayed by the model species. An example is the molecular evolutionary analysis of variation in Met glucosinolate chain-elongation among Arabidopsis ecotypes. Recent analysis has shown that glucosinolate chain-length variation, and resistance to feeding by the generalist herbivore Spodoptera exigua, is because of complex allelic and copy number variation of MAM genes at the GS-Elong locus (Kroymann et al., 2003). Comparative genomic studies have also implicated the MAM genes in natural genetic variation for glucosinolate chain elongation in the Brassica crops (Li and Quiros, 2002; Gao et al., 2006) and in Arabidopsis lyrata (Clauss et al., 2006; Heidel et al., 2006). In A. lyrata, genetic variation at GS-Elong determines biochemical variation in glucosinolate side chains both within and among populations, showing widespread functional polymorphism at GS-Elong. Although Met-derived glucosinolates are the dominant constitutive glucosinolates expressed in the leaves of Arabidopsis and Brassica species, we have little information about the possible ecological or adaptive role of aliphatic glucosinolates derived from other amino acids.

Transgenic methods can be employed to explore phenotypes beyond the natural range of variability (for example, glucosinolates derived from other non-methionine amino acids). Transgenic over-expression of CYP79s, such as, those coming from the cyanogenic pathway of non-glucosinolate producing species is one means of generating novel glucosinolates. Introduction of CYP79D2 from the biosynthetic pathway of the cyanogenic glucosides from Cassava (Manihot esculenta) (Mikkelsen and Halkier, 2003) led to the accumulation of glucosinolates derived from the BCAA, valine and isoleucine; namely, 1-methyl ethyl (isopropyl) and 1-methylpropyl, which are normally absent or present in trace amounts in Arabidopsis (Brown et al., 2003). When these transgenic plants were assayed with pathogenic fungi and bacteria, it was found that the Cyp79D2 transgenic line provided enhanced resistance to some pathogens, such as Erwinia carotovora (Brader et al., 2006). However, the transgenic Cyp79D2 plants still produced predominantly endogenous Met-derived glucosinolates (Mikkelsen and Halkier, 2003).

Another approach to studying the effects of non-Met glucosinolates on herbivore defense is to examine additional species having variation beyond what is available in Arabidopsis. In a recent survey of glucosinolate diversity among the relatives of Arabidopsis, several examples of dominant non-Met glucosinolates were found (Windsor et al., 2005). In particular, the genus Boechera was found to have some lines that predominantly produced methionine glucosinolates, (especially 6-methylsulfinyl hexyl) and others that predominantly produced valine-derived glucosinolates (1-methyl ethyl) (also, see Supplementary data file, which shows this variation both within and between populations).

The genus Boechera contains an array of morphologically and ecologically diverse taxa with highest diversity in western North America. The genus has a base chromosome number of x=7 and contains both diploid sexuals, as well as diploid, aneuploid and polyploidy apomictic lines (Kantama et al., 2007). Boechera stricta is one of the most morphologically and molecularly well-defined Boechera species; it is predominantly diploid, sexual, highly self-fertilizing and most accessions form of a monophyletic group (Dobeš et al., 2004; Schranz et al., 2005), making it a good system for ecological genomic studies. To this end, we have also created a number of genomic resources, including extensive comparative analyses with Arabidopsis. The B. stricta line, SAD12 was the focus of a partial genomic sequencing project in which we analyzed over 39 000 paired end-sequences (Windsor et al., 2006). We found that genic and inter-genic regions are very similar to A. thaliana, and conserved microsynteny can be used for rapid identification and cloning of Boechera genes (Windsor et al., 2006). These results were used to generate 196 microsatellite and candidate gene molecular markers and to create an F2 genetic map of B. stricta based on 192 individuals (Schranz et al., 2007b). The genetic map also established the detailed syntenic relationships to Arabidopsis and to the n=8 species, such as A. lyrata and Capsella rubella (Schranz et al., 2006b).

The parents used for our genetic cross came from two diverse habitats and differ in their glucosinolate profiles. The maternal SAD12 locality in Colorado is a sagebrush grassland in a river valley occurring at an elevation of 2530 m. This population produces predominately Met-derived 6-methylsulfinyl hexyl glucosinolate (6MSOH). The paternal Lost Trail Meadow (LTM) population from Idaho grows in a subalpine meadow occurring at an elevation of 2390 m. Val-derived, 1-methyl ethyl glucosinolate (1ME) is the predominant glucosinolate produced by LTM. Hence, Boechera provides an excellent opportunity to evaluate the effects that differences in glucosinolate amino-acid precursors have on insect herbivory and whether these differences are adaptive. In this study, we perform QTL analysis on ecologically relevant quantitative variation in glucosinolates between Boechera populations and assess the effects of this variation on feeding by the generalist lepidopteran herbivore, Trichoplusia ni (Noctuidae). These efforts are an important step towards the cloning of an ecologically variable trait that will allow for future direct testing of ecological hypotheses.

Materials and methods

Plant materials and genetic map construction

As described earlier (Schranz et al., 2007b), a genetic cross was made between two highly inbred lines of Boechera stricta (Graham) Al-Shehbaz. The maternal line ‘SAD12’ was collected in Gunnison County, CO, USA and the paternal line ‘LTM’ was collected in Lemhi County, ID, USA. Details about the plant populations, locations and the methods for genetic crossing were reported earlier (Schranz et al., 2005).

A total of 192 F2 lines were grown, with seeds placed on moist filter paper in sealed Petri dishes and cold-treated at 4 °C for 3 weeks in the dark. The Petri dishes were then transferred to a growth chamber until seed germination. The germinated seedlings were transferred to 96-well flats. Seedlings were grown for 4 weeks and then transplanted to pots (11 × 11 × 13 cm). The plants were grown in a controlled growth room at 21–23 °C under long-day conditions (16 h light and 8 h dark).

When plants were 8 weeks old, we sampled young leaves from each line directly into liquid nitrogen to minimize breakdown of glucosinolates. The two most recent fully expanded leaves (0.1 g) per line were first harvested to determine constitutive glucosinolate composition (see below). The collected tissue samples were freeze-dried and the dry weight of each sample was measured. Tissue was then harvested for DNA isolations from each of the F2 lines used for genetic map construction (Schranz et al., 2007b).

The F2 lines were then allowed to grow for an additional 4 weeks, and were subsequently vernalized by transfer to 4 °C under long-day conditions (16 h light and 8 h dark) for 4 weeks. Post-vernalizaton, the plants were transferred back to 21–23 °C. All lines flowered in approximately 3–5 weeks and were allowed to self-fertilize. The resulting F3 seeds were harvested from each line.

To investigate the genetic variation for insect resistance among lines of B. stricta, we used a randomized complete blocks design with six replicates for each line. The 192 F3 lines were divided equally among six blocks (each block was constituted by four 48-pot flats). In October 2006, ten F3 seeds per line were placed into Petri dishes and subjected to 4 °C for cold stratification for 4 weeks. Once germinated, six seedlings per line (192 × 6=1152 plants in total) were individually grown in 5 × 5 cm pots filled with standard soil (Fafard 4p mix; MA, USA). These pots were arranged randomly in flats (48 pots per flat). Plantlets were grown in the Duke University greenhouse and placed under controlled growth conditions. After ca. 4 weeks, plants were moved to a growth chamber and analyzed for insect herbivory (below).

Insect feeding, herbivory assessment and statistical analysis

An insect feeding trial was conducted using the generalist lepidopteran, Trichoplusia ni (cabbage looper, Noctuidae). Second instar-larvae were ordered from Benzon Research Inc. (Carlisle, PA, USA) and were available for experimentation 24 h before the beginning of the feeding trial.

Without any earlier starvation period, a single second-instar T. ni larva was placed on each plant. To avoid larvae roaming among plants and to guarantee an individual insect treatment for each plant during the whole experiment period (48 h), each plant-larva pair was encapsulated using a cone-shaped pipe made of plastic (5 cm diameter × 14 cm high). Preliminary observations determined that the use of these plastic pipes did not affect feeding behavior or survivorship of T. ni larvae even over longer periods (72 h). After 48 h of feeding, larvae were removed from the plants and herbivory levels for each plant recorded. We recorded both the proportion of leaves with some sign of insect herbivory and a visual estimation of the proportion of tissue removed per leaf (ranging from 1–100; see (Kliebenstein et al., 2002) for a similar procedure). The percentage of leaf area damaged was calculated as the product of the proportion of leaves harmed and the average proportion of tissue removed per leaf. In addition, all plants were investigated for the presence or absence of insect larvae at the end of the experiment; any plants lacking a larva were excluded from the analysis (8 plants out of 1152). As plant size may influence both the type and the concentration of glucosinolates in Brassicaceous species (Petersen et al., 2002; Brown et al., 2003), rosette diameter was recorded 24 h before the experiment as an estimate of plant size.

To obtain least square means for T. ni herbivory for each line, we fitted a generalized linear mixed model using the GLIMMIX procedure in SAS 9.1.2 (SAS Institute, 2005). In this model, we considered the response variable (that is, percent plant damage per plant) to have a binomial distribution and logit link function. ‘Line’ was included in the model as fixed-effect factor, and flat-nested-within-block was treated as random-effect factor. Plant size (that is, plant diameter, see above) was a covariate included in the model to control for developmental differences among plants (see (Kliebenstein et al., 2002). Finally, least square model-adjusted means for herbivory were obtained using the ILINK option of the LSMEANS statement in the GLIMMIX procedure (SAS Institute, 2005).

Glucosinolate analysis

Glucosinolate extraction and purification on parental and F2 leaf and seed material was carried out according to Kroymann et al., 2001. High-pressure liquid chromatography with a diode-array detector was performed as described (Kliebenstein et al., 2001a; Reichelt et al., 2002) using an Agilent 1100 high-pressure liquid chromatography, with 96-cell extraction protocols and a 96-cell autoloader. Identification of desulfo-glucosinolates was carried out by comparison of retention times and UV absorption spectra at A229 nm with those of known standards and of peaks that were confirmed by electrospray mass spectrometry (LC/MS) or LC/MS/MS as reported by Windsor et al., 2005.

Glucosinolate concentrations were inferred from high-pressure liquid chromatography peak areas and were normalized by dividing by the dry weight of the sample. The glucosinolate concentrations were then log-transformed before analysis. Ratios among compound classes were calculated to reflect enzyme function at key steps in glucosinolate biosynthesis (as described by Clauss et al., 2006).

QTL analysis

Quantitative trait locus mapping of the log-transformed F2 glucosinolate profiles and the least square model-adjusted means for herbivory based on the six replicated F3 lines was done using QTL Cartographer for Windows, (Wang et al., 2001–2006) employing multiple interval mapping (MIM) (Kao et al., 1999; Zeng et al., 1999) and composite interval mapping (CIM) (Zeng, 1993, 1994; Jiang and Zeng, 1995). CIM tests were performed at 1-cm steps, with a 10-cm window size. Genome-wide threshold values were determined at the 5% level by running 1000 permutations (Churchill and Doerge, 1994). The QTL detected in CIM analysis was used as an initial model for MIM mapping. These tests in MIM were performed in 1-cm steps, with Bayesian information criterion model selection. One- and two-LOD (logarithm of odds) support limits for each QTL were calculated around the QTL positions. The results of QTL analysis, including the 1- and 2-LOD confidence intervals, were drawn using MapChart 2.2 (Voorrips, 2002).

Results

Glucosinolate variation and QTL analysis

The leaves of the parental genotypes contained predominately aliphatic glucosinolates derived from either the amino acid methionine (4-methylsulfinylbutyl (4MSOB); 6-methylsulfinylhexyl (6MSOH); 7-methylsulfinylheptyl (7MSOH); 6-methylthiohexyl (6MTH); 7-methylthioheptyl (7MTH)) or from the BCAA (1-methylethyl (1ME) from valine; 1-methylpropyl (1MP) from isoleucine; 2-methylpropyl (2MP) from leucine). Both progenitor genotypes also displayed trace quantities of the tryptophan-derived indolic glucosinolate, 4-methoxy-indol-3-yl-methyl (4MO-I3M). Although the total quantities of glucosinolates were similar among genotypes, they differed in their relative quantities of particular compounds. The most prominent glucosinolate for the SAD12 parent was the methionine-derived, 6MSOH (also known as glucohesperin), whereas the LTM parent had higher concentrations of BCAA-derived glucosinolates 1ME (also known as glucoputranjivin) and 1MP (also known as glucocochlearin). We have also observed a similar pattern of glucosinolate variation both within and between populations of B. stricta (Supplementary data).

Quantitative trait locus analyses were performed using the weight adjusted and log-transformed glucosinolate amounts, as well as by analyzing the log ratio of total BCAA-derived to total Met-derived glucosinolates. Earlier quantitative work on glucosinolates has shown that ratios are an effective method for normalizing data (Clauss et al., 2006; Heidel et al., 2006). The predominant glucosinolate compounds, and the log ratio of BCAA-Met glucosinolates, had only one significant QTL located on the top of BstLG7 (Figure 1). The log ratio of BCAA-Met glucosinolates had the greatest level of support using MIM analysis, with a LOD score of 65.1 centered at 31.8 cm and had a very narrow 1-LOD confidence interval from 30.6 to 32.5 cm, encompassing the interval from markers Cyp83A to BstES007 and BstES0008 (Figure 1). CIM mapping gave very similar results with a LOD of 66.6. We have named this locus the branched-chain methionine allocation (BCMA) locus. The BCMA QTL has an R2=0.804, with the additive component accounting for 69.1% of the variance and the dominance component accounting for 11.3% of the variance.

Figure 1
figure 1

Positions of single QTLs detected for glucosinolate profile (BCMA) and levels of herbivory (PD), both located on linkage group BstLG7 of Boechera stricta. Marker names and cm positions for the top 65.0 cm of BstLG7 from the published F2 genetic map (Schranz et al. 2007b) are shown to the left. QTL analyses were performed using the Log ratio of total BCAA-derived to total Met-derived glucosinolates for both leaves and seeds of F2 individuals, giving one significant QTL named the Branched Chain-Methionine Allocation or BCMA locus. BCMA is shown in black to the right of the linkage group, with the 2-LOD confidence interval represented by bars, the 1-LOD confidence interval by a box and the maximal LOD score (of 119.4) by the horizontal line. Plant damage (PD) was calculated from 48-h feeding trials conducted with second instar larvae of the generalist lepidopteran, Trichoplusia ni. Analysis gave a single significant PD QTL, shown in white to the right of the linkage group, with the 2-LOD confidence interval represented by bars, the 1-LOD confidence interval by a box and the maximal LOD score (of 7.2) by the horizontal line.

The seeds of the F2 population had a similar spectrum of glucosinolates to that of the leaves. However, there were some major differences. In particular, there was a higher quantity of the thiol glucosinolates, 6MTH and 7MTH, than in the leaves (with more of the sulfinyl glucosinolates). MIM analysis showed a LOD score of 119.4 centered at 31.8 cm, with R2=0.934 for BCMA QTL. The additive component accounted for 73.7% of the variance and the dominance component for 19.7% of the variance. CIM mapping gave similar results, with a LOD score of 103.5 centered at 31.0 cm, and R2=0.923.

A summary of the variation in glucosinolate composition and how it varies with the genotype of BCMA (S alleles derived from SAD12 parent and L alleles from LTM parent) in the F2 population is presented in Figure 2. The figure shows how total leaf-glucosinolate content remains nearly constant, but that genotypes carrying SAD12-derived genotypes have greater amounts of Met-derived glucosinolates whereas LTM genotypes have greater amounts of BCAA-derived glucosinolates (Figure 2a). In seeds, there is a more uniform level of Met-derived glucosinolates, with variable BCAA-derived glucosinolates (Figure 2b). The segregation of the log ratio of the BCAA/Met glucosinolates in leaves and seeds used for QTL analysis by genotype is shown in Figure 2c.

Figure 2
figure 2

Summary of glucosinolate profiles and levels of herbivory given by genotype at BCMA locus of F2 individuals (SS=two alleles from maternal parent SAD12, LL=two alleles from paternal parent LTM, SL=heterozygote). Per gram dry weight adjusted Log-values of glucosinolates from leaves (a) and seeds (b), including methionine-deived (4-methylsulfinylbutyl (4MSOB); 6-methylsulfinylhexyl (6MSOH); 7-methylsulfinylheptyl (7MSOH); 6-methylthiohexyl (6MTH); 7-methylthioheptyl (7MTH)), BCAA-derived (1-methylethyl (1ME) from valine; 1-methylpropyl (1MP) from isoleucine; 2-methylpropyl (2MP) from leucine) and tryptophan-derived (4-methoxy-indol-3-yl-methyl (4MO-I3M)). Also, shown are the sums for Met-derived, BCAA-derived and Total glucosinolates. (c) The Log ratio of total BCAA-derived to Met-derived glucosinolates by genotype used for QTL analysis. (d) Levels of plant damage (PD) by genotype used for QTL analysis.

Insect feeding variation and QTL analysis

The analysis of plant damage (PD) by T. ni larvae showed significant differences among the 192 F3 families (GLMM result: F191,928=24.83, P<0.0001), suggesting extensive variation in insect resistance among the lines for insect feeding on B. stricta leaves. PD was also significantly affected by variation in plant size (F=4.19, P=0.041; estimate ±1s.e.: −0.0028±0.0013) and Flat (Z=0.047 P=0.014). Least square mean values of PD for each genotype were used to QTL map insect feeding. We similarly detected a correlation between generalist insect feeding damage and glucosinolate profiles in an analysis of four B. stricta populations (Supplementary data).

Quantitative trait locus analysis using PD had only one significant QTL by both MIM and CIM, located on the top of BstLG7 with a maximal LOD score of 7.2. The PD QTL is in the same position as BCMA, and is centered at 30.6 cm, and has a 1-LOD confidence interval from 27.2 to 31.8 cm (Figure 1). The PD QTL has R2=0.179, with the additive component accounting for 15.9% of the variance and the dominance component accounting for 2.1% of the variance. A summary of the variation in PD and how it varies with BCMA genotype is presented in Figure 2d.

Comparative genomics of BCMA region

The genetic map used in this study was earlier utilized to establish a comparative genomics framework between B. stricta, A. thaliana and n=8 chromosome relatives, such as A. lyrata and Capsella (Schranz et al., 2007b). The BCMA genomic region is located within genomic block T (genomic blocks defined in Schranz et al., 2006b). The centromere of BstLG7 likely resides between blocks S and T based on comparison with A. lyrata (Schranz et al., 2006b). However, the 2-LOD confidence interval for BCMA lies within block T (Figure 1) and there are normal levels of recombination between markers and nearly identical levels of recombination compared with the same interval of A. thaliana (which does not have a centromere near this genomic region). This supports the conclusion that the QTL does not reside within a low or non-recombining pericentromeric genomic region.

The 1-LOD confidence interval for BCMA spans only 1.9 cm. The genomic region shows high levels of synteny to the A. thaliana genomic region corresponding to the interval on chromosome 4 approximately from AT4G13410 to AT4G14250 (or nucleotide positions 7 790 000–8 215 000). This interval in Arabidopsis only encompasses 100 annotated genes and 420 kb of sequence. Given that the Boechera genome is slightly larger (1.7 × times the size of Arabidopsis (Schranz et al., 2006a)) this would still only equate to a genomic interval of around 700 kb. In addition, the peak LOD scores for BCMA and PD, suggests that the QTL may be restricted to the even smaller 1.1 cm window between markers Cyp83A (AT4G13770) and BstES0008 (AT4G14200), a region containing 50 genes and 200 kb in A. thaliana.

The genomic region on the top of BstLG7 near BCMA displays significant levels of gametic segregation distortion (Schranz et al., 2007b), with an under-representation of maternal SAD12 homozygous genotypes (17.9% of F2 progeny). So, although we did not find evidence of epistasis between BCMA and other genomic regions, segregation distortion within the genome could hamper our ability to detect such interacting loci. Ongoing backcrossing experiments of the BCMA region should help to resolve this possibility.

Discussion

The development of additional model species within the Brassicaceae, which display novel physiological and ecological traits, allows comparative genomic analysis of phenotypes that are not available in A. thaliana or Brassica crops (Schranz et al., 2007a). The diverse glucosinolate profiles that influence herbivory in Arabidopsis and other Brassicaceae arise largely from biosynthetic modifications in type and length of side chains, and in the steps that determine the amino-acid precursors of the glucosinolate skeleton. Species in the genera Arabidopsis and Brassica mostly produce chain-elongated glucosinolates derived from methionine in their leaves, and contain almost no glucosinolates derived from BCAA (Kliebenstein et al., 2001b; Windsor et al., 2005). In this report, we further the development of Boechera stricta as a system to study ecologically relevant phenotypic and genetic variation. In B. stricta, we have found a major glucosinolate polymorphism that controls the allocation to BCAA- vs methionine-derived glucosinolates. By QTL analysis, we have identified a single QTL with large effects on glucosinolate composition and small effects on insect resistance, which we have named the BCMA (Branched Chain Methionine Allocation) locus, influencing both the amino-acid precursors of glucosinolate production and defense against a generalist herbivore. The future cloning of the locus will allow us to test evolutionary hypotheses, including the potential role of balancing selection and local adaptation.

Variation and QTL analysis of glucosinolates and insect feeding

Boechera shows a major glucosinolate polymorphism controlling the allocation to BCAA- vs methionine-derived glucosinolates. The major methionine-derived glucosinolates are the six-carbon (6C) elongated 6MSOH and 6MTH in leaves and seeds, respectively. The distribution of carbon chain lengths of Met-derived glucosinolates is determined by allelic state of the GS-Elong locus. In A. thaliana, this locus contains a tandem array of two or three genes (MAM1, MAM2 and/or MAM-L) and forms either 3C or 4C glucosinolates (Kroymann et al., 2003). A. lyrata displays greater variation in longer-chain Met glucosinolates because of variation at the MAM locus (Heidel et al., 2006). It has been found that longer-chain glucosinolates tend to be more toxic than those with shorter chains (Borek et al., 1998). Hence, the predominantly 6C glucosinolates may provide a higher level of defense against herbaceous generalist insects. Boechera contains a tandem array of three MAM genes (MAMa, MAMb and MAMc) (Benderoth et al., 2006), which evidently control the chain-elongation process. In our cross, we do not find variation or QTL for Met-derived chain-length differences. In other B. stricta accessions, we have surveyed to date, we also find little to no variation in methionine chain length variation (Mitchell-Olds, unpublished). Hence, unlike in Arabidopsis species (Kroymann et al., 2003; Clauss et al., 2006) variation at the MAM locus does not seem to be a major component of the natural variation of glucosinolates in B. stricta. However, we cannot rule out that some yet untested B. stricta populations will show significant differences.

In B. stricta, the major BCAA-derived glucosinolates are 1ME derived from valine and 1MP derived from isoleucine. These glucosinolates are uncommon among members of the Brassicaceae (Windsor et al., 2005), but are found among other groups in the Brassicales. For example, within the genus Capparis (Capparaceae), C. flexuosa has high levels of the Ile-derived 1MP and C. spinosa (capers) has high levels of Val-derived 1ME (from which capers derive their characteristic flavor) (Mikkelsen and Halkier, 2003). In transgenic Arabidopsis plants (expressing the Cassava Cyp79D2 gene) that produced 1MP and 1ME glucosinolates, the authors found a strong correlation between total quantity of BCAA-glucosinolates and the ratio of 1MP:1ME ((Mikkelsen and Halkier, 2003). The authors hypothesized that this might be because of to more rapid depletion of Val-pools. By contrast, in our B. stricta F2 lines carrying one or two copies of the LTM allele at BCMA, we found a constant ratio of 1MP:1ME. This suggests that the flux through the pathway is either not rate limiting for Val or Ile precursors, or that regulation or enzyme kinetics are more finely controlled than in the Arabidopsis transgenic lines. The short-chained isothiocyanates, 1MP and 1ME have a different bioactivity than the larger side chained and differently structured 6MOH as shown by our feeding studies. An earlier study had shown that shorter chain-length glucosinolates can be less toxic (Borek et al., 1998); however, the variable structure might also cause the differences in bioactivity. Potentially, the 1MP and 1ME glucosinolates have beneficial effects as biofumigants because of their high volatility, or alternatively, they may be more effective than Met-derived glucosinolates in combating bacterial or fungal pathogens (Brader et al., 2006).

We observed higher concentrations and altered ratios of glucosinolates in seeds than in leaves. Most notably, there are much higher levels in the amount of the reduced 6MTH glucosinolate and higher levels in the amount of indolic glucosinolate, 4MO-I3M. The larger concentration and greater amount of reduced Met-derived and non-aliphatic glucosinolates is similar to the difference between leaves and seeds observed in the Columbia ecotype of A. thaliana (Brown et al., 2003). The conversion of reduced thiol- to the oxygenated sulfinyl-glucosinolates in Arabidopsis leaves has recently been found to be controlled by a flavin-monooxygenase (FMO) as the S-oxygenating enzyme, FMOGS−OX1 (Hansen et al., 2007). Expression of this locus and/or other FMO genes may be restricted to leaf tissues and absent in developing seeds. Also, the ratio of 1MP:1ME was slightly higher than in leaves when the lines had at least one LTM allele at BCMA, potentially because of different substrate pools and/or gene(s) expression between the two tissues. The differences in glucosinolates in seeds vs leaves could be of ecological importance. There are a number of insect species within the host range of B. stricta that do feed on its the silques and/or seeds (Mitchell-Olds, personal observation). The potential role of the specific seed glucosinolates in the defense of these tissues will be an important future research objective.

To test whether the difference in allocation between Met- and BCAA-derived glucosinolates had any potential ecological significance because of altered defense, we preformed a feeding assay with a generalist herbivorous insect. QTL analysis showed a single major additive QTL that co-segregated with the BCMA glucosinolate locus. From this result, we conclude that the long-chained Met-derived glucosinolates provide a higher level of defense against a generalist herbivore than the shorter-chain BCAA glucosinolates. We also observed a similar pattern in glucosinolate profiles and effects on generalist insect feeding by analysis of individuals from four populations of B. stricta (Supplementary data), suggesting that the single major QTL identified in our cross is segregating within and between populations.

Potential genetic control of BCMA

There are several potential mechanisms for the differences in glucosinolate profiles controlled by the BCMA locus: a polymorphism in glucosinolate biosynthetic genes, a change in precursor amino acid substrate availability, a transporter between cellular compartments or a regulatory change influencing biosynthesis. The analysis of glucosinolate biosynthetic mutants in Arabidopsis suggests that components of the core glucosinolate biosynthetic pathway are partly modular (Kliebenstein et al., 2001b; Windsor et al., 2005). In other words, the components are non-redundant and catalyze equivalent reactions although displaying varying substrate specificities. In A. thaliana, the conversion of Met-derivative amino acids to aldoximes is catalyzed by CYP79F1 and CYP79F2 (Hansen et al., 2001; Reintanz et al., 2001; Chen et al., 2003) with the resultant Met-aldoximes being metabolized mostly by CYP83A1 (Hemm et al., 2003; Naur et al., 2003). Even the final reaction in glucosinolate biosynthesis, the conversion of desulfo-glucosinolates to the core glucosinolate structure is catalyzed by a family of sulfotransferases that display a level of substrate specificity (Piotrowski et al., 2004). Thus, alterations involving CYP79, CYP83, or AtST5 homologs, or the occurrence of a gene of unknown function could be an explanation for a difference between Met- and BCAA-glucosinolate phenotypes.

Variation in the availability of glucosinolate precursor substrates could also underlie the BCMA QTL. A number of biochemical changes could lead to differences in available substrate pools, such as the redox state of the cells or available levels of sulfur needed for both Met and glucosinolate production. An interesting trade-off between levels of Met and branched-chain amino acids was recently established, when it was shown that the catabolism of Met leads to increased Ile production (Rebeille et al., 2006). Furthermore, there are several strong connections and similarities between the synthesis of branched-chain amino acids and the chain elongation reactions of Met leading to glucosinolate production (recently reviewed by Binder et al., 2007). For example, the isopropylmalate synthase (IPMS) genes involved in the elongation of Val to Leu are evolutionarily and biochemically related to the MAM genes involved in Met-chain elongation (Field et al., 2004; de Kraker et al., 2007). Similarly, the branched-chain aminotransferase 3 (BCAT3) gene has been shown to be directly involved in both BCAA- and glucosinolate-formation (Knill et al., 2008). Hence, the alteration in the specificity or function of a number of genes effecting precursor molecules could lead to the trade-off in Met- vs BCAA-derived glucosinolates controlled by the BCMA locus in Boechera.

The third possibility is that there are regulatory changes affecting glucosinolate and/or BCAA biosynthetic genes. Once again, recent studies have found that transcription factors effecting Met-glucosinolate formation often also regulate genes involved in BCAA formation. For example, gene expression analyses of mutants and transgenic lines containing the Myb-transcription factors (Myb28/HAG1, Myb29/HAG3 and Myb76/HAG2) have shown co-regulation of methionine chain elongation and glucosinolate biosynthetic genes (for example, MAM, Cyp79F, Cyp83A1, BCAT4) and BCAA biosynthetic genes (IPMS, BCAT3 and several additional genes involved in Leu production, such as AtLeuC1) (Hirai et al., 2007; Gigolashvili et al., 2007a, 2007b).

Comparative genomics and potential candidate genes

In our F2 mapping population, we delimited the BCMA QTL to a small genomic region with a maximal LOD score of 119.4, and with a 1.0 LOD confidence interval just 1.9 cm wide, which in A. thaliana contains 100 genes. We used eight tightly linked markers to dissect this interval. Analysis of chromosome rearrangements among four crucifer species finds no evidence for a centromere within the QTL interval (Lysak et al., 2006; Schranz et al., 2006b), providing further support for normal recombination in our target region. Markers that are separated by 1 cm in B. stricta contain on average <30 genes in the corresponding region of A. thaliana, suggesting a gene density and recombination rate similar to A. thaliana.

Based on syntenic relationships between Boechera and Arabidopsis (Schranz et al., 2007b), the only predicted glucosinolate biosynthetic gene in the BCMA region is Cyp83A1 (Bak and Feyereisen, 2001; Naur et al., 2003). The Boechera homolog of this gene was used as a molecular marker, confirming its position within the QTL confidence interval. It is plausible that allelic differences in the Cyp83A1 might lead to substrate specificity as Cyp83A1 shows high affinities for other non-Met derived aldoximes in Arabidopsis (Mikkelsen and Halkier, 2003; Brader et al., 2006). The knockout phenotype of Cyp83A1 in Arabidopsis is a pronounced reduction of Met-derived glucosinolates and an increased production of indolic glucosinolates derived from tryptophan through the activity of the Cyp79Bx and Cyp83B1 genes (Hemm et al., 2003; Naur et al., 2003; Grubb and Abel, 2006). As we observe an increased production of BCAA-derived glucosinolates, and not increased Trp-derived glucosinolates, it seems unlikely that the BCMA phenotype is because of a non-functional Cyp83A1 allele from the LTM parent. An alternative is that genes associated with substrate availability might influence our glucosinolate polymorphism. At least one BCAA biosynthetic gene (AtLeuC1; AT4G13430) is near the QTL peak. Alternatively, a gene such as a methionine-tRNA ligase (AT4G13780) could be involved in altering the available pools of Met or Selenium binding proteins (AT4G14030) might regulate sulfur availability (Van Hoewyk et al., 2008). Ultimately, the identification of the biochemical mechanism of this insect resistance polymorphism is conditional on positional cloning of BCMA.

Future directions

Despite recent progress in identifying genes controlling complex trait variation (Koornneef et al., 2004; Alonso-Blanco et al., 2005), we have little understanding of the frequencies and effects of QTL alleles or the evolutionary processes that maintain this variation (Mitchell-Olds and Schmitt, 2006; Mitchell-Olds et al., 2007). Similar to the variation found within our mapping population, we have also found large differences in the ratio of methionine and branched-chain amino acid derived glucosinolates within and between populations of Boechera (Supplementary data). The future cloning of the locus will allow us to test evolutionary hypotheses, including the potential role of balancing selection and local adaptation.