Nucleotide diversity of vernalization and flowering-time-related genes in a germplasm collection of meadow fescue (Festuca pratensis Huds. syn. Lolium pratense (Huds.) Darbysh.)

In plant species, control of flowering time is an important factor for adaptation to local natural environments. The Vrn1, CO, FT1 and CK2α genes are key components in the flowering-specific signaling pathway of grass species. Meadow fescue is an agronomically important forage grass species, which is naturally distributed across Europe and Western Asia. In this study, meadow fescue flowering-time-related genes were resequenced to assess nucleotide diversity in European and Western Asian subpopulations. Identified sequence polymorphisms were then converted into PCR-based molecular genetic markers, and a meadow fescue germplasm collection was genotyped to investigate global allelic variation. Lower nucleotide diversities were observed for the Vrn1 and CO orthologs, while relatively higher values were observed for the FT1 and casein kinase II α-subunit (CK2α) orthologs. The nucleotide diversity for FT1 orthologs in the Western Asian subpopulation was significantly higher than those of the European subpopulation. Similarly, significant differences in nucleotide diversity for the remaining genes were observed between several combinations of subpopulation. The global allele distribution pattern was consistent with observed level of nucleotide diversity. These results suggested that the degree of purifying selection acting on the genes differs according to geographical location. As previously shown for model plant species, functional specificities of flowering-time-related genes may also vary according to environmental conditions.


Introduction
Control of flowering time is essential for adaptation of plant species to natural environments (Shimizu et al. 2011). Variation for flowering time is largely governed by genetic factors, and genes involved in this trait have been identified from model plant species through a combination of molecular biological and genetic approaches, such as mutagenesis, transgenic modification and gene expression profiling (Andres and Coupland 2012). In Arabidopsis Among turf and forage grass species of the Pooideae tribe Poeae, flowering-time-related traits have been extensively studied in perennial ryegrass (Lolium perenne L.). Orthologs for the Vrn1, CO, FT1 and CK2a genes have been cloned and characterized from this species (Jensen et al. 2005;Shinozuka et al. 2005;Armstead et al. 2008;Studer et al. 2012). Colocation on genetic linkage maps between the gene ortholoci and flowering time QTLs has been demonstrated, suggesting conservation of key genes in the flowering-specific pathway. DNA sequence polymorphisms in the perennial ryegrass Vrn1 and FT1 gene orthologs were determined in order to identify "diagnostic" polymorphisms, diversity for which was demonstrated to be significantly correlated with flowering time variation (Asp et al. 2011;Skøt et al. 2011). Due to a significant correlation between flowering time-related characters and vegetative biomass-related traits, knowledge of the flowering time-related gene alleles may be exploited in order to improve the biomass yield of turf and forage grass cultivars (Yamada et al. 2004).
Meadow fescue (Festuca pratensis Huds. syn. Lolium pratense (Huds.) Darbysh.) is an outbreeding diploid (2n = 2x = 14) member of the Poeae tribe, which is widely sown as a pasture crop in northern temperate regions (Ergon et al. 2006). This species is closely related to perennial ryegrass (Hand et al. 2010(Hand et al. , 2012a, such that construction of a genetic linkage map containing functionally associated markers demonstrated genomewide collinearity between the two taxa (Alm et al. 2003). Genetic factors controlling agronomically important traits, such as drought and frost tolerance, flowering time and seed production-related characters, have been investigated through the use of biparental mapping populations (Ergon et al. 2006;Alm et al. 2011). A QTL identification study suggested the presence of multiple loci related to vernalization requirement on chromosome 4F, to which the Vrn1 ortholog was assigned (Ergon et al. 2006).
Both meadow fescue and perennial ryegrass are widely sown in Europe and Western Asia, and the geographical distribution structures of organelle genome-specific haplotypes of the two species have been investigated (Balfourier et al. 2000;Fjellheim et al. 2006). While no clear association between the distribution structure of meadow fescue chloroplast DNA haplotypes and human migration routes was observed, such a relationship was inferred for perennial ryegrass (Balfourier et al. 2000). In contrast, the previous phylogeographic studies suggested that the distribution structure of meadow fescue genotypes shows a weaker relationship with agricultural practices in Europe and Western Asia and that this species is more naturally distributed than perennial ryegrass (Fjellheim et al. 2006). Recently, worldwide germplasm collections of both meadow fescue and tall fescue (Festuca arundinacea Schreb. syn. Lolium arundinaceum) curated by the United State Department of Agriculture-Agricultual Research Service (USDA-ARS) were genotyped through resequencing of the chloroplast genome-located matK gene and the nuclear ribosomal DNA internal transcribed spacer (nrDNA ITS), permitting precise species-specific identification within the complex germplasm collections (Hand et al. 2012b). This study identified 189 distinct meadow fescue genotypes originating from 27 countries within the two collections (Hand 2012c), providing a geographically diverse core resource for studies of relationships between genetic and ecoclimatic variation.
In this study, nucleotide diversity of meadow fescue flowering-time-related genes has been assessed. Regions of the meadow fescue Vrn1, CO, FT1 and CK2a orthologs (FpVrn1, FpCO, FpFT1 and FpCK2a) were resequenced, and nucleotide diversity within three sample groups was evaluated. Population-specific single nucleotide polymorphisms (SNPs) were subsequently converted into cleaved amplified polymorphic sequence (CAPS) markers, and the meadow fescue germplasm collection was extensively genotyped in order to investigate global allele distribution. As flowering-time-related genes are key components of natural environment adaptation, it was hypothesized that such genes have been subjected to natural selection pressures due to ecoclimatic factors and that the degree of selection strength may vary according to geographical location. The results of this analysis are presented and discussed.

Plant materials and DNA extraction
Meadow fescue seed was obtained from the USDA-ARS collection (Hand et al. 2012b). Single seeds from each accession were germinated and grown under controlled conditions. Meadow fescue genotypes were selected and classified into six groups on the basis of geographical origin (Hand et al. 2012b): Northern Europe (NE), Russia and former Soviet Union (R), Central Europe (CE), Southern Europe (SE), Western Asia (WA) and the rest of the world (RW). A subset of 36 genotypes was then selected to represent each group, with an emphasis on origin from Western Eurasia (Table 1). The NE group and a combination of the Central and Southern European groups (CSE) each contributed 11 individuals. Members of the NE group were collected from three countries located between latitudes 55 and 65°N, while the CSE combined group contained members of accessions from four countries located between latitudes 40 and 50°N. The WA group consisted of 10 individuals that originated from 4 countries located between latitudes N 30 and 40°N. A single individual from Russia (56.9°N) represented the R group, while a single genotype from Japan (43.1°N) and two individuals from Australia represented the RW group. DNA was extracted from a single young leaf of each plant using the DNeasy 96 Plant Kit (QIAGEN, Hilden, Germany). Simple sequence repeat (SSR) genotyping data from the previous study were used for assessment of genetic diversity of the selected genotypes (Hand et al. 2012b). The DARwin5 program was used for the construction of a genetic dissimilarity matrix and radial unweighted neighbor-joining tree (Perrier et al. 2003).

Primer design and PCR amplification
PCR primers were designed using the Sequencher TM software version 4.7 for Windows (Genecodes, Ann Arbor, MI) and Oligo Calc program (Kibbe 2007). Primers for the FpCO and FpFT1 genes were designed on the basis of the previously sequenced meadow fescue CO and FT1 orthologs, respectively (NCBI GenBank ID: AJ833018 and FN993915; Armstead et al. 2005;Skøt et al. 2011). As sequence information for the FpVrn1 and FpCK2a genes was not directly available, DNA sequences from related species were obtained (JN969602, JN969603, FJ793194, AB213316, AK354232 and XM_003559017) and aligned using the Sequencher TM software. Primers capable of crossspecific amplification were then designed within the conserved regions. PCR amplification was performed using Immolase TM DNA polymerase (BIOLINE, London, UK), following manufacturer's instructions. The PCR amplicons were visualized on a 2.0% (w/v) agarose gel containing 0.59 SYBR â Safe DNA gel staining (Life Technologies, Carlsbad, CA).

Sanger PCR-direct sequencing
The PCR products were treated with exonuclease I (0.5 U) and shrimp alkaline phosphatase (0.5 U) at 37°C for 40 min, and the enzymes were then heat deactivated at 85°C for 20 min. Direct Sanger sequencing analysis was performed using BigDye TM terminator chemistry (Life Technologies), following the manufacturer's instructions, and the fluorochrome-labelled fragments were size-separated on the ABI 3730xl Prism sequencer (Life Technologies).

DNA sequence data analysis
Sequence reads were aligned and manually corrected using Sequencher (Genecodes), and nucleotide polymorphisms were identified. An unphased FASTA file of the Sanger sequence reads was prepared and imported into the DnaSP v5 program (Librado and Rozas 2009). Haplotype reconstruction was performed with the PHASE algorithm (Stephens et al. 2001;Stephens and Donnelly 2003), and nucleotide diversity (p) was subsequently calculated. Significantly different combinations of nucleotide diversity were identified with the Tukey-Kramer method. A studentized range (q) value of 4.46 was used for the Tukey-Kramer method at the significance level of a = 0.01.

CAPS assay
PCR amplicons that included polymorphic nucleotide sites were generated with locus-specific primers. The PCR products (5 lL) were digested with restriction enzyme (0.5 U), following the manufacturer's instruction. The DNA fragments were visualized on a 2.0% (w/v) agarose gel with 0.59 SYBR â Safe DNA gel staining.

SSR-based genetic diversity within the sample set
A neighbor-joining (NJ) tree was constructed for the 36 individuals that represent broad geographical origins, based on genotyping results from use of 20 genomic DNA-derived SSR markers (Fig. 1). The NJ analysis Genotypes from Russia, Japan and Australia were assembled into the NE, CSE and WA groups, respectively, according to latitude of origin. n.d., Data not available.
excluded two individuals from each of the CSE (595025 and 595031) and WA (229500 and 383647) groups from the dendrogram, but all individuals from Russia, Japan and Australia were included.

PCR amplification and sequencing of flowering time-related genes
The FpVrn1, FpCO, FpFT1 and FpCK2a gene-related sequences were amplified using PCR with a total of eight primer pairs (two pairs of primers for each of the 4 genes; Table 2; Fig. 2). Amplicon sizes varied from c. 200 to 600 bp depending on primer combinations. The amplicons were sequenced, and their identity was confirmed on the basis of high sequence similarity to the previously sequenced Vrn1, CO, FT1 and CK2a genes from meadow fescue and related species (Table 2). Putative exons, introns, 3′-untranslated regions (UTRs) and 5′-UTRs were included in the sequenced amplicons.

DNA polymorphism identification and nucleotide diversity assessment
The Sanger sequence reads were aligned for SNP identification. Total of 21 and 18 SNP sites were identified in the FpFT1-and FpCK2a-related sequences, respectively (Table 3). All SNPs found in regions 1 and 2 of FpFT1 and region 2 of FpCK2a were located in either 5′-UTRs or introns. In the FpCK2a-region 1, 10 exonic SNPs were identified, of which 7 were putatively nonsynonymous. Relatively few polymorphic sites were identified in the FpVrn1-and FpCO-related sequences (Table 3). A single nonsynonymous SNP was found in the FpVrn1-related sequences, and the other SNPs were located in either 5′-UTR or intron. One synonymous SNP and three nonsynonymous SNPs were identified in the FpCO-related Table 2. PCR primer sequences for amplification from the flowering-time-related genes and similarity to the corresponding DNA sequence. sequences. No SNP-containing sites were identified in either FpCO-related sequences of the NE group or region 1 of FpVrn1-related sequences from the CSE combined group. The highest SNP prevalence (10 across 147 bp) was found in region 1 of the FpCK2a-related sequence. Additional SNPs were identified when sequencing data from the Russia-, Japan-and Australia-derived individuals were combined, corresponding to one and two SNPs in the FpFT1-and FpCO-related sequences, respectively (Table 3). No additional SNPs were, however, identified in the FpVrn1-and FpCK2a-related sequences. Nucleotide diversity was calculated for each gene sequence and each geographical subpopulation. Diversity values were then compared between subpopulations to investigate the association between flowering-time diversity and geographical origin. The highest level of nucleotide diversity was observed in the FpCK2a-region 1 sequence, more than 10 times higher than that of the remainder of this gene (Table 3). However, no significant differences between groups were found in the sequence of this region. The nucleotide diversity in five sequenced regions for the WA group was significantly higher than those of the NE and CSE groups. The exception was in FpVrn1-region 2, for which the NE group significantly exceeded the WA group. Levels of nucleotide diversity were largely unchanged when data from genotypes originating from more distant localities (Australia, Japan and Russia) were included.

Allele variation in flowering-specific genes
In order to screen a larger number of accessions in a high-throughput fashion, sequence polymorphisms were converted into CAPS assays. A total of five polymorphic sites capable of conversion into CAPS markers were identified across FpVrn1-region 2, FpCO-region 2 and FpFT1-region 2 (Table 4), and the resulting genotypic assays were performed across all 152 meadow fescue individuals ( Fig. 3; Tables S1 and S2). The minor (lower frequency) alleles of the FpVrn1/HaeIII and FpFT1/ HaeIII markers, which were identified only in the NE group during resequencing, were found in the Russian and each of the European groups. The minor alleles of the FpCO/HpaII and FpFT1/MslI markers were found in the CSE and WA groups used in the reseqeuncing assay. Although one individual with the minor FpCO/HpaII allele was found in the CE group, no genotypes from the NE and R groups contained these alleles. The minor FpFT1/MslI allele was present in additional individuals of the SE and WA groups, but not the R and other European groups. The minor allele of the FpVrn1/MluCI marker was found in all groups, and the frequencies were relatively high in the NE and CE as compared to SE and WA groups.

Linkage disequilibrium in meadow fescue
Natural selection exerts effects not only on mutations that contribute to trait-specific variation, but also noncausal nucleotide changes that are physically linked to the causal mutation site (Futuyma 1998;Frankham 2012). The extent of the sequence region affected by this "indirect" selection is determined by the decay of linkage disequilibrium (LD), and adjacent nucleotides within an area of high LD are expected to be subjected to a similar selective pressure. For grasses of the Poeae tribe, extent of LD has been directly studied in perennial ryegrass. Due to an obligate outbreeding reproduction habit, LD in perennial ryegrass exhibits rapid decay, typically over distances 500-2000 bp in length, comparable to the size of a single gene locus (Ponting et al. 2007;Fiil et al. 2011). In the absence of direct empirical data, it seems likely that LD decay would occur over equivalent physical distances in meadow fescue, due to similarities of outbreeding reproductive habit and genome size (Smarda et al. 2008). For this reason, identification and characterization of genespecific sequence polymorphisms provide a possible means for tracing causal variation in response to ecoclimatic factors for species such as meadow fescue.
In the present study, nucleotide diversities between the two regions of each of the FpVrn1, FpCO and FpFT1 genes were not substantially different, consistent with a common selection history. In contrast, a large difference in nucleotide diversity was demonstrated between the two regions of the FpCK2a-1 gene, suggesting that different selective pressures may have arisen between the regions. A more extensive resequencing activity over a larger sample of loci may be required to determine the average extent of LD within meadow fescue genes, and whether the flowering-time-related genes in the present study provide a representative group.

Response of flowering time genes to ecoclimatic variation
Recent studies have demonstrated that the gene products involved in the flowering-specific signaling pathway may alter in terms of functionality depending on environmental conditions, especially latitudinal variations of seasonal day length (Tsuji et al. 2011). Previous trait-dissection studies in perennial ryegrass suggested that functionality of the Vrn1 ortholog is predominant at middle latitudes (35.5°N), while that of the FT1 (Hd3a) ortholog is more important at high latitudes (52.4°N; Yamada et al. 2004;Armstead et al. 2008;Shinozuka et al. 2012). In the present study, nucleotide diversity of meadow fescue flowering-time-related genes was determined and interpreted, based on the following hypotheses. Firstly, floweringtime-related genes are key determinants of adaptation to natural environments, through influence on the probability and timing of reproduction (Shimizu et al. 2011) and secondly, the functionality of meadow fescue floweringtime-related genes differs depending on latitudinal variation. Based on these assumptions, flowering-time-related genes may be subjected to selective sweeps leading to homogenization of genetic variants, due to a requirement for synchronization of flowering time, and differences of the degree of selective pressure may hence be detected as variations of nucleotide diversity. In this study, nucleotide diversity among Europe-and West Asia-derived genotypes was investigated. As meadow fescue exhibits a relatively natural distribution within these areas (Fjellheim et al. 2006), correlations between gene-specific diversity and ecological parameters may be anticipated for the relevant genotypes. Understanding of such correlations may support the identification of favorable alleles at target genes for use in breeding activities. For example, desirable flowering-time-related gene alleles may be effectively sourced from populations with high levels of nucleotide diversity, and such alleles can be used to increase yield of vegetative biomass.

SSR-based genetic diversity within meadow fescue germplasm
Genomic DNA-derived SSR genotyping data revealed the extent of genetic distance between the germplasm sources used for the resequencing activity The majority of these markers are derived from noncoding regions, which are expected to affected to a lesser extent than genic regions by selective pressures, hence permitting a less-biased assessment of affinity. Although some individuals from nearby locations were clustered together, individuals from each group were widely distributed across the phenetic tree. These data suggest that the candidates for genotypic analysis would be relatively unbiased in terms of background genetic structure, permitting assessment of genespecific divergence.

Nucleotide diversity of flowering-timerelated genes
Limited nucleotide diversity in the FpVrn1 and FpCO generelated sequences was identified across the three groups, suggesting that strong selective pressures may have been exerted on these loci. Nonetheless, some significant differences across geographical combinations were identified. As the number of SNPs identified was small, further analysis is essential in order to establish a convincing correlation with geographical provenance. In contrast, relatively high SNP numbers were found in the FpFT1 and FpCK2a-1 generelated regions. The significant difference in nucleotide diversity observed in FpFT1-related sequences suggested that different selective pressures may have arisen between lower (30-40°N) and higher (40-65°N) latitudes. As QTL detection studies in perennial ryegrass have shown, the NE CE SE WA R Figure 3. Geographical distribution of regions contributing to each group, and graphical representation of cleaved amplified polymorphic sequence (CAPS) allele frequency within each group. Countries classified into the Northern Europe (NE), Russia and former Soviet Union (R), Central Europe (CE), Southern Europe (SE) and Western Asia (WA) groups are shown in dark blue, green, light blue, yellow and orange, respectively. The CAPS allele frequencies are shown in pie-chart format, gray and red colors representing frequencies of the major and minor alleles, respectively. The number in each pie-chart shows the frequency of the minor allele. importance of the FT1 ortholog may be greater in higher latitude regions. Similarly, in FpCK2a-1 gene region 2, the nucleotide diversity of the WA group was significantly higher than those of the other groups. The nucleotide diversity in the region 1 of the same gene was not, however, significantly different at the level a = 0.01, possibly due to rapid decay of LD across this gene unit. A previous study (Fiil et al. 2011) described a higher degree of nucleotide diversity in perennial ryegrass flowering-time-related genes, including the Vrn1 (p = 0.01992) and CO (p = 0.00119) orthologs, than in the present study. This difference may, however, be due to the influence of differing population structures, and different target regions of the orthologous genes selected for resequencing.
Due to the outbreeding reproductive habit, background genetic diversity of meadow fescue is relatively high within populations and hence relatively low between populations (Fjellheim and Rognli 2005). The genomic DNA-derived SSR genotyping data revealed a large level of genetic diversity within populations, supporting this scenario. Conversely, the nucleotide diversities of flowering-time-related genes were very limited, and significant differences were observed between populations. Such a difference may be due to distinct mechanisms of allelic diversification between gene-associated and noncoding DNA sequences. Further studies are required to explore this issue.

Global allele distribution of flowering-time genes
The CAPS marker assay permitted simple and cost-effective genotyping for the larger sample set. Although a relatively small number of samples were used in the resequencing analysis, the allele frequencies were largely similar to those obtained from the expanded set, suggesting that alleles at low frequency (c. 5%) may be effectively identified from a relatively small number of genotypes. In the resequencing assay, the minor alleles at the FpVrn1/ HaeIII, FpCO/HpaII, FpFT1/HaeIII and FpFT1/MslI sites exhibited group specificity, and a similar distribution pattern was also observed for the CAPS genotyping assay (Table 4; Tables S1 and S2; Fig. 3). The relative frequencies by group of the minor alleles for FpVrn1-and FpCOspecific CAPS markers were consistent with observed levels of nucleotide diversity. A relatively high frequency of the FpFT1/MslI minor allele was found in the Western Asiaderived genotypes, for which the highest nucleotide diversity of the FpFT1 gene was observed. The frequency of the FpFT1/HaeIII minor allele was, however, higher in the NE and R groups. The distribution pattern of the minor alleles may hence represent different and distinct evolutionary histories of the FpFT1 gene between the Europe-and Western Asia-derived individuals. All minor alleles were identified in the SE group. In particular, higher frequencies of the minor alleles were found in genotypes from the former Yugoslavia (n = 22) and in Italy (n = 5; Table S2), suggesting that genetic diversity of the flowering-timerelated genes may be enhanced within these localities.
The RW group consisted of genotypes derived from Australian, Canadian and Japanese origins. This study suggested that the genetic background of these genotypes was not substantially different from that of Europe and West Asia-origin genotypes, despite large geographical separation. This is probably because meadow fescue has been introduced into these nations from Europe within the relatively recent past (Casler and van Santen 2001;Hand 2012c).

Conclusions and future directions
In this report, the first study of nucleotide diversity within meadow fescue flowering-time-related genes has been provided. Through resequencing and CAPS genotyping activities, the following insights were obtained as follows: (1) nucleotide diversity in the flowering-time-related genes may be limited, especially in the FpVrn1 and FpCO genes; (2) nucleotide diversity in the genes may be more extensive in South European populations; (3) LD in meadow fescue may decay rapidly, similar to the situation in other outbreeding plant species; and (4) the degree of nucleotide diversity in the flowering-time-related genes may differ depending on latitudinal position.
Although some data in support of the original hypotheses were obtained, several problems remain. Firstly, the nature of the environmental variation that affects nucleotide diversity is unclear. In outbreeding plant species, synchronization of flowering time is essential for mating success (Shimizu et al. 2011). In the previously described perennial ryegrass-based trait-dissection study, the maximum LOD scores for both seed set and heading date QTLs were observed to co-locate with the FT1 ortholog-derived marker (Armstead et al. 2008), suggesting a close relationship between the two processes. However, flowering-timerelated genes have also been suggested to contribute to low temperature stress tolerance, as vernalization requirement genes are believed to maintain high expression of cold tolerance-related genes (Prasil et al. 2004). In meadow fescue, such an association between the diversity of FpVrn1 gene and variation for frost tolerance has been reported (Alm et al. 2011). It is therefore possible that low temperature conditions, which also vary with latitude, rather than differences of seasonal day length were responsible for the nucleotide diversity differences observed in this study.
Secondly, although the relationship between flowering time and mating success has been considered at the intrapopulation level (Armstead et al. 2008;Shimizu et al.

2011)
, the relationship at the interpopulation level is still unclear. The meadow fescue genotypes analyzed in this study were collected from widespread areas, and so the data obtained in the current study represent estimates of nucleotide between, rather than within, natural populations. The effects of flowering time on mating success at the interpopulation level require further analysis, as well as the evolutionary histories of the specific meadow fescue populations. In addition, although the present study focused on latitude, altitude is another major ecological factor which contributes to variation in flowering time (Orlandi et al. 2005). The relationship between altitude and nucleotide diversity still remains to be clarified.
Thirdly, as a large proportion of the polymorphisms identified in the present study were located in UTR and intron regions, it is possible that the polymorphic sites have been affected by indirect rather than direct selection pressures and that decay of LD over distances comparable to the length of a gene could decouple the causal and noncausal variants and lead to loss of diagnostic information. Although closely linked SNPs can be utilized for the purpose of crop breeding, identification of polymorphisms directly related to traits is essential for definitive understanding of plant biology and evolution. Further analysis is required for identification of such causal sequence polymorphisms.
Finally, geographical origin has been used in this study as a proxy for inferred phenotype. More detailed assessment of representative genotypes for heading date, vernalization requirement and cold tolerance would be required to test the various ecological hypotheses that have been proposed. Nonetheless, the comprehensive survey of gene-specific sequence variation presented here provides a solid foundation for such a study.

Supporting Information
Additional Supporting Information may be found in the online version of this article: Table S1. Summary information for cleaved amplified polymorphic sequence (CAPS) genotyping of the meadow fescue genotypes. A and B stand for major and minor alleles identified in resequencing analysis, respectively. Table S2. Allele frequency of the cleaved amplified polymorphic sequence (CAPS) marker in populations. A and B stand for major and minor alleles identified in the resequencing analysis, respectively.