Assessment of genetic diversity in Nordic timothy (Phleum pratense L.)

Timothy (Phleum pratense L.), a cool-season hexaploid perennial, is the most important forage grass species in Nordic countries. Earlier analyses of genetic diversity in a collection of 96 genebank accessions of timothy with SSR markers demonstrated high levels of diversity but could not resolve population structure. Therefore, we examined a subset of 51 accessions with REMAP markers, which are based on retrotransposons, and compared the diversity results with those obtained with SSR markers. Using four primer combinations, 533 REMAP markers were analyzed, compared with 464 polymorphic alleles in the 13 SSR loci previously. The average marker index, which describes information obtained per experiment (per primer combination or locus) was over six times higher with REMAPs. Most of the variation found was within accessions, with somewhat less, 89 %, for REMAPs, than for SSR, with 93 %. SSRs revealed differences in the level of diversity slightly better than REMAPs but neither marker type could reveal any clear clustering of accessions based on countries, vegetation zones, or different cultivar types. In our study, reliable evaluation of SSR allele dosages was not possible, so each allele had to be handled as a dominant marker. SSR and REMAP, which report from different mechanisms of generating genetic diversity and from different genomic regions, together indicate a lack of population structure. Taken together, this likely reflects the outcrossing and hexaploid nature of timothy rather than failures of either marker system.

Keywords: Genetic diversity, Genetic structure, Phleum pratense L, REMAP, Retrotransposon marker, SSR, Microsatellite, Timothy Background Timothy (Phleum pratense L.), a cool-season perennial, is the most important forage grass species in Nordic countries. Genetic diversity has been previously assessed [1] in a collection of 96 timothy accessions, of which 88 were of Nordic origin. Simple sequence repeat (SSR) markers revealed Nordic timothy accessions to be very polymorphic, having significant differences in the levels of diversity between countries, vegetation zones, and different cultivar types. However, most of the variation (94 %) existed within accessions, and no clear clustering of accessions based on any grouping was observed. This lack of resolution may either reflect the outcrossing and hexaploid nature of timothy or that SSR markers are not suitable for resolving population structure in timothy.
A wide range of DNA markers are available for diversity studies, which all have their advantages and disadvantages. SSRs are amplified from single loci, but are multiallelic and highly polymorphic. Although they are inherited codominantly, separation of different genotypes may not be possible in a polyploid species such as timothy. Therefore, each allele has to be treated as a dominant marker [1]; consequently, the markers that are amplified from the same SSR locus are not independent of each other, and consequentially information is lost. In the REMAP (retrotransposon-microsatellite amplified polymorphism) markers [2,3] assay, the diversity is generated by the integration of retrotransposons, which move in the genome by a copy-and-paste mechanism but are fixed in position upon insertion [4]. They are ubiquitous and abundant in plant genomes, where they are dispersed on all chromosomes [5]. REMAP markers are amplified using a primer designed to a conserved retrotransposon region and another anchored to a simple sequence repeat. Products from multiple loci are produced in one PCR reaction, each with only two allele alternatives, a dominant one (amplification) and a recessive one (non-amplification). Because the mechanisms that activate retrotransposons [6] and thereby generate insertional polymorphisms are fully different than that generating SSR allelic variation (polymerase slippage) [7], the two marker systems assay different components of genetic diversity. For potato [8], alfalfa [9], and grapevine [10], retrotransposon and SSR markers in combination were shown to be highly discriminatory and effective.
In the previous study [1], a collection of 96 timothy accessions was analyzed using 13 SSRs, thus describing diversity only at this number of loci. On the other hand, these 13 SSR loci harbored as many as 499 alleles. In the present study, we used REMAP markers for studying diversity in a subset of 51 accessions and compared the results with those obtained with SSR markers. We wanted to determine if another type of marker, which would report from many more locations in the genome and assess different genomic regions where diversity is generated by a different mechanism, could describe diversity more efficiently and also reveal population structure, particularly for a polyploid species. Especially the autonomous nature of retrotransposon diversity generation and display, which is independent of the syntenic organization of polyploids, appeared suited to clonal polyploid species such as timothy. We expected that the retrotransposon markers should thereby be more likely to find genetic structure in timothy, should it exist.

Plant material
In the previous study [1], SSR markers were analyzed in a collection of 96 timothy accessions. Fifty-one of these were selected for the present study to be screened also with REMAP markers (Table 1). Fifteen to twenty randomly selected individuals per accession were investigated, in total 945 individuals. The number of individuals analyzed from each accession in the two studies was not exactly the same because 20 individuals had to be omitted due to their poor amplification in RE-MAP analysis.
The 51 accessions were mostly wild (30, locations in Fig. 1 in [1]); seven each were classified as landraces, cultivars, and of unknown cultivar types. Accessions were derived from all Scandinavian countries (Denmark, 8; Finland, 10; Iceland, 2; Norway, 10; Sweden, 13). In addition, eight gene bank accessions (so-called exotics) originating from non-Scandinavian countries were included in the study.

Marker analyses
DNA was extracted using the method of Tinker et al. [11] with some modifications as described in Tanhuanpää and Manninen [1]. Using the iPBS (inter-primer binding site) method, retrotransposon segments were isolated from the timothy genome, sequenced, and long terminal repeats (LTRs) identified [12]. LTR primers were designed to match conserved motifs at or near their termini, according to the methods of Kalendar et al. [13]. For REMAP marker amplification, four different retrotransposon primers (TIM1 -4) for grasses were used. These were combined with 19 microsatellite-based primers (ISSR + number) that contain repeat units (composed of two or three bases); the 3′ ends of the primers were anchored by a single nucleotide. Because analyzing markers by gel electrophoresis is very laborious, the retrotransposon primers were labelled with a fluorescent dye, FAM (5-carboxyfluorescein), HEX (hexachloro-6carboxyfluorescein), or TET (6-carboxytetrachlorofluorescein) to enable resolution and visualization of amplification products with a MegaBACE TM 500 Sequencer (GE Healthcare, Buckinghamshire, UK).
Fifty-nine REMAP primer combinations were first tested in a small set of individuals for their functionality and efficiency to produce polymorphic bands. The four best primer combinations were chosen for final analyses (TIM1 with ISSR1, 15 and 20, and TIM2 with ISSR5). These primers, together with their sequences and properties, are shown in Table 2. The REMAP markers were amplified in a reaction volume of 10 μl, using 0.25 U of FIREPol® DNA polymerase I (Solis BioDyne OU, Tartu, Estonia), buffer B with 2.5 mM MgCl 2 as supplied by the enzyme manufacturer, 200 μmol/L each dNTP, 10 ng of DNA, and 500 nmol/L each primer. The PCR program was run on a PTC-220 DNA Engine Dyad TM Peltier Thermal Cycler (MJ Research, Waltham, MA, USA) and consisted of an initial denaturation step of 2 min at 94°C ; 32 cycles of 30 s at 94°C, 30 s at 60°C and 2 min at 72°C; a final extension step of 10 min at 72°C. After PCR, the amplified products with different labels were combined for MegaBACE runs. SSRs were developed for timothy [14], and analyses were run as described previously [1].

Data analyses
Each REMAP fragment represents a separate locus, and the presence and absence of the fragment was scored in a binary code (1/0). Likewise, each SSR allele was treated as a separate locus and scored in a binary code, even though SSRs are codominant markers. This was because we found the evaluation of allele dosages very unreliable in hexaploid timothy. Diversity indices for markers, including polymorphic information content (PIC), gene diversity, and major allele frequency, were calculated with the program Powermarker v3.0 [15]. A marker index (MI) for each REMAP primer combination and each SSR locus was determined by multiplying the number of polymorphic markers generated (EMF = Effective multiplex ratio) by average PIC value [16]. It illustrates the amount of information obtained per experiment (per primer combination or locus). Genetic diversity in each accession was described with five different diversity indices: 1) the number of all markers observed (A A ), corrected to a sample size of n = 15 with 1000 resamplings without replacement; 2) the mean number of all markers observed in each individual (A I ); 3) the mean number of pairwise differences (PWD) (Euclidean distances) between individuals, which was counted with the program ARLEQUIN version 2.000 [17]; 4) Shannon's diversity index I [18]; 5) the percentage of polymorphic loci. The last two were calculated using the program GenAlex 6.4 [19,20]. Correlations between diversity indices based on REMAP and SSR markers, and differences in the level of diversity between different  [19,20] was used to perform analysis of molecular variance (AMOVA) [22] which partitions total genetic variation to within-and among-accession variance components. The significance of the results was tested by permuting the data 999 times. Principal coordinate analyses (PCA) based on Nei's genetic distances [23] between accessions, and a Mantel test [24], which was used to compare Nei's distances based on REMAP or SSR data, were carried out with the software GenAlex.

Diversity at marker loci
Four REMAP primer combinations were used for studying diversity of the 51 accessions. Because not all fragments could be read as marker peaks, selections were made on the basis of the size and shape of the peaks. The numbers of scored polymorphic markers produced by different primer combinations were as follows: TIM2 + ISSR5, 91; TIM1 + ISSR20, 84; TIM1 + ISSR1, 209; TIM1 + ISSR15, 149. A total of 533 REMAP markers were analyzed, ranging in size from 80 to 650 bp. A total of 464 polymorphic alleles in the 13 SSR loci were amplified from the 51 accessions, the number varying from 13 to 71 per accession [1]. The average diversity indices of REMAP markers were higher than those of SSR markers (Table 3) leading to a six-fold higher MI for REMAPs.

Genetic diversity within accessions
The observed number of REMAP markers per accession varied from 195 (PL204480) to 352 (NGB1672) ( Table 4), and the number of SSR alleles from 95 (NGB10785) to 194 (NGB1111). There was only one private REMAP marker (in accession PL325461), but 43 private SSR alleles were found [1]. Diversity indices of accessions studied with REMAP or SSR markers, respectively, varied as follows: A I from 47.5 (PL204480) to 84.8 (NGB1672) and from 28.4 (NGB10831) to 35 When studying levels of diversity between countries, vegetation zones, or different cultivar types, we found no significant differences in A A and PWD based on REMAP markers (Table 5). On the other hand, statistically significant (P < 0.05) differences in A A and PWD between different vegetation zones and in A A between different   cultivar types were found with SSR markers (Table 5). In the previous study with 96 accessions analyzed with SSR markers, we found significant differences (P < 0.05) in levels of diversity in all groups [1]. When the total number of markers was studied on an individual rather than accession level (A I ), significant differences for each grouping and with both marker types were discovered (Table 5). However, these differences explained only a minor fraction of variation between individuals (1 to 5 %).

Genetic divergence between accessions and groups
AMOVA was performed in order to divide the total genetic variation into three components: variation within accessions, among accessions, and among countries. Most of the variation in the studied material was found within accessions: 89 % when based on REMAP markers, 93 % when based on SSR markers, and 91 % when based on both marker types (Table 6).
No genetic divergence was observed between vegetation zones or cultivar types either using SSR or REMAP markers or both (AMOVA, P < 0.05), which might be due to the small numbers of members in different classes. However, the same result was obtained with SSR markers when 96 accessions were studied [1]. In PCA analysis as well, no clustering of accessions based on countries, vegetation zones, or cultivar types was seen (Fig. 1). The first two axes respectively explained 44.1 %, 45.8 %, or 41.1 % of the variation when REMAPs, SSRs, or both marker types were used in the analysis.

Discussion
Previously, SSR markers revealed timothy to be very diverse both on the individual and accession level when  studied in a collection of 96 accessions. Because it was impossible with SSRs to resolve any population or geographical structure [1], we here have applied a very different kind of neutral marker, REMAPs, which are based on displaying retrotransposon insertions. Both REMAPs and SSRs were highly polymorphic. Variation was observed mostly within accessions but with slightly smaller proportion for REMAPs (89 % vs. 93 %). This difference may be due to the biology of how SSR and retrotransposon polymorphisms are generated. SSRs are generated by replication slippage [7], a process expected to be independent of the environment. In contrast, retrotransposons are known to be activated by both biotic and abiotic stresses [6], conditions which may well be greater in some populations compared with others. Population-level stress would thereby lower the proportion of polymorphism on the individual level and increase it on population or geographic levels.
Diversity indices in accessions were lower for SSR than for REMAP markers. This is likely because SSR markers (i.e., alleles) are not independent of each other; there is a theoretical maximum number of markers that can exist in one individual. If all SSR loci would amplify from all three genomes of Phleum, the maximum number of markers would be 78 (13 loci, 6 alleles in each). However, there is evidence that timothy is an allopolyploid [25]. Allopolyploidy is consistent with our earlier results [1], with some SSR loci found only in one genome whereas others were present in all three. Therefore, the real maximum number of SSR alleles in any one individual lies somewhere between 26 and 78. In the present study, the observed maximum was 45.
Polyploids represent about 50 % of flowering plants [26]. In polyploids, the problem of lack of independence between SSR loci is particularly a problem, but given a very high number of loci developed from the genome sequences of major crops such as cotton or wheat, chromosome-specific markers can be recovered [27]. For agricultural species without reference genomes such as timothy or for many wild species [28], selection of markers with diploid inheritance can reduce the usable loci to very low numbers.
In contrast to SSRs, no limit exists for the maximum amount of REMAPs in one individual because retrotransposon insertions are independent of each other. Moreover, different retrotransposon families, such as in the hexaploid wheat genomes [29], show different evolutionary histories, enabling discrimination between homeologues. Retrotransposon markers have been deployed effectively for even the highly polyploid sugarcane [30]. Although codominant REMAPs also exist, codominance does not restrict the possibility of co-existence of markers in one individual. The maximum amount for REMAPs observed in one individual in the present study was 121. Correlations between diversity indices based on REMAP or SSR markers were mostly low or moderate because the two marker systems report from different genomic regions where polymorphisms are generated by different processes. On the other hand, even though SSRs could be treated as codominant markers, it has been suggested that large similarities between diversity indices with dominant markers but somewhat lower between dominant markers and SSRs are due to insufficient numbers of analyzed SSR loci [31].
When using markers for measuring distances, PWD between individuals correlated weakly (r = 0.26) but genetic distances between accessions strongly (r = 0.67) between the two marker types. PWD is based on the  Euclidean distances between individuals whereas distances between accessions are based on marker frequencies. The same sort of resultpoor or nonexistent individual-by-individual correlations but moderate correlation between accessionswas obtained when amplified fragment length polymorphisms (AFLPs), which are comparable to REMAPs by being a multilocus and dominant marker type, and SSRs were compared [32]. In potato, a low correlation of SSR and REMAP markers (r = 0.17) in the Mantel's matrix correspondence test was found [8].
Comparing the two marker types, REMAP markers were more cost-efficient. The PCRs of four REMAP primer combinations were made separately, and products from two different combinations with different fluorescent labels were combined for MegaBACE runs. As a consequence, for the whole diversity analysis study (945 samples), 40 PCRs on 96-well microtitre plates were made and analyzed in 20 Megabace runs. A total of 533 polymorphic markers was produced. On the other hand, the 13 SSR loci were multiplexed into 5 PCR reactions and analyzed in 5 MegaBACE runs, requiring in total 50 PCR plates and 50 MegaBACE runs. In addition, some planning and optimization was required in order to multiplex the PCR reactions for the various SSR loci. A total of 464 SSR markers (i.e., alleles) was amplified. Accordingly, more REMAP markers (i.e., loci) were produced with less labor, money, and time. The MI was over six-fold higher with REMAPs, which is not due only to the need to interpret SSR alleles as separate markers but is also typical for markers with an effective multiplex ratio, and has been detected also when AFLPs have been compared with SSRs [16].
Knowledge of genetic variation and relationships between individuals and accessions is essential when conserving and using genetic resources. Evaluation of genetic diversity requires analysis of multiple markers as efficiently as possible. Choosing a suitable marker type, several aspects have to be taken into account, not only expected heterozygosity and marker index, but also technical difficulty, ease of genotyping, cost, and availability. Technically, there were no differences between REMAPs and SSRs and we encountered analysis difficulties with both marker types. All SSR peaks contained some degree of stutter, which complicated the identification of alleles. On the other hand, interpretation of REMAP markers was very slow because several markers were amplified in one PCR reaction and there was a wide variation in peak heights. All peaks could not be analyzed and selections had to be made according to peak heights and frequency. Difficulties in scoring hindered the use of automated analysis programs for both marker types. Regarding availability, there are universal retrotransposon primers that can be used in any species, and primers specific for Graminae also have a vast range of use. Moreover, SSRs have not been developed for every species, and transform rates from one species to another depends on the genetic distance of the taxa [33]. These general conclusions regarding the utility of SSR and retrotransposon markers alone and in combination are consistent with those for four diverse dicot species, distant from the monocot timothy [8][9][10].

Conclusions
When diversity in a polyploid species is examined, where the codominant nature of SSRs is of no use, dominant REMAP markers, as analyzed by size on a sequencer, were more cost-efficient. REMAPs also described diversity from a larger segment of the genome compared to the same number of SSR alleles. On the other hand, SSRs detected differences in the level of diversity in different groups better than REMAPs. Furthermore, private SSR alleles were found, making SSRs better for accession identification. Private alleles, however, can be developed from retrotransposon markers using the RBIP (retrotransposon-based insertion polymorphism) and ISBP (insertion site-based polymorphism) methods, which are locus-specific [34]. Genetic distances between accessions were similar with REMAP or SSR markers, but neither marker type could reveal any clear divergence between vegetation zones, cultivar types or countries in the polyploid, very polymorphic and heterozygous timothy species. SSR and REMAP polymorphisms derive from very different mechanisms. Variations in SSR numbers at individual loci derive from polymerase slippage during replication. In contrast, retrotransposon insertions, which can be stress-driven, generate the priming sites for retrotransposon-based marker methods. Given the vastly different numbers of microsatellite and retrotransposon loci queried by the marker systems used, which report from very different genomic regions, the fact that they together show a lack of structure likely reflects the outcrossing and hexaploid nature of timothy rather than failures of either marker system. Both retrotransposons and SSRs, however, are neutral markers; patterns of variation in the gene space of timothy, such as through single nucleotide polymorphism (SNP) genotyping, remain to be explored. These would allow the possibility to evaluate allele dosages, thereby increasing the information embodied in each locus. SNP markers have been used in sugarcane, a complex autopolyploid species, to estimate ploidy level and also the dosage of SNPs [35]. The availability of SNP markers has increased with the invention of the genotyping by sequencing strategy (GBS) [36] and a recent study presents its use to evaluate allele frequencies in populations in an outbreeding species, perennial ryegrass [37]. Such techniques could be applied to timothy as well to study the structure of accessions. On the other hand, the importance of using both molecular and phenotypic markers for assessing diversity especially when evaluating adaptive potential has been emphasized in a study where timothy accessions were characterized with SSRs, chloroplast DNA sequences, as well as by morphological and phenological traits [38].