Phylogeny and differentiation of the St genome in Elymus L. sensu lato (Triticeae; Poaceae) based on one nuclear DNA and two chloroplast genes

Hybridization and polyploidization can be major mechanisms for plant evolution and speciation. Thus, the process of polyploidization and evolutionary history of polyploids is of widespread interest. The species in Elymus L. sensu lato are allopolyploids that share a common St genome from Pseudoroegneria in different combinations with H, Y, P, and W genomes. But how the St genome evolved in the Elymus s. l. during the hybridization and polyploidization events remains unclear. We used nuclear and chloroplast DNA-based phylogenetic analyses to shed some light on this process. The Maximum likelihood (ML) tree based on nuclear ribosomal internal transcribed spacer region (nrITS) data showed that the Pseudoroegneria, Hordeum and Agropyron species served as the St, H and P genome diploid ancestors, respectively, for the Elymus s. l. polyploids. The ML tree for the chloroplast genes (matK and the intergenic region of trnH-psbA) suggests that the Pseudoroegneria served as the maternal donor of the St genome for Elymus s. l. Furthermore, it suggested that Pseudoroegneria species from Central Asia and Europe were more ancient than those from North America. The molecular evolution in the St genome appeared to be non-random following the polyploidy event with a departure from the equilibrium neutral model due to a genetic bottleneck caused by recent polyploidization. Our results suggest the ancient common maternal ancestral genome in Elymus s. l. is the St genome from Pseudoroegneria. The evolutionary differentiation of the St genome in Elymus s. l. after rise of this group may have multiple causes, including hybridization and polyploidization. They also suggest that E. tangutorum should be treated as C. dahurica var. tangutorum, and E. breviaristatus should be transferred into Campeiostachys. We hypothesized that the Elymus s. l. species origined in Central Asia and Europe, then spread to North America. Further study of intraspecific variation may help us evaluate our phylogenetic results in greater detail and with more certainty.


Background
Hybridization and polyploidization is a major mechanism in plant evolution and speciation [1,2]. Polyploidization by itself has many consequences for genome evolution, particularly for gene expression and gene organization [3][4][5]. These changes may result in full fertility and stabilization of the hybrid condition and assist in establishing the phenotype in nature, which allows polyploids to adapt to new ecological niches or to be competitively superior to the parental donor [2,6,7].
Evolution under polyploidization alone and/or hybridization and polyploidization together can give rise to a complex of lineages whose phylogenetic relationships are unclear. For such groups, molecular genetic analysis is often necessary to elucidate the genome evolution patterns and the phylogenetic relationships among taxa [8].
The wheat tribe Triticeae (Poaceae) includes many different auto-and allopolyploid taxa, and has received considerable study of its systematics, genetics and speciation [9][10][11]. One example of a polyploid complex within that tribe Triticeae is the genus Elymus L. sensu lato delimited by Löve [12]; it is an important perennial genus with approximately 150 species worldwide. It includes the traditional species of Elymus L., Roegneria C. Koch, Hystrix Moench, Sitanion Raf., and Kengyilia C. Yen et J. L. Yang.
Since Elymus L. was first described as a genus by Linnaeus [13], its circumscription and taxonomy has changed through times but is still uncertain because of the huge morphological variation within and between species, the polyploid origin of the genus and the frequent spontaneous hybridizations between species [12,[14][15][16]. Löve [12] suggested that the taxonomic treatment for Triticeae species should be based on genomic constitution, recognizing StH to be the genomes of Elymus. Dewey [9] accepted Löve's opinion but note the Y genome was represented in many Asiatic species, recommending that the genomic constitutions of Elymus should be StH, StY or StYH. Roegneria has been recognized a part of Elymus based on morphological characters: tufted plants; similar spikelets, one spikelet per node; lemma lanceolate-oblong, rounded ab-axially, 5-veined and veins connivent at apex; also they have a limited genomic relationship [10,17,18]. Although Roegneria shares one or more characteristics with Agropyron, Elymus, and Kengyilia, none have them in the same combination. Therefore, Baum et al. [19,20] concluded that the genus Roegneria should be treated as a strictly separate from Agropyron, Elymus, and Kengyilia. The genus Hystrix was established by Moench with the Hy. patula as the type based on morphological character of lacking glumes, or possessing subulate or linearsetiform ones [21]. Dewey [9] and Löve [12] proposed to put Hystrix in Elymus based on the fact that Hy. patula contains the StH genome. However, it was suggested that species of Hystrix containing NsXm genomes, such as Hystrix coreana, Hy. duthiei ssp. duthiei and Hy. duthiei ssp. longearistata should be transferred into Leymus Hochst [22,23]. The genus Sitanion Rafinesque was erected in 1819, and the type species was Sitanion hystrix. However, Sitanion hystrix and its varieties were treated as Elymus hystrix on the basis of cytogenetic studies [9,24,25]. The genus of Kengyilia C. Yen et J. L. Yang was described with Kengyilia gobicola C. Yen et J. L. Yang as the type species, which contains StYP genomes [26]. Based on the principle that taxonomic treatment should reflect phylogenetic history, Yen et al. [27] suggested that the genus Elymus s. l. should be split into Elymus sensu stricto (StH genome), Roegneria C. Koch  [28]. This change has been supported by a few taxonomists [8,18,[27][28][29][30][31][32]. Also, some systematists have treated Elymus s. l. species as different genera, based on differences in morphology and the regional distribution of those polyploid species [27,30,32].
Several dioploid species with St genome in Pseudoroegneria occur from Ciscaucasica to the Middle East and Northern China, and on to western North America [12]. However, the evolutionary pathway of the St genome from dioploid Pseudoroegneria to Elymus s. l. via hybridization and polyploidization is still unclear.
Gaining a better understanding of the evolutionary history of polyploids is important to the study of plant evolution [1]. Molecular phylogenetic analyses have aided in this process [1,36,37]. Nuclear internal transcribed spacer (nrITS) DNA sequences have been used to study phylogenetic and genomic relationships at lower taxonomic levels [38][39][40][41]. The chloroplast DNA (cpDNA) sequences, including coding and non-coding regions such as rbcL gene, matK gene, the intron of trnL and the intergenic spacer of trnL-trnF and trnH-psbA are also valuable source of markers for identifying the maternal donors of polyploids with additional capacity to reveal phylogenetic relationships of related species [38,[42][43][44][45]. In Elymus s. l., both nuclear and chloroplast genes have been used to identify genome donnors, to demonstrate hybridization events or introgression, to examine duplicate gene evolution, and to reveal the evolutionary history and origin of its species [38,2,3,.
In the present study, we analyzed the 6 accessions of 4 Pseudoroegneria species with St genome, 35 accessions of 12 other diploid species with P, W, V, H, I, E, Xp, Ns monogenome, and 28 Elymus s. l. allotetraploids using one internal transcribed spacer region of nuclear gene (nrDNA ITS) and two chloroplast genes (matK and the intergenic region of trnH-psbA). The objectives of this study are: (1) to elucidate the phylogenetic relationships of some Elymus s. l. polyploid species; (2) to examine the genetic differentiation of St genome in Pseudoroegneria; (3) to investigate the genetic differentiation of St genome in polyploid Elymus s. l. relative to each other and Pseudoroegneria; (4) to compare the nucleotide diversity of the St-genome sequences of nrITS, matK, and trnH-psbA between Elymus s. l. and its putative diploid donors and among Elymus s. l. species.

Phylogeny analysis nrITS analysis
With the assumed nucleotide frequencies A: 0.21490, C: 0.26170, G: 0.27980, T: 0.24360, the nrITS data yielded a single phylogenetic tree (−Lnlikelihood = 3004.4870), the proportion of invariable sites = none, gamma shape parameter = 0.5849. Likelihood settings from best-fit model (GTR + G) selected by Akaike information criterion (AIC) in Modeltest 3.7. The ML tree with bootstrap support (BS) above branches was illustrated in Fig. 1

matK analysis
The ML analysis of the matK sequence data yielded a single phylogenetic tree (−Lnlikelihood = 1787.3855), with the assumed nucleotide frequencies A: 0.36600; C: 0.15890; G: 0.17570; T: 0.29940, the proportion of invariable sites = none, gamma shape parameter = 0.8381. Likelihood settings from best-fit model (TVM + G) were selected by AIC in Modeltest 3.7. We found all matk sequences from Elymus s. l. species corresponded to the St-type.
The tree illustrated in Fig. 2 was ML tree for the matK data with BS above branches. All the Elymus s. l. species and some diploid species of the Triticeae formed Clade I. The other diploid species were put outside Clade I. Within Clade I, the St-genome sequences of the following formed one subclade: all Pseudoroegneria species, the E e genome

trnH-psbA analysis
Likelihood settings from best-fit model (K81uf + G) were selected by AIC in Modeltest 3.7 (−Ln likelihood = 1174.7281). The assumed nucleotide frequencies A: 0.35970; C: 0.17790; G: 0.18010; T: 0.28230, the proportion of invariable sites = none, gamma shape parameter = 0.1481. The ML tree with BS above branches was illustrated in Fig. 3. We obtained two different St-type trnH-psbA sequences from Elymus s. l. species.

MJ-network analysis
As no recombination was detected using the GARD recombination-detection method within the HyPhy package, nrITS, matK, and trnH-psbA sequences obtaitned in this study were used to generate MJ network. Each circular network node represents a single sequence haplotype, with node size being proportional to number of isolates with that haplotype. Median vectors (mv representing Seventy-six, forty-eight, and thirty-three haplotypes were derived from 98 nrITS sequences (Fig. 4), 102 matK sequences (Fig. 5), and 95 trnH-psbA sequences (Fig. 6), separately. We found median-joining (MJ) network showed a consistent phylogenetic reconstruction with ML tree. We identified those clusters' name following the group name showed in the ML tree to make it clearly concerted. In the nrITS MJ network analysis, five clusters (Cluster N-A to Cluster N-E) representing three distinct types of haplotypes (St-, P-, and H-type) of Elymus s. l. In the matK MJ-network analysis, all the species with St genome clustered together with St diploid species in Cluster N-I. The trnH-psbA MJ network analysis recognized two different St-types of haplotypes of Elymus s. l. species, grouped in Cluster N-One and N-Two.

Nucleotide diversity analysis in St genome
Two measures of nucleotide diversity π and θw, were separately calculated for each set of sequence data for the St genome of the diploid species (Pseudoroegneria), tetraploid StH and StY species and hexaploid StYW, StYH, StYP and StStH species. The Tajima's test and Fu and Li's test were conducted on each of different genome composing data sets (Table 1). The St-type nrITS sequence of StYP species is missing from our data, thus we cannot report nucleotide diversity for that category. trnH-psbA sequences obtained from the StStH species (Elymus repens) were identical, in that case nucleotide diversity was zero. Tajima's and Fu and Li's D estimate for the trnH-psbA sequences from St genome species and Tajima's D estimate for the trnH-psbA sequences from StYW genome species were  Table S1. The numbers after species names represent different accessions of the same species positive, indicating a departure from the equilibrium neutral model at this locus, with an excess of rare sequence variants in the St genome diploid species and StYW genome hexaploid species based on trnH-psbA sequences.

Discussion
Phylogenetic relationships among the polyploids in Elymus s. l.
Elymus s. l. consists of allopolyploids that are widely distributed and includes a number of endemic species. Analyses of nrITS, matK and trnH-psbA sequences collected from a wide range of Elymus s. l. species and related genera can shed light on their phylogenetic relationships, ancestral donors and the polyploidization events in the speciation processes on the basis of orthologous comparison.
The  [52]. Subtle morphological differences have often formed the basis for taxon recognition within the complex, resulting in different taxonomic treatments of the Elymus dahurica complex. The species complex possesses three haplomes St, Y, and H with 2n = 6x = 42 chromosomes and has an Asiatic distribution, ranging from Iran to Japan and from southern Siberia to central China [12]. Molecular diversity of the 5S rDNA units [53], storage proteins  Table S1. The numbers after species names represent different accessions of the same species [54], and other considerations [55] in the Elymus dahurica complex supported the genomic constitution of St, Y, and H haplomes. The ML tree and MJ network based on nrITS data from this study, combined with unpublished GISH (Genomic in situ hybridization) results, confirms the genomic constitution of St, Y, and H haplomes in E. tangutorum and E. breviaristatus. Morphologically, E. tangutorum and E. breviaristatus are similar to the species in Campeiostachys in that they share the chatacteristic of palea and lemma having equal length [51]. Despite subtle morphological differences in these species, we strongly support the taxon treatment based on both genomic constitution and morphology. Thus, E. tangutorum should be treated as C. dahurica var. tangutorum and E. breviaristatus should be transferred into Campeiostachys.
It has been found recently that incomplete concerted evolution of nrDNA is widespread among angiosperms [56]. The frequency of heterogeneity among rDNA sequences is higher in alloployploids than that in diploid and autopolyploid species [57]. The main cause of heterogeneity is slowed concerted evolution due to hybridization and polyploidy. Concerted evolution in an allopolyploid may lead to a novel combination of nrITS sequences representing a mixture of the two original parental nrITS sequences that occur within a single individual. It is also possible that unidirectional concerted evolution could subsequently occur, leading to the loss of one copy and fixation of the new nrITS type. Furthermore, both types of parental sequences of the nrITS region could be maintained, especially in the case in young hybridderived taxa that have had little opportunity for concerted evolution [57][58][59]. In the present ML analysis, Anthosachne scabra (StYW, PI533213, Australia) was placed at abnormal branches site with St-type nrITS sequences obtained from Pse. libanotica   Table S1. The numbers after species names represent different accessions of the same species the Roegneria species in Clade C. At the same position, a GGT/AT insert in the nrITS sequence was detected for the Elymus, Pseudoroegneria and Anthosachne scabra (StYW, PI533213, Australia) species in Clade D. A CCAC insert at position 417-420 was detected for all species mentioned. And, these two clades were very close to each other. Thus, we hypothesized that the nrITS type obtained from this group might be a mid-type, representing a mixture of the two ancestral nrITS sequences (St-and St-Ytype). This situation may be due to inter-genome recombination, following hybridization either before or after the chromosome doubling event. Furthermore, Pseudoroegneria from Central Asia might have acted as an ancestor in the hybrid history of Roegneria (StY, Central Asia), resulting recombination sequences. Previous findings on the evolution of nrITS sequences in allopolyploids are typically similar to our findings; sequences that represent some combination of ancestral input [60,61].

The differentiation of St genome in Elymus s. l.
Prior research has demonstrated the evolutionary differentiation of the St genome in different diploid species. Considering the morphological differentiation of Pseudoroegneria, Pse. stipifolia has rough rachis densely covered by prickles; P. spicata has slender awns and unequal glumes; Pse. strigosa has long awns with equal glumes; but Pse. tauri and Pse. libanotica have no awns with unequal glumes [62]. The molecular data also shows differentiation in Pseudoroegneria. Sun et al. [63] reported a 39 bp MITE stowaway element insertion in the region of nuclear RNA polymerase II (RPB2) gene for Pse. spicata and Pse. stipifolia; Pse. tauri and Pse. libanotica lack this insertion. The Pseudoroegneria diploid species are widely distributed extending from Ciscaucasica to Middle East and Central Asia, and on to western of North America [12]. In our study, Pse. libanotica (Middle East), Pse. strigosa ssp. aegilopoides (PI595164, Central Asia; PI531752, Middle East), Pse. stipifolia (Central Asia), and Pse. spicata (North America) were used in the phylogenetic analysis based on the nrITS, matK and trnH-psbA data. All Elymus s. l. species grouped with the Pseudoroegneria species in the ML tree and MJ network using the matK data. Although in the ML tree and MJ network based on the trnH-psbA data, Pse. stipifolia from Central Asia and Pse. strigosa ssp. aegilopoides (PI531752) from Middle East were closely placed with six Elymus s. l. tetraploids and sixteen Elymus s. l. hexaploids, Pseudoroegneria  The n is the number of the sites (excluding sites with gaps/missing data), s is the number of segregating sites, π is the average pairwise diversity, and θ w is the diversity based on the number of segregating sites. In this study, based on the matK data, all the Elymus s. l. species were grouped with the Pseudoroegneria species (with sub-clades) in the ML tree and MJ network. In contrast, the ML tree and MJ network based on the trnH-psbA data closely placed Pse. stipifolia from Central Asia and Pse. strigosa ssp. aegilopoides (PI531752) from Middle East with three tetraploids (E. wawawaensis, E. virginicus and E. sibiricus) and nine hexaploids (C. breviaristata, C. kamoji, C. nutans, An. australasica, An. scabra, K. gobicola, K. hirsuta, K. kokonorica and K. melanthera). Pseudoroegneria libanotica and Pse. strigosa ssp. aegilopoides (PI595164) from Middle East and Central Asia, Pse. spicata from North America were grouped with the rest Elymus s. l. species. Similar results were obtained from the ML tree based on the nrITS sequence data. The evolution of Elymus s. l. species might appears to parallel that of the Pseudoroegneria species, originating in Central Asia and Europe, then spreading to the North America via recurrent hybridization and polyploidization events. In addition, Elymus s. l. species were split into different St-groups. For instance, two accessions of hexaploid C. breviaristata were placed in separate St-genome clade in the ML tree based on the nrITS and trnH-psbA sequence data. The same situation was also detected in the tetraploid E. canadensis in the ML tree based on the matK and trnH-psbA sequence data. Such patterns indicate that differentiation of St genome existed in the species of Elymus s. l. at both the genus and species after polyploidization event based on the nrDNA ITS and the chloroplast matK and trnH-psbA molecular data. We also found non-coding cpDNA sequences (trnH-psbA) provided more phylogenetic information than coding cpDNA sequences (matK), revealing the differentiation of St genome in Elymus s. l. species more clearly.
Evolutionary dynamics of duplicate genes can provide a better understanding of the processes of polyploidization and subsequent rapid diversification [1,4]. In this study, nrITS and matK nucleotide sequence diversity of the St genome of tetraploid StH and StY tetraploid species was higher than in the St genome of diploid Pseudoroegneria. Tajima's and Fu and Li's D estimate for the trnH-psbA in the St genome of diploid Pseudoroegneria was positive. This result indicated a departure from the equilibrium neutral model at this locus, with an excess of rare sequence variants in the diploid Pseudoroegneria species. This finding is compatible with a genetic bottleneck created by recent polyploidization during radiation of Pseudoroegneria species. The values of Tajima's and Fu and Li's D statistic for nrITS, matK and trnH-psbA sequence on StH and StY genome were all negative, indicating that the observed number of rare variations exceeds the expected number in an equilibrium neutral model. These estimates indicated that the excess of rare variants in tetraploid StH and StY species might be created by different independent hybridization event or introgression of St genome during polyploidization.
Our phylogenetic results support the possibility that StY tetraploid species was the direct ancestor of the StYW, StYP and StYH hexaploid species during the allohexaploid speciation process (see next discussion section). We compared the nucleotide sequence diversity of the nrITS, matK and trnH-psbA between the St genome of StY tetraploid spices and the StYW, StYP and StYH hexaploid spices, respectively. As the narrow distribution of StYW and StYP species and rare species of StYH species compared with StY species, the nucleotide sequence diversity in the St genome of tetraploid StY species were higher than in the St genome of hexaploid species (StYH and StYW for nrITS, matK sequence, and StYP for matK and trnH-psbA sequence). In addition, the values of Tajima Cytogenetical studies have concluded that Pseudoroegneria, Hordeum, Australopyrum, and Agropyron species have served as the St, H, W, and P genome diploid donors, respectively, during the polyploid speciation of Elymus s. l. species [9,17,35]. In the ML tree based on the nrITS data, three types of nrITS sequences (St-, H-and P-type) were obtained from all the polyploidy Elymus s. l. species (except the An. scabra PI533213) in the present study. This result indicated that nrITS sequences in different Elymus s. l. species were very similar to their diploid ancestors, confirming that Elymus s. l. is closely related to Pseudoroegneria, Hordeum and Agropyron. Combined with the prior cytogenetic results, we can conclude that the Pseudoroegneria, Hordeum and Agropyron species served as the St, H and P genome diploid donors during the allopolyploid speciation of Elymus s. l. species. Our conclusion is partly consistent with prior the single-copy nuclear gene data (Acc1 and Pgk1) studies [8].
Those studies also proposed that Australopyrum species served as the W genome diploid donors during the polyploid speciation of Anthosachne species. We did not obtain W-type nrITS sequences in this study. In a future study the W-type nrITS sequences from Anthosachne might be obtained by screening a larger number positive clones with the nrITS sequence insert to test whether Australopyrum contributed to the evolution of Elymus s. l. species.
Phylogenetic analysis of our nrITS data revealed each homoeologous sequence grouped with those from the corresponding diploid progenitors. Similarly, the homoeologous loci of nrITS from sampled StYH genome Campeistachys species (C. komoji and C. nutans), StYP genome Kengyilia species (K. melanthera) and StYW genome Anthosachne species (An. scabra and An. australasica) were recovered, with each homoeologous locus also grouping with the StY genome Roegneria species (R. anthosachnoid, R. grandis and R. stricta) and StH genome Elymus sensu stricto species (E. canadensis, E. caninus, E. elymoides, E. hystrix, E. mutabilis, E. sibiricus, E. virginicus and E. wawawaiensis). These results strongly support the suggestion that the StYH, StYP and StYW genome species had their allohexaploid origin via StY as one of the hybridizing ancestors. Combined with the previous cytogenetic evidence, relatively large population size of the StY genome Roegneria species and the failure to discover the diploid Y-genome donor, it can be concluded that the StY genome species might serve as a direct donor of the StYH, StYP and StYW genome species during the allohexaploid speciation. These results also suggested a multiple origin of some polyploid species resulting from independent origin. This conclusion is compatible with the hypothesis of Yen et al. [27] and the results of Fan et al. [8]

Conclusion
In this study, the nrITS sequence analysis in different Elymus s. l. species showed a clear linkage between nrITS sequences of polyploid Elymus s. l. species and those of their diploid ancestors. Combined with the previous cytogenetic results, our data supported the premise that Pseudoroegneria, Hordeum and Agropyron species served as the St, H and P genome diploid donors during the polyploid speciation of Elymus s. l. species. Analyses of phylogenetic relationships based on nrITS data also showed that it is reasonable to treat the E. tangutorum as C. dahurica var.
tangutorum and transfer the E. breviaristatus into Campeiostachys in spite of subtle morphological differences in these species. We strongly support the taxonomy according to both genomic constitution and morphology. Sequence diversity patterns analyses of the two chloroplast genes suggested that the Pseudoroegneria (St genome donor) served as the maternal donor during the polyploidization events that gave rise to Elymus s. l. Those patterns also suggested that Pseudoroegneria species from Central Asia and Europe were more ancient than those from North America. Elymus s. l. species appear to have originated in Central Asia and Europe, then spread to the America after the recurrent hybridization and polyploidization events. Furthermore, differentiation of St genome existed at both genus and species level based on the nrDNA ITS and the chloroplast matK and trnH-psbA sequences. The molecular diversity of the two chloroplastid genes and one nuclear DNA sequence in the St genome reflect the evolution of the St genome in the Elymus s. l. The molecular evolution in the St genome may go into a period of nonrandom evolution following the polyploidization event and introgression of St genome departing from the equilibrium neutral model due to a genetic bottleneck caused by recent polyploidization.

Taxon sampling
Twenty-eight Elymus s. l. species were included in this study and were analyzed together with sixteen diploid taxa representing nine basic genomes in the tribe Triticeae (See Additional file 1: Table S1). Bromus inermis Leyss was used as outgroup. The seed materials with PI numbers were kindly provided by American National Plant Germplasm System (Pullman, Washington, USA). We collected the seed materials with Pr, ZY, and Y numbers. The plants and voucher specimens were deposited at Herbarium of Triticeae Research Institute, Sichuan Agricultural University, China (SAUTI).

DNA extraction, amplification and sequencing
The CTAB (Cetyltrimethyl Ammonium Bromide) procedure [64] was used to isolate total DNA. The nuclear nrITS sequence, chloroplast matK and trnH-psbA spacer sequence were amplified with primers listed in Table 2. PCR amplification of the cpDNA was carried out in a 50 μL reaction mixture, containing 10× ExTaq polymerase buffer, 2 mM MgCl 2 , 200 μM of dNTP, 1 μM of each primer, 1.5 U ExTaq and about 30 ng of template DNA. Amplifications were performed on Mastercycler (Pro S, Eppendorf, Germany) using protocols described in Table 3. The PCR products were visualized on 1.0 % agarose gels, purified by an ENZA™ gel extraction kit (Omega Bio-Tech, Georgia, USA) and then cloned into pMD19-T vector (TaKaRa, Dalian, China) according to the manufacturer's instructions. Three random clones per diploid were chosen to sequence. As there are at least three to five accessions for each allopolyploid in this study, only one random clone for each accession of allopolyploid was picked and sequenced. All clones were sequenced in both directions in Beijing Genomics Institute (BGI, Beijing, China).

Phylogenetic analysis
Multiple sequences alignments were made using Clus-talX [65], with additional manual adjustment. Phylogenetic analyses were performed using Maximum likelihood (ML). Maximum likelihood analyses of the nrITS data, matK data and trnH-psbA data were performed in PAUP*4.0b10 (Swofford D L, Sinauer Associates, http:// www.sinauer.com). The evolutionary model used for the phylogenetic analyses was determined using ModelTest v3.0 with Akaike information criterion (AIC) [66]. The optimal model were GTR + G for nrITS data, TVM + G for matK data, and K81uf + G for trnH-psbA data. Maximum likelihood heuristic searches were performed with 100 random addition sequence replications and Tree Bisection-Reconnection (TBR) branch swapping algorithm. In order to infer the robustness of clades, bootstrap support (BS) values were calculated with 1000 replications [67].

Network analysis
Taking into consideration the potential for reticulation in the evolution of polyploids, phylogenetic network reconstruction method was used to study the relationship between ancestral and derived haplotypes in this study. Because we used known gene genealogies in our simulation studies, the median-joining (MJ) network method was performed [68]. The MJ network method has already been successfully used to study the specific progenitordescendant relationship of polyploidy Triticeae species [69,70,11]. The MJ network analysis was generated by the Network 4.6.1.3 program (Fluxus Technology Ltd, Clare, Suffolk, UK). Because the program infers medianjoining networks from non-recombining DNA [71], the GARD recombination detection method within the HyPhy package [72] was used to test for recombination.

Nucleotide diversity estimate
To assess the gene divergence and genetic relationships in the St genome between polyploids and its diploid progenitor, nucleotide diversity was estimated by Tajima's π [73], and Watterson's θ [74,75]. Tajima's π quantifies the mean percentage of nucleotide differences among all pairwise comparisons for a set of sequences, while Watterson's θ is simply an index of the number of segregating (polymorphic) sites. Tests of neutrality including Tajima's and Fu and Li's D statistic were performed as described by Tajima [73], and Fu and Li [76]. Significance of D-values was estimated with the simulated distribution of random samples (1000 steps) using a coalescence algorithm assuming neutrality and population equilibrium [77]. These parameters were calculated with DnaSP 4.10.9 [78].