Genetic variability and structure of an important wild steppe grass Psathyrostachys juncea (Triticeae: Poaceae) germplasm collection from north and central Asia

View article
Plant Biology

Introduction

Psathyrostachys Nevski, a perennial genus of Triticeae native to the steppe and desert regions of Russia and China, includes approximately ten species, and all species share the Ns genome, which is distantly related with the A, B and D genomes of wheat but has no genetic relatedness with the St, P, E, W, R, I and H genomes in Triticeae (Hsiao, Wang & Dewey, 1986; Wang, Xu & Song, 2005). There are four species, P. huashanica, P. kronenburgii, P. lanuginose and P. juncea, distributed in China, of which P. huashanica and P. juncea are of primary concern. Russian wildrye (P. juncea (Fisch.) Nevski), an outcrossing diploid species (2n = 2x = 14, NsNs), is native to the steppe and desert regions of Russia, Mongolia, China and central Asia. P. juncea could be used for both autumn and winter grazing with high feeding value. It has been also introduced and widely utilized in the North American Prairie regions for rangeland rehabilitation and improvement (Wang, Xu & Song, 2005). In addition, as a relative species of wheat, P. juncea, whose longevity to be 25 years or longer by artificial cultivation, is considered to be exceptionally cold and drought tolerant and highly adaptive to loam, clay and saline-alkali soils. Furthermore, P. juncea possesses excellent tolerance to barley yellow dwarf virus, thus it has the greatest potential for improving cereal crops by wide cross breeding (Wang, Xu & Song, 2005; Jefferson & Muri, 2007; Sharma et al., 1989). To date, most of the available reports on P. juncea mainly involves genetic improvement of wheat disease resistance, tissue culture technique and genetic diversity analysis. However, studies on wild populations over large spatial scales are still insufficient. Population study may also possess wide genetic backgrounds and retain abundant genes for good quality and resistance to abiotic stress and disease (Yu & Zhang, 2010). Also, no previous studies have focused on the interaction between genetic divergence and eco-geographic factors on this species. Furthermore, our study could provide useful information for the conservation and utilization of P. huashanica, an endangered species endemic to China that was reported to be closely related to the Ns genome of P. juncea (Yu & Zhang, 2010).

To conserve and exploit the genetic resources of plants scientifically, a detailed acquaintance of the distribution and amount of genetic variability within the organism is demanded, mainly focused on the investigation of the population genetic diversity and hierarchical structure (Feng, Jiang & Fan, 2015; Costa et al., 2016). A heavily researched subject in evolutionary biology and molecular ecology is the investigation of the genetic basis of local adaptation in non-model species in natural plant populations (Di Pierro et al., 2017; Yang et al., 2017). Adaptive divergence driven by seasonal climate change and habitat heterogeneity can result in local adaptation (Yang et al., 2017; Hedrick, 2006). In addition to environmental differences, geographic isolation, phylogeographic patterns, gene flow and population dynamics also lead to selective pressures resulting in spatially structured genetic variation. Species will undergo adaptive evolution in phenotypes and phenology as a result of adaptive changes of genes in the genome (Feng, Jiang & Fan, 2015; Rellstab et al., 2015). However, a lack of genomic information makes it difficult for most non-model species to define these candidate genes accountable for local adaptation in their genomes while non-negligible (Yong et al., 2017). This problem could be solved by applying molecular markers that require no prior information and have high density genomic coverage (Yong et al., 2017). Among frequently-used markers, AFLPs were used in this study. It provided a rapid and inexpensive way to obtain allele frequency parameters for abundant samples of organisms with hundreds of loci generated per primer combination (Costa et al., 2016).

Hitchhiking effects play an important role in the detection of candidate genes (Schlötterer, 2003). Loci that possessing obviously higher or lower genetic variation (e.g., Fst) than expected under neutrality are considered to be under selection, and the proportion of outliers is usually less than 5% (Schlötterer, 2003; Paris et al., 2010). To date, the most frequently employed analytical approaches for genome scanning are Dfdist and BayeScan software (Beaumont & Nichols, 1996; Wright, 1949; Beaumont & Balding, 2004). The alternative means of detecting and illuminating genetic distinctiveness, gene flow and dispersal are the population re-allocation test for individuals based on a large number of polymorphic loci (Albaladejo et al., 2010). Individuals could be assigned to the most possible source populations with the software AFLPOP (Duchesne & Bernatchez, 2002). Those approaches provide opportunity to access distribution of genetic variation in species, as a result, genetic diversity could be managed appropriately.

In this study, a total of eleven wild P. juncea populations from three regions at north and central Asia were employed to investigate their extent of genetic diversity and structure hierarchies and lay a foundation for collecting, protecting, and utilizing of excellent germplasm resources. On the other hand, AFLP markers were applied to scan the genome of P. juncea and identify the adaptive loci. Assignment tests were performed to investigate the seed-mediated dispersal of P. juncea at a large landscape across north and central Asia. Moreover, an environmental parameter data set corresponding to the sampling site was obtained to provide new insights regarding the correlation of genetic variation, genetic diversity and habitat heterogeneity.

Materials and Methods

Plant materials, DNA extraction and AFLPs

Eleven wild populations of P. juncea were used in this study, which contained a total of 81 individual plants. These populations were all collected from NPGS (National Plant Germplasm System of the United States) genbank from Kazakhstan (KZ), Mongolia (MGL) and Xinjiang (XJ) in China (Fig. 1). With the principle that 50 or more randomly sampled individuals were demanded to collect seeds for each population (Brown & Marshall, 1995). The sample number of each population ranged from 4 to 10, and the corresponding eco-geographical conditions are shown in Table S1. Seeds were planted in containers in a phytotron at Sichuan Agriculture University (25 °C, 300 μmols·m2·s−1; 16-h photoperiod). Genomic DNA of a single plant was isolated from approximately 50 mg fresh young leaves using the Plant DNA Extraction Kit (Tiangen, Beijing, China) following the manufacturer’s instructions. Then, the DNA concentration was quantified using the NanoDrop 2000 Spectrophotometers (Thermo Fisher, Waltham, MA, USA). The DNA samples diluted to 100 ng·μL−1 were stored at −20 °C. According to Sun et al. (2017) with minor modifications, AFLP fingerprinting was performed using six primers (Table S2), and the fragments measuring 60–500 bp with a peak height above or equal to 100 reflective fluorescent units (RFUs) were scored using Genemarker 2.2 (SoftGenetics) as “1” (presence of fragment) or “0” (absence of fragment).

Geographical distribution of eleven P. juncea populations at a large geographical scale and clustering pattern of them in UPGMA analysis and STRUCTURE analysis by populations.

Figure 1: Geographical distribution of eleven P. juncea populations at a large geographical scale and clustering pattern of them in UPGMA analysis and STRUCTURE analysis by populations.

Data analysis

Genetic diversity

According to the band frequency less than 95%, number of polymorphic bands (NPB) and the percentage of polymorphic bands (PPB) were calculated. In addition, polymorphic information content (PIC) and Shannon’s diversity index (I) were calculated as indicators for estimating the discriminatory power of each primer combination (Sun et al., 2017). The PIC of each primer pair was defined as follows, supposing that p and q represent the frequencies of the marker bands that were present and absent, respectively: PIC = 1 − p2q2 (Roldán-Ruiz et al., 2000). NPB, PPB, PIC and I were all calculated using Excel 2013.

At the population’s level, the descriptive statistics, including number of polymorphic loci (NPL), percentage of polymorphic loci (PPL), Nei’s gene diversity (Hj), and Shannon diversity index (Ho), were calculated by AFLP-SURV v1.0 (Zhang et al., 2018). The observed number of alleles per locus (Na) and the effective number of alleles per locus (Ne) were computed by POPGENE v1.31 (Fu et al., 2016), assuming populations under Hardy–Weinberg equilibrium (HWE).

Outlier detection

The detection of outliers is based on the hypothesis that extremely high Fst between populations are seen in positively selected loci (the positive outliers) compared to neutral loci and reduced Fst in loci under balancing selection (the negative outliers) (Yu et al., 2013). In this instance, two complementary measures described as follows were applied to detect candidate outliers that were potentially influenced by selection. Dfdist (Beaumont & Nichols, 1996) in Arlequin35, which is based on the summary statistics under Wright’s infinite hierarchical island model at migration–drift equilibrium (Wright, 1949). Based on the rejection of most common alleles (allele frequency >99%), empirical multilocus Fst was estimated, and simulations were performed. With the infinite island model and the null distribution of Fst, which was based on 50,000 simulations and was obtained according to the methods of Beaumont & Balding (2004), coalescent simulations were performed to generate data sets. According to the recommendations of Wang et al. (2012) with minor modifications, any loci occurring outside of 0.005–0.995 were considered candidate outliers.

Another method to detect signatures of natural selection was BayeScan 2.0 (http://www.cmpg.unibe.ch/software/bayescan/) following the Bayesian likelihood method by reversible-jump MCMC, which uses population-specific plus locus-specific components of Fst coefficients with the assumption that allele frequencies follow a Dirichlet distribution (Balding & Nichols, 1995). In Bayesian statistics, model choice decisions can be performed using the so-called “Bayes factors” (BF). Given a choice between two models M1 and M2 (say neutral and selection), based on a data set N, the Bayes factor BF for model M2 is given by BF = P(N|M2)/P(N|M1). In this study, a threshold of PO > 3 (substantial) was applied as a norm of selection, corresponding to a posterior probability of 0.76 following Jeffreys’ instructions (Foll & Gaggiotti, 2008). Moreover, a burn-in of 50,000 iterations followed by 50,000 iterations was used for estimation using a thinning interval of 10.

Population structure

Nei’s genetic distance (GD) matrix among eleven populations, with 10,000 bootstrap values, was obtained via AFLP-SURV v1.0 (Sun et al., 2017), and the UPGMA dendrogram was constructed via the CONSENSE module in PHYLIP v3.695 (Christensen et al., 2011). The principal coordinate analysis (PCoA) of 81 genotypes was performed by NTSYS v2.2 and R package Scatterplot3d (Ligges & Mächler, 2002) for the depicture of their hierarchical structure to evaluate the genetic homogeneity of the natural populations. In addition, based on the frequencies computation of the dominant band, the re-allocation test was used to estimate allocation success rates and solve the problem of group membership (Duchesne & Bernatchez, 2002). The two allocation process parameters, minimal log likelihood difference (MLD) and choice of zero replacement value, were set as 2 and (1/(sample size + 1)), respectively. Furthermore, the p-value of MLD per individual was calculated, and individuals were assigned to none of the source populations when their p-value was inferior to the appointed threshold (<0.001). For the sake of illustration of population genetic structure, cluster analysis based on a Bayesian model in STRUCTURE v2.3.4 was performed (Falush, Stephens & Pritchard, 2007). The associated allele frequency and admixture model were set with 200,000 burn-in and 1200,000 Monte Carlo Markov chain (MCMC). We commanded that the range of clustering numbers (K) from 1 to 11, and 10 runs were performed for each permutation. Furthermore, the STRUCTURE HARVESTER (Ortego et al., 2015) was applied to estimate the “optimum K”, which was usually reached when L(K) or ΔK plateaued.

Genetic differentiation

The hierarchical analysis of molecular variance (AMOVA) was performed to detect the partitioning of genetic variance using neutral and positive selection loci at the individuals, populations and regions levels using GenAlEx 6.51 software (Peakall & Smouse, 2012), in which the magnitude of genetic differentiation (Fst) and Shannon differentiation coefficient (Gst) were calculated. The gene flow (Nm) at different levels was conducted according to Nm = (1 − Fst)/4 Fst. The coefficient of variance (θB, analogous to Fst) was also calculated with the inbreeding rate (f, analogous to Fis) using a Bayesian-based approach with HICKORY v1.1 (Fuchs et al., 2016), in which different models (full model, f = 0, θ = 0 and free model) were performed, and the fittest one was selected via parameters The Deviance Information Criterion (DIC). Dbar and pD follow the principle that the model with a better fit to the data (smaller Dbar) may be preferred when the DIC difference is primarily a result of differences in model dimension (pD, Fuchs et al., 2016). The analyses were proceeded taking the follow instructs for all four models: burn-in = 5,000, number of iterations = 100,000 and thinnin = 20. Estimates were made using all studied populations.

Mantel test between eco-environmental and genetic data

To study the impact of environmental conditions on genetic diversity, climatic variables were derived from DIVA-GIS (Rao, 2009) based on geographical coordinates of the studied populations, and the correlation between environmental parameters and genetic diversity indexes were estimated by Mantel test using NTSYS-PC v2.02 (Exeter Software, New York, USA). What’s more, correlation between pairwise GD and geographical distance, which was obtained via Geographic Distance Matrix generator software (Ersts, 2010), was calculated to detect whether the IBD pattern exists.

Results

AFLP polymorphism

A total of 604 unambiguous bands were generated from six AFLP primer pairs, of which 488 (80.79%) exhibited polymorphic patterns and were subsequently employed to analyze the entire set of 81 genotypes (Table 1). The number of bands of each primer pair ranged from 68 to 120 with a mean value of 100.67 and an average of 81.33 (80.79%) polymorphic bands per primer combinations. The PIC per primer pair varied from 0.2442 to 0.3115 with an average of 0.2800. The I value of each primer pair changed from 0.4275 to 0.5031 with a mean value of 0.4672.

Table 1:
Marker parameters calculated for each AFLP primer pair used with P. juncea populations.
Primer code TNB NPB PPB (%) PIC Shannon index (I)
E32M58 91 74 81.32 0.2702 0.4879
E33M64 68 57 83.82 0.2907 0.4712
E40M49 98 71 72.45 0.2442 0.4352
E50M56 120 105 87.50 0.2974 0.4784
E59M62 115 84 73.04 0.2660 0.4275
E84M64 112 97 86.61 0.3115 0.5031
Min 68 57 72.45 0.2442 0.4275
Max 120 105 87.50 0.3115 0.5031
Average 100.67 81.33 80.79 0.2800 0.4672
Total 604 488 80.79 0.2805 0.4669
DOI: 10.7717/peerj.9033/table-1

Note:

TNB, total number of bands; NPB, number of polymorphic bands; PPB, proportion of polymorphic bands; PIC, polymorphism information content.

Outlier identity

We performed two outlier detection methods, BayeScan and Dfdist, with the same data set for comprehensive analysis. We successfully detected a total of 19 (out of 604) AFLP loci in eleven P. juncea populations that possess an atypical variation pattern (outliers), which might be affected by selection. Among the 604 loci, only one (1.66‰) possessed significantly higher values of Log(PO) > 0.5 (posterior probabilities higher than 0.99) in BayeScan (Fig. 2A), which exhibited more divergence than the majority of loci, were considered outlier loci under divergent selection. In Dfdist, 14 loci (2.15%) had significantly higher Fst values that deviated from 99% confidence intervals (Fig. 2B), suggesting a possibility influenced by directional selection. The other five loci showed lower Fst values than expected under neutrality, which may correspond to balancing selection. The outlier that the two methods shared was locus 345.

Outliers detected in: (A) BayeScan and (B) Dfdist.

Figure 2: Outliers detected in: (A) BayeScan and (B) Dfdist.

Population genetic structure

Pairwise population GD was measured using Nei–Li’s coefficients. The exhibited dissimilarities varied from 0.073 to 0.193 with an average of 0.120 (Table S3), suggesting a small degree of genetic variability between studied P. juncea populations. The UPGMA dendrogram divided eleven P. juncea populations into three main clusters according to the mean GD value (Fig. 1) in which Cluster I, exhibiting a bootstrap support rate of 80%, comprised two populations from XJ (PJ03 and PJ08), all three populations from MGL (PJ26, PJ27 and PJ28) and all four populations from KZ (PJ19, PJ20, PJ22 and PJ23). Both Cluster II and Cluster III consisted of only one population from XJ (PJ05 and PJ04, respectively), revealing a high bootstrap value of 90% and 100%.

To obtain a more accurate evaluation of population structure, we applied an Admixture Model analysis using STRUCTURE software, which was based on a Bayesian clustering method. Three main genetic memberships based on population (Fig. 1) were deduced from the pooled data because the LnP(D) was large and the calculation of ΔK was maximized when K = 3 (Fig. S1), indicating that all populations grouped into three clusters. As a result, populations mainly in blue include PJ26, PJ27, PJ28, and populations mostly in green covered PJ19, PJ20, PJ22, PJ23, while PJ03, PJ04, PJ05 and PJ08 were chiefly colored in red, which was highly consistent with their geographical origins. However, the UPGMA clustering analysis indicated that populations or individuals in diverse clusters did not agree well with their geographical distribution. The principle of identification is that more than 0.80 of the membership coefficients were considered pure, and less than 0.80 were considered to be an admixture. The results showed that there were three admixed populations (PJ08, PJ22 and PJ23) out of a total of eleven. The pure probability of the MGL group was 1, whereas the XJ and KZ groups had mixed probabilities of 0.25 and 0.5, respectively.

Principal coordinate analysis (PCoA) was performed to estimate the relationships among individuals in terms of their spatial position relative to three coordinate axes. As shown in Fig. 3, the plot of the first, second and third principal components accounted for 10.65%, 7.52% and 6.51% of the genetic variation, respectively, giving a cumulative variation of 24.68%. For all specific individuals, the combination of the three axes could distinguish among geographic groups (XJ-Group, KZ-Group and MGL-Group corresponding to Clade II, Clade I and Clade III, respectively) to a large extent. Populations in diverse groups were, however, intermixed and could not be separated accurately.

Principle coordinate plot of 81 P. juncea individuals.

Figure 3: Principle coordinate plot of 81 P. juncea individuals.

Genetic diversity, gene flow and Mantel test

Calculation of genetic diversity based on the six AFLP primer combinations confirmed that genetic diversity at the species level based on all loci was high with the values of Ne, Hj and Ho (1.5315, 0.3110 and 0.4669, respectively) assuming HW equilibrium (Table 2), in which population PJ28 held a comparatively higher measurement (Ne = 1.4935, Hj = 0.3828, Ho = 0.4053), whereas population PJ23 exhibited the lowest level (Ne = 1.3307, Hj = 0.3716, Ho = 0.2693). In addition, the XJ geo-group exhibited a slightly higher level (Ne = 1.5411, Hj = 0.3553, Ho = 0.4589) than the other two groups.

Table 2:
Genetic diversity of P. juncea distributed among populations in different sampling regions.
Population NPL PPL (%) Na Ne Hj Ho
PJ03 384 63.58 1.6358 1.4492 0.3791 ± 0.0056 0.3662 ± 0.0122
PJ04 378 62.58 1.6258 1.4286 0.3656 ± 0.0056 0.3552 ± 0.0121
PJ05 359 59.44 1.5944 1.4095 0.3597 ± 0.0057 0.3372 ± 0.0122
PJ08 355 58.77 1.5877 1.4187 0.3810 ± 0.0055 0.3393 ± 0.0124
PJ19 349 57.78 1.5778 1.4027 0.3489 ± 0.0059 0.3300 ± 0.0123
PJ20 351 58.11 1.5811 1.3825 0.3141 ± 0.0064 0.3196 ± 0.0121
PJ22 343 56.79 1.5679 1.4054 0.4161 ± 0.0050 0.3296 ± 0.0123
PJ23 280 46.36 1.4636 1.3307 0.3716 ± 0.0055 0.2693 ± 0.0123
PJ26 378 62.58 1.6258 1.4315 0.3613 ± 0.0055 0.3555 ± 0.0121
PJ27 353 58.44 1.5844 1.4084 0.3459 ± 0.0060 0.3343 ± 0.0123
PJ28 424 70.20 1.7020 1.4935 0.3828 ± 0.0053 0.4053 ± 0.0117
Mean1 359.45 59.51 1.5951 1.4146 0.3660 ± 0.0056 0.3401 ± 0.0122
XJ-Group 501 82.95 1.7997 1.5411 0.3553 ± 0.0056 0.4589 ± 0.0100
KZ-Group 512 84.77 1.8377 1.5316 0.3531 ± 0.0055 0.4564 ± 0.0098
MGL-Group 487 80.63 1.7848 1.5344 0.3550 ± 0.0057 0.4474 ± 0.0106
Mean2 500 82.78 1.8074 1.5357 0.3545 ± 0.0056 0.4542 ± 0.0059
Total 488 80.79 1.9288 1.5315 0.3110 ± 0.1656 0.4669 ± 0.2183
DOI: 10.7717/peerj.9033/table-2

Note:

NPL, Number of polymorphic loci; PPL, percentage of polymorphic loci; Na, observed number of alleles per locus; Ne, effective number of alleles per locus; Hj, Nei’s gene diversity index; Ho, Shannon information index; Mean1, mean value of eleven populations; Mean2, mean value of three geographical groups; XJ-Group, Xinjiang Group; KZ-Group, Kazakhstan Group; MGL-Group, Mongolia Group.

The AMOVA partitioning inferred by neutral loci presented in Table 3 indicated a moderate share of the among-population component (Fst = 0.1106, P < 0.001), while the genetic differentiation coefficient (Gst) suggested that differentiation existed among populations was 25.33% (Table 4). In addition, the genetic differentiation among three regions was low (Fct = 0.0568), whereas the vast majority of variation (87.62%) occurred within populations (Table 3). In contrast, genetic variation was also detected by positive outliers, which showed a higher genetic differentiation (Fst = 0.5099, P < 0.001, Table 3) at the species level and regions level (Fct = 0.3593) than neutral loci.

Table 3:
Analysis of molecular variance (AMOVA) for AFLP data of P. juncea populations.
Group Source of variation df PMV (%) SS MS Est. Var. F-statistic P-value
All populations
Neutral loci Among pops 10 11.06 1,470.372 147.037 9.575 Fst = 0.1106 <0.001
Within pops 70 88.94 5,392.344 77.033 77.033
Total 80 6,862.716 86.608
Positive outliers Among pops 10 50.99 119.136 11.914 1.440 Fst = 0.5099 <0.001
Within pops 70 49.01 96.889 1.384 1.384
Total 80 216.025 2.824
Three regions
Neutral loci Among regions 2 5.68 516.788 258.394 4.998 Fct = 0.0568 <0.001
Among pops within regions 8 6.69 953.584 119.198 5.885 Fsc = 0.0710 <0.001
Within populations 70 87.62 5,392.344 77.033 77.033 Fst = 0.1238 <0.001
Total 80 6,862.716 87.917
Positive outliers Among regions 2 35.93 72.912 36.456 1.120 Fct = 0.3593 <0.001
Among pops within regions 8 19.67 46.224 5.778 0.613 Fsc = 0.3070 <0.001
Within populations 70 44.40 96.889 1.384 1.384 Fst = 0.5560 <0.001
Total 80 216.025 3.117
DOI: 10.7717/peerj.9033/table-3

Note:

df, degree of freedom; PMV, percentages of molecular variance; SS, square deviation; MS, mean square deviation; Est. Var., exist variance; Fst, coefficient of genetic differentiation.

Table 4:
List of Mantel test, genetic differentiation and gene flow for P. juncea groups in different regions.
Variable All populations XJ KZ MGL XJ–KZ XJ–MGL KZ–MGL
Mantel test (r) 0.6024*** −0.1196 0.2712 0.9646 0.3822* 0.4786* 0.4165
Fst 0.1106 0.0472 0.1137 0.0176 0.0538 0.0537 0.0560
Gst 0.2533 0.1721 0.2552 0.1318 0.2594 0.1949 0.2578
Nm 1.4736 2.4052 1.4594 3.2942 1.4278 2.0657 1.4392
DOI: 10.7717/peerj.9033/table-4

Note:

P < 0.05.
P < 0.001.

The r value represent correlations between genetic and geographical distance.

Nei’s GD and pairwise Fst values among the three geographic P. juncea groups are displayed in Table S4. The largest Fst and Nei’s GD value existed between MGL and KZ, while the minimum of Fst and Nei’s GD was detected between MGL and XJ. To further explore the relationship between genetic differentiation and geographic isolation, the Mantel test was conducted at different geographic scales, and AMOVA analysis and gene flow were simultaneously calculated. The result revealed a significant IBD effect (P < 0.05) in the XJ–KZ and XJ–MGL regions (r = 0.3822, P = 0.0249 and r = 0.4786, P = 0.0196, respectively), especially a highly significant relevance at the species level (r = 0.6024, P < 0.001, Table 4).

In addition, high gene flow (Nm), low Fst or Gst were also revealed among populations or regions, which could be the reason of totally high genetic variability. The estimation of Fst by the Bayesian analysis was also implemented (Table 5). As the DIC difference is primarily a result of pD, the model with a better fit to the data (smaller Dbar) was preferred from the f-free mode (DIC = 17587.100, Dbar = 14808.900), which suggested a certain degree of inbreeding in natural populations (f = 0.496, estimate of Fis), and the mean θB (analogous to Fst) in the f-free model was 0.128 (SD = 0.010) among these wild populations, agreeing well with the value of Fst obtained when it was assumed that populations were in HW equilibrium (θB = 0.128 vs. Fst = 0.1106).

Table 5:
Wright’s F statistics calculated for populations of P. juncea.
Models ThetaB f DIC Dbar pD
Mean SD 2.50% 97.50% Mean SD 2.50% 97.50%
Full model 0.085 0.005 0.075 0.096 0.145 0.070 0.026 0.300 17,209.600 14,853.200 2,356.450
f = 0 model 0.077 0.004 0.070 0.084 0.000 17,243.300 14,816.500 2,426.830
Theta = 0 model 0.000 0.341 0.050 0.246 0.442 19,383.500 18,861.600 521.903
f-free model 0.128 0.010 0.111 0.149 0.496 0.288 0.023 0.972 17,587.100 14,808.900 2,778.250
DOI: 10.7717/peerj.9033/table-5

Note:

ThetaBB) is analogous to Wright’s Fst, and f is analogous to Fis. DIC is deviance information criterion.

Individual-based population allocation

The re-allocation of sample genotypes (individual plants) was analyzed with the AFLPOP procedure after filtering of loci according to logical criteria, and a clear pattern is shown in Fig. 4. The result indicated that there were 47 individuals (out of 81, 58.02%) reallocated to their population of sampling origin under the minimal log-likelihood difference (MLD) set at 2. Here an MLD of two indicates that a genotype is allocated to a population if this population is 102 times more likely than any other population; otherwise, the genotype is not allocated, and the procedure is said to have failed. This finding suggested inferior genetic homogeneity within populations along with significant heterogeneity among populations. The proportion of individual genotypes that allocated correctly to their source population ranged from 0.00% (PJ23) to 100.00% (PJ20) with a mean value of 51.63% (Table S5). It is noteworthy that 19 individuals were not allocated to any population origin.

Re-allocation of individuals of P. juncea using AFLPOP.

Figure 4: Re-allocation of individuals of P. juncea using AFLPOP.

Correlations between genetic diversity and climate parameters

In this study, 23 bioclimatic parameters were derived from DIVA-GIS (http://www.diva-gis.org/) (Supplemental Materials, Table S6) based on geographical coordinate of tested populations, and those factors that are significantly correlated with genetic diversity are listed. Most of the climate indexes showed faint but significant positive correlations with GD at the species level, while there were generally high and significant negative coefficients of association in the XJ geo-group (Supplemental Materials, Table S7). In addition, for the MGL geo-group, the result revealed a majority of high but non-significant patterns between climatic parameters and GD. For the KZ geo-group, however, there seems no discernible correlations.

On the other hand, for genetic diversity indexes (Hj, Ho, Ne and NPL), the results presented diverse outcomes at different levels. The population Nei’s genetic diversity (Hj) weakly and positively correlated with the mean temperature of the driest quarter (MTD, Table S7) (r = 0.4139, P < 0.05) at the total natural population level. For the XJ geo-group, the main factor affecting Hj was the mean minimum temperature from May to August (MTmin) (r = −0.9811, P < 0.05). Analogously, most of the environmental parameters highly and positively correlated with Hj for the MGL group, though not significantly. However, there was no correlation between climatic variables and Hj for the KZ geo-group. The Shannon index (Ho), the number of available alleles (Ne) and the number of polymorphic loci (NPL) all demonstrated a significantly positive-low correlation with precipitation of the driest quarter (PD) and precipitation of the coldest quarter (PC) at the species level, while only MTmin showed a significantly negatively impact on NPL for the XJ geo-group (r = −0.3320, P < 0.05). For the KZ and MGL geo-groups, there was no significant association between environmental variables and Ho, Ne and NPL as listed in Table S7. In short, the significant correlation occurs only in the species and XJ geo-group levels, while the high correlation (though non-significant) was pervasive in the KZ and MGL geo-groups.

Discussion

AFLP polymorphisms

The polymorphism rate among the genotypes under exploratory conditions is considered as a key factor in measuring the discriminatory and diversity analysis efficiency of DNA markers (Roldán-Ruiz et al., 2000). Notably, AFLP fingerprinting plays an important role in informatively investigating the genetic divergence, population structure and phylogenetic relationships (Costa et al., 2016). All the AFLP primer combinations utilized in this study provided abundant and definite information and produced high-quality DNA profiles with an average of 80.79% polymorphic bands per primer combination (Table 1), lower than that of previous study by random amplified polymorphic DNA (RAPD) in P. huashanica (95.08%, Wang, Xu & Song, 2005) and inter-simple sequence repeat (ISSR) in P. juncea (85.70%, Liu, 2009). The PIC value is one of the most extensively used indicators for evaluating the discriminative power of markers in most diversity studies (Monfared et al., 2018; Wu et al., 2019). The average PIC value of 0.28 in our study was higher than that obtained in the previous diversity study using SSR markers in P. juncea (PIC = 0.2124, Zhang et al., 2017 and PIC = 0.2337, Zhang et al., 2019), demonstrated good marker discriminatory ability owing to the PIC value ranging from 0 to 0.5 for AFLP markers (Zhang et al., 2018; Laurentin & Karlovsky, 2007). The average Shannon index value (I) of 0.4672 in our study was higher than that estimated by ISSR in related P. huashanica (I = 0.391). According to the average values of PIC and I, primer E84M64 could be considered as an optimum primer combination for P. juncea due to the highest PIC (PIC = 0.3115) and I values (I = 0.5031, Table 1).

Explanation of genetic relationships

To better protect germplasm resources and develop synthetic varieties, investigating genetic relationships and population differentiation of P. juncea populations is of great importance. The GD-based UPGMA dendrogram divided eleven studied populations into three clusters (cluster I, II and III, Fig. 1). Almost all the populations could be clustered into cluster I, except population PJ04 and PJ05 (both originated from XJ). PJ03, PJ04 and PJ05 all had completely consistent genetic memberships in STRUCTURE analyses (Fig. 1), which is why the smaller Fst values between PJ04 and PJ03 (Fst = 0.043) or PJ05 (Fst = 0.032) were observed (Table S3). In addition, heterogeneity of genotypes identified by structure pattern based on individuals also accounted for the dispersion of geographical populations. Comparing to UPGMA clustering, PCoA analysis is more dominant in distinguishing the geographical groups, except two individuals from XJ, were claded with MGL, and another two individuals from KZ were claded with MGL and XJ, respectively, which probably because of the gene flow among three groups confirmed by the Nm values between them (XJ vs. KZ, 1.4278; XJ vs. MGL, 2.0657; KZ vs. MGL, 1.4392, Table 4).

Population genetic diversity

Classically, a high degree of gene flow, which could neutralize interspecific differentiation and intraspecific genetic drift, is extremely common in cross-pollinated plants thereby contributing to low genetic diversity among individuals or populations (Sui et al., 2009). Intraspecific genetic drift could promote the genetic differentiation between populations and alleviate the diversity level when Nm was inferior to one (Hutchison & Templeton, 1999; Govindaraju, 2010). In our study, the low inter-population variance components and genetic diversity were observed with total Fst = 0.1106, Gst = 0.2533 (Table 4), Hj = 0.3110 and Ho = 0.4669 (Table 2), which was less than those parameters of typically outcrossing species by meta-analysis based on dominant markers (Fst = 0.27, Hpop = 0.27) (Nybom, 2004), and was also inferior to previous study of P. huashanica using RAPD (Gst = 0.3263, He = 0.3198) (Wang, Xu & Song, 2005).

Given the reduced genetic diversity as mentioned above, inbreeding effect had to be brought back into consideration, which was reconfirmed by Wright’s F statistics ‘f’ (inbreeding coefficient) under the f-free model (f = 0.496, Fst = 0.128, Table 5). As extreme weather occurs frequently resulted from the interference of human activities such as overgrazing and grassland reclamation, natural habitat fragmentation of plant species in the Central Asian steppe is intensifying. In return, the progress of inbreeding depression is accelerated in spite of a certain degree of gene flow, which cannot completely compensate for the decrease in diversity caused by a higher inbreeding rate (Wesche & Treiber, 2012; Sui et al., 2009; Liu et al., 2013). Additionally, with the rapid climate changing and habitat destruction, long-lived perennial plants were apt to remain low rates of seedling recruitment remains within the population (Ellis, Weekley & Menges, 2007; Chesser & Brewer, 2011). The possible factors such as high grazing pressure, low germinability, soil drought and seed removal by predators were considered to reduce or prevent seedling recruitment, which ultimately resulted in decrease of populations’ size and their genetic diversity (Endels et al., 2007).

Geographical barriers and local adaptation lead to spatial genetic structure

In this study, a highly significantly positive correlation between genetic difference and geographical distance was observed in the Mantel test (r = 0.6024, P < 0.001, Table 4), showing an IBD model over all sampling locations that was comparable with the results of Liu (2009). Usually, more opportunity for allelic exchange could be obtained by individuals of neighboring populations because the gene flow will theoretically be obstructed by a longer geographical distance (Wright, 1946; Zhang et al., 2018). However, it is interesting to observe that significant IBD were observed at XJ vs. KZ or XJ vs. MGL (r = 0.3822 and 0.4786, respectively, P < 0.05) although there was a farther geographical distance between KZ and MGL. This may be caused by genetic discontinuity because of the geographical barriers, such as the Tianshan Mountains and Altai Mountains.

The genetic structure of species is affected by the interaction of multiple factors, such as the transmission model of seeds and pollen, population demographic history, geological events, geographical or ecological barriers and divergent selection of environmental factors (Yang et al., 2017; Ohsawa & Ide, 2008). While a strong gene flow (Nm > 1) may weaken the segregation of species, the hierarchical structure of P. juncea revealed by AFLPs here could scarcely be ascribed to gene drift or inbreeding (Sui et al., 2009; Clark, Wentworth & O’Malley, 2000; Slatkin, 1993). In this study, the distribution of variation among regions, among populations within regions, and within populations inferred by AMOVA-derived F statistics were 0.3593, 0.3070, and 0.5560, respectively (P < 0.001 at all hierarchies, Table 3) using positive outliers, clearly higher than those values based on neutral loci (0.0568, 0.0710 and 0.1238, respectively, P < 0.001), revealing that local adaptations could have occurred at different geographical levels. Because of a varied topography, isolation by environment (max temperature of warmest month, mean temperature of driest quarter, mean maximum temperature from May to August, and so on) would cause the monodirectional variation of individuals in neutral evolutionary process but also promote the directional variation of population structure and the diversity of genotypes (Hou & Lou, 2011). This could explain why XJ populations had the highest Hj and Ho (Hj = 0.3553 and Ho = 0.4589, Table 2) values, and populations from XJ had entirely different genetic memberships in the UPGMA and STRUCTURE analyses.

Conclusions

AFLP markers were applied to study the adaptive genetic differentiation of eleven P. juncea populations in the current investigation. The IBD pattern accompanying with environmental heterogeneity and geographical barrier induced moderate genetic differentiation among populations (Fst) and regions (Fct) of P. juncea. Furthermore, the relatively low genetic diversity of P. juncea might be explained by habitat fragmentation or the low seedling recruitment. In general, additional populations should be collected from diverse eco-sites so as to extend genetic and phenotypic diversity of founder populations used to form synthetic varieties with a broad genetic base in a breeding program.

Supplemental Information

Geographic information of P. juncea populations.

DOI: 10.7717/peerj.9033/supp-1

Primer sequence information.

DOI: 10.7717/peerj.9033/supp-2

Nei’s genetic distance (above the diagonal) among 11 P. juncea populations based on Nei–Li diversity index.

DOI: 10.7717/peerj.9033/supp-3

Nei’s genetic distance (above the diagonal) among XJ, KZ and MGL groups of P. juncea.

DOI: 10.7717/peerj.9033/supp-4

Number of individuals correctly allocated to their source populations.

N, total number of individuals per population; CA, percentage of correct allocation; MA, percentage of mistake allocation

DOI: 10.7717/peerj.9033/supp-5

23 bioclimatic parameters were derived from DIVA–GIS.

DOI: 10.7717/peerj.9033/supp-6

Correlation analysis between genetic diversity and climatic factors by Mantel test.

AMT, Annual mean temperature; MTW, Max temperature of warmest month; MTC, Mean temperature of coldest quarter; MTD, Mean temperature of driest quarter; PD, Precipitation of driest quarter; PC, Precipitation of coldest quarter; MTmax, Mean maximum temperature from May to August; MTmin, Mean minimum temperature from May to August; GD, Genetic distance; Climate (all), comprehensive climate, namely the combination of all climate factors. = P < 0.05, ∗∗ = P < 0.01.

DOI: 10.7717/peerj.9033/supp-7

Raw data scanned from AFLP.

DOI: 10.7717/peerj.9033/supp-8

Optimal cluster (K) obtained from STRUCTURE HARVESTER.

DOI: 10.7717/peerj.9033/supp-9
5 Citations   Views   Downloads