Genetic diversity and population structure of the endangered basal angiosperm Brasenia schreberi (Cabombaceae) in China

Brasenia schreberi J.F. Gmelin (Cabombaceae), an aquatic herb that occurs in fragmented locations in China, is rare and endangered. Understanding its genetic diversity and structure is crucial for its conservation and management. In this study, 12 microsatellite markers were used to estimate the genetic diversity and variation in 21 populations of B. schreberi in China. A total of 61 alleles were found; assessment of allelic richness (Ar = 1.92) and observed and expected heterozygosity (HO = 0.200, HE = 0.256) suggest lower genetic diversity compared to some endangered species, and higher variation was observed within populations (58.68%) rather than among populations (41.32%). No significant correlation between geographical and genetic distance among populations was detected (Mantel test, r = 0.0694; P = 0.7985), which may have likely resulted from barriers to gene flow (Nm = 0.361) that were produced by habitat fragmentation. However, Bayesian and neighbor-joining cluster analyses suggest a population genetic structure consisting of two clusters (I and II) or four subclusters (I-1, 2 and II-1, 2). The genetic structure and distribution of B. schreberi in China may have involved glacial refugia that underwent range expansions, introgression, and habitat fragmentation. The findings of the present study emphasize the importance for both in situ and ex situ conservation efforts.


INTRODUCTION
Brasenia schreberi J.F. Gmelin, a basal angiosperm that belongs to family Cabombaceae (Nymphaeales), is a perennial aquatic plant. Similar to most aquatic plants, B. schreberi can reproduce sexually by outcrossing or asexually through rhizomes and winter buds (Bertin, 1993;Griffin, Mavraganis & Eckert, 2000). It has a wide yet and sporadic geographical distribution in temperate and tropical regions of Asia, Australia, Africa, India, and North and South America (Kim et al., 2012).
Prior to the early 20th century, B. schreberi was widely distributed in China and grew in unpolluted aquatic environments such as freshwater ponds, lakes, swamps, and even agricultural fields. However, in recent decades, its population has significantly decreased due to the loss of natural habitats and deterioration in water quality resulting from excessive human activities, particularly involving leaves harvested for food and increasing the use of fertilizers and pesticides. Previous field investigations experienced difficulty in finding natural populations in regions within China where B. schreberi was previously known to grow in abundance (Gao, Zhang & Chen, 2007). Similar situations have been reported in other countries such as Korea and Japan (Kim et al., 2012). It is currently listed as a critically endangered species in China, belonging to the first category of key protected wild plants (Yu, 1999). Therefore, effective conservation management to preserve the remaining populations of B. schreberi is imperative.
Demonstration of genetic diversity and structure in rare plant species is often crucial for formulation of conservation and management strategies because it provides valuable insights into the vital aspects of demography, reproduction, and ecology (Zaya et al., 2017). Previous studies have employed various molecular markers to assess the genetic diversity of B. schreberi, including inter-simple sequence repeat markers (Zhang & Gao, 2008), randomly amplified polymorphic DNA, amplified fragments length polymorphisms (Kim, Na & Choi, 2008), and nuclear ribosomal spacer and chloroplast DNA sequences (Kim et al., 2012). Despite the significant decrease in the size of B. schreberi populations, these studies reveal significant genetic diversity, highlight factors that negatively influence genetic diversity, and propose potential conservation measures. Zhang & Gao (2008) investigated the population diversity in semi-natural populations of B. schreberi in Zhejiang and Suzhou provinces, whereas Kim et al. (2012) focused on natural populations from South Korea. However, no research investigations on the level and pattern of diversity and genetic structure in the wild populations of B. schreberi in China have been conducted to date.
Compared to other molecular markers, simple sequence repeat (SSR) markers have numerous merits, including co-dominance, high reproducibility, a relatively high level of polymorphisms, and are plentiful in the genome (Liao et al., 2013). SSR markers have been successfully applied to estimate the genetic diversity of other aquatic plants such as Sparganium emersum (Pollux et al., 2007), Zostera marina (Reusch, Stam & Olsen, 2000;Talbot et al., 2016), Sagittaria natans (Yue et al., 2011), Zizania latifolia (Chen et al., , 2017, Nymphoides peltata (Liao et al., 2013), Isoetes hypsophila (Li et al., 2013), Ruppia cirrhosa (Martínez-Garrido, González-Wangüemert & Serrão, 2014), Nuphar submersa (Shiga et al., 2017), and Ottelia acuminata (Zhai, Yin & Yang, 2018). Here, we used 12 microsatellite loci to detect and estimate the genetic variation of B. schreberi in China. In this study, a total of 21 populations, representing nearly the entire known natural distribution zones in China, were sampled. We aimed to determine (1) how the extent of genetic diversity is apportioned within and among populations of B. schreberi, and (2) the genetic structure and its association with geographical distribution. The findings of the present study may be utilized in conservation efforts on this species.

Sample collection
During June-September 2016, fresh leaves of 376 individuals from 21 locations across almost the entire geographical distribution of B. schreberi in China were collected (Fig. 1) and rapidly dried in silica gel until further analyses. We sampled a range of 14-20 individuals per site (Table 1). Because B. schreberi could reproduce asexually through its stolons, we selected leaves from plants within populations that were separated by at least five metres to reduce the collection of clonal individuals. Voucher specimens from each population were deposited in the herbarium of Wuhan Botanical Garden, Chinese Academy of Sciences.

DNA extraction, amplification, and sequencing
Total genomic DNA was extracted from silica-dried leaves using the modified cetyltrimethylammonium bromide method described by Doyle & Doyle (1987). SSRs were genotyped using 12 polymorphic SSR loci from previous study (Liu et al., 2016) using polymerase chain reaction (PCR). The total reaction volume was 25 mL, which consisted of 0.25 mM of each dNTP, five mL of 10 Â Taq buffer (10 mM Tris-HCl, (pH 8.3), 1.5 mM MgCl 2 , and 50 mM KCl), one mM of each forward primer labeled with a fluorescent chemical and an unlabeled reverse primer, one U of Taq polymerase (TransGen Biotech Co., Beijing, China), and 10-30 ng of genomic DNA as template. PCR amplification was performed with a PTC-100 thermocycler (Bio-Rad, Hercules, CA, USA) using the following program profile: 95 C for 3 min; followed by 30 cycles of 95 C for 30 s, annealing at 55-60 C for 30 s (depending on primer), and 72 C for 40 s; and a final extension at 72 C for 10 min. The PCR products were separated via electrophoresis on 1.0% (w/v) agarose gels, stained with ethidium bromide, and observed under UV light. Then, the multiplex amplified PCR products were sequenced on an ABI prism 3730xl and sized using an internal DNA standard (Rox-500; Life Technologies, Shanghai, China). The SSR fragments were visualized with GENEMAPPER v.4.0 (Applied Biosystems, Foster City, CA, USA).
For each population, the average number of alleles (N A ), N E , the number of private allele (A P ), H O , H E , I and, the pairwise F ST between each pair of populations were estimated using GENALEX ver. 6.0 (Peakall & Smouse, 2006). Allelic richness (Ar) and inbreeding coefficients (F IS ) were calculated using the diveRsity (Keenan et al., 2013) packages and confidence intervals were estimated with 10,000 bootstrap replicates. Gene flow (Nm) among populations was calculated using the expression Nm = (1 -F ST )/4F ST (Slatkin, 1993). Analysis of molecular variation was used to evaluate the relative level of genetic variations among groups (F CT ), among populations within groups (F SC ), and among individuals within populations (F ST ); these values, and the significance of each value, were tested using Arlequin ver. 3.5 (Excoffier, Smouse & Quattro, 1992;Excoffier, Laval & Schneider, 2005).
The neighbor-joining (NJ) tree was constructed using the software TreeFit (Kalinowski, 2009) based on Nei's genetic distance (D A ; Nei, 1987) to reveal the genetic relationships among the populations used in this study. The correlation between genetic distance and geographic distance was estimated using a Mantel test based on a matrix of genetic distance using D A and a pairwise matrix of geographic distance. Isolation by Distance Web Service ver. 3.23 (Jensen, Bohonak & Kelley, 2005) was used to test for significance with 10,000 permutations.
The BOTTLENECK program (Piry, Luikart & Cornuet, 1999) was used to estimate the possible influence of recent demographic changes on genetic diversity and identify genetic bottlenecks among populations. Wilcoxon signed-rank tests were conducted using three different models with 10,000 replicates: the infinite allele mutation model (IAM), the stepwise mutation model (SMM), the two-phased model of mutation (TPM) which with 70% of mutations were assumed to occur under the SMM, and 30% were assumed to occur under the IAM. The mode shift of each population was also estimated using BOTTLENECK using default settings.
The population structure was tested using STRUCTURE ver. 2.3.1 (Pritchard, Stephens & Donnelly, 2000) based on a Bayesian clustering method. The approach was used to cluster genetically similar individuals and assess the most likely clustering state under the Hardy-Weinberg principle. The optimum number of genetic clusters (K) was tested from K = 2 to 20 clusters based on assuming admixture, correlated allele frequencies.
The analysis was performed for 10 iterations, with a burn-in of 80,000 replications, followed by 800,000 Markov Chain Monte Carlo replications. The most likely number of clusters was verified based on the ÁK method (Evanno, Regnaut & Goudet, 2005). To generate a consensus K, independent runs of all data were normalized with CLUMPP ver. 1.1.2 (Jakobsson & Rosenberg, 2007) using the Greedy algorithm with 100,000 repeats. DISTRUCT ver. 1.1 (Rosenberg, 2004) was used to visualize the population structure.

Microsatellite variation and genetic diversity within populations
Null alleles were observed in a few loci (BS02, BS06, BS09, BS18, and BS19). A total of 61 alleles were detected across 12 SSR loci in 376 individuals from 21 populations ( Table 2). The number of alleles generated by each marker ranged from two at locus BS18 to 13 at locus BS05, with an average of 5,167 alleles at each locus. Ne for each locus ranged from 1.075 to 5.710. Ho ranged from 0.003 to 0.955, and He ranged from 0.07 to 0.826. High F ST values (F ST = 0.152-0.710, mean = 0.393) and PIC values (PIC = 0.483-0.854, mean = 0.591) were detected at all loci. The average F IS across all loci was 0.341. Nm per locus varied between 0.102 in locus BS09 and 1.393 in locus BS18 (Table 2).  The average number of alleles per population ranged from 1.5 ± 0.151 (QYSYH) to 2.667 ± 0.284 (YTLHS). Six private alleles, distinctive to a specific population, were observed in population YTLHS, three were recorded in population LBMH, and two were recorded in population GDSYD, whereas other populations, including SZDSZ, SZLTS, YNTC, MSGH, CLHL, and TWYL, had a single private allele each (Table 3). Ar ranged from 1.49 to 2.58. H O and H E ranged from 0.093 ± 0.083 to 0.354 ± 0.137 and from 0.139 ± 0.056 to 0.398 ± 0.051, respectively. I ranged from 0.220 ± 0.081 to 0.660 ± 0.091. F IS ranged from -0.349 to 0.666.
The Wilcoxon signed-rank tests revealed significant bottlenecks in 11 populations based on the IAM, SMM, and TPM assumptions, and normal models of distribution (L-shaped) for allelic frequencies were exhibited in other populations (Table 3).

Genetic diversity and differentiation among populations
The ÁK values were calculated to assess the optimal number of genetic clusters (K) in the population structure. Analyses using STRUCTURE gave the highest value for two distinct genetic clusters (K = 2) and a second possibility was K = 4; both of these displayed biological significance (Fig. 1). We analyzed the bar-plot outputs using the best K-value, where K = 4 further indicated deep genetic structure. At K = 2, cluster I included nine populations from Hunan Province (CLHL, GDSYD, MSGH), Zhejiang Province (HZXH, HZTJH), Jiangsu Province (SZDSZ, SZLTS), Jiangxi Province (YTLHS), and Taiwan (TWYL). At K = 4, each cluster defined above were further split into two subclusters. The rest of the populations were from cluster II, which were also separated into two subclusters (I-1, 2 and II-1, 2). Further population structure analysis indicated differentiation between two clusters (I and II) or among four subclusters (I-1, 2 and II-1, 2).
Under the D A distances (Nei, 1987), the NJ dendrogram (Fig. 2) indicated the genetic relationships among the studied populations (Table S2). Two main groups were also depicted, corroborating the STRUCTURE results, and each group had two subgroups. For example, the subgroup that included four populations (QYSYH, QYBSZ, SCHSY, and NDXFS) coincided with the results obtained from STRUCTURE; moreover, these populations were located in the same geographic region. These findings were suggestive of genetic differentiation among populations from southwest China, central China, and southeast China.
Analysis of molecular variation indicated substantial genetic variations among groups, populations, and individuals (Table 4). Of the total genetic diversity, the main components of genetic variation were from individuals within populations (58.68%), whereas 41.32% was attributable to variations among populations. When K = 2/4, differentiation was attributed to individuals within populations and among groups at 55.60%/56.19% and 10.71%/18.23%, respectively; the rest of the observed differentiation (33.69%/25.58%) was observed among populations.
No significant isolation by distance was detected by the Mantel test (r = 0.0694; P = 0.7985), suggesting that genetic differentiation in populations might not be the result of isolation by geographic distance (Fig. 3).  Two genetic clusters were revealed in B. schreberi in China by both Bayesian clustering and NJ cluster analyses, but with low bootstrap supporting values in NJ dendrogram, which were probably as a result of low genetic variation existed among populations. Within each genetic cluster, two subclusters were also detected, and the major genetic subclusters generally coincided with geographical regions. Several factors such as long-term survival in glacial refugia with some population expansion or long-distance dispersal, introgression, and population fragmentation may have contributed to the population genetics structure of B. schreberi in China. Analysis of molecular variation indicated substantial among-population differentiation (F ST = 0.413, P < 0.001) of B. schreberi in China. Furthermore, Mantel testing did not detect any significant isolation by distance (r = 0.0694, P = 0.7985), thus suggesting that geographic distance might not be the main factor causing genetic differentiation among B. schreberi populations. The significant genetic differentiation found in B. schreberi populations can probably be explained by its long evolutionary history in glacial refugia. The current populations of B. schreberi in China are mostly located in biodiversity hotspots and centers of endemism for plants in China; for example, populations YNGLG and YNTC, LBMH and MSGH, GDSYD, CLHL, YTLHS, XGLHT, and NXQZS are distributed in the eastern Himalayan region, the Hengduan Mountain region, and the Nanling region, respectively, three of the world's biodiversity hotspots (Myers et al., 2000).
Populations CQSZ and LCFBS were located in the "Eastern Sichuan-Western Hubei" region, an endemic area for Chinese flora. These regions were not directly affected by repeated Pleistocene glaciations; thus, these regions may have served as major glacial refugia for plant species in China (Wang & Ge, 2006;Qiu, Fu & Comes, 2011). The great diversity in topography, climate, and ecological conditions of these regions may have provided opportunities for population differentiations of B. schreberi. This pattern of extensive population differentiation has also been revealed in various plant species that are distributed in similar geographic regions as B. schreberi (Qiu et al., 2009;Lei et al., 2012;Shi et al., 2014;Sun et al., 2014). Lei et al. (2012) investigated the phylogeography of the Chinese beech, Fagus engleriana, in subtropical China using sequences of two chloroplast intergenic spacers, which indicated that most genetic variations occur among populations (G ST = 0.831, N ST = 0.855), suggesting that long-term isolation of F. engleriana populations among multiple refugia during the Pleistocene climatic changes might be the main driving factor for its population divergence. Shi et al. (2014) studied the phylogeography of Castanopsis eyrei, a widely distributed tree of subtropical evergreen broad-leaved forests in China. In their study, both nuclear microsatellite and cpDNA data indicated high levels of population differentiation (SSR: F ST = 0.443; cpDNA: G ST = 0.709, and N ST = 0.729). They observed that the patterns of genetic differentiation between the extant populations of Castanopsis eyrei may have been affected by topographical differences between the mountainous western and lowland eastern refugia. STRUCTURE and NJ tree analyses indicated that B. schreberi populations were comprised of two distinct genetic clusters. One of the clusters included a total of 12 populations, which were further divided into two subclusters. Another cluster included the other sampled populations, which were also separated into two subclusters. These two major genetic clusters were not always grouped by geographical region; for example, the populations of NXQZS, XGLHT, QYBSZ, QYSYH, SCHSY, and NDXFS of cluster II are located in Southeast China (Nanling regions), whereas the rest of the populations in this cluster are located in Southwest and Central China. Specifically, populations NXQZS and XGLHT from the Southeast China region were subclustered with the populations from the Southwest and Central China regions (e.g., populations YNTC, YNGLG, LBMH, CQSZ, LCFBS, and ZYFS). Population TWYL from Taiwan Island was included in cluster I despite being geographically separated from the other populations in the Nanling region (e.g., the populations of MSGH, GDSYD, CLHL, and YTLHS). However, most of the subclusters coincided with specific geographical regions. For example, populations QYBSZ, QYSYH, SCHSY, and NDXFS from Zhejiang and Fujian provinces, with their short geographic distances, formed a subcluster. Similar geographical groups were found in populations HZTJH, HZXH, SZLTS, and SZDSZ from Zhejiang and Jiangsu provinces, and populations MSGH, GDSYD, CLHL, and YTLHS from Hunan, Guangdong, and Jiangxi provinces. The admixture pattern of populations of the two main genetic clusters of B. schreberi revealed in this study may be the result of long-distance dispersals. However, considering the highly fragmented modern populations of B. schreberi in China, we would not propose contemporary long-distance migrations of this species; instead, we suggest that the long-distance dispersal events are likely to represent historical migration events such as the Pleistocene interglacial/postglacial expansions from different refugia.
During the Pleistocene, the climate in subtropical China was characterized by significant cold-warm alternations, which have significantly affected plant population dynamics (Qi et al., 2012). Uplifting of mountains in the subtropical regions of China began in the Pleistocene (Ming, 1987;Tang et al., 1994;Luo et al., 2010). For example, the uplifting of the Yunnan-Guizhou Plateau began in the early Pleistocene and had a fast and dramatic uplift in the middle Pleistocene (Tang et al., 1994); furthermore, the Wuyi mountains were uplifted in the late Pleistocene (Luo et al., 2010). During periods of weak tectonic movements and/or more favorable climate (e.g., the Pleistocene interglacial periods), there is evidence indicating that numerous plant species underwent range expansions in subtropical China such as Sagittaria trifolia (Chen et al., 2008), Cercidiphyllum japonicum (Qi et al., 2012), Pteroceltis tatarinowii (Li et al., 2012), Tetracentron sinense (Sun et al., 2014), Quercus glauca , and Sargentodoxa cuneata (Tian et al., 2015). As a clonal aquatic herb with both sexual and asexual reproductive modes, B. schreberi should have been abundant enough to disperse in subtropical China during these periods. The distinct TWYL population on the island of Taiwan may have emerged after long-distance dispersal from the adjacent continent because land bridges between Taiwan and the Asian mainland formed during the Pleistocene, in line with changes in sea levels (Huang & Lin, 2011). However, the subtropical regions of China were subsequently fragmented by the intense topographical changes involving mountains, which in turn may have brought about genetic drift and/or extinction of isolated populations.
Range expansions may result in low levels of genetic diversity in newly established populations through a founder effect or a bottleneck effect (Excoffier, Foll & Petit, 2009). Our investigation of genetic diversity and bottlenecks in B. schreberi populations in China revealed low levels of genetic diversity within populations, and half of the studied populations might have experienced significant bottlenecks; this suggests that some of the B. schreberi populations in China might have been established by range expansion events. However, multiple range expansions from different refugee populations could trigger introgression or admixture and generate hybrid zones among populations in contact (Excoffier, Foll & Petit, 2009). Populations undergoing introgression and hybridization are mainly characterized by more admixed genotypes. STRUCTURE analysis of the B. schreberi populations showed that the genetic patterns of hybridization and introgression potentially involve subcluster I-2 (e.g., populations SZDSZ, HZTJH, SZLTS, and HZXH) and subcluster II-2 (e.g., populations SCHSY, QYBSZ, QYSYH, and NDXFS). A high number of individuals from these two subclusters and from populations TWYL and CLHL contained admixed genotypes, with their Q values supporting membership in both clusters. The populations from the subcluster I-2 and subcluster II-1 locations and the populations TWYL and CLHL might be the result of northward or eastward expansions of the southern, western, or central China populations. Similar dispersals such as the northward expansion from the Nanling Mountains along the Luoxiao Mountains to Dabie Mountain and the north-eastward expansion from the Nanling Mountains to the mountains of eastern Zhejiang and Fujian provinces along the coastline are evidenced by the extensive distribution of woody deciduous tree, Sargentodoxa cuneata (Tian et al., 2015).

CONCLUSION AND IMPLICATIONS FOR CONSERVATION
Preservation of genetic variation is a principal target of conservation biologists for endangered species because it is critically important to maintain their evolutionary potential to survive in an ever-changing environment (Shao et al., 2015;Hu et al., 2010). To our knowledge, this is the first study that includes almost all known natural populations of B. schreberi in China, and therefore, the results generated in this study may facilitate the development of an effective conservation program for this species in China and other areas around the world. Compared to other endemic aquatic plants, this species shows low genetic diversity and significant genetic differentiation. The long-term evolutionary history in glacial refugia, the introgression following the range expansions, and the habitat fragmentation might have played a vital part in the genetic differentiation of this species. In China, B. schreberi comprises four genetically distinct subgroups, which should be treated as separate units in conservation programs. For an ex situ conservation procedure, sampling more regions where the distinct genetic subgroups are located is suggested to cover more genetic diversity. Recent anthropogenic intervention through polluted water environments or deforestation might be the principal factor that is responsible for the current endangered state of B. schreberi. Its habitat has been damaged by human activities, including the widespread use of herbicides in agriculture (Dong, 2010) and overexploitation for profit. Essentially, in our field investigation and in previous reports (Dong, 2010), water pollution has worsened in habitats near residential areas. In addition, intense habitat fragmentation has also occurred in the field. Hence, protection of the remaining habitats of B. schreberi should be a priority of conservation programs.
Although great efforts have been made to reveal the population genetic structure of B. schreberi in China, samples from other regions around the world are limited. In addition, our DNA sequence-based analyses of the phylogeographic structure of B. schreberi revealed low polymorphism in the nuclear internal transcribed spacer (ITS) region and 20 chloroplast non-coding regions in six individuals sampled from different populations in China (Fig. S1). Thus, the evolutionary history of B. schreberi in China remains unclear. Additional samples from other places around the world and high throughput next-generation sequencing techniques (e.g., restriction-site-associated DNA sequencing or genotyping-by-sequencing) can be used to construct a more extensive phylogeographic history of B. schreberi in China, which could lead to more robust conservation strategies.