Analysis of genetic population structure and diversity in Mallotus oblongifolius using ISSR and SRAP markers

Background Mallotus oblongifolius, an evergreen shrub endemic to Hainan Island, China, is important both medicinally and economically. Due to its special medicinal significance and the continuing rise of market demand, its populations in the wild have been subject to long-term illegal and unrestrained collection. Hence, an evaluation of genetic variability is essential for the conservation and genetic reserve development of this species. Methods Sequence-related amplified polymorphism (SRAP) and inter-simple sequence repeat (ISSR) markers were employed to assess the genetic diversity and genetic structure of 20 natural populations of M. oblongifolius growing in different eco-geographical regions of Hainan Island, China. Results We revealed a considerable genetic diversity (h = 0.336, I = 0.5057, SRAP markers; h = 0.3068, I = 0.4657, ISSR markers) and weak genetic differentiation (Gst = 0.2764 for SRAP, Gst = 0.2709 for ISSR) with the same gene flow (Nm = 1.3092 for SRAP, Nm = 1.346 for ISSR) among the M. oblongifolius populations. The Mantel Test showed that the distribution of genetic variation among populations could not be explained by the pronounced geographical distances (r = 0.01255, p = 0.5538). All results of the Unweighted Pair Group Method with Arithmetic Mean (UPGMA), Neighbor-joining (NJ), Principal Coordinate Analysis (PCoA) and Bayesian analyses supported a habitat-specific genetic clustering model for M. oblongifolius, indicating a local adaptive divergence for the studied populations. Discussion We suggested that the habitat fragmentation and specificity for M. oblongifolius populations weakened the natural gene flow and promoted an adaptation to special habitats, which was the main reason for local adaptive divergence among M. oblongifolius.


INTRODUCTION
Mallotus oblongifolius is a tropical plant belonging to Euphorbiaceae. This plant, native to regions of Hainan Island in China, is economically important and medicinally relevant.
Mallotus oblongifolius is an evergreen shrub or small tree (2-10 m), mostly dioecious and rarely monoecious (Lin & Zhou, 1992). Its leaves, which are commonly known as "Zhe Gu Cha," have been used in a popular herbal tea to promote digestion. They are also used as a folk medicine for the treatment of cholecystitis. Various other biological properties of M. oblongifolius leaf extracts have been recently reported, and extracts have been shown to have anti-atherosclerotic (Yueli et al., 2011), analgesic and antioxidant properties. Due to its special medicinal significance, M. oblongifolius is known as "glossy ganoderma" in China. During the field investigations of this species, we observed considerable morphological variation. Such variation included leaf color, ranging from purple-red, red/pale red to pale green/green/dark green; size and width of leaves. Thus, to ensure that this plant is being used sustainably in a way that preserves phenotypic diversity, its germplasm must be collected, conserved and assessed at the phenotypic, genetic and molecular levels.
In previous studies of M. oblongifolius, researchers mainly focused on effective chemical composition analysis (Wei et al., 2004), product processing and pharmacology values. As a result, little information exists about M. oblongifolius diversity, germplasms and breeding. Furthermore, due to its special medicinal significance, wild populations have long been subjected to illegal and unrestrained collection as a result of the continuing rise of market demand, and the number of M. oblongifolius individuals has dropped dramatically. In our previous field investigations, healthy populations were hard to find. Additionally, some M. oblongifolius habitats have been exploited as tourist areas or have been reclaimed for agriculture, resulting in the disappearance of wild populations. Thus, the abundance and genetic variability of this important species may soon vanish if proper action is not taken. An evaluation of genetic variability is essential for the conservation and development of genetic reserves for this species. To ensure their success, these conservation strategies (in situ and ex situ) must be based on scientific practices (Bruni et al., 2013;Gentili et al., 2015). The analysis done here represents a preliminary step for selecting breeding material and establishing conservation strategies.
A molecular marker is a genetic marker based on nucleotide sequence variation in genetic material between individuals, and is a direct reflection of genetic polymorphism at the DNA level. It is a powerful tool for estimating the characteristics of genetic diversity and distinguishing among individuals from different sources (EL-Bakatoushi & Ahmed, 2018;Nair et al., 2018;Kumar & Agrawal, 2017;Zheng et al., 2012). Sequence-related amplified polymorphism (SRAP) and inter-simple sequence repeats (ISSRs) markers are dominant markers based on PCR and have been widely used in different genetic diversity studies (Xing et al., 2016;Gaafara et al., 2017;Sun et al., 2018;Khalifa et al., 2016). Sequence-related amplified polymorphism is a PCR-based molecular marker developed by Li & Quiros (2001), which uses a unique dual primer design to amplify specific regions of open reading frames (ORFs). It has been used in genetic diversity analysis and genetic map construction studies for a wide range of plants. Inter-simple sequence repeat is a similarly powerful tool for genetic mapping, gene tagging, germplasm resource identification, genetic distance, genetic diversity and molecular marker-assisted breeding research. Inter-simple sequence repeat markers consist of repeating units of one to six base pairs, targeting regions between two simple sequence repeat (SSR) sequences (Zietkiewicz, Rafalski & Labuda, 1994).
In this study, we used SRAP and ISSR markers to assess the genetic diversity and genetic structure of 20 natural populations of M. oblongifolius on Hainan Island, China. The targets of this study were: (i) to assess the level of genetic diversity and genetic differentiation across M. oblongifolius populations; (ii) to characterize the genetic structure within the populations and (iii) to assess the correlation between the two markers. These results could benefit M. oblongifolius germplasm collection, conservation and future breeding.

Plant materials and DNA extraction
In total, 371 adult individuals of M. oblongifolius were collected from 20 naturally occurring populations on Hainan Island, China from January 2016 to May 2017 ( Fig. 1; Table 1), covering almost the entire range of species distribution on Hainan Island. For the convenience of the study, their names have been assigned arbitrarily, with populations BW, DPC, GMS, GPS, GX, SJ, SSC, TX, XP and ZCED along stream banks; populations DSL, JF, PJL, QXL and TGL along the mountainside and populations BSLLX, LCNC, LTCC, NPNC and ZJC at the foot of the mountain. The longitude and latitude of each  Table 1 for explanation of population abbreviations). Sampled populations from different habitats are marked with different colors: populations along stream banks (dark-green circles), populations along the mountainside (blue triangle), populations found at the foot of the mountain (green square).
Full-size  DOI: 10.7717/peerj.7173/ fig-1 population were recorded using a global positioning system (GPS; UniStrong, Beijing, China). Eight to 22 individuals from each of the 20 populations were used for SRAP and ISSR analyses. Individuals were randomly sampled from known geographic areas of each population and were spaced at least five meters apart to minimize sampling of progeny. A modified cetyltrimethylammonium bromide method (Li et al., 2010) was used to extract genomic DNA from fresh and healthy leaves of each M. oblongifolius population. The value of absorbance of the extracted DNA was measured using an ultraviolet spectrophotometer, and its integrity was detected with agarose gel electrophoresis. DNA samples were diluted to a final concentration of 20 ng/mL for PCR analysis.

SRAP-PCR analysis
Among 100 pairs of SRAP primer combinations, 14 pairs with good repeatability and high polymorphism were selected to amplify the genomic DNA of the 371 accessions from the 20 naturally occurring populations of M. oblongifolius by SRAP-PCR (Table S1). According to Yan et al. (2018), the optimum SRAP-PCR reaction system (20 mL) includes 2.0 mL 10 Â PCR buffer, 3 U Taq DNA polymerase, 150 mmol/L dNTPs, 0.4 mmol/L each of SRAP forward and reverse primers and 40 ng genomic DNA. PCR reactions were carried out in a Heal Force T960 Thermocycler (Biometra, Shanghai, China) under the following conditions: initial denaturation at 94 C for 5 min followed by five cycles of denaturation at 94 C for 1 min, annealing at 35 C for 1 min and extension at 72 C for 1 min. For the next 35 cycles, denaturation at 94 C for 1 min, annealing at 48 C for 1 min and extension at 72 C for 1 min; a final extension step at 72 C for 10 min. The amplification products were resolved on 2% agarose gels.

ISSR-PCR analysis
Among 100 ISSR primers, 10 primers with good repeatability and high polymorphism were selected to amplify the genomic DNA of the 371 accessions from the 20 naturally occurring populations of M. oblongifolius by ISSR-PCR (Table S2). The optimum ISSR-PCR reaction system (20 mL) includes 2.0 mL 10 Â PCR buffer, 1.5 U Taq DNA polymerase, 300 mmol/L dNTPs, 0.6 mmol/L ISSR primers and 80 ng genomic DNA. A Heal Force T960 Thermocycler (Biometra, Shanghai, China) was used for ISSR-PCR. PCR cycling conditions were: initial denaturation at 94 C for 4 min followed by 40 cycles of denaturation at 94 C for 40 s, annealing at 50 C for 45 s (for annealing at primer-specific temperatures, see Table S2) and extension at 72 C for 2 min; a final extension step at 72 C for 8 min. The amplification products were resolved on 2% agarose gels.

Data analysis
Bands were scored manually. According to the weight of the DNA ladder (100 bp), the same weight bands were marked as a line. The bands that were clearly visible and repeatable on the electrophoresis map were marked as "1" and the absence of a band at the same site was marked as "0". A binary data matrix was compiled for each primer set. Both the total number of bands amplified by each primer and the number of polymorphic bands were calculated. Most of the following analyses were carried out based on adjoined matrix data. The genetic diversity of different M. oblongifolius populations was evaluated by calculating the number of alleles (na), effective number of alleles (ne), Nei's gene diversity (h), Shannon's information index (I) and percentage of polymorphism bands (PPB). These parameters (na, ne, h, I, PPB) were calculated using POPGENE (version 1.32) software (Yeh et al., 1999). The total genetic diversity (Ht), the mean within-population genetic diversity (Hs), genetic differentiation coefficients among different populations (Gst) (Nei, 1974) and gene flow number (Nm, Nm ¼ 0.5(1 -Gst)/Gst) were calculated according to the gene frequency matrix using POPGENE (version 1.32) software. Genetic similarity (GS) coefficients and genetic distance (GD) among the 20 M. oblongifolius populations were calculated using Jaccard's similarity coefficients (Sneath & Sokal, 1973). The Unweighted Pair Group Method with Arithmetic Mean (UPGMA) and Neighbor-joining (NJ) trees based on the GD were used for cluster analysis on Molecular Evolutionary Genetics Analysis (MEGA) version 6.06 (Tamura et al., 2013). Principal Coordinate Analysis (PCoA) based on the GS matrix was performed using NTSYS-pc version 2.1 (Rohlf, 2000) to detect the genetic relationships among populations. The MXComp program of NTSYS-pc version 2.1 was used to test ISSR and SRAP data by Mantel (Mantel, 1967) statistics. Correlations between the two GD matrices were compared and a significance test was conducted. A Mantel test was performed to estimate a correlation between the matrices of Nei's (1978) GDs and of geographical distances using NTSYS-pc version 2.1. Geographic distances among populations were measured based on the latitude and longitude coordinates of the populations.
The population structure of M. oblongifolius was studied further. A calculation based on a Bayesian model was performed on the combined SRAP and ISSR data in STRUCTURE program version 2.3.4 (Pritchard, Stephens & Donnelly, 2000) to detect the number of population clusters (K). The most likely number of clusters (K) was estimated using STRUCTURE HARVESTER (Earl & Vonholdt, 2012), which determines the optimal K depending on the probability of the data for a given K and the DK (Evanno, Regnaut & Goudet, 2005).

Polymorphism analysis of SRAP and ISSR amplified products
The number of plants sampled per population of M. oblongifolius is shown in Table 1. Fourteen SRAP primers and 10 ISSR primers were selected to study genetic diversity and genetic differentiation of M. oblongifolius. One hundred and ninety-seven total bands were produced by 14 SRAP primers, of which 164 bands (83.24%) were polymorphic (see Table S1). Nine to 20 bands and three to 19 polymorphic bands were yielded for each primer pair. The average number of bands and polymorphic bands for each primer pair was 14.1 and 11.7, respectively. For the ISSR markers, the 10 primers generated 141 bands, and each primer yielded 10-17 bands (see Table S2). There was an average of 14.1 bands per primer set, and 12.4 polymorphic loci were obtained (87.94%). These results show that both SRAP and ISSR markers can effectively reveal the polymorphism among materials. Mantel's test showed that at the population level, the analysis results of SRAP and ISSR were highly and significantly correlated (r ¼ 0.7804, p ¼ 0.01) (Fig. 2). Thus, SRAP and ISSR are highly consistent and reliable tools for analysis of both the genetic diversity and the population structure of M. obongifolius. However, ISSR exceeded SRAP in the ability to detect genetic polymorphism with higher resolving power. The PPB ranged   1.65142 and the effective number of alleles per locus (ne) ranged from 1.2714 to 1.5012, with an average of 1.38099. At the species level, na, ne, h, I and PPB were calculated to be 2%, 1.5186%, 0.3068%, 0.4657% and 87.94%, respectively. The ISSR markers revealed the highest genetic diversity in population XP (p ¼ 80.85%, h ¼ 0.2722, I ¼ 0.4087), followed by populations TX and SJ, and the lowest genetic diversity was found in population PJL. According to the SRAP markers, population TX had the highest genetic diversity (p ¼ 79.19%, h ¼ 0.2725, I ¼ 0.4074), followed by populations ZJC and SJ, and the lowest genetic diversity was found in population PJL. The genetic diversity demonstrated by the SRAP marker was similar to that revealed by the ISSR marker. At the species level, h, I and P, meaning the genetic diversity, were significantly higher than at the population level. Each index exhibited a small change among populations, and there is a similar level of genetic diversity within each population. Based on SRAP and ISSR data, the total genetic diversity (Ht) and mean within-population genetic diversity (Hs) of M. oblongifolius were calculated, as well as the genetic differentiation coefficients among different populations (Gst) (see Table 4). Nearly 27% of the total genetic variation in the 20 populations of M. oblongifolius occurred among populations (Gst ¼ 0.2764 for SRAP, Gst ¼ 0.2709 for ISSR), rather than within. Both SRAP

Cluster and population structure analysis
Both the GD and the GS among populations were calculated with POPGENE (version 1.32) using SRAP and ISSR data (see Tables S3 and S4 Genetic relationships among the populations were constructed by UPGMA cluster analysis based on a GD matrix using the SRAP and ISSR markers. The results show that the two markers completely divide the 20 populations of M. oblongifolius and that the clustering results had certain similarities but were not identical. The clustering of M. oblongifolius is not significantly related to its geographical origin. For example, the population GMS and population GPS with the closest geographic sources are clustered in large branches, but they are not clustered together in small branches. However, the dendrogram shows a spatial pattern of GD among the populations aligning with their habitat status. A dendrogram (Fig. 3A) that grouped the 20 populations into two main groups (Group A1 and Group B1) using the SRAP data. Group A1 consisted of populations BSLLX, BW, DPC, GMS, GPS, GX, LCNC, LTCC, NPNC, SJ, SSC, TX, XP, ZCED and ZJC, which were distributed along stream banks and the dark damp regions at the mountain bases. Three sub-groups (A1-1, A1-2 and A1-3) were identified within Group A1. Populations DSL, JF, PJL, QXL and TGL were distributed along the mountainside and tightly grouped into Group B1. The cluster analysis based on the ISSR data generated a unique dendrogram that divided the 20 populations into two main groups (Group A2 and Group B2) (Fig. 3B). Group A2 consisted of populations BW, DPC, GMS, GPS, GX, SJ, SSC, TX, XP and ZCED, whereas the other populations fell into Group B2. Two sub-groups (A2-1 and A2-2) were identified within Group A2 and two sub-groups (B2-1 and B2-2) were identified within Group B2. This was similar to the results using the SRAP markers,

Note:
Ht: The total genetic diversity; Hs: The mean within-population genetic diversity; Gst: The genetic differentiation coefficients among different populations; Nm: Gene flow number, Nm ¼ 0.5(1 -Gst)/Gst. except for populations BSLLX, LCNC, LTCC, NPNC and ZJC. Figure 3A shows that populations BSLLX, LCNC, LTCC, NPNC and ZJC were grouped into Group A1 under SRAP results, but then were grouped into Group B2 using ISSR (Fig. 3B). This result reveals that the habitat of populations BSLLX, LCNC, LTCC, NPNC and ZJC existed between stream banks and mountainsides, thus belonging to an intermediate habitat.
According to the Mantel test, the analysis results of SRAP and ISSR were significantly correlated. Therefore, POPGENE (version 1.32) was used to calculate the GD and the GS among populations using combined SRAP and ISSR data (see Table S5). The relationships among the 20 populations of M. oblongifolius were revealed by UPGMA and NJ-based GD using combined SRAP and ISSR data (Figs. 4 and 5). Both UPGMA and NJ trees divided the 20 populations of M. oblongifolius into two major groups. The dendrogram consistent with that was constructed using the SRAP data. In order to study the population structure of M. oblongifolius further, we performed a calculation based on a Bayesian model on the combined SRAP and ISSR data in STRUCTURE program version 2.3.4. Results showed a spatial pattern of GDs among the populations in accordance with their habitat status, which was similar to the result of UPGMA and NJ analyses. The results showed that when K ¼ 2 (Fig. 6), all the individuals of the population could be distributed to corresponding groups in a high proportion, and these groups were divided according to habitat type.
Principal Coordinate Analysis was used to construct 3D eigenvectors using the DCENTER module of the NTSYS-pc program to add complementary information to the cluster analysis (Fig. 7). The PCoA for 20 populations of M. oblongifolius revealed that these populations were divided into four groups. The results of PCoA were the same from the other cluster analyses as shown above.
Geographic distances of the 10 studied populations ranged from 0.456 to 276.232 km, with a mean of 81.63 km (Table S6). A Mantel test was performed to estimate a correlation between the matrices of Nei's (1978) GDs and of geographical distances using NTSYS-pc version 2.1 (Fig. 8). The results showed a low correlation between geographical and GDs (r ¼ 0.01255, p ¼ 0.5538).

DISCUSSION Genetic diversity of M. obongifolius
From the perspective of modern genetics, species with higher genetic diversity will have a wider natural distribution and stronger environmental adaptability, survivability and evolutionary potential. Genetic variation is a necessary condition for a species to adapt to environmental changes (Liu et al., 2015). Therefore, evaluation of genetic diversity is an essential component in germplasm characterization and collection. In the present study based on SRAP markers, ne, h and I were calculated to be 1.5702, 0.336 and 0.5057 at the species level, respectively. These results were higher than those obtained using ISSR markers (ne ¼ 1.5186, h ¼ 0.3068 and I ¼ 0.4657). However, SRAP markers produced a lower PPB (83.24%) than ISSR markers (87.94%). Sequence-related amplified polymorphism markers amplify the ORF of plant genome, whereas ISSR markers amplify the repetitive regions. Thus, differences exist in the polymorphism of the two amplification targets (Song et al., 2010). Our results show that both SRAP and ISSR markers can effectively reveal the polymorphism among materials. Mantel's test showed that the analysis results of SRAP and ISSR were highly and significantly correlated at the population level (r ¼ 0.7804, When compared with other species belonging to the Euphorbiaceae family such as Jatropha curcas L. (85.19% for ISSR) (Mavuso et al., 2016) and Ricinus communis L. (68.1% for ISSR) (Gajera et al., 2010), our study reveals a relatively high level of PPB (83.24% for SRAP, 87.94% for ISSR) in M. oblongifolius. Furthermore, the genetic diversity of M. oblongifolius (h ¼ 0.336, I ¼ 0.5057, SRAP markers and h ¼ 0.3068, I ¼ 0.4657, ISSR markers) is higher than the average genetic diversity of multiple plant populations detected based on dominant markers such as RAPD, RFLP and ISSR (h ¼ 0.22) in the review of Nybom (2004). Genetic diversity also differed somewhat within the 20 populations in this study. Both SRAP and ISSR markers indicated that population TX had the highest genetic diversity and PJL had the lowest. This variation may be due to human activity, random genetic drift and/or inbreeding variation. During our field investigation, we found that population PJL, with the lowest genetic diversity, was a highly damaged population potentially beyond recovery.

Genetic structure of M. oblongifolius populations
Genetic differentiation is an important index of genetic structure. Wright (1951) has previously shown that Gst can be used to determine the degree of genetic differentiation between populations with isoenzymes. In the present study, Gst ¼ 0.2764 for SRAP and Gst ¼ 0.2709 for ISSR were revealed for the 20 populations of M. oblongifolius, which were equivalent to or less than the mean values in the Nybom (2004) review of 31 RAPD and six ISSR studies (mean Gst ¼ 0.27 for RAPD, Gst ¼ 0.34 for ISSR), indicating weak genetic differentiation among the populations of M. oblongifolius. Hamrick & Godt (1990) believed that the mating system was the biggest factor affecting plant genetic diversity and genetic differentiation. Among the genetic variation in selfing species, 51% exist across populations (Gst ¼ 0.510), which amounts to five times more than that of wind-borne plants (Gst ¼ 0.099). Nearly 27% of the total genetic variation in the 20 populations of M. oblongifolius occurred among populations (Gst ¼ 0.2764 for SRAP, Gst ¼ 0.2709 for ISSR) with weak genetic differentiation among populations. By reviewing Gst values of plant species (based on RAPD data), Nybom (2004) reported an average Gst of 0.22 for 31 outbreeding species and 0.59 for six inbreeding species, and Bussell (1999) reported an average Gst of 0.19 for 29 outbreeding species and 0.63 for six inbreeding species. We supposed that the mating system of M. oblongifolius was the outbreeding or the mixing of mating taxa with dominated outbreeding, which may have an influence on extant among-population genetic structure of M. oblongifolius.
Gene flow is the movement of genes within and between populations. The intensity of gene flow has an important impact on population differentiation (Grant, 1991). It is generally believed that when Nm > 1, gene flow can prevent genetic differentiation between populations caused by genetic drift. When Nm < 1, genetic drift is the main reason for genetic structure differentiation among populations. The gene flows (Nm) among the M. oblongifolius populations were 1.3092 (SRAP) and 1.346 (ISSR), which indicates that the gene exchange was high between populations. Such an exchange could prevent genetic differentiation caused by genetic drift between populations, and may play important roles in the genetic structure of M. oblongifolius populations.
If gene flow by mating system and seed dispersal are the main causes of population variation, the closer the geographical distance between two populations is, the smaller the genetic differentiation should be. However, the result of the Mantel test suggested that the distribution of genetic diversity among populations might not be explained by the pronounced geographical distances (r ¼ 0.01255, p ¼ 0.5538). The populations in this study were clustered according to the degree of similarity of their habitats and had nothing to do with their geographical location, and the populations with similar habitats first clustered together. The pattern of genetic subdivision can be clearly demonstrated in the UPGMA and NJ cluster analysis (see Figs. 4 and 5), in which the populations of M. oblongifolius were divided into two groups according to their habitat type (stream banks, darker damper mountain bases and mountainsides). The results of PCoA and Bayesian analysis also support this habitat-specific genetic clustering model (see Figs. 6 and 7).
Local adaptive divergence will occur when the gene flow is weakened, and populations in different habitats experience divergent patterns of selection (Schluter, 2000). Due to their special medicinal significance, wild populations of M. oblongifolius have long been subjected to illegal and unrestrained collection, and the number and size of populations have dropped dramatically. This has led to habitat fragmentation and specificity, which weaken the natural gene flow between the populations and promote the adaptation to special habitats to maintain high genetic diversity.
Our evaluation of genetic diversity is an essential step toward the germplasm characterization of M. oblongifolius, which will in turn enable better collection and management practices to preserve its diversity.