Phylogeography and Population Variation in Prunus discoidea (Prunus subg. Cerasus) in China

Prunus discoidea is a unique cherry blossom germplasm resource native to China. It is widely distributed across the provinces of Anhui, Zhejiang, Jiangxi, Jiangsu, and Henan, with significant variation. We employed phylogeographic analysis to reveal the evolutionary history of P. discoidea to better understand its genetic diversity and structure. This study provides more accurate molecular insights for the effective conservation and utilization of this germplasm resource. We conducted a phylogeographic analysis of 348 individual plants from 13 natural populations using three fragments (rpoB, rps16, and trnD–E) of chloroplast DNA (cpDNA) and one fragment (ITS) of ribosomal DNA. The results revealed that P. discoidea demonstrates a significant level of genetic diversity (Hd = 0.782; Rd = 0.478). Gene flow among populations was limited, and the variation within populations was the main source of genetic diversity in P. discoidea (among populations: 34.26%, within populations: 65.74%). Regarding genetic differences among populations, Nst (0.401) showed greater differences than Gst (0.308; p < 0.05), demonstrating that there was a significant geographical structure of lineage. One lineage was the central region of Anhui and the western region of Hubei. The other lineage was the Jiangsu region and the Zhejiang region. P. discoidea diverged from Prunus campanulata approximately 1.5 million years ago, during the Pleistocene epoch. This study provides a scientific theoretical basis for the conservation and utilization of germplasm resources of P. discoidea.


Introduction
Prunus discoidea is a member of the Prunus subg.Cerasus in the Rosaceae family.It is an excellent germplasm resource endemic to China [1].The branches of P. discoidea are graceful and spreading, with pink flowers that bloom early in the season.It is widely distributed in Anhui, Zhejiang, Jiangxi, Jiangsu, and Henan provinces [2], growing in valley forests or streamside thickets at elevations of 200-1100 m [3].It has a wide distribution and significant variation.Currently, the research on P. discoidea mainly focuses on the resources, community structure, species diversity, and morphological characteristics.For example, Nan et al. showed that the ecological niches of the P. discoidea communities in four areas (Huang Mountain, Baiyun Mountain, Tianmu Mountain, and Lu Mountain) had a high degree of overlap [4].Yan et al. indicated that P. discoidea primarily inhabited the central and eastern regions of China [3].Fu et al. indicated that P. discoidea, Prunus cantabrigiensis, and Prunus helenae were grouped into one clade [5].Shang  Baiyun Mountain [6].However, research on the phylogeography of P. discoidea is still lacking, and its biogeography and trait evolution are not yet clear.
Phylogeography is often referred to as molecular phylogeography.It is a field of study concerned with the principles and processes governing the geographic distributions of genealogical lineages, especially those within and among closely related species [7].The concept was first introduced by Avise in 1987 [8].It enables a more precise depiction of genealogical geographic structures, variances in geographic distributions, and the spatial and temporal dynamics underlying speciation, utilizing advanced bioinformatic tools [9].As DNA sequence data continue to expand and resources become increasingly abundant, research in plant phylogeography has progressed from initially examining changes in gene frequencies to using microsatellite marker technology and single-nucleotide polymorphism data to cluster populations.Subsequently, phylogeographic research has employed a combination of microsatellite genetic markers or a small number of cpDNA and mtDNA sequences [10].The study of the phylogeography of Prunus subg.Cerasus in China has only recently commenced, constrained by sampling time, costs, and molecular marker technologies.Currently, a combination of chloroplast fragments and nuclear gene fragments is often used to leverage their respective advantages for comprehensive analysis [11].In Prunus subg.Cerasus, chloroplast sequence fragments are mainly represented by matK [12] fragments and non-coding intergenic spacer regions such as trnD-trnE [13] and trnL-trnF [14].Within the nuclear genome, the ITS sequences are considered core markers for identifying plants within the Prunus subg.Cerasus.
In this study, based on a comprehensive survey of wild populations of P. discoidea and systematic sampling, we conducted a phylogeographic analysis of P. discoidea.Chloroplast DNA sequences and nuclear ribosomal internal transcribed spacer (ITS) sequences were used to analyze the genetic diversity and genetic structure of P. discoidea and an integrative method employed to trace its evolutionary history.These findings provide a theoretical foundation for future strategies related to the conservation and utilization of P. discoidea resources.

Population Genetic Diversity
Three cpDNA fragments, namely, rps16, rpoB, and trnD-E, had a combined length of 1886 bp, with fragment lengths of 887 bp, 440 bp, and 632 bp, respectively.Based on the concatenated sequences, nine variable sites were detected.Three mutation sites were detected for each of trnD-E, rpoB, and rps16.Seventeen chloroplast haplotypes (H1-H17) were recovered from the 13 populations (Table 1).At the species level, the overall population haplotype diversity (H d ) was 0.782, and the nucleotide diversity (P i ) was 0.00104.At the population level, the haplotype diversity (H d ) ranged from 0.000 to 0.891, and the nucleotide diversity (P i ) ranged from 0.00000 to 0.00113.At the regional level, the haplotype diversity of the two geographical regions ranged from 0.703 to 0.807, and the nucleotide diversity ranged from 0.00087 to 0.0013 (Table 2).
The sequence length of the nrDNA fragment measured was 692 bp, and 13 variable sites were detected.Six ribotypes (R1-R6) were recovered from the 13 populations (Table 3).At the species level, the overall ribosomal diversity (R d ) of P. discoidea was 0.478, and the nucleotide diversity (P i ) was 0.00451.At the population level, the ribosomal diversity (R d ) ranged from 0.000 to 0.756, and the nucleotide diversity (P i ) ranged from 0.00000 to 0.00571.At the regional level, the ribosomal diversity of the two geographical regions ranged from 0.348 to 0.530, and the nucleotide diversity ranged from 0.00336 to 0.00489 (Table 4).

Population Genetic Structure
The population differentiation index (F st ) of P. discoidea at the cpDNA level was 0.34264, signifying a significant level of differentiation.Among populations, genetic variation accounted for 36.27%,while within populations it was 63.73%.Genetic variation within populations slightly exceeded the variation among populations, although the values were similar.As shown by the AMOVA results, genetic variation among regional groupings was 12.47%, genetic variation among populations within regional groupings was 25.57%, and genetic variation within populations in regional groupings was 61.96%.The genetic variation within populations among regional groupings was greater than the genetic variation among populations.The genetic differentiation parameters of P. discoidea (N st = 0.40104, G st = 0.30809, p < 0.05) indicated population substructure.Genetic differentiation was detected in both geographic regions: eastern China (N st = 0.34538, G st = 0.22369, p < 0.05) and central China (N st = 0.3413, G st = 0.32690, p < 0.05).The phylogeographic structure was detected in both locations (Table 5).The population differentiation index (F st ) of P. discoidea at the ITS level was 0.57621, which suggested a high degree of species differentiation in P. discoidea.The genetic variation among populations accounted for 51.26%, while the genetic variation within populations was 48.74%, meaning that the genetic variation among populations was slightly higher than the genetic variation within populations.Genetic variation among regional groupings was 4.57%, the genetic variation among populations within regional groupings was 25.57%, and genetic variation within populations in regional groupings was 61.96%, indicating that the genetic variation among populations within each geographic group was slightly higher than the genetic variation within populations (Table 5).

Phylogeographic Structure
ZJS (Xianning, Hubei) and YZH (Lu'an, Anhui) had the most haplotypes, with eight haplotypes each, while BYS (Lishui, Zhejiang) and LS (Jiujiang, Jiangxi) had the fewest, with only one haplotype each.Regarding the frequency of haplotypes, H2 had the highest Plants 2024, 13, 2535 6 of 16 distribution frequency, totaling 133 individuals, and the lowest frequency was observed for H16 and H17, each with a distribution of three individuals (Figure 1A).Based on the TCS haplotype network diagram, H2 was located in the central position of the network diagram, with a wide distribution range and a higher proportion of individuals in the population.This suggested that it was an ancient haplotype.Haplotypes H3 and H4 were located in subcore positions and had a broader distribution, and hence they were classified as sub-ancient haplotypes (Figure 1B).Prunus padus, Prunus salicina, Prunus mume, Prunus mahaleb, and Prunus cerasoides were used as outgroups.The haplotype phylogenetic tree of P. discoidea based on the maximum likelihood method and Bayesian method showed a consistent topological structure.The 17 haplotypes formed a highly supported monophyletic group.The phylogenetic tree diverged into two distinct lineages, which were consistent with the clustering results of the haplotype TCS network.H1 and H12-H15 diverged into one clade, corresponding to the eastern lineage, while H6-H11 and H16-H17 diverged into another clade, corresponding to the central lineage (Figure 2A,B).
YZH had the most ribosomal types, with four, ZJG (Jingmen, Hubei) and DMS (Lin'an, Zhejiang) each had three ribotypes, and ZJS, LS, YTS (Lianyungang, Jiangsu), LKY (Nanyang, Henan), and THC (Huanggang, Hubei) had the fewest, with only one ribotype each (Figure 1C).R1 was located in the center of the network diagram, contained the most individuals, and was found in all populations except for Yuntai Mountain, suggesting that it was an ancient haplotype.The remaining ribotypes had all further mutated, establishing interconnections among different regions (Figure 1D).P. padus, P. salicina, and Prunus pseudocerasus were used as outgroups.The haplotype phylogenetic tree based on the maximum likelihood method and Bayesian method showed a consistent topological structure.The six ribotypes identified formed two distinct groups: one east lineage and one central lineage.R2 and R3 were ribotypes unique to the eastern region, and R4-R6 were ribotypes unique to the central region (Figure 3A,B).

Phylogeographic Structure
ZJS (Xianning, Hubei) and YZH (Lu'an, Anhui) had the most haplotypes, with eight haplotypes each, while BYS (Lishui, Zhejiang) and LS (Jiujiang, Jiangxi) had the fewest, with only one haplotype each.Regarding the frequency of haplotypes, H2 had the highest distribution frequency, totaling 133 individuals, and the lowest frequency was observed for H16 and H17, each with a distribution of three individuals (Figure 1A).Based on the TCS haplotype network diagram, H2 was located in the central position of the network diagram, with a wide distribution range and a higher proportion of individuals in the population.This suggested that it was an ancient haplotype.Haplotypes H3 and H4 were located in sub-core positions and had a broader distribution, and hence they were classified as sub-ancient haplotypes (Figure 1B).Prunus padus, Prunus salicina, Prunus mume, Prunus mahaleb, and Prunus cerasoides were used as outgroups.The haplotype phylogenetic tree of P. discoidea based on the maximum likelihood method and Bayesian method showed a consistent topological structure.The 17 haplotypes formed a highly supported monophyletic group.The phylogenetic tree diverged into two distinct lineages, which were consistent with the clustering results of the haplotype TCS network.H1 and H12-H15 diverged into one clade, corresponding to the eastern lineage, while H6-H11 and H16-H17 diverged into another clade, corresponding to the central lineage (Figure 2A,B).
YZH had the most ribosomal types, with four, ZJG (Jingmen, Hubei) and DMS (Lin'an, Zhejiang) each had three ribotypes, and ZJS, LS, YTS (Lianyungang, Jiangsu), LKY (Nanyang, Henan), and THC (Huanggang, Hubei) had the fewest, with only one ribotype each (Figure 1C).R1 was located in the center of the network diagram, contained the most individuals, and was found in all populations except for Yuntai Mountain, suggesting that it was an ancient haplotype.The remaining ribotypes had all further mutated, establishing interconnections among different regions (Figure 1D).P. padus, P. salicina, and Prunus pseudocerasus were used as outgroups.The haplotype phylogenetic tree based on the maximum likelihood method and Bayesian method showed a consistent topological structure.The six ribotypes identified formed two distinct groups: one east lineage and one central lineage.R2 and R3 were ribotypes unique to the eastern region, and R4-R6 were ribotypes unique to the central region (Figure 3A,B).
The historical dynamics of P. discoidea were analyzed mainly by neutral tests and mismatch analysis.S1).The results of population tests showed that Tajima's D value was positive and Fu's Fs
The historical dynamics of P. discoidea were analyzed mainly by neutral tests and mismatch analysis.S1).The results of population tests showed that Tajima's D value was positive and Fu's Fs
The historical dynamics of P. discoidea were analyzed mainly by neutral tests and mismatch analysis.Tajima's D and Fu's F s neutrality tests were conducted at the species level and in each geographical group.Based on the results of cpDNA molecular markers, Tajima's D value for the ZJG population was negative and Fu's F s value was positive, with p > 0.05, indicating non-significance.Tajima's D values for ZJS and YZH were positive, while Fu's F s values were negative, with p > 0.05, indicating non-significance.Tajima's D and Fu's F s values for LS and BYS were both zero.Tajima's D and Fu's F s values of the other eight populations were all positive, with p > 0.05, indicating non-significance (Table S1).The results of population tests showed that Tajima's D value was positive and Fu's F s value was negative (p > 0.05).In both the eastern and central geographical groups, Tajima's D values were positive, Fu's F s values were negative, and p > 0.05, indicating non-significance (Table S2).In conclusion, P. discoidea did not experience population expansion events or bottleneck effects.However, the mismatch analysis of P. discoidea populations showed a unimodal curve (Figure S1), with an SSD value of 0.01938 and Hrag value of 0.05561, with p > 0.05, consistent with the hypothesis of population expansion (Table S5).This indicated that P. discoidea populations experienced a recent population expansion event.
the eastern and central geographical groups were positive, with p > 0.05 (Table S2).The mismatch distribution curve of P. discoidea was multimodal, with p > 0.05, indicating nonsignificance.The mismatch analyses of both the populations and the geographical groups showed bimodal curves (Tables S3 and S4, Figure S2).In summary, these results suggested that the populations of P. discoidea did not experience rapid expansion or contraction events recently.

Genetic Diversity and Population Genetic Structure
Genetic diversity is defined as the variety of genetic materials and genetic information within all biological individuals.This includes gene mutations among different Based on the results of ITS molecular markers, Tajima's D and Fu's F s values for the seven populations of BYS, DMS, HS (Huangshan, Anhui), SMS (Ningbo, Zhejiang), ZJG, LCS (Yixing, Jiangsu), and YZH were all positive, with p > 0.05, indicating non-significance.Tajima's D value and Fu's F s value in the BMQ (Jiande, Zhejiang) population were negative, with p > 0.05, which was not significant.Tajima's D and Fu's F s values for the five populations of ZJS, LS, YTS, LKY, and THC were zero, with p > 0.05, indicating non-significance (Table S1).Additionally, Tajima's D and Fu's F s values of the population and the eastern and central geographical groups were positive, with p > 0.05 (Table S2).The mismatch distribution curve of P. discoidea was multimodal, with p > 0.05, indicating non-significance.The mismatch analyses of both the populations and the geographical groups showed bimodal curves (Tables S3 and S4, Figure S2).In summary, these results suggested that the populations of P. discoidea did not experience rapid expansion or contraction events recently.

Genetic Diversity and Population Genetic Structure
Genetic diversity is defined as the variety of genetic materials and genetic information within all biological individuals.This includes gene mutations among different populations of the same species as well as genetic differences within the same population.This is of great importance for the maintenance and propagation of species, adaptation to the environment, and resistance to adverse environmental conditions and disasters.Liu et al. used nSSR and cpDNA to analyze the genetic diversity of Chengbutong tea and showed that the genetic diversity (H d ) was 0.732 [18].Li et al. used cpDNA non-coding sequencing to study the genetic diversity of Salix psammophila and showed that the genetic diversity (H d ) was 0.737 [19].Li et al. studied the phylogenetic relationship of the genus Disanthus distributed disjunctively in China and Japan based on cpDNA sequences and showed that the genetic diversity (H d ) among populations was 0.725 [20].The genetic diversity (H d ) within the population of P. discoidea was 0.782, the variation in haplotype diversity (H d ) among the different populations ranged from 0.000 to 0.891, and the variation in nucleotide diversity (P i ) ranged from 0.00000 to 0.00114.These results showed that P. discoidea populations had a relatively high level of genetic diversity.The research results were similar to those of previous studies on Prunus serrulate [21], P. dielsiana [22], and Prunus conradinae [23].The ITS genetic diversity (R d ) within P. discoidea was 0.478, and the nucleotide diversity (P i ) was 0.00451.The variation in ribosomal diversity (R d ) ranged from 0.175 to 0.756, and the variation in nucleotide diversity (P i ) ranged from 0.00051 to 0.00571.The findings showed that the genetic diversity was lower compared to Prunus tomentosa [24], P. pseudocerasus [25], and Prunus avium [26].However, both markers indicated that P. discoidea exhibited a high level of genetic diversity, which is presumed to be related to the growth environment and the distribution of its habitats.P. discoidea was distributed in central and eastern China, located on the third step of China's geographical terrain, and was concentrated in the middle and lower reaches of the Yangtze River.The region is characterized by flat terrain with no mountainous barriers.At the same time, the warm and humid climate conditions maintained their genetic diversity, and the activities of birds and humans made hybridization and self-pollination within or among neighboring populations possible.
As shown in Table 5, based on three cpDNA fragments, the genetic variation among populations was lower than that within populations.Based on ITS fragments, the genetic variation among populations was higher than that within populations.The genetic differentiation coefficients for both molecular markers reached significant levels, and gene flow among populations of P. discoidea was relatively weak.Therefore, this study suggests that the variation in the population of P. discoidea mainly arose from the variation within populations.The study by Shang et al. on the analysis of population diversity in P. discoidea using SSR markers was consistent with the findings presented here [6].Based on three cpDNA fragments, the results for P. discoidea populations and two geographic groups separately showed that the genetic differentiation coefficients reached significant levels.This finding was consistent with the genetic differentiation parameters of nrDNA markers, indicating the presence of phylogeographic structure within the geographic groups and populations of P. discoidea.The research results were similar to those of previous studies on P. serrulate [21], P. dielsiana [22], and P. conradinae [23].This result was consistent with the habits of P. discoidea, which in its natural state tended to individual scattered distribution.P. discoidea was typically distributed on cliffs and river valleys, with long-distance seed dispersal primarily relying on bird and human activities.

Geographical Structure
Based on the geographical distribution of haplotypes and ribotypes, there were at least two genetic lineages within P. discoidea.One lineage included Anhui and areas to the west of Hubei.This division was based on the unique haplotypes H6-H11 and H16-H17, which were primarily distributed in THC, YZH, and ZJS.The other lineage included the regions of Jiangsu and Zhejiang, based on the unique haplotypes H1, and H12-H13, which were primarily distributed in BMQ, HS, YTS, and SMS.The reason for this might have been that P. discoidea was mainly distributed in central and southeast China.Furthermore, the sampling points were situated in the third step of the Chinese geographical map, with flat terrain and the presence of only two natural barriers, namely, Huang Mountain and Lu Mountain.This formed two distinct lineages in the eastern and central regions.The research results were consistent with those of previous studies on P. serrulata [21].

Historical Dynamics of P. discoidea Group
Based on neutrality tests for P. discoidea populations and geographic groups, the results indicated that neither of the two molecular markers pointed to population expansion or contraction events.However, the mismatch distribution analysis based on cpDNA molecular markers for P. discoidea population and geographic groups showed a unimodal curve.Both the SSD value of 0.01938 (p = 0.18000 > 0.05) and the Hrag value of 0.05561 (p = 0.33000 > 0.05) are consistent with the hypothesis of the population expansion model.This suggested that P. discoidea had experienced a population expansion event.The results contradicted those of the neutrality tests.However, according to Table S1, the population size of P. discoidea was θ 0 = 0.00176 before the outbreak and θ 1 = 5.33203 after the outbreak.The change in effective population size (θ 0 − θ 1 = 5.33203 − 0.00176) was large, so it was considered that P. discoidea had recently experienced a population expansion event.This result was consistent with a phylogeographic study of Xanthopappus subacaulis in the northeastern Qinghai-Tibet Plateau conducted by Zhang Yang et al. [27].
Based on the phylogenetic tree, P. discoidea diverged from P. campanulata approximately 1.5 million years ago, during the Pleistocene epoch.In phylogeographic studies, glacial refugia were usually detected through the high diversity of haplotypes and major lineages within species populations [28,29].We inferred that P. discoidea had two mainly glacial refugia, namely, Dabie Mountain around Yanzihe Canyon in Anhui and Qingliang Peak in Zhejiang.Following the glacial period, P. discoidea spread from these refuges, with its diffusion route roughly spanning Anhui-Henan-Hubei-Jiangxi or directly from Anhui to Jiangxi.Another diffusion route was from Zhejiang-Anhui-Jiangsu or directly from Zhejiang to Jiangsu, resulting in the formation of two lineages in the middle and east, and the current distribution pattern of P. discoidea.The findings of this research were largely consistent with those of previous phylogeographic studies of P. dielsiana [30] and P. serrulate [21].

Plant Materials
The distribution data of P. discoidea is primarily based on the Chinese Virtual Herbarium (CVH: https://www.cvh.ac.cn/ (accessed on 10 October 2020)) and published academic papers.For distribution points with accurate specimen records, but lacking latitude and longitude data, LocaSpaceViewer (http://www.locaspace.cn/(accessed on 20 November 2020)) was used to ascertain the coordinates, thereby enhancing the precision of the specimen information.DIVA-GIS was used to filter the obtained data, deleting duplicate records and those with collection points that were too close to each other.From 2020 to 2022, a total of 348 samples from 13 populations of P. discoidea were collected through two consecutive years of field investigation and sample collection (Tables 6 and 7, Figure 5).Within each population, 10 to 35 individuals were randomly selected, each at least 30 m apart.For each individual, 5 to 10 mature, healthy, and intact small leaves were collected.Then, the samples were rapidly placed in silica gel for drying.Finally, the samples were put into the refrigerator at −20 • C for use.According to previous research, leaves from the Prunus subg.Cerasus are known to contain significant amounts of polysaccharides and polyphenols, making DNA extraction quite challenging.Therefore, the DNA of P. discoidea was co-extracted using a polysaccharide polyphenol reagent kit (Tiangen Biotechnology Co., Ltd., Shanghai, China) [21] and a modified CTAB method [31].The concentration and purity of the extracted DNA were assessed via 1% agarose gel electrophoresis.Samples that did not meet the quality criteria were excluded, and the qualified DNA samples were stored at −80 • C for preservation.These samples were shipped to Shenggong Bioengineering (Shanghai) Co., Ltd. for sequencing, and the haplotypes were obtained for lineage geographical analysis.By reviewing the relevant literature and accessing the NCBI website (https://www.ncbi.nlm.nih.gov/(accessed on 20 November 2022)), universal primers for different sequences of cpDNA and nrDNA from the Prunus subg.Cerasus was selected and collected.Three pairs of cpDNA universal primers are respectively rps16 (F: GTGGTAGAAAGCAACGTGCGACTT; R: TCGGGATCGAACATCAATTGCAAC) [13], rpoB (F: AAGTGCATTGTTGGAACTGG; R: CCCAGCATCACAATTCC) [32], trnD-E (F: ACCAATTGAACTACAATCCC; R: AGGA-CATCTTCAAGGAG) [33], and a pair of nrDNA sequence fragment ITS (F: TCCTCCGCT-TATTGATATGC; R: GGAAGGAGAAGTCGTAACAAGG) [34], to be used for determining the genetic diversity and population structure of P. discoidea.A 25 µL PCR amplification reaction system was constructed using 1.0 µL of DNA template, 1.0 µL (10 µmol L −1 ) each of upstream and downstream primers, 12.5 µL of 2 × PCR Master Mix, and 9.5 µL of ddH 2 O [35].The PCR amplification protocol was as follows: initial denaturation at 94 • C for 5 min, followed by 30-35 cycles of denaturation at 94 • C for 1 min, annealing at 52-58 • C for 1 min, extension at 72 • C for 1 min, and a final extension at 72 • C for 10 min (Table 8).

Data Analysis
The company-provided sequencing data were imported into SeqMan for peak comparison, sequence assembly, and correction.Subsequently, after assembling the sequences of the three fragments, they were imported into PhyloSuite version 1.2.3 for sequence alignment and correction [36].DnaSP version 6.1 was used to calculate conventional indices for P. discoidea [37], such as haplotype diversity (H d ), nucleotide diversity (P i ), gene flow (N m ), and differentiation coefficients (N st and G st ), among others [38].The size of N st and G st was used to determine whether there was a genealogical geographic structure among populations.When N st was greater than G st and P was less than 0.05, haplotypes with similar phylogenetic relationships were distributed within the same population, indicating that there was an obvious lineage structure among the populations.When N st was equal to G st , the phylogenetic relationships among haplotypes across populations were similar.When N st was less than G st , it indicated that haplotypes with similar phylogenetic relationships existed in different populations, and there was no lineage structure [39].Arlequin version 3.5 was used for molecular ANOVA, and 1000 non-parametric permutations were used for significance testing to estimate the distribution of variance within and among populations and the genetic differentiation index (F st ) among populations of P. discoidea and further infer the main factors influencing the genetic differentiation in P. discoidea [40].The value of F st ranges from 0 to 1.When the F st is between 0 and 0.05, it indicates that genetic differentiation is low, when the F st is between 0.05 and 0.25, it indicates a moderate degree of genetic differentiation, and when the F st is greater than 0.25, it represents a significant level of genetic differentiation [41].
PopArt version 1.7 was used to construct the haplotype network diagram among populations to explore the relationships among various haplotypes.Additionally, the geographic distribution of haplotypes was mapped using ArcGIS version 10.8 [42].PhyloSuite version 1.2.3 was used to construct haplotype phylogenetic trees of P. discoidea using the maximum likelihood and Bayesian methods.For the trnD-E, rps16, and rpoB sequences, the best nucleotide substitution model for the maximum likelihood method was GTR2 + ML + R2, while the best nucleotide substitution model for the Bayesian method was HKY + F + G4.For the ITS sequence, the nucleotide substitution model for the maximum likelihood method was TIM-R3, and for the Bayesian method, it was JC-I-G.
BEAST version 1.8. 4 [43] was used to estimate the divergence times of P. discoidea by using the two-step method combining horizontal phylogenetic trees and haplotype trees, combined with the calibration points obtained from fossils, literature, and tree-building calculations.The molecular clock model chosen was the uncorrelated lognormal relaxed clock model.The results were visualized and edited using ITOL version 5 and FigTree version 1.4.4 [44][45][46].DnaSP version 6 was used to conduct pairwise mismatch distribution analysis and neutrality tests on P. discoidea to examine whether it had experienced population expansion or bottleneck effects [47].If Fu's F s value and Tajima's D value in the neutrality tests were both negative, it indicated that the P. discoidea population had recently undergone a rapid expansion event.Conversely, positive values suggested that the population had experienced a contraction event or bottleneck effect.Mismatch analysis focused on the distribution of nucleotide differences among different haplotypes.By observing the fit between the expected value curve and the observed value curve, as well as whether the overall curve was unimodal or multimodal, historical events recently experienced by the population were inferred [16].

Conclusions
This study evaluated the population variation, genetic diversity, phylogenetic structure, and dynamic history of P. discoidea by using three matrilineal inherited cpDNA fragments and biparentally inherited nuclear ITS sequences.We found high genetic diversity and the existence of phylogenetic structure in P. discoidea.One lineage was the central region of Anhui and the western region of Hubei.The other lineage was the Jiangsu region and the Zhejiang region.This study provides insights into the population variation, genetic diversity, phylogenetic structure, and dynamic history of P. discoidea.The findings offer a theoretical foundation for the protection and utilization of germplasm resources of P. discoidea.Due to the ambiguous information of some samples, accurate geographic information could not be obtained, resulting in a relatively small overall sample from Jiangxi and Anhui, so population genetic variation and differentiation were not fully verified.With the decrease in sequencing costs and the continuous advancement in sequencing techniques, we expect to use whole-genome resequencing and other methods to conduct in-depth studies on the population variation and historical dynamics of P. discoidea.This will be aimed at providing more accurate data support for the conservation and utilization of P. discoidea germplasm resources.
et al. indicated that there was a high level of genetic differentiation among P. discoidea populations, with gene flow being obstructed.Additionally, four populations were divided into two clades: one consisting of Huang Mountain and Lu Mountain, and the other comprising Tianmu Mountain and Plants 2024, 13, 2535 2 of 16

Figure 1 .
Figure 1.(A) Geographical distribution map of the haplotypes of P. discoidea based on cpDNA sequences.(B) Network diagram of the TCS haplotypes of P. discoidea based on cpDNA sequences.(C) Geographical distribution map of the ribotypes of P. discoidea based on nrDNA sequences.(D) Network diagram of TCS ribotypes of P. discoidea based on nrDNA sequences.

Figure 1 .
Figure 1.(A) Geographical distribution map of the haplotypes of P. discoidea based on cpDNA sequences.(B) Network diagram of the TCS haplotypes of P. discoidea based on cpDNA sequences.(C) Geographical distribution map of the ribotypes of P. discoidea based on nrDNA sequences.(D) Network diagram of TCS ribotypes of P. discoidea based on nrDNA sequences.

Figure 2 .
Figure 2. (A) ML tree of P. discoidea haplotypes based on cpDNA sequences.(B) BI tree of P. discoidea haplotypes based on cpDNA sequences.
Tajima's D and Fu's Fs neutrality tests were conducted at the species level and in each geographical group.Based on the results of cpDNA molecular markers, Tajima's D value for the ZJG population was negative and Fu's Fs value was positive, with p > 0.05, indicating non-significance.Tajima's D values for ZJS and YZH were positive, while Fu's Fs values were negative, with p > 0.05, indicating non-significance.Tajima's D and Fu's Fs values for LS and BYS were both zero.Tajima's D and Fu's Fs values of the other eight populations were all positive, with p > 0.05, indicating non-significance (Table

Figure 2 .
Figure 2. (A) ML tree of P. discoidea haplotypes based on cpDNA sequences.(B) BI tree of P. discoidea haplotypes based on cpDNA sequences.

Figure 1 .
Figure 1.(A) Geographical distribution map of the haplotypes of P. discoidea based on cpDNA sequences.(B) Network diagram of the TCS haplotypes of P. discoidea based on cpDNA sequences.(C) Geographical distribution map of the ribotypes of P. discoidea based on nrDNA sequences.(D) Network diagram of TCS ribotypes of P. discoidea based on nrDNA sequences.

Figure 2 .
Figure 2. (A) ML tree of P. discoidea haplotypes based on cpDNA sequences.(B) BI tree of P. discoidea haplotypes based on cpDNA sequences.
Tajima's D and Fu's Fs neutrality tests were conducted at the species level and in each geographical group.Based on the results of cpDNA molecular markers, Tajima's D value for the ZJG population was negative and Fu's Fs value was positive, with p > 0.05, indicating non-significance.Tajima's D values for ZJS and YZH were positive, while Fu's Fs values were negative, with p > 0.05, indicating non-significance.Tajima's D and Fu's Fs values for LS and BYS were both zero.Tajima's D and Fu's Fs values of the other eight populations were all positive, with p > 0.05, indicating non-significance (Table

Figure 4 .
Figure 4. Phylogenetic tree of Rosaceae based on chloroplast DNA (rps16) and four fossil dates.

Figure 4 .
Figure 4. Phylogenetic tree of Rosaceae based on chloroplast DNA (rps16) and four fossil dates.

Figure 5 .
Figure 5. Sampling points and population distribution points of P. discoidea.Note: Green represents the sampling points, and red represents the population distribution points.

Figure 5 .
Figure 5. Sampling points and population distribution points of P. discoidea.Note: Green represents the sampling points, and red represents the population distribution points.
Note: Periods represent the base of the Hap1 site.

Table 2 .
Genetic diversity index and geographical information of P. discoidea based on cpDNA sequences.

Table 4 .
Genetic diversity index and geographical information of P. discoidea based on nrDNA sequences.

Table 5 .
Analyses of molecular variance (AMOVAs) based on cpDNA and nrDNA data for populations of P. discoidea.

Table 6 .
Information on P. discoidea sampling points.

Table 8 .
Primer sequence and annealing temperature.