Comparative and Phylogenetic Analyses of the Complete Chloroplast Genomes of Four Ottelia Species

: The genus Ottelia comprises approximately 21 submerged macrophyte species primarily found in tropical Africa and Southeast Asia. The classification of several Ottelia species as vulnerable under the criteria “A2c” in the China Species Red List emphasizes the urgency of establishing a credible taxonomy. The ambiguity in taxonomy and evolutionary history persists, primarily stemming from the absence of a robust phylogeny at the species level. The rapid progress in high-throughput sequencing technology has facilitated the retrieval of complete chloroplast (cp) genome sequences, offering a molecular foundation for conducting phylogenetic analyses. In this study, the entire cp genomes of five samples of four Ottelia species were sequenced. All five Ottelia samples exhibited a circular, quadripartite-structured molecule, with lengths ranging from 156,823 to 162,442 bp. A total of 75–88 simple sequence repeats (SSRs) in the cp genomes of the five Ottelia samples were observed, which could be used for species identification. A preliminary phylogenetic analysis revealed that O. fengshanensis , O. acuminata , and O. guanyangensis clustered with strong support (100 BS). O. acuminata var. jingxiensis was resolved as a sister to O. acuminata var. lunanensis (100 BS), and both were further found to be sisters to O. balansae . The widely distributed O. alismoides exhibited a close phylogenetic relationship as the sister taxon to all the Chinese endemic taxa, supported by robust values. Together, our thorough examination of the entire cp genomes of the five Ottelia samples provides eminent insights into reconstructing their phylogeny. Furthermore, it illuminates the evolutionary dynamics of the cp genome within the Ottelia genus.


Introduction
The Ottelia Pers.genus is widely distributed, encompassing around 21 submerged macrophyte species primarily found in tropical Africa and Southeast Asia, as documented in the most comprehensive taxonomic revision [1].Southern and Southwestern China are crucial regions for the genus, as they host nearly all Asian Ottelia species, with the exception of the Indonesian endemic O. mesenterium and a possible cryptic species located in Thailand [2], contributing to the foundational understanding of the genus in this region [1,3].In China, the record of Ottelia includes seven species and five varieties [4].Apart from the widely distributed O. alismoides, the other Chinese taxa have restricted ranges, mainly confined to karst rivers or lakes in the South and Southwest of China [5].For instance, O. acuminata var.crispa is found exclusively in Luguhu Lake in Yunnan Province, while O. acuminata var.songmingensis is documented solely in a stream in Songming county [5].Unfortunately, numerous wild populations have declined or vanished over the past three decades, attributed to factors such as the degradation of habitat, disturbances from anthropogenic activities, and the introduction of herbivorous fish [6][7][8].Several Ottelia species are now classified as vulnerable (VU) under the criteria "A2c" in the China Species Red List [9].
Ottelia, distinguished by its intricate and advanced flowering characteristics within the family Hydrocharitaceae, is an attractive ornamental aquatic plant for tourists with its year-round flowering.Additionally, its leaves possess a certain edible value, adding to its appeal [10].Furthermore, it flowers on the surface of lakes, so it is known as a "touchstone of water quality", and its presence is viewed as an indicator of decreasing water pollution in an area [11].The genus, characterized by continuous morphological variations and unique traits, is considered a pivotal group for studying the development of the family Hydrocharitaceae and even all monocotyledons [1,12].However, given the threat of local or global extinction to the majority of Ottelia species [13,14], establishing a credible taxonomy has become imperative for the preservation and management of extant Ottelia species.
Since Persoon's naming of the Ottelia genus in 1805, its classification has been a subject of controversy, particularly at the species and variety levels.Dandy's landmark revision in 1934, based on spathe characteristics, acknowledged approximately 40 species across three subgenera [15].Li et al. (1981) studied the Asian members using inflorescence and flower characters, categorizing them into two subgenera [3].Cook et al. globally revised the genus without accepting subgenera, recognizing around 21 species and two varieties [1].He et al. incorporated morphology, cytology, and isozymes and accepted only four species and one variety in China, initially suggesting the synonymy of O. emersa with O. cordata [12].In addition, the taxonomic delimitation of O. acuminata species has shown variations among different researchers.For example, Li et al. [3] recognized O. acuminata var.crispa, whereas Dandy et al. and Wang et al. treated it as a separate species, O. crispa [15].Similarly, O. acuminata var.lunanensis was acknowledged as a variety by Li et al. [3] and Wang et al. [16], while Cook et al. [1] consolidated it as a synonym of O. acuminata var.acuminata.These discrepancies highlight the differences in taxonomic interpretations, emphasizing the need for a re-evaluation of the O. acuminata taxonomic delimitation and its phenotypic varieties.
Chloroplasts play a pivotal role in photosynthetic and various metabolic reactions in plants [17].Like mitochondria, chloroplasts are semi-autonomously inherited maternally and possess a relatively independent genetic system.Moreover, chloroplasts have played significant roles as dominant drivers of evolutionary processes [18].Compared to Sanger sequencing, next-generation sequencing methods offer significant promise in this field because they provide ample data for statistically rigorous comparisons across cp genomes [19].The benefits of the cp genome, such as its structural conservation, low molecular weight, simplicity, genetic stability, and moderate evolution rate, have contributed to improving our understanding of the evolutionary relationships within Ottelia [4,20,21].Li et al. (2018) [22] conducted a phylogenetic reconstruction of Ottelia in China, using one nuclear and three cp regions, and they described O. guanyangensis as a new species; meanwhile, they proposed the elevation of O. acuminata var.songmingensis to distinct species status rather than a variety of O. acuminata.Recent DNA sequence-based studies have identified cryptic species within O. alismoides [2] and O. ulvifolia [23], underscoring the persisting ambiguity in the taxonomy of Ottelia.However, only around 10 cp genomes have been released in the NCBI database, and the information is not enough to resolve the relationships among several groups of the Ottelia genus.
In the present study, we conducted a comprehensive analysis of the cp genome of five newly sequenced samples from four Ottelia species, encompassing the structure, GC content, codon usage preference, simple sequence repeats (SSRs), variations within cp genomes, and expansion and contraction of inverted repeats (IRs).Subsequently, we employed whole cp genome sequences, including those newly sequenced and previously published, to reconstruct phylogenetic relationships.The specific objectives of this study were as follows: (i) to characterize the Ottelia cp genome structure and provide an indepth understanding of the organization and arrangement of the Ottelia cp genome; (ii) to identify highly divergent regions as potential DNA barcodes and pinpoint regions within the Ottelia cp genome exhibiting high divergence; and (iii) to investigate the evolutionary 3 of 14 relationships within Ottelia species.This study will contribute to a broader understanding of the evolutionary relationships within various Ottelia species

Chloroplast Genome Features of Ottelia
The initial dataset comprising five Ottelia individuals underwent stringent filtration, removing adapters and low-quality reads, resulting in a robust dataset of 3-5 Gb per species.Assembly and splicing procedures led to the acquisition of complete circular tetrad-structured cp genomes (Figure 1), including an LSC, an SSC, and two IR regions.The structural composition of the complete cp genomes in the five studied Ottelia samples conforms to the typical configuration observed in most plants, characterized by a single circular quadripartite molecule [24,25].
Horticulturae 2024, 10, x FOR PEER REVIEW 3 of 14 understanding of the organization and arrangement of the Ottelia cp genome; (ii) to identify highly divergent regions as potential DNA barcodes and pinpoint regions within the Ottelia cp genome exhibiting high divergence; and (iii) to investigate the evolutionary relationships within Ottelia species.This study will contribute to a broader understanding of the evolutionary relationships within various Ottelia species

Chloroplast Genome Features of Ottelia
The initial dataset comprising five Ottelia individuals underwent stringent filtration, removing adapters and low-quality reads, resulting in a robust dataset of 3-5 Gb per species.Assembly and splicing procedures led to the acquisition of complete circular tetradstructured cp genomes (Figure 1), including an LSC, an SSC, and two IR regions.The structural composition of the complete cp genomes in the five studied Ottelia samples conforms to the typical configuration observed in most plants, characterized by a single circular quadripartite molecule [24,25].A chloroplast genome gene map for five Ottelia samples.Genes are transcribed clockwise on the outer circle and counterclockwise on the inner circle.Functional groups of genes are colorcoded for ease of identification.Additionally, variations in shading on the inner circle denote the GC content (darker gray) and AT content (lighter gray) of the chloroplast genome.
The complete cp genomes exhibited intriguing variations in length across species, with O. acuminate var.jingxiensis exhibiting the smallest size at 156,823 bp and O. alismoides boasting the largest at 162,442 bp among the five individuals.The size of the Ottelia cp genomes studied here is comparable with that of previously reported Ottelia species [10,20,26], with the exception of O. alismoides, which was 4380 to 5619 bp longer than the A chloroplast genome gene map for five Ottelia samples.Genes are transcribed clockwise on the outer circle and counterclockwise on the inner circle.Functional groups of genes are color-coded for ease of identification.Additionally, variations in shading on the inner circle denote the GC content (darker gray) and AT content (lighter gray) of the chloroplast genome.
The complete cp genomes exhibited intriguing variations in length across species, with O. acuminate var.jingxiensis exhibiting the smallest size at 156,823 bp and O. alismoides boasting the largest at 162,442 bp among the five individuals.The size of the Ottelia cp genomes studied here is comparable with that of previously reported Ottelia species [10,20,26], with the exception of O. alismoides, which was 4380 to 5619 bp longer than the other Ottelia species.The discrepancy in the cp genome size can be attributed to the significant divergence in the IR region.Specifically, the length of the IR region was 25,039 bp in O. acuminata var.jingxiensis, whereas it was 30,136 bp in O. alismoides.A length divergence was also observed in the LSC region, spanning from 85,568 bp (O.alismoides) to 87,378 bp (O.acuminata var.jingxiensis), while the SSC region ranged from 16,602 bp (O.alismoides) to 19,367 bp (O.acuminata var.jingxiensis).
The GC content in these genomes plays a pivotal role in sequence stability.A higher GC content correlates with an increased DNA density, resulting in a more conservative and inflexible sequence [27].The collective GC content across the five samples remained nearly consistent, ranging from 36.41% to 36.68%.However, there was an uneven distribution of GC content within the cp genomes (Table 1).Specifically, the GC content reached its peak in the IR, ranging from 41.10% to 43.22%, while exhibiting the lowest values in the SSC regions, ranging from 29.54% to 30.12%.It has been reported that IR regions have a high GC content due to the presence of rRNAs [28].This aspect holds significant importance in safeguarding the fundamental content of cp genomes and maintaining the stability of their structures.Here, we also observed that the GC content of rRNA (55.24-55.26%)was much higher than that of PCG (37.58-38.37%),contributing to the high GC content in the IR regions.Moreover, the gene count and intron presence remained highly conserved (Table 1), with a uniform suite of rRNA and tRNA genes across all taxa.The majority of genomes included 85 protein-coding genes, with the exception of O. alismoides, which had 88 protein-coding genes.This increase was due to the presence of the IhbA, rpl22, rps3, and ycf15 genes, while the psbZ gene was absent (Supplementary Table S1).A noteworthy repetition of 16 genes related to photosynthesis and self-replication was identified within the Ottelia cp genomes, underscoring the key functional aspects of these cp genes [29].

SSRs and Repeats
The cp genome has traits of parthenogenesis, and there is significant variation in the simple sequence repeats (SSRs) within individuals of the same species [30].Thus, these SSRs are useful molecular markers for studying development and identifying species [31].Furthermore, SSRs are commonly used as genetic markers in studies related to community genetics and evolution [32].The SSRs and repetitive elements within the cp genomes of the five Ottelia samples were thoroughly analyzed using MISA.We observed a total of 75-88 SSRs in the cp genomes of the five Ottelia samples (Figure 2A).Of these, mononucleotide SSRs predominated, followed by dinucleotide repeats, which was previously reported in other angiosperm studies [28,29].The prevalent A/T motifs in the mononucleotide SSRs enriched the A and T bases within the cp genomes (Figure 2B).Previous studies [33,34] have confirmed that these were the most commonly utilized bases among all types of SSRs.Due to the reduced nitrogen concentration of A/T bases compared to G/C bases, an increase in A/T bases may result in base mutations requiring less energy [35].Base bias may impact not only the SSR types and codon bias but also the stability of the four divisions of the cp genome [36].Using a comparative analysis, we found that the count of A/T repeats varied among species, ranging from 29 in O. guanyangensis and O. acuminata to 37 in O. acuminata var.jingxiensis (Supplementary Table S2).Most SSRs were shared by the five species, with the exception that the tetranucleotide repeat TATG and the pentanucleotide repeat CTATT only existed in O. alismoides.Our analysis also delved into the distribution of SSRs across the LSC/SSC/IR regions (Figure 2C).The quantity of SSR markers within the LSC region varied from 51 to 64 across the five Ottelia species, notably surpassing the counts observed in both the SSC region (ranging from 13 to 21) and the IR regions (ranging from 4 to 14).The IR region consistently exhibited the lowest SSR count, emphasizing its high level of conservation.Notably, O. alismoides had more SSRs in the IR region than the other species did, offering a potential reference point for the identification of O. alismoides.Currently, there is a limited number of published studies on SSR markers in the Ottelia cp genome.Identifying SSRs in the Ottelia cp genome may serve as a basis for creating molecular markers and studying genetic diversity in the Ottelia genus.
repeats varied among species, ranging from 29 in O. guanyangensis and O. acuminata to 37 in O. acuminata var.jingxiensis (Supplementary Table S2).Most SSRs were shared by the five species, with the exception that the tetranucleotide repeat TATG and the pentanucleotide repeat CTATT only existed in O. alismoides.Our analysis also delved into the distribution of SSRs across the LSC/SSC/IR regions (Figure 2C).The quantity of SSR markers within the LSC region varied from 51 to 64 across the five Ottelia species, notably surpassing the counts observed in both the SSC region (ranging from 13 to 21) and the IR regions (ranging from 4 to 14).The IR region consistently exhibited the lowest SSR count, emphasizing its high level of conservation.Notably, O. alismoides had more SSRs in the IR region than the other species did, offering a potential reference point for the identification of O. alismoides.Currently, there is a limited number of published studies on SSR markers in the Ottelia cp genome.Identifying SSRs in the Ottelia cp genome may serve as a basis for creating molecular markers and studying genetic diversity in the Ottelia genus.Repeats larger than 30 bp, termed long repeat sequences, contribute to cp genome rearrangement [37].Our study centered on interspersed repeated sequences, which include four types of long repeat sequences: complement repeats (Cs), forward repeats (Fs), palindromic repeats (Ps), and reverse repeats (Rs).Among these, Ps were identified as the most prevalent in most species, followed by Fs and Rs (Figure 2D).Remarkably, Cs were consistently rare across all species [38], aligning with our study.Repeats were predominantly concentrated in the LSC region, with the SSC and IR regions significantly less characterized.This pattern was consistently observed across all five analyzed Ottelia species (Figure 2E) and is also consistent with that in previous studies [39,40].Most of these repeats spanned between 21 and 30 bp, while repeats between 41 bp and 60 bp in length were absent (Figure 2F).One repeat surpassing 60 bp was shared by the five Ottelia samples.Repeats larger than 30 bp, termed long repeat sequences, contribute to cp genome rearrangement [37].Our study centered on interspersed repeated sequences, which include four types of long repeat sequences: complement repeats (Cs), forward repeats (Fs), palindromic repeats (Ps), and reverse repeats (Rs).Among these, Ps were identified as the most prevalent in most species, followed by Fs and Rs (Figure 2D).Remarkably, Cs were consistently rare across all species [38], aligning with our study.Repeats were predominantly concentrated in the LSC region, with the SSC and IR regions significantly less characterized.This pattern was consistently observed across all five analyzed Ottelia species (Figure 2E) and is also consistent with that in previous studies [39,40].Most of these repeats spanned between 21 and 30 bp, while repeats between 41 bp and 60 bp in length were absent (Figure 2F).One repeat surpassing 60 bp was shared by the five Ottelia samples.

Codon Usage Analysis
While the universal codon table is shared across all organisms, indicating a common ancestry, biological evolution has led to the development of various biases.Not only do various species display preferences for specific synonymous codons but also, within the same species, various proteins might favor the same amino acid, known as codon bias.RSCU, a commonly employed metric for evaluating codon bias, helps mitigate the influence of amino acid composition associated with specific codons [41].The protein-coding genes within the complete cp genome of Ottelia are encoded by 61 codons, representing 20 amino acids.RSCU values exceeding 1 signify a notable codon bias, with values surpassing 1.3 indicating high-frequency codons, as described in a study on the cp genome of the medicinal plant Lonicera japonica [42].An analysis of the Ottelia cp genome (Table 2) revealed 23 high-frequency codons, predominantly terminating with either A or T. It has been reported that, in the chloroplast genome of angiosperms, most codons show a higher A/T preference in the third codon [39,40], which is consistent with our results.The main reason for this situation may be related to the abundance of A or T in the IR region [43].In contrast, codons concluding with a C-terminal, such as CTC (Leu), CGG (Arg), CTG (Leu), AGC (Ser), TAC (Tyr), and GGC (Gly), exhibit relatively low RSCU values.This finding aligns with that in previous related reports [39,40].

IR Expansion and Contraction
IR expansion and contraction stand out as the principal factors driving the variation in cp genome size, exerting a significant influence on species evolution [44].Thus, a comparative analysis was performed on the junction regions of IR within the five cp genomes of the Ottelia samples to investigate potential expansion or contraction events.Except for O. alismoides, the lengths of the IR regions remained relatively consistent across the other four Ottelia species (Figure 3).Previous studies have shown that the diverse lengths observed in cp genomes stem from the expansion and contraction events within IR regions [32], a finding that aligns with our own results.

Variations within the cp Genome Synteny Analysis
Comparing the variations in cp genome sequences among different taxa enables the identification of informative DNA fragments and contributes to the development of species identification techniques and the exploration of population diversity [46].The genetic variations among the five Ottelia species were assessed using mVISTA.The entire cp genomes were compared with the reference sequence of O. balansae (Figure 4).Overall, the protein-coding regions showed high conservation across the different species, while intergenic spacers such as rps16, trnQ-UUG-psbK, ndhc-trnM-CAU, atpB-rbcL, rpL16, petN-psbM, and trnN-GUU-ndhF exhibited significant variability.These hotspot regions can be employed for DNA barcoding and phylogenetic analyses within the Ottelia genus.Previous research has indicated that LSC regions exhibit a lower variability than IR and SSC regions.Additionally, non-coding regions are more susceptible to mutations than coding regions [47].Consistent with this characteristic, our study specifically identified highly The IRa/SSC(JSA) junction was consistently found in the ycf1 gene across all five samples.The length of the ycf1 gene ranged from 5489 to 5501 bp, and it spanned from 10 to 2953 bp within the IRa region.The distance between the ndhF gene and the boundary IRb/SSC(JSB) in the SSC region ranged from 70 bp to 424 bp.Kim et al. [45] proposed that the ndhF gene strongly correlated with the IR/SSC junction stability.In the present study, the ndhF gene lies very near the SSC/IR junction, as is typical of most angiosperm plastomes.With the exception of O. alismoides, whose IRb/LSC(JLB) junction was located in the rpl16 gene, all studied Ottelia species exhibited the rps19 gene positioned in the LSC region at a distance of 35 bp to 38 bp from the IRb/LSC(JLB) junction.The IRa/LSC(JLA) junction of O. alismoides, O. acuminata, and O. guanyangensis was precisely located in the trnH gene within the LSC region.In O. acuminata var.jingxiensis, the trnH gene spanned the IRa/LSC(JLA) boundary, extending 1bp into the IRa region, whereas it was found to have a length of 1 bp in the trnH gene located in the IRa region near the IRa/LSC junction in O. fengshanensis.

Variations within the cp Genome Synteny Analysis
Comparing the variations in cp genome sequences among different taxa enables the identification of informative DNA fragments and contributes to the development of species identification techniques and the exploration of population diversity [46].The genetic variations among the five Ottelia species were assessed using mVISTA.The entire cp genomes were compared with the reference sequence of O. balansae (Figure 4).Overall, the protein-coding showed high conservation across the different species, while intergenic spacers such as rps16, trnQ-UUG-psbK, ndhc-trnM-CAU, atpB-rbcL, rpL16, petN-psbM, and trnN-GUU-ndhF exhibited significant variability.These hotspot regions can be employed for DNA barcoding and phylogenetic analyses within the Ottelia genus.Previous research has indicated that LSC regions exhibit a lower variability than IR and SSC regions.Additionally, non-coding regions are more susceptible to mutations than coding regions [47].Consistent with this characteristic, our study specifically identified highly variable regions within the intergenic spacers instead of within the coding regions.

Phylogenetic Analysis
A preliminary phylogenetic tree was constructed using the protein-coding regions of 12 Ottelia samples, with Hydrocharis dubia and H. chevalieri designated as outgroups, employing the maximum likelihood method (Figure 5).All 12 Ottelia samples were completely supported (100 BS) as being monophyletic.Overall, O. ulvifolia, which is widespread in Africa [1], was found to be a sister to all remaining species.O. ovalifolia from Among the different species analyzed, O. alismoides exhibited the greatest diversity compared to the other five species, suggesting that O. alismoides is phylogenetically distinct from the other species.O. fengshanensis, O. guanyangensis, and O. acuminata displayed similar peaks at the positions trnQ-UUG-psbK, ndhc-trnM-CAU, and atpB-rbcL.Additionally, only minimal variation was observed between O. fengshanensis and O. guanyangensis, indicating a close relationship between these two species, which is in accordance with a previous study [23].The loci rbcL, matK, and trnH-psbA, together with the nuclear ribosomal internal transcribed spacers, are suggested locations for DNA barcoding in plants [48].However, in our study of Ottelia, these plastid markers had extreme variations across the five samples.We identified two variable non-coding regions-petN-psbM and trnN-GUU-ndhF-which might be regarded as potential molecular markers for Ottelia.Further work is still necessary to determine whether these highly variable regions could be used in Ottelia phylogenetic analyses or serve as candidate DNA barcodes.

Phylogenetic Analysis
A preliminary phylogenetic tree was constructed using the protein-coding regions of 12 Ottelia samples, with Hydrocharis dubia and H. chevalieri designated as outgroups, employing the maximum likelihood method (Figure 5).All 12 Ottelia samples were completely supported (100 BS) as being monophyletic.Overall, O. ulvifolia, which is widespread in Africa [1], was found to be a sister to all remaining species.O. ovalifolia from Oceania was found to be closely related to the clade consisting of Asian species.Among the five cp genomes sequenced in this study, O. fengshanensis, O. acuminata, and O. guanyangensis formed a cluster with substantial support (100 BS).O. acuminata var.jingxiensis was identified to be grouped with O. acuminata var.lunanensis (100 BS), and both were then shown to be closely related to O. balansae.Additionally, the widespread species O. alismoides was established as a sister to all Chinese endemic taxa.Our findings suggest that the plastid phylogenomic tree aligns with the reported morphological variations in Ottelia, as documented by Cook.et al. [1] and He [12].That is, the morphological features-for example, a wholly submerged leaf, which is triplinerved with obvious crossveins; three white petals with a yellow base, obcordate or obovate; narrowly elliptic seeds; and hexagonal-cylindric fruit-are shared by O. balansae, O. fengshanensis, O. acuminata, and O. guanyangensis.This also suggests that O. balansae might be a variety of O. acuminata.Therefore, the existing classification of Ottelia warrants further examination using nuclear genomic methods, such as single-copy homologous gene analysis and restriction site-associated DNA sequencing.Another branch, O. alismoides, can be distinguished by its colorful petals (e.g., white, pink, blue, or purple) and longer petioles (up to 50 cm in length).These results provide novel perceptions for resolving the taxonomy intricacies within the Ottelia genus.

Material Collection and DNA Extraction
Fresh leaves from five Ottelia samples representing four distinct species, namely, O. fengshanensis, O. guanyangensis, O. acuminata var.jingxiensis, O. acuminata, and O. alis- Phylogenomic analyses, leveraging plastomes, have proven successful in unraveling intricate relationships within taxonomically complex plant groups [24,25].In this study, our phylogenomic analysis not only reaffirmed the monophyly of Ottelia but also offered a comprehensive understanding of the relationships within the genus.This finding aligns with conclusions drawn in previous reports [2,23].At the intrageneric level, the clusters strictly corresponded to the geographical distribution.The Asian species, including those from China, those from Indonesia, and O. alismoides-a widely distributed species-formed the initial cluster.This clustering pattern corresponds to the findings of a previous study [2] using nrITS and plastid genes.Before our investigation, O. acuminata and O. balansae were thought to be monophyletic sisters [5].However, our study revealed that O. balansae was a clade that was nested inside the varieties of O. acuminata.
Our findings suggest that the plastid phylogenomic tree aligns with the reported morphological variations in Ottelia, as documented by Cook.et al. [1] and He [12].That is, the morphological features-for example, a wholly submerged leaf, which is triplinerved with obvious crossveins; three white petals with a yellow base, obcordate or obovate; narrowly elliptic seeds; and hexagonal-cylindric fruit-are shared by O. balansae, O. fengshanensis, O. acuminata, and O. guanyangensis.This also suggests that O. balansae might be a variety of O. acuminata.Therefore, the existing classification of Ottelia warrants further examination using nuclear genomic methods, such as single-copy homologous gene analysis and restriction site-associated DNA sequencing.Another branch, O. alismoides, can be distinguished by its colorful petals (e.g., white, pink, blue, or purple) and longer petioles (up to 50 cm in length).These results provide novel perceptions for resolving the taxonomy intricacies within the Ottelia genus.

Sequencing and Assembly of Chloroplast Genome
For library construction, an initial amount of 300 ng of DNA served as the input material.Utilizing a VAHTS Universal Plus DNA Library Prep Kit for Illumina (Vazyme, Nanjing, China), sequencing libraries were generated, with unique index codes assigned to each sample's sequences.The input DNA was fragmented (300-500 bp), simultaneously achieving end polishing, phosphorylation at the 5 ′ end, and the addition of A tails at the 3 ′ end.Adapter ligation was performed at the ends of the produced fragmented DNA.PCR amplification was conducted on the purified or size-selected adapter-ligated products using an AMPure XP system (Beckman Coulter Inc., Brea, CA, USA), with the library size distribution assessed via an Agilent 2100 Bioanalyzer.Library quantification was performed using real-time PCR.The indexed specimen underwent clustering via a cBot Cluster Generation System (Illumina Inc., San Diego, CA, USA).After clustering, the libraries were sequenced on an Illumina Novaseq 6000 platform, yielding paired-end reads of 150 bp.
The raw paired-end readings were evaluated for quality using FastQC v0.11.7 software.Following a comprehensive quality evaluation, the data obtained underwent processing using de novo assemblers, Fast-plast or GetOrganelle, to obtain the optimal contigs.During the assembly process, seed sequences sourced from GenBank were utilized.The cp sequences of Photinia beckii (MN577889) were used as references for O. fengshanensis, O. guanyangensis, O. acuminata var.jingxiensis, and O. acuminata.The cp sequences of Diospyros dumetorum (MF179487) were used as references for O. alismoides.

Chloroplast Genome Structural Analysis
The cp genome sequence of O. balansae, obtained from GenBank, and the five newly sequenced cp genomes of the Ottelia samples were used to analyze repeat sequences as well as simple sequence repeats (SSRs).The SSRs in the cp genome sequences of the five Ottelia samples were identified using a Perl script named MISA, with threshold values set for the lengths of different SSRs: mononucleotides (10), dinucleotides (6), trinucleotides (5), tetranucleotides (5), pentanucleotides (5), and hexanucleotides (5).The REPuter program was employed to classify repeat sequences.The criteria for identifying repeat sequences in the cp genome included a Hamming distance of 3, a minimum size of 30 bp, and a sequence identity ≥ 90%.Relative synonymous codon usage (RSCU) and amino acid frequency within the protein-coding gene region were assessed using Codonw version 1.4.4 (http://codonw.sourceforge.net//culong.html(accessed on 5 January 2024)).

Variations in the Chloroplast Genomes
The mVISTA comparative genomics server served as the platform for generating a sequence variation map, with the annotation of the O. balansae cp genome utilized as a reference.Variation in inverted repeat (IR) sequences, encompassing expansion and contraction events, was assessed using the IRscope online program.

Phylogenomic Reconstruction
A phylogenomic analysis was conducted based on the five newly sequenced cp genomes of the Ottelia samples alongside those of ten Ottelia species retrieved from Gen-Bank.Sophora velutina and S. tomentosa were designated as outgroups for the phylogenetic analysis.The Fast Fourier Transform (MAFFT) program was utilized to perform multiple sequence alignment [49].Gblock v0.91b software was then employed to filter out conserved sequences.
The maximum likelihood (ML) method with 1000 bootstrap replicates was applied in MEGA 11.0 software [50] to deduce the phylogenetic relationships.An ML analysis was conducted using RAxML v8.2.12 software [51], with a burn-in of 25% discarded, and samples were collected every 1000 generations.Stationarity was deemed to be achieved when the average standard deviation of the splitting frequency fell below 0.01.Visualization of the phylogenetic tree was carried out using FigTree v1.4.4.

Conclusions
In our study, five complete cp genomes of Ottelia (O.fengshanensis, O. acuminata var.jingxiensis, O. guanyangensis, O. acuminata, and O. alismoides) were sequenced, assembled, and comparatively analyzed.The cp genome structure and GC content of the five Ottelia samples were comparable, indicating substantial conservation.We observed 75-88 SSRs in the cp genomes of the five Ottelia samples.Intergenic spacers, such as rps16, trnQ-UUG-psbK, petN-psbM, ndhc-trnM-CAU, atpB-rbcL, rpL16, and trnN-GUU-ndhF, exhibited significant variability among the five sequenced samples.A preliminary phylogenetic tree was constructed using the protein-coding regions of 12 Ottelia samples.Overall, the clusters strictly corresponded to the geographical distribution.This study contributes to the collection of cp genome sequences, which can enhance species identification and facilitate phylogenetic analyses in future Ottelia investigations.

Supplementary Materials:
The following supporting information can be downloaded at https: //www.mdpi.com/article/10.3390/horticulturae10060603/s1,Table S1: List of predicted genes in the Ottelia chloroplast genome; Table S2: Details of SSRs in the Ottelia chloroplast genome.Table S3: GenBank accession numbers of species used in this study.

Figure 1 .
Figure1.A chloroplast genome gene map for five Ottelia samples.Genes are transcribed clockwise on the outer circle and counterclockwise on the inner circle.Functional groups of genes are colorcoded for ease of identification.Additionally, variations in shading on the inner circle denote the GC content (darker gray) and AT content (lighter gray) of the chloroplast genome.

Figure 1 .
Figure1.A chloroplast genome gene map for five Ottelia samples.Genes are transcribed clockwise on the outer circle and counterclockwise on the inner circle.Functional groups of genes are color-coded for ease of identification.Additionally, variations in shading on the inner circle denote the GC content (darker gray) and AT content (lighter gray) of the chloroplast genome.

Figure 2 .
Figure 2. Analysis of simple sequence repeats (SSRs) and large repeats in the chloroplast genomes of five Ottelia samples: (A) number of SSRs; (B) frequency of common motifs; (C) types of SSRs; (D) four types of repeats; (E) frequency of repeats in the LSC, IR, and SSC regions; (F) frequency of repeats by length.

Figure 2 .
Figure 2. Analysis of simple sequence repeats (SSRs) and large repeats in the chloroplast genomes of five Ottelia samples: (A) number of SSRs; (B) frequency of common motifs; (C) types of SSRs; (D) four types of repeats; (E) frequency of repeats in the LSC, IR, and SSC regions; (F) frequency of repeats by length.

Figure 3 .
Figure 3.Comparison of the junctions between the LSC, SSC, and IR regions across chloroplast genomes of five Ottelia samples.

Figure 3 .
Figure 3.Comparison of the junctions between the LSC, SSC, and IR regions across chloroplast genomes of five Ottelia samples.

14 Figure 4 .
Figure 4.The configuration of six chloroplast genomes visualized using mVISTA, with Ottelia balansae as the reference.The vertical axis shows identity percentages ranging from 50% to 100%, and the horizontal scale represents positions in the chloroplast genome.Various colors differentiate between exons, introns, untranslated regions (UTRs), and conserved non-coding sequences (CNSs).

Figure 4 .
Figure 4.The configuration of six chloroplast genomes visualized using mVISTA, with Ottelia balansae as the reference.The vertical axis shows identity percentages ranging from 50% to 100%, and the horizontal scale represents positions in the chloroplast genome.Various colors differentiate between exons, introns, untranslated regions (UTRs), and conserved non-coding sequences (CNSs).

Figure 5 .
Figure 5. Preliminary phylogenetic tree of five Ottelia samples and their related species based on protein-coding genes of chloroplast genomes.Chloroplast genome sequences were downloaded from GenBank.Different colors indicate species widely distributed in different region.

Figure 5 .
Figure 5. Preliminary phylogenetic tree of five Ottelia samples and their related species based on protein-coding genes of chloroplast genomes.Chloroplast genome sequences were downloaded from GenBank.Different colors indicate species widely distributed in different region.

Table 1 .
Chloroplast genome features of five Ottelia samples.

Table 2 .
The RSCU of chloroplast genes in five Ottelia samples.
Note: the RSCU values greater than 1 are highlighted in bold.