Comparative analysis of the complete chloroplast genomes of thirteen Bougainvillea cultivars from South China with implications for their genome structures and phylogenetic relationships

Bougainvillea spp., belonging to the Nyctaginaceae family, have high economic and horticultural value in South China. Despite the high similarity in terms of leaf appearance and hybridization among Bougainvillea species, especially Bougainvillea × buttiana, their phylogenetic relationships are very complicated and controversial. In this study, we sequenced, assembled and analyzed thirteen complete chloroplast genomes of Bougainvillea cultivars from South China, including ten B. × buttiana cultivars and three other Bougainvillea cultivars, and identified their phylogenetic relationships within the Bougainvillea genus and other species of the Nyctaginaceae family for the first time. These 13 chloroplast genomes had typical quadripartite structures, comprising a large single-copy (LSC) region (85,169–85,695 bp), a small single-copy (SSC) region (18,050–21,789 bp), and a pair of inverted-repeat (IR) regions (25,377–25,426 bp). These genomes each contained 112 different genes, including 79 protein-coding genes, 29 tRNAs and 4 rRNAs. The gene content, codon usage, simple sequence repeats (SSRs), and long repeats were essentially conserved among these 13 genomes. Single-nucleotide polymorphisms (SNPs) and insertions/deletions (indels) were detected among these 13 genomes. Four divergent regions, namely, trnH-GUG_psbA, trnS-GCU_trnG-UCC-exon1, trnS-GGA_rps4, and ccsA_ndhD, were identified from the comparative analysis of 16 Bougainvillea cultivar genomes. Among the 46 chloroplast genomes of the Nyctaginaceae family, nine genes, namely, rps12, rbcL, ndhF, rpoB, rpoC2, ndhI, psbT, ycf2, and ycf3, were found to be under positive selection at the amino acid site level. Phylogenetic relationships within the Bougainvillea genus and other species of the Nyctaginaceae family based on complete chloroplast genomes and protein-coding genes revealed that the Bougainvillea genus was a sister to the Belemia genus with strong support and that 35 Bougainvillea individuals were divided into 4 strongly supported clades, namely, Clades Ⅰ, Ⅱ, Ⅲ and Ⅳ. Clade Ⅰ included 6 individuals, which contained 2 cultivars, namely, B. × buttiana ‘Gautama’s Red’ and B. spectabilis ‘Flame’. Clades Ⅱ only contained Bougainvillea spinosa. Clade Ⅲ comprised 7 individuals of wild species. Clade Ⅳ included 21 individuals and contained 11 cultivars, namely, B. × buttiana ‘Mahara’, B. × buttiana ‘California Gold’, B. × buttiana ‘Double Salmon’, B. × buttiana ‘Double Yellow’, B. × buttiana ‘Los Banos Beauty’, B. × buttiana ‘Big Chitra’, B. × buttiana ‘San Diego Red’, B. × buttiana ‘Barbara Karst’, B. glabra ‘White Stripe’, B. spectabilis ‘Splendens’ and B. × buttiana ‘Miss Manila’ sp. 1. In conclusion, this study not only provided valuable genome resources but also helped to identify Bougainvillea cultivars and understand the chloroplast genome evolution of the Nyctaginaceae family.


PMID: 37894819
We revised the section Phylogenetic relationships in the Bougainvillea genus as following: The  including B. campanulata,B. berberidifolia,B. infesta,B. modesta OM44398,B. modesta OM044396,B. stipitata,and B. stipitata var. grisebachiana. In Clade Ⅳ,in   The authors clearly failed to address all four major issues.For major #2, the significancy in including the thornless mutant was not well explained and elaborated.
Response: Thanks for your idea.The thornless mutant was a variety from Bougainvillea × buttiana 'Miss Manila'.The mother line of Bougainvillea × buttiana 'Miss Manila' has been sold out in commercial activities so far.However, the bud mutant has been kept in live and cultivated by grafting.Because the author He-Fa Wang hasn't yet got certificate of registered commercial name for the mutant.In the manuscript, we gave the mutant with the name of Bougainvillea × buttiana 'Miss Manila' sp.1.
For major #4, the breeding techniques do not affect the presentation of SNPs and Indels which are significant in cultivar authentication.It is understandable their are plenty of SNPs discovered from the analysis.Yet, the authors failed the listed out those valuable in cultivar authentication in an reader-friendly manner.
Response: Thanks for your idea.We didn't use the presentation of SNP and indels as described in cultivars of Hyacinthus orientalis.For three comparisons, there were more than 700 SNPs and 100 indels.This situation is not fit in the text.However, we listed out these SNPs and indels in detail information in the supplementary file: S7 Table, including base information of SNPs, mutant type, gene name, gene start, gene end, indel sequence, indel start, indel end, alignment strand and so on.
For major #1 and #3, the authors should obtain more suggestions froms experienced taxonomists and phylogenists, respectively.A numbers of minor issues were also not well addressed.The authors should read more articles to find out which format "the xxx genus/family" or "the genus/family xxx" should be correct in English.
Response: Thanks for your idea.These two forms have appeared in many articles.We selected the form of "the XXX genus/family" in the text.
The authors also failed to add authorities after genus name and species epithets, which are commonly adopted in the scientific research articles.
Response: The name of 12 Bougainvillea cultivar were adopted in the scientific book as following.The presentation of GPS coordinates was in wrong order.
Response: We checked and revised the correct GPS in correct order.We reported details about the GPS for sample collection, DNA sample store and the rest leaf materials store.).

Fresh leaves of twelve
Suggestions on figure and table presentation were sadly ignored.
Response: We revised Figure 1 to Figure 9 as following.
Liu Y, Ruan L, Zhou H, Yu M. Cultivar classification of Bougainvillea.China Forestry Press, Beijing, China, 2020 Sun L. Molecular identification of cultivars and transcriptome analysis of bracts in Bougainvillea.Ph.D. Chinese Academy of Forestry, Beijing, China, 2019, pp38-39 cultivation factoty ofZhangzhou (117°49′9″E, 24°31′33″N)  in Xiamen Qianrihong Horticulture Co., Ltd, Fujian Province, China.Fresh leaves were quickly frozen on dry ice, sent to the laboratory of the Environmental Horticulture Research Institute (113°20′39″E, 23°8′51″N) at the Guangdong Academy of Agricultural Sciences, Guangzhou, China, and stored at −80 ℃ until use.Genomic chloroplast DNA was extracted from each sample using the modified sucrose gradient centrifugation method[13].Then, the DNA quality and quantity were checked through agarose gel electrophoresis and the NanoDrop microspectrometer method, respectively.Each qualified DNA sample was sheared to fragments of approximately 350 bp.Short-insert (350 bp) paired-end libraries were constructed, and sequencing was performed on an Illumina NovaSeq 6000 platform with a paired read length of 150 bp(Biozeron, Shanghai, China).The raw data //www.bioinformatics.babraham.ac.uk/projects/fastqc/), and adaptors and low-quality reads were subsequently deleted by Trimmomatic v. 0.39[14]  with default parameters.The remaining materials, including the leaves and DNA, were deposited in the laboratory of the Environmental Horticulture Research Institute (store sheet code: B2023, 113°20′39″E, 23°8′51″N), Guangdong Academy of Agricultural Sciences, Guangzhou, China, as vouchers (S1 Table

Fig 2 .
Fig 2. Chloroplast genome map of B. × buttiana 'Mahara' (the outermost three rings) and CGView comparison of 13 complete chloroplast genomes of Bougainvillea cultivars (the inner rings with different colors).Genes shown on the outside of the outermost first ring are transcribed counter-clockwise, and those on the inside are transcribed clockwise.The second ring with the darker gray color corresponds to the GC content, whereas the third ring with the lighter gray color corresponds to the AT content of the B. × buttiana 'Mahara' chloroplast genome generated by OGDRAW.The gray arrowheads indicate the directions of the genes.LSC, large single -copy region; IR, inverted repeat; SSC, small single-copy region.The innermost first black ring indicates the chloroplast genome size of B. × buttiana 'Mahara'.The innermost second and third rings indicate deviations in the GC content and GC skew, respectively, in the chloroplast genome of C. barbatus: GC skew + indicates G > C, and GC skew − indicates G < C. CGView comparison of the 13 complete chloroplast genomes of Bougainvillea cultivars displayed from the innermost 4th colored ring to the outer 16th ring: B. × buttiana 'Mahara', B. × buttiana 'Gautama's Red', B. × buttiana 'California Gold', B. × buttiana 'Double Salmon', B. × buttiana 'Double Yellow', B. × buttiana 'Big Chitra', B. × buttiana 'Los Banos Beauty', B. glabra 'White Stripe', B. spectabilis 'Flame', B. spectabilis 'Splendens', B.× buttiana 'Barbara Karst', B.× buttiana 'San Diego Red', and B. × buttiana 'Miss Manila' sp. 1, respectively.Chloroplast genome similar and highly divergent locations are represented by continuous and interrupted track lines, respectively.

Fig 3 .Fig 4 .Fig 5 .Fig 6 .
Fig 3. Distribution of SSRs in the 13 newly sequenced Bougainvillea chloroplast genomes.(A) Numbers of different SSR types detected in the 13 chloroplast genomes.(B) Frequencies of SSRs in the LSC, IR and SSC regions.(C) Frequencies of identified SSR motifs in different repeat class types.

Fig 7 .
Fig 7. Complete chloroplast genome comparison of the 16 Bougainvillea chloroplast genomes using B. × buttiana 'Mahara' as a reference.The gray arrows and thick black lines above the alignment indicate gene orientation.Purple bars represent exons, sky-blue bars represent untranslated regions (UTRs), red bars represent non-coding sequences (CNS), gray bars represent mRNAs, and white regions represent sequence differences among the analyzed chloroplast genomes.The y-axis represents the identity percentage ranging from 50% to 100%.The 13 sequenced Bougainvillea chloroplast genomes in this study are shown in bold.

Fig 9 .
Fig 9. Phylogenetic relationships of Nyctaginaceae species based on chloroplast genomes sequences reconstructed using maximum likelihood (ML) and Bayesian inference (BI) methods.(A) ML tree.(B) BI tree.The 13 newly sequenced Bougainvillea chloroplast genomes identified in this study are shown in bold.
One bud mutation armed with simple or no thorns and derived from B. × buttiana 'Miss Manila', given namely, B. × buttiana 'Miss Manila' sp. 1 (Fig 1, S1Table), was collected from the

Table 4 .
For table 4, '462 L 0.969' and '462 |L|0.969'weretwo different forms for presentation.We use the form of '462 L 0.969', not used the form of '462 |L|0.969'.Because the analyzed result from the CodeML program used the form of '462 L 0.969', not used the form of '462 |L|0.969'.