The complete chloroplast genome sequence of Berchemia racemosa Siebold & Zucc. (Rhamnaceae), a rare plant species in Korea

Abstract Berchemia racemosa Siebold & Zucc., 1845 is a rare species distributed in restricted areas in the western Korean peninsula. In this study, the complete chloroplast genome (plastome) of B. racemosa was sequenced and assembled by Illumina paired-end sequencing. The plastome of B. racemosa was 161,187 bp in length and was quadripartite in structure, including a large single-copy (LSC) region of 89,503 bp, a small single-copy (SSC) region of 18,214 bp, and two inverted repeats of 26,735 bp. The GC content was 37.2%. The plastome of B. racemosa contains 130 genes, including eight ribosomal RNA (rRNA) genes, 37 transfer RNA (tRNA) genes, and 85 protein-coding genes. Phylogenetic analysis using complete genome sequences showed that B. racemosa is most closely related to Berchemia flavescens.


Introduction
Berchemia racemosa Siebold & Zucc., 1845 is a deciduous vine in the Rhamnaceae family that is native to Korea, Japan, and Taiwan. In Korea, B. racemosa is a rare plant distributed only in a limited region in Jeollabuk-do Province. Its natural habitat in Jeollabuk-do Province is designated as a nature reserve. Here, we analyzed the complete chloroplast genome (plastome) sequence of B. racemosa to provide valuable information for genetic diversity and phylogenetic relationships between B. racemosa and other species in the Rhamnaceae family.

Sampling and genome sequencing
Plant materials of B. racemosa were sampled from Gunsan, Jeollabuk-do Province, Korea (126 41 0 23.10 00 E, 35 58 0 41.90 00 N). Each specimen was imaged using a digital camera to record the information about the sampling sites in natural habitat (Figure 1).

Assembly and annotation of chloroplast genome
The complete plastome was assembled using NOVOPlasty v.4.3.1 (Dierckxsens et al. 2017). Gene annotation was performed using GeSeq v.1.59 (Tillich et al. 2017) with options of Chloe v. 0.1.0., BLATN, and BLATX. BLAST was used to further identify positions of inverted repeat (IR) regions by searching against the published plastome database. A circular map of the complete plastome was generated by CPGView software (http://www.1kmpg.cn/cpgview/). The genomic DNA of B. racemosa was deposited in the Jeollabuk-do Forest Environment Research Institute (contact person, Joon Moh Park; E-mail, joonmoh@korea.kr) under the voucher number JFERI-DNA0020-2. All biological samples and research in this study had been approved by the Ethics Committee of Jeollabuk-do Forest Environment Research Institute. To clarify the phylogenetic position of B. racemosa, the complete plastome sequences of 47 Rosales species and two outgroups (Castanea mollissima and Cucurbita moschata) were downloaded from GenBank and aligned using MAFFT v7.3 (Katoh and Standley 2013). Spurious matches or poorly aligned regions were removed from the multiple sequence alignment by TrimmAl v.1.2 (Capella-Guti errez et al. 2009). The resulting aligned sequences of 127,121 bp were analyzed by the maximum-likelihood (ML) method using IQ-Tree (Nguyen et al. 2015) with a TVM þ FþR4 substitution model as a best-fit model, 1000 replicates of ultrafast bootstrap, and SH-aLRT branch support.

Chloroplast genome features
The complete plastome of B. racemosa (GenBank accession number ON749761) is 161,187 bp with 37.2% GC content. It is composed of a pair of IR regions of 26,735 bp, a large single-copy (LSC) region of 89,503 bp, and a small single-copy (SSC) region of 18,214 bp ( Figure 2). The gene structure of the plastome was nearly identical to that of other Rhamnaceae species (Ma et al. 2017). A total of 130 genes were annotated in the plastome, comprising eight ribosomal RNA (rRNA) genes, 37 transfer RNA (tRNA) genes, and 85 protein-coding genes (PCGs). The rps12 is a trans-spliced gene. The 5 0 exon of this gene is found in the LSC region, while the 3 0 exon is duplicated in the IR regions (Figure 3(A)). Thirteen genes, including rps16, atpF, rpoC1, pafI, clpP1, petB,  petD, rps16, rpl2, ndhB, ndhA, ndhB, and rpl2, contain one or two introns (Figure 3(B)).
Whole plastome sequence of seven Rhamnaceae species was aligned to compare the variations of gene structure within the family. The plastome of these species showed high similarity in terms of the number, length, and arrangement of genes. In particular, the plastome sequence of B. racemosa showed the highest sequence similarity of 98.7% with Berchemia flavescens (Zhu et al. 2019), except for the inverted orientation of a 13,092-bp fragment from 115,721 to 128,812 bp.

Phylogenetic analysis
Complete plastome sequences of 47 species from seven families in Order Rosales were utilized to explore the phylogenetic position of B. racemosa. The reconstructed phylogenetic tree showed that B. racemosa is most closely related to Berchemia flavescens (Zhu et al. 2019) with high support (BS ¼ 100), both of which belong to the genus Berchemia in the Rhamnaceae family ( Figure 4). It also indicated that Rhamnaceae are monophyletic and sister to the Elaeagnaceae family. In addition, our phylogenetic tree confirmed that all seven families formed monophyletic groups.

Discussion and conclusions
B. racemosa is a rare plant distributed only in a limited region in Jeollabuk-do Province, South Korea. In this study, we  reported the complete plastome of B. racemosa together with its genome features. Phylogenetic analysis based on the complete plastome strongly supported earlier study that Rhamnaceae are monophyletic and sister to the Elaeagnaceae family (Cheon et al. 2018). Also, B. racemosa is particularly closely related to Berchemia flavescens (BS ¼ 100). This study provides valuable insights into the phylogenetic and evolutionary position of B. racemosa in the Rhamnaceae family and Order Rosales.

Disclosure statement
The authors report there are no competing interests to declare.

Funding
This research was supported by the "Research Base Construction Fund Support Program" funded by Jeonbuk National University in 2022.

Data availability statement
The genome sequence data that support the findings of this study are openly available in NCBI (https://www.ncbi.nlm.nih.gov) under the accession no. ON749761. The associated BioProject, SRA, and Bio-Sample numbers are PRJNA835661, SRR21615884, and SAMN30910605, respectively.