The first complete chloroplast genome in Engelhardia sensu stricto, Engelhardia hainanensis Chen: genome characterization and its phylogenetic relationships within the family Juglandaceae

Abstract Trees of Engelhardia are important components of subtropical and tropical forests in South-eastern Asia with great ecological and economic values. However, phylogenetic relationships within Engelhardioideae (Juglandaceae) remains obscure. In this study, we report the first complete chloroplast genome sequences of Engelhardia sensu stricto, Engelhardia hainanensis Chen, a rare species endemic in southern China. Its complete chloroplast genome is 161,574 bp in length, with a typical quadripartite structure that includes a large single-copy region of 91,158 bp, a small single-copy region of 18,790 bp, and its GC content is 35.8%. A total of 128 genes were identified, including 83 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Furthermore, a phylogenetic tree of Juglandaceae was constructed based the complete chloroplast genome sequence, which strongly support the three-subfamily classification system in Juglandaceae, and E. hainanensis was resolved sister to two Alfaropsis species. This study provides valuable genomic information for the species identification and phylogenetic study of Juglandaceae.


Introduction
Engelhardia Leschenault ex Blume is a deciduous, semievergreen or evergreen tree group of Juglandaceae which distributes in South-Eastern Asia from eastern Pakistan to New Guinea (Lu et al. 1999). Members of this group are important component in the subtropical and tropical forests, whose leaves are used for drinking as tea while its bark are used to poison fish in local community. Contrary to great ecological and economic value, the subgeneric classification and species number of Engelhardia is open to question (Meng et al. 2022). Engelhardia is suggested divided into genus Engelhardia sensu stricto (s.s.) which contains ca. five species, and a monotypic genus Alfaropsis Iljinsk. which contains only Alfaropsis roxburghiana (Wall.) Iljinsk. (Iljinskaya 1993). A distinct morphological difference between Engelhardia s.s. and Alfaropsis lies in that prophyllum is obvious and envelops fruit in the former while it is absent in the latter. Engelhardia hainanensis Chen (Chen 1981) is a rare species which has the longest winged bracts out of fruit (ca. 8cm, Figure 1) in the genus, and it is thought endemic in Hainan Island, China. However, we discover it in Guangxi Zhuang Autonomous Region during our recent field investigation. In order to better understand the phylogenetic relationships within Engelhardia and Juglandaceae, and provide more genomic data for accurate identification of the toxic Engelhardia species, we sequenced and analyzed the first complete chloroplast genome of Engelhardia s.s., E. hainanensis.

Sample collection and preservation
The fresh leaves of E. hainanensis were collected at Lingyun County in Guangxi, China (N24 31 0 52.86 00 , E106 27 0 43.67 00 ). Leaves were dried with silica gel in the field. Once completely dried, the leaf tissues were stored at À20 C freezer until further use. Voucher specimen (collector and collection number: Xian-Yun Mu (xymu85@bjfu.edu.cn) and MU4973) is deposited at the herbarium of Beijing Forestry University (www.bjfu.edu.cn).

DNA extraction and sequencing, and cleaning raw reads
Genomic DNA were extracted from the above leaf tissues using the DNAsecure Plant Kit (Tiangen Biotech Co. Ltd., Beijing, China), and sequenced by next-generation sequencing method on Illumina Hiseq X Ten platform. In total of 4.1 Gb of 150-bp clean reads were generated for chloroplast genome assembly.

Assembly, annotation, and visualization
Complete chloroplast genome assembly was performed on GetOrganelle 1.7.5 , the annotation was performed on CPGAVAS2 , and further verification by Geneious Prime 2022 (Kearse et al. 2012) with A. roxburghiana (Ling and Zhang 2020) as a reference. The complete plastome of E. hainanensis has been deposited in GenBank under accession number OM302449. A circular map of its plastome was visualized using the CPGView online web  Graphic showing features of its plastome was generated using CPGview. The map contains six tracks. From the inner circle, the first track depicts the dispersed repeats connected by red (forward direction) and green (reverse direction) arcs, respectively. The second track shows the long tandem repeats as short blue bars. The third track displays the short tandem repeats or microsatellite sequences as short bars with different colors. The fourth track depicts the sizes of the inverted repeats (IRa and IRb), small single-copy (SSC), and large single-copy (LSC). The fifth track plots the distribution of GC contents along the plastome. The sixth track displays the genes belonging to different functional groups with different colored boxes. The outer and inner genes are transcribed in the clockwise and counterclockwise directions, respectively.

Phylogenetic reconstruction
To determine the phylogenetic relationships of E. hainanensis in Juglandaceae, 40 species in Juglandaceae and three species in Betulaceae, Fagaceae, and Myricaceae family were selected based on previous researches (e.g., Zhang et al. 2019;Mu et al. 2020;Song et al. 2020). These related plastome sequences were downloaded from NCBI (https://www. ncbi.nlm.nih.gov/). All sequences were aligned using MAFFT (Katoh et al. 2019). A maximum likelihood tree with bootstrap value (1000 replicates) was constructed based on GTR þ G model in RAxML software (Kozlov et al. 2019). The final tree was edited using the iTOL version 5.0 online web (https://itol. embl.de/) (Letunic and Bork 2016).

Conclusions
We reported the first complete chloroplast genome sequences of Engelhardia s.s., E. hainanensis. The assembly circular plastome was 16,534 bp in length. The phylogenetic analysis results strongly supported the three-subfamily classification system in Juglandaceae, and E. hainanensis was a resolved sister to two Alfaropsis species. The plastome sequence of E. hainanensis presented here provides valuable genomic information for further species identification and phylogenetic study of Juglandaceae.

Ethical approval
This study includes no endangered plant samples, and the sampling site is not located in any protected area. The collection of plant materials is in accordance with local regulations and obtain the permission of local authorities.

Author contributions
XYM conceived and designed the work. XYM and YHQ performed field collection and identification of plant material. XYM and YMW assembled and annotated the plastome sequence. YMW prepared genome sequences, performed phylogenetic analysis, and visualized figures. XYM and YMW wrote the manuscript. YMW, XYM and YHQ reviewed drafts of the paper. All authors discussed the results, commented on the manuscript, and approved the current version of the manuscript.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This work was supported by the Natural Science Foundation of China [grant no. 32070235].

Data availability statement
The genome sequence data that support the findings of this study are openly available in GenBank of NCBI at (https://www.ncbi.nlm.nih.gov/) under the accession no. OM302449. The associated BioProject, SRA, and BioSample numbers are PRJNA797548, SRP355546 and SAMN25010399, respectively.