The complete chloroplast genome of Chinese medicine (Psoralea corylifolia): Molecular structures, barcoding and phylogenetic analysis
Introduction
Leguminosae is known as the third-largest family of angiosperms in terms of 751 genera, consisting of 19,000 species (Christenhusz and Byng, 2016; Stevens, 2008). The subfamily Papilionoideae of Leguminosae compose of several tribes, including Fabeae, Galegeae, Indigofereae, Loteae, Millettieae, Phaseoleae, Psoraleeae, etc (Wojciechowski et al., 2000). Among these tribes, nucleotide substitution rates of Psoraleeae have been found elevated relative to other tribes, suggestive of rapid evolution or diversification (Egan and Crandall, 2008).The genus Psoralea of Psoraleeae contains about 120 species mainly distributed in southern Africa, North and South America, Australia and only one species can be found in China (Committee, 1995). P. corylifolia (Chinese name Buguzhi) is an annual plant, generating pale-purple flowers. Due to the presence of minute brown glands, the plant has a distinctive and pleasant fragrance (Miller and Miranda, 1998). Furthermore, the seed of P. corylifolia can traditionally be used for the treatment of menopause, depression, kidney deficiency (Wang et al., 2011; Zhao et al., 2005). Previous pharmacological studies indicate its antioxidant, antimicrobial, antiinflammatory and chemoprotective properties. Thus, P. corylifolia has been widely used in traditional Chinese medicine for its pharmacological effects to treat multiple diseases (Zhang et al., 2016). Based on its medicinal value, P. corylifolia deserves further utilization and development. Recently, with the rapid growth of next-generation sequencing (NGS) technologies, more and more genomic resources at reasonable schedules and prices have been provided (Mardis, 2008). However, despite the profound importance of P. corylifolia, the studies regarding its genetic characteristics are in scarce. Therefore, the genetic variety and phylogenetic status of this species needs to be analyzed by molecular techniques.
Chloroplast genome, which comprises a typical structure consisting of two duplicate inverted repeats (IRs) isolated by the large and small single copy (LSC and SSC) regions, frequently has conserved circular double-stranded structure, ranging from 120 to 160 kb in length (Bock, 2007; Jansen and Ruhlman, 2012). However, previous studies have revealed that the chloroplast genome has shown a notable structural variation among the Papilionoideae subfamily (Wang et al., 2017). Meanwhile, most species in Papilionoideae possess a large (50-kb) inversion in their chloroplast genomes (Doyle et al., 1996). Moreover, the loss of IRs have been detected in many species of Papilionoideae, such as Glycyrrhiza glabra, Lens culinaris, Vicia faba (Sabir et al., 2014). The genes obtained from the chloroplast genome have been applied to molecular identification and phylogenetic evolution analysis, as a result of its maternal inheritance pattern and comparatively independent evolution (Zhou et al., 2017). Although previous studies have recommended some loci as the plant barcode, including rbcL, matK, trnH-psbA, trnL-trnF (Ferri et al., 2015; Hollingsworth et al., 2011), the complete chloroplast genome might be more appropriate to be used as a super-barcode for species identification (Xia et al., 2016).
For such an important group of plants, the classification and phylogenetic relationships of Psoraleeae remain poorly known, with no report on chloroplast genomes of memebers of this tribe having been ever made. Therefore, in this research, we sought to confirm the first complete chloroplast genome sequence of P. corylifolia and compare with the published chloroplast genome of the related genus of Glycine, including G. max (Saski et al., 2005), G. soja (Gao and Gao, 2017b) and G. gracilis (Gao and Gao, 2017a). Our aim is to construct and characterize the structure of the complete chloroplast genome of P. corylifolia and provide vital phylogenetic and genetic information for future studies in Psoraleeae and legume plastomes.
Section snippets
Chloroplast genome sequencing and assembly
Dried ripe fruit powder of P. Corylifolia was collected from National Institutes for Food and Drug Control, which batch number was 121,056–200,904, and the voucher was deposited in Tianjin State Key Laboratory of Modern Chinese Medicine. Total genomic DNA was detached using Extract Genomic DNA Kit following the protocol of the manufacturer. Then, genomic DNA was used for sequencing by an Illumina HiSeq platform and 150 bp paired-end reads were generated with an insert size of 800 bp. The
Chloroplast genome assembly
By using the Illumina sequencing platform, we obtained a total of 37,052,568 reads with an average read length of 150 bp. The reads were re-mapped to the chloroplast genome, and the coverage of the chloroplast genome was 8108×. The size of the whole genome was 153,114 bp.
Organization and gene content
The complete plastid genome of P. corylifolia displayed a typical quadripartite structure (Fig. 1), which comprised of a pair of inverted repeats (each 25,557 bp in length) separated by the SSC and the LSC region (17,885 and
Discussion
In our study, the four highly variable protein-coding genes (ycf1, matK, accD, ndhF) were selected as the potential markers. Previous evidence supports that ycf1 gene is one of the core plastid DNA barcode of land plants (Dong et al., 2015). Meanwhile, the phylogenetic relationship of Papilionoideae based on matK gene has been reported (de Queiroz et al., 2015). Moreover, not only the ycf1 and matK genes but also the other two genes showed good performance in distinguishing Phaseoleae and
Conclusions
In China, P. corylifolia is a vital traditional Chinese medicine. In this study, we presented the complete chloroplast genome of P. corylifolia using Illumina sequencing platforms, and this was the first time that the chloroplast genome was assembled in the tribe Psoraleeae. P. corylifolia chloroplast genome (153,114 bp) was fully characterized and compared to the chloroplast genomes of related species previously reported. The chloroplast genome of P. corylifolia contained 111 unique genes,
Formatting of funding sources
This study was supported by National Natural Science Foundation of China (Grant no. 81673826).
Acknowledgements
We are thankful to the National Natural Science Foundation of China (Grant no. 81673826) for its financial support to our study. We also acknowledge Ms. Liran Sun for her previous work.
References (57)
- et al.
A multilocus phylogenetic analysis reveals the monophyly of a recircumscribed papilionoid legume tribe Diocleae with well-supported generic relationships
Mol. Phylogenet. Evol.
(2015) - et al.
The distribution and phylogenetic significance of a 50-kb chloroplast DNA inversion in the flowering plant family leguminosae
Mol. Phylogenet. Evol.
(1996) - et al.
Forensic botany II, DNA barcode for land plants: which markers after the international agreement?
Forensic Sci Int Genet
(2015) - et al.
Synonymous codon usage and gene function are strongly related in Oryza sativa
Biosystems
(2005) - et al.
Chloroplast genome analyses and genomic resource development for epilithic sister genera Oresitrophe and Mukdenia (Saxifragaceae), using genome skimming data
BMC Genomics
(2018) - et al.
Plastomes of Mimosoideae: structural and size variation, sequence divergence, and phylogenetic implication
Tree Genet. Genomes
(2017) - et al.
Fingerprint analysis of P. corylifolia L. by HPLC and LC–MS
Journal of Chromatography B Analytical Technologies in the Biomedical & Life Sciences
(2005) - et al.
IRscope: an online program to visualize the junction sites of chloroplast genomes
Bioinformatics
(2018) Tandem repeats finder: a program to analyze DNA sequences
Nucleic Acids Res.
(1999)Structure, function, and inheritance of plastid genomes
The chloroplast genome of Phalaenopsis aphrodite (Orchidaceae): comparative analysis of evolutionary rate with that of grasses and its phylogenetic implications
Mol. Biol. Evol.
The number of known plants species in the world and its annual increase
Phytotaxa
Flora of China
NOVOPlasty: de novo assembly of organelle genomes from whole genome data
Nucleic Acids Res.
Complete chloroplast genome of Sedum sarmentosum and chloroplast genome evolution in Saxifragales
PLoS One
ycf1, the most promising plastid DNA barcode of land plants
Sci. Rep.
Divergence and diversification in North American Psoraleeae (Fabaceae) due to climate change
BMC Biol.
VISTA: computational tools for comparative genomics
Nucleic Acids Res.
The complete chloroplast genome sequence of semi-wild soybean, G. gracilis (Fabales: Fabaceae)
Conserv. Genet. Resour.
The complete chloroplast genome sequence of wild soybean, G soja
Conserv. Genet. Resour.
Rapid evolutionary change of common bean (Phaseolus vulgaris L) plastome, and the genomic diversification of legume chloroplasts
BMC Genomics
Choosing and using a plant DNA barcode
PLoS One
Phylogenetic systematics of the tribe Millettieae (Leguminosae) based on chloroplast trnK/matK sequences and its implications for evolutionary patterns in Papilionoideae
Am. J. Bot.
Plastid genomes of seed plants
MAFFT multiple sequence alignment software version 7: improvements in performance and usability
Mol. Biol. Evol.
Complete chloroplast genome sequence of Adenophora remotiflora (Campanulaceae)
Mitochondrial DNA A DNA Mapp Seq Anal
MEGA X: molecular evolutionary genetics analysis across computing platforms
Mol. Biol. Evol.
REPuter: the manifold applications of repeat analysis on a genomic scale
Nucleic Acids Res.
Cited by (12)
The complete chloroplast genome sequence of Cicer echinospermum, genome organization and comparison with related species
2022, Scientia HorticulturaeCitation Excerpt :Besides this, due to high degree polymorphism of these repeat units, they have a strong potential to be molecular markers (Munyao et al., 2020; Powell et al., 1995). The predominance of A/T bases in microsatellites has been reported frequently in the previous studies, which suggested that the cp genome have a high content of polythymine (T) and polyadenine (A) repeats (Kaila et al., 2017; Tan et al., 2020). The abundance of A and T bases was detected in the repetitive regions of C. echinospermum genome, similar the other legume species (Moghaddam, 2021).
The complete chloroplast genome of rabbiteye blueberry (Vaccinium ashei) and comparison with other Vaccinium species
2024, Revista Brasileira de BotanicaComparative Analysis on the Codon Usage Pattern of the Chloroplast Genomes in Malus Species
2023, Biochemical GeneticsPhylogenomic analysis of Bupleurum in Western Sichuan, China, including an overlooked new species
2023, Frontiers in Plant Science
- 1
These authors contributed equally to this work.