Elsevier

Plant Gene

Volume 21, March 2020, 100216
Plant Gene

The complete chloroplast genome of Chinese medicine (Psoralea corylifolia): Molecular structures, barcoding and phylogenetic analysis

https://doi.org/10.1016/j.plgene.2019.100216Get rights and content

Highlights

  • The first complete chloroplast genome in the tribe Psoraleeae.

  • Phylogenetic trees of Papilionoideae reconstructed with 75 protein-coding genes of 37 species using maximum likelihood (ML) and bayesian inference (BI) methods.

  • Four chloroplast DNA regions (ycf1, matK, accD, ndhF) can serve as the potential barcodes.

Abstract

Psoralea corylifolia is one kind of traditional Chinese medicine used in China widely. In this study, we sequence the complete chloroplast genome of P. corylifolia, which is 153,114 bp in size and includes a pair of inverted repeats regions of 25,557 bp interspersed by a small single copy of 17,885 bp and a large single copy of 84,115 bp region. Approximately 98 simple sequence repeats, 14 forward, 2 reverse, 2 complement, 32 palindromic and 49 tandem repeats are identified in the P. corylifolia chloroplast genome. The chloroplast genomes of P. corylifolia and three Glycine species are conserved in gene order and content, but show high diversity within intergenic spacers. P. corylifolia with three Glycine species in Papilionoideae fall into the same clade based on 75 conserved coding-protein genes phylogenomic analysis. Moreover, four chloroplast DNA regions (ycf1, matK, accD, ndhF) can serve as the barcodes. In general, our findings will dedicate to better comprehension of the genome aspect as well as evolutionary status of P. corylifolia.

Introduction

Leguminosae is known as the third-largest family of angiosperms in terms of 751 genera, consisting of 19,000 species (Christenhusz and Byng, 2016; Stevens, 2008). The subfamily Papilionoideae of Leguminosae compose of several tribes, including Fabeae, Galegeae, Indigofereae, Loteae, Millettieae, Phaseoleae, Psoraleeae, etc (Wojciechowski et al., 2000). Among these tribes, nucleotide substitution rates of Psoraleeae have been found elevated relative to other tribes, suggestive of rapid evolution or diversification (Egan and Crandall, 2008).The genus Psoralea of Psoraleeae contains about 120 species mainly distributed in southern Africa, North and South America, Australia and only one species can be found in China (Committee, 1995). P. corylifolia (Chinese name Buguzhi) is an annual plant, generating pale-purple flowers. Due to the presence of minute brown glands, the plant has a distinctive and pleasant fragrance (Miller and Miranda, 1998). Furthermore, the seed of P. corylifolia can traditionally be used for the treatment of menopause, depression, kidney deficiency (Wang et al., 2011; Zhao et al., 2005). Previous pharmacological studies indicate its antioxidant, antimicrobial, antiinflammatory and chemoprotective properties. Thus, P. corylifolia has been widely used in traditional Chinese medicine for its pharmacological effects to treat multiple diseases (Zhang et al., 2016). Based on its medicinal value, P. corylifolia deserves further utilization and development. Recently, with the rapid growth of next-generation sequencing (NGS) technologies, more and more genomic resources at reasonable schedules and prices have been provided (Mardis, 2008). However, despite the profound importance of P. corylifolia, the studies regarding its genetic characteristics are in scarce. Therefore, the genetic variety and phylogenetic status of this species needs to be analyzed by molecular techniques.

Chloroplast genome, which comprises a typical structure consisting of two duplicate inverted repeats (IRs) isolated by the large and small single copy (LSC and SSC) regions, frequently has conserved circular double-stranded structure, ranging from 120 to 160 kb in length (Bock, 2007; Jansen and Ruhlman, 2012). However, previous studies have revealed that the chloroplast genome has shown a notable structural variation among the Papilionoideae subfamily (Wang et al., 2017). Meanwhile, most species in Papilionoideae possess a large (50-kb) inversion in their chloroplast genomes (Doyle et al., 1996). Moreover, the loss of IRs have been detected in many species of Papilionoideae, such as Glycyrrhiza glabra, Lens culinaris, Vicia faba (Sabir et al., 2014). The genes obtained from the chloroplast genome have been applied to molecular identification and phylogenetic evolution analysis, as a result of its maternal inheritance pattern and comparatively independent evolution (Zhou et al., 2017). Although previous studies have recommended some loci as the plant barcode, including rbcL, matK, trnH-psbA, trnL-trnF (Ferri et al., 2015; Hollingsworth et al., 2011), the complete chloroplast genome might be more appropriate to be used as a super-barcode for species identification (Xia et al., 2016).

For such an important group of plants, the classification and phylogenetic relationships of Psoraleeae remain poorly known, with no report on chloroplast genomes of memebers of this tribe having been ever made. Therefore, in this research, we sought to confirm the first complete chloroplast genome sequence of P. corylifolia and compare with the published chloroplast genome of the related genus of Glycine, including G. max (Saski et al., 2005), G. soja (Gao and Gao, 2017b) and G. gracilis (Gao and Gao, 2017a). Our aim is to construct and characterize the structure of the complete chloroplast genome of P. corylifolia and provide vital phylogenetic and genetic information for future studies in Psoraleeae and legume plastomes.

Section snippets

Chloroplast genome sequencing and assembly

Dried ripe fruit powder of P. Corylifolia was collected from National Institutes for Food and Drug Control, which batch number was 121,056–200,904, and the voucher was deposited in Tianjin State Key Laboratory of Modern Chinese Medicine. Total genomic DNA was detached using Extract Genomic DNA Kit following the protocol of the manufacturer. Then, genomic DNA was used for sequencing by an Illumina HiSeq platform and 150 bp paired-end reads were generated with an insert size of 800 bp. The

Chloroplast genome assembly

By using the Illumina sequencing platform, we obtained a total of 37,052,568 reads with an average read length of 150 bp. The reads were re-mapped to the chloroplast genome, and the coverage of the chloroplast genome was 8108×. The size of the whole genome was 153,114 bp.

Organization and gene content

The complete plastid genome of P. corylifolia displayed a typical quadripartite structure (Fig. 1), which comprised of a pair of inverted repeats (each 25,557 bp in length) separated by the SSC and the LSC region (17,885 and

Discussion

In our study, the four highly variable protein-coding genes (ycf1, matK, accD, ndhF) were selected as the potential markers. Previous evidence supports that ycf1 gene is one of the core plastid DNA barcode of land plants (Dong et al., 2015). Meanwhile, the phylogenetic relationship of Papilionoideae based on matK gene has been reported (de Queiroz et al., 2015). Moreover, not only the ycf1 and matK genes but also the other two genes showed good performance in distinguishing Phaseoleae and

Conclusions

In China, P. corylifolia is a vital traditional Chinese medicine. In this study, we presented the complete chloroplast genome of P. corylifolia using Illumina sequencing platforms, and this was the first time that the chloroplast genome was assembled in the tribe Psoraleeae. P. corylifolia chloroplast genome (153,114 bp) was fully characterized and compared to the chloroplast genomes of related species previously reported. The chloroplast genome of P. corylifolia contained 111 unique genes,

Formatting of funding sources

This study was supported by National Natural Science Foundation of China (Grant no. 81673826).

Acknowledgements

We are thankful to the National Natural Science Foundation of China (Grant no. 81673826) for its financial support to our study. We also acknowledge Ms. Liran Sun for her previous work.

References (57)

  • C.C. Chang et al.

    The chloroplast genome of Phalaenopsis aphrodite (Orchidaceae): comparative analysis of evolutionary rate with that of grasses and its phylogenetic implications

    Mol. Biol. Evol.

    (2006)
  • M.J.M. Christenhusz et al.

    The number of known plants species in the world and its annual increase

    Phytotaxa

    (2016)
  • F.O.C.E. Committee

    Flora of China

    (1995)
  • N. Dierckxsens et al.

    NOVOPlasty: de novo assembly of organelle genomes from whole genome data

    Nucleic Acids Res.

    (2017)
  • W. Dong et al.

    Complete chloroplast genome of Sedum sarmentosum and chloroplast genome evolution in Saxifragales

    PLoS One

    (2013)
  • W. Dong et al.

    ycf1, the most promising plastid DNA barcode of land plants

    Sci. Rep.

    (2015)
  • A.N. Egan et al.

    Divergence and diversification in North American Psoraleeae (Fabaceae) due to climate change

    BMC Biol.

    (2008)
  • K.A. Frazer et al.

    VISTA: computational tools for comparative genomics

    Nucleic Acids Res.

    (2004)
  • C.W. Gao et al.

    The complete chloroplast genome sequence of semi-wild soybean, G. gracilis (Fabales: Fabaceae)

    Conserv. Genet. Resour.

    (2017)
  • C.W. Gao et al.

    The complete chloroplast genome sequence of wild soybean, G soja

    Conserv. Genet. Resour.

    (2017)
  • X. Guo et al.

    Rapid evolutionary change of common bean (Phaseolus vulgaris L) plastome, and the genomic diversification of legume chloroplasts

    BMC Genomics

    (2007)
  • P.M. Hollingsworth et al.

    Choosing and using a plant DNA barcode

    PLoS One

    (2011)
  • J.M. Hu et al.

    Phylogenetic systematics of the tribe Millettieae (Leguminosae) based on chloroplast trnK/matK sequences and its implications for evolutionary patterns in Papilionoideae

    Am. J. Bot.

    (2000)
  • R.K. Jansen et al.

    Plastid genomes of seed plants

  • K. Katoh et al.

    MAFFT multiple sequence alignment software version 7: improvements in performance and usability

    Mol. Biol. Evol.

    (2013)
  • K.A. Kim et al.

    Complete chloroplast genome sequence of Adenophora remotiflora (Campanulaceae)

    Mitochondrial DNA A DNA Mapp Seq Anal

    (2016)
  • S. Kumar et al.

    MEGA X: molecular evolutionary genetics analysis across computing platforms

    Mol. Biol. Evol.

    (2018)
  • S. Kurtz et al.

    REPuter: the manifold applications of repeat analysis on a genomic scale

    Nucleic Acids Res.

    (2001)
  • Cited by (12)

    View all citing articles on Scopus
    1

    These authors contributed equally to this work.

    View full text