利用叶绿体基因组大单拷贝区的单核苷酸多态位点鉴定紫薇属和马尾藻属植物
A Novel Method for Identification of Lagerstroemia and Sargassum Taxa Using Single Nucleotide Polymorphic Characters from the Large Single-Copy Region of Complete Chloroplast Genomes
DOI: 10.12677/BR.2022.112026, PDF, HTML, XML,  被引量 下载: 264  浏览: 594  国家自然科学基金支持
作者: 索志立*:中国科学院植物研究所系统与进化植物学国家重点实验室,北京;顾翠花*:浙江农林科技大学风景园林与建筑学院/南方园林植物种质创新与利用国家林业和草原局重点实验室/浙江省园林植物种质创新与利用重点实验室,浙江 杭州 ;左云娟:中国科学院东南亚生物多样性研究中心/中国科学院西双版纳热带植物园综合保护中心,云南 勐腊;杨志荣:中国科学院植物研究所植物标本馆,北京;孙忠民:中国科学院海洋研究所海洋生物分类与系统演化实验室,山东 青岛;杨强发:湖南紫薇投资集团有限公司,湖南 邵阳 ;靳晓白:中国科学院植物研究所北京植物园,北京
关键词: 紫薇属马尾藻属叶绿体基因组大单拷贝区单核苷酸多态位点植物鉴定Lagerstroemia Sargassum Complete Chloroplast Genome Large Single-Copy RegionSingle-Nucleotide Polymorphic Character Plant Identification
摘要: 由于植物的种间和属间存在或多或少的形态相似性,用于植物分类的有价值的形态性状数量有限。DNA短片段提供的信息量少,只能区分有限数量的植物种。精确鉴定和数字化管理全球植物,仍然是极其具有挑战性的难题。本文报道,在一个核苷酸分子水平上精确鉴定紫薇属植物的新方法。叶绿体基因组的大单拷贝区的单核苷酸多态位点,作为关键的分子性状用于编制分子分类检索表。为了达到简化和精确鉴定的目的,回避使用如下3类基因组DNA区域的分子性状,即gap区域、poly结构区域及简单重复序列区域。本研究对于紫薇属植物的分类修订、数字化管理、保护和利用具有重要价值。海洋植物大型藻类马尾藻科马尾藻属的9个种/变种也可以用同样的方法成功鉴定,表明我们的方法不仅适用于高等植物的鉴定,也适用于低等植物藻类。
Abstract: Morphological features to some extent are similar among plant species/genera, so the number of distinguishable morphological traits useful for plant taxonomy is limited. With the progress of global projects of the tree of life, 4 universal DNA barcodes (rbcL, matK, trnH-psbA and ITS) are recommended for global plant classification, 3 (rbcL, matK, and trnH-psbA) of which are from the chloroplast (cp) genome. However, short DNA fragments provide less information and relatively limited number of plant species can be resolved satisfactorily. Assembling and operating the complete cp genome with approximate 150 kb in length require skilled high-level experts and involve a large workload. It is still difficult to accurately identify and digitally manage the global plants. To provide a more convenient, simple and accurate method for plant identification and classification, for the first time, we used single-nucleotide polymorphic characters from the large single-copy (LSC) region of the cp genome for compilation of molecular taxonomic keys to Lagerstroemia species. To ensure simplification and accurate identification, we avoided the use of molecular traits from the following three categories of genomic DNA regions: gap regions, poly-N regions and simple sequence repeat regions. With the huge number of plant species in the world, genetic variations, such as those resulted from gene transfer and loss, may cause the target DNA region to be unavailable and further make comparative analysis impossible in plant DNA barcoding in some cases, but the LSC region is present in each cp genome in the plant kingdom. Thus, comparative analysis of all plant species worldwide can be conducted based on sequences from the LSC region. The cp genome can provide rich information for plant identification. Our new methodology is valuable for improving plant taxonomic revision, upgrading the digital management platform and accelerating phylogenetic and phytogeographic insights into global plants. Nine Sargassum species/varieties (Sargassaceae) of marine plants were also successfully identified in the similar way, indicating that our method is suitable not only for identification and classification of higher plants, but also algae of lower plants.
文章引用:索志立, 顾翠花, 左云娟, 杨志荣, 孙忠民, 杨强发, 靳晓白. 利用叶绿体基因组大单拷贝区的单核苷酸多态位点鉴定紫薇属和马尾藻属植物[J]. 植物学研究, 2022, 11(2): 218-228. https://doi.org/10.12677/BR.2022.112026

1. 引言

过去200多年来,全球范围内进行了大规模野外调查、标本采集,根据形态特征进行植物分类,推测其系统发生关系,进行生物地理学研究 [1] - [9]。出版了植物志和专著 [10] [11]。近年来,中国科学院设立中国–非洲联合研究中心,目的在于协助非洲国家出版和再版植物志 [12]。植物志全面记录植物特征,为植物资源的保护和利用提供科学依据。

由于植物的种间和属间存在或多或少的形态相似性,用于植物分类的有价值的形态性状数量有限 [3] [13] [14] [15],这些给植物物种鉴定带来很大挑战。以中国科学院植物研究所植物标本馆为例,该馆馆藏植物标本295万份,其中,约有60万份标本没有鉴定,原因大多是由于标本缺乏识别特征或者没有相关植物类群的专家来鉴定。这些存疑标本多放置在类群的科尾。Goodwin et al. (2015) [3] 评价了21个国家40个标本馆的4500份非洲姜科Zingiberaceae椒蔻属Aframomum植物标本,发现至少58%的标本名称鉴定错误,龙脑香科Dipterocarpaceae龙脑香属Dipterocarpus植物以及旋花科Convolvulaceae番薯属Ipomoea植物的标本也存在类似的情况 [3]。热带植物的标本平均有50%以上名称鉴定错误 [3]。植物多样性资源的鉴定和可持续利用已成为世界性重大需求 [16]。

近20年来,分子性状辅助植物鉴定已经成为热点研究方向之一。叶绿体基因组序列作为一种信息量大、有价值的数据来源用于植物鉴定和系统发生研究已被广泛接受,因其具有相当保守的基因组结构、基因含量和基因顺序 [17] - [23]。随着生命之树全球研究计划的进展,推荐使用的4个通用DNA条形码(rbcL, matK, trnH-psbA及ITS),其中3个(rbcL, matK, trnH-psbA)来自叶绿体基因组 [2] [24] [25]。由于DNA片段提供的信息量少,只能区分有限数量的植物种类,有学者提出,利用叶绿体全基因组序列作为一个超级DNA条形码用于植物的物种鉴定 [2] [20] [21]。然而,叶绿体基因组序列的全长大约15万个核苷酸,操作工作量巨大。为了提供一种更加简便、准确的植物分类鉴定方法,本研究首次利用叶绿体基因组的大单拷贝区的单核苷酸多态位点,研制分子分类检索表,为紫薇属和马尾藻属植物资源的保护与利用提供技术支持。

2. 材料与方法

测序了代表千屈菜科(Lythraceae)紫薇属(Lagerstroemia) 16种 [2] [20] [26] [27] [28] 的31个体的叶绿体全基因组序列(表1),利用MAFFT v7.055b软件 [29] (http://mafft.cbrc.jp/alignment/software)比对。截取其中大单拷贝区的序列,利用Mega 7.0软件 [30] 分析变异位点。从中选择单核苷酸多态位点,利用Suo et al. (2012, 2015, 2016) [15] [31] [32] 描述的方法,研制紫薇属物种的分子分类检索表。检索表中使用的单核苷酸多态位点作为分子数据,利用Mega 7.0软件 [30] 生成16种紫薇的系统发生关系树状图。利用马尾藻科(Sargassaceae)马尾藻属(Sargassum)的9个种和变种的叶绿体基因组序列,验证了本方法鉴定藻类的适用性。

Table 1. Chloroplast genome sequence data of Lagerstroemia used in the present study

表1. 本研究使用的紫薇属植物叶绿体基因组序列数据

Figure 1. Molecular taxonomic key of the sixteen species in Lagerstoemia based on the single-nucleotide polymorphic characters from the large single-copy (LSC) region in the chloroplast genome. The figure following the nucleotide character indicates the position of the corresponding SNP from the 5’ end of the LSC sequence alignment as described previously by Suo et al. (2012, 2015, 2016) [15] [31] [32] ; the clade numbers correspond to those in Figure 2

图1. 基于叶绿体基因组大单拷贝区的单核苷酸多态位点的紫薇属16种植物分子分类检索表,核苷酸字母后面的数字为单核苷酸多态位点位于叶绿体基因组大单拷贝区内的位置;分支的编号与图2中的分支的编号一致

3. 结果

紫薇属16个种的叶绿体全基因组序列比对矩阵的长度为153826 bp。103个单核苷酸多态位点来自LSC区的第128个至第73199个碱基的区域内,作为分子性状,研制分子分类检索表(图1)。成功鉴定了紫薇属的物种。利用检索表内使用的103个单核苷酸多态位点,生成16个物种的系统发生关系树状图(图2)。16个物种分为4个大的支系,I,II,III和IV。使用3~15个单核苷酸多态位点可以鉴定出每个分支,利用3~14个单核苷酸多态位点可以鉴定出每个物种(图1)。同一个支系内的物种具有近缘关系(图2)。分支IV揭示,安徽紫薇Lagerstroemia anhuiensis和光紫薇L. glabra具有非常近缘的关系,它们共有5个独特的单核苷酸多态位点,即cpDNA_LSC_G4172G4173G8310A8448C72550

Figure 2. Phylogenetic tree of the 16 Lagerstroemia species based on 103 SNPs using the neighbour-joining method with the Tamura-3 parameter model. The numbers near the branches are bootstrap support values (%) of 1000 replications

图2. 基于103个单核苷酸多态位点数据的紫薇属16个种的系统发生关系

4. 讨论

紫薇属植物的叶绿体基因组中,大单拷贝区的长度是反向重复区长度的约3.2倍,是小单拷贝区长度的近5倍。大单拷贝区的变异位点数目,是小单拷贝区的约2.4倍,为反向重复区的近7倍 [20] [26] [27] [28]。大单拷贝区的这103个单核苷酸多态位点,已经足够区分16种紫薇属植物 [27] [28] [33] [34] [35]。紫薇属其余物种的叶绿体基因组测序需要全球合作。

有学者认为gap的有无、poly结构的长度变异,如(A)n,(C)n,(G)n,(T)n,以及简单重复序列的长度变异,可以作为分子性状使用。这3类“分子性状”也确实能够用来区分物种。然而,当几个gap同时集中存在于临近位置时,即使手动调整,gap之间的短序列有时也很难匹配到唯一的位置。当poly结构长度或简单重复序列的长度很长的时候,容易出现测序误差。在本研究中回避使用gap、poly结构以及简单重复序列区域的生物信息。这3类“分子特征”不便于编写分子分类检索表。图2是LSC区的单核苷酸多态位点揭示的系统发生关系。

泛素–蛋白酶体系统的相关基因区域是开发植物分子鉴定工具的另一种选择,精度也较高 [17] [31] [32]。然而,获得长度较长的泛素–蛋白酶体系统的相关基因序列,有待实验技术的突破。相对而言,叶绿体基因组序列的性价比较高,目前更容易获得。

叶绿体DNA也用于藻类物种的分类和系统发生关系研究 [36] [37] [38] [39]。马尾藻科马尾藻属植物是“海底森林”的重要构成物种。我们利用马尾藻属9种/变种的14条叶绿体基因组序列(表2),获得长度为74409 bp的LSC区的比对序列,第5336个碱基至第10027个碱基区段的序列的长度为4692 bp (占LSC全长的6.3%),其中,检测到有分类价值的单核苷酸多态位点221个,成功鉴定了马尾藻属的9个种/变种,见图3图4。结果表明,我们的植物分类鉴定新方法不仅适用于高等植物,也适用于藻类。

Table 2. Chloroplast genome sequence data of nine Sargassum species/varieties used in the present study

表2. 本研究使用的马尾藻属9个种/变种的叶绿体基因组序列数据

Figure 3. Molecular taxonomic key of the nine species/varieties in Sargassum based on the single-nucleotide polymorphic characters from the large single-copy (LSC) region in the chloroplast genome. The figure following the nucleotide character indicates the position of the corresponding SNP from the 5’ end of the LSC sequence alignment as described previously by Suo et al. (2012, 2015, 2016); the clade numbers correspond to those in Figure 4

图3. 基于叶绿体基因组大单拷贝区的单核苷酸多态位点的马尾藻属9个种/变种的分子分类检索表。核苷酸字母后面的数字为单核苷酸多态位点位于叶绿体基因组大单拷贝区内的位置;分支的编号与图4中的分支的编号一致

全球植物的种类数量巨大。基因转移和丢失造成的遗传变异,可能导致靶标DNA区域的序列难以获得,进而导致DNA条形码研究和系统发生关系研究有时不能进行比较分析 [2],但是,叶绿体基因组(大单拷贝区)在植物界始终存在。因此,基于大单拷贝区的序列,应该能够实现全球所有植物的比较分析。叶绿体基因组可以为植物分类鉴定提供丰富的生物信息。我们的新方法具有确定性好、分辨率高、操作简单、最大限度地减少了多样性管理数据的储存量。对于植物的鉴定、植物数字化管理平台升级以及植物多样性保护,具有重要价值 [3] [9]。

Figure 4. Phylogenetic tree of the nine Sargassum species/varieties based on complete chloroplast genome sequences using the neighbour-joining method with the Tamura-3 parameter model. The numbers near the branches are bootstrap support values (%) of 1000 replications

图4. 基于叶绿体全基因组序列的马尾藻属9个种/变种的系统发生关系

致 谢

在样品采集和讨论过程中获得侯伯鑫、张寿洲、余俊杰、徐超、陈进、施济普、宁祖林、徐炳强、张会金、何开红、王少青等多位老师的热情指导和帮助。

基金项目

中国国家自然科学基金项目(No. 31770744)和国家林木种质资源共享服务平台项目(2005DKA21003)。

NOTES

*通讯作者。

参考文献

[1] The Angiosperm Phylogeny Group (2016) An Update of the Angiosperm Phylogeny Group Classification for the Orders and Families of Flowering Plants: APG IV. Botanical Journal of the Linnean Society, 181, 1-20.
https://doi.org/10.1111/boj.12385
[2] Zuo, Y.J., Wen, J., and Zhou, S.L. (2017) Intercontinental and Intracontinental Biogeography of the Eastern Asian-Eastern North American Disjunct Panax (the Ginseng Genus, Araliaceae), Emphasizing Its Diversification Processes in Eastern. Molecular Phylogenetics and Evolution, 117, 60-74.
https://doi.org/10.1016/j.ympev.2017.06.016
[3] Goodwin, Z.A., Harris, D.J., Filer, D., et al. (2015) Widespread Mistaken Identity in Tropical Plant Collections. Current Biology, 25, R1066-R1067.
[4] 索志立. 木兰藤科系统位置评述[J]. 植物学通报, 2005, 22(B08): 146-156.
[5] 索志立. 腺齿木科系统位置评述[J]. 植物研究, 2005, 25(1): 26-29.
https://doi.org/10.7525/j.issn.1673-5102.2005.01.012
[6] 索志立. 金鱼藻科系统位置评述[J]. 西北植物学报, 2005, 25(5): 1058-1063.
https://doi.org/10.3321/j.issn:1000-4025.2005.05.040
[7] 索志立. 水盾草科系统位置评述[J]. 植物学通报, 2006, 23(1): 87-97.
https://doi.org/10.3969/j.issn.1674-3466.2006.01.012
[8] 索志立. 莲科系统位置评述[J]. 广西植物, 2007, 27(1): 31-39.
[9] Thomson, S.A., Pyle, R.L., Ahyong, S.T., et al. (2018) Taxonomy Based on Science Is Necessary for Global Conservation. PLoS Biology, 16, e2005075.
[10] Wu, Z.Y., Hong, D.Y. and Raven, P.H. (2013) Flora of China, Volume 24. Science Press, Beijing and Missouri Botanical Garden Press. St. Louis.
[11] Muñoz-Rodríguez, P., Carruthers, T., Wood, J.R.I., et al. (2019) A Taxonomic Monograph of Ipomoea Integrated across Phylogenetic Scales. Nature Plants, 5, 1136-1144.
https://doi.org/10.1038/s41477-019-0535-4
[12] Wei, N., Mutie, F.M., Mwachala, G., et al. (2021) Euphorbia mbuinzauensis, a New Succulent Species in Kenya from the Synadenium Group in Euphorbia Sect. Monadenium (Euphorbiaceae). PhytoKeys, 183, 21-35.
https://doi.org/10.3897/phytokeys.183.70285
[13] 黄建民, 侯伯鑫, 索志立. 邵阳市紫薇品种调查研究III[J]. 农学学报, 2013, 3(5): 34-41.
https://doi.org/10.3969/j.issn.1007-7774.2013.05.010
[14] Suo, Z.L., Zhao, X.Q., Zhao, J.P., et al. (2006) ‘Sisters on Spring Outing’ (Paeonia suffruticosa ‘Zi Mei You Chun’) (Paeoniaceae): A Unique Chinese Tree Peony Cultivar Possessing Side Flowers and Bicolored Floral Discs. HortScience, 43, 532-534.
https://doi.org/10.21273/HORTSCI.43.2.532
[15] Suo, Z.L., Zhang, C.H., Zheng, Y.Q., et al. (2012) Revealing Genetic Diversity of Tree Peonies at Micro-Evolution Level with Hyper-Variable Chloroplast Markers and Floral Traits. Plant Cell Reports, 31, 2199-2213.
https://doi.org/10.1007/s00299-012-1330-0
[16] 洪德元. 生物多样性事业需要科学、可操作的物种概念[J]. 生物多样性, 2016, 24(9): 979-999.
https://doi.org/10.17520/biods.2016203
[17] Li, W.Q., Yang, Y., Xie, X.M., et al. (2018) Diospyros oleifera and D. deyangensis Are Revealed as the Closest Relatives to D. kaki by E3 Ubiquitin-Protein Ligase UPL3 DNA Sequences. Hans Journal of Agricultural Sciences, 8, 657-673.
https://doi.org/10.12677/HJAS.2018.86100
[18] Dong, W.P., Xu, C., Li, D.L., et al. (2016) Comparative Analysis of the Complete Chloroplast Genome Sequences in Psammophytic Haloxylon Species (Amaranthaceae). PeerJ, 4, e2699.
https://doi.org/10.7717/peerj.2699
[19] Dong, W.P., Xu, C., Li, W.Q., et al. (2017) Phylogenetic Resolution in Juglans Based on Complete Chloroplast Genomes and Nuclear DNA Sequences. Frontiers in Plant Science, 8, Article No. 1148.
https://doi.org/10.3389/fpls.2017.01148
[20] Dong, W.P., Xu, C., Liu, Y.L., et al. (2021) Chloroplast Phylogenomics and Divergence Times of Lagerstroemia (Lythraceae). BMC Genomics, 22, Article No. 434.
https://doi.org/10.1186/s12864-021-07769-x
[21] Dong, W.P., Lu, Y.Z., Xu, C., et al. (2021) Chloroplast Phylogenomic Insights into the Evolution of Distylium (Hamamelidaceae). BMC Genomics, 22, Article No. 293.
https://doi.org/10.1186/s12864-021-07590-6
[22] Dong, W.P., Sun, J.H., Liu, Y.L., et al. (2021) Phylogenomic Relationships and Species Identification of the Olive Genus Olea (Oleaceae). Journal of Systematics and Evolution.
https://doi.org/10.1111/jse.12802
[23] Dong, W.P., Liu, Y.L., Li, E.Z., et al. (2022) Phylogenomics and Biogeography of Catalpa (Bignoniaceae) Reveal Incomplete Lineage Sorting and Three Dispersal Events. Molecular Phylogenetics and Evolution, 166, Article ID: 107330.
https://doi.org/10.1016/j.ympev.2021.107330
[24] Coissac, E., Hollingsworth, P.M., Lavergne, S., et al. (2016) From Barcodes to Genomes: Extending the Concept of DNA Barcoding. Molecular Ecology, 25, 1423-1428.
https://doi.org/10.1111/mec.13549
[25] Hollingsworth, P.M., Li, D.Z., van der Bank, M., et al. (2016) Telling Plant Species Apart with DNA: From Barcodes to Genomes. Philosophical Transactions of the Royal Society, London. Series B, Biological Science, 371, Article ID: 20150338.
https://doi.org/10.1098/rstb.2015.0338
[26] Xu, C., Dong, W.P., Li, W.Q., et al. (2017) Comparative Analysis of Six Lagerstroemia Complete Chloroplast Genomes. Frontiers in Plant Science, 8, Article No. 15.
https://doi.org/10.3389/fpls.2017.00015
[27] Gu, C., Tembrock, L.R., Johnson, N.G., et al. (2016) The Complete Plastid Genome of Lagerstroemia fauriei and Loss of rpl2 Intron from Lagerstroemia (Lythraceae). PLoS ONE, 11, e0150752.
https://doi.org/10.1371/journal.pone.0150752
[28] Gu, C., Ma, L., Wu, Z.Q., et al. (2019) Comparative Analyses of Chloroplast Genomes from 22 Lythraceae Species: Inferences for Phylogenetic Relationships and Genome Evolution within Myrtales. BMC Plant Biology, 19, Article No. 281.
https://doi.org/10.1186/s12870-019-1870-3
[29] Katoh, K. and Standley, D.M. (2013) MAFFT Multiple Se-quence Alignment Software Version 7: Improvements in Performance and Usability. Molecular Biology and Evolution, 30, 772-780.
https://doi.org/10.1093/molbev/mst010
[30] Kumar, S., Stecher, G. and Tamura, K. (2016) MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets. Molecular Biology and Evolution, 33, 1870-1874.
https://doi.org/10.1093/molbev/msw054
[31] Suo, Z.L., Chen, L.N., Pei, D., et al. (2015) A New Nuclear DNA Marker from Ubiquitin Ligase Gene Region for Genetic Diversity Detection of Walnut Germplasm Resources. Biotechnology Reports, 5, 40-45.
https://doi.org/10.1016/j.btre.2014.11.003
[32] Suo, Z.L., Li, W.Y., Jin, X.B., et al. (2015) A New Nuclear DNA Marker Revealing Both Microsatellite Variations and Single Nucleotide Polymorphic Loci: A Case Study on Classification of Cultivars in Lagerstroemia indica L. Journal of Microbial & Biochemical Technology, 8, 266-271.
[33] Furtado, C. and Srisuko, M. (1969) A Revision of Lagerstroemia L. (Lythraceae). Garden Bulletin, 24, 185-334
[34] Pham, T.T., Tagane, S., Chhang, P., et al. (2017) Lagerstroemia ruffordii (Lythraceae), a New Species from Vietnam and Cambodia. Acta Phytotaxonomica et Geobotanica, 68, 175-180.
[35] Qin, H.N. and Graham, S. (2007) Lagerstroemia. In: Wu, Z.Y., Raven, P.H. and Hong, D.Y., Eds., Flora of China, Science Press, Beijing and Missouri Botanical Garden Press, St. Louis, 277-281.
[36] Huang, C.H., Sun Z.M., Gao D.H., et al. (2017) Molecular Analysis of Sargassum from the Northern China Seas. Phytotaxa, 319, 71-83.
https://doi.org/10.11646/phytotaxa.319.1.3
[37] Zhong, B.J., Xi, Z.X., Goremykin, V.V., et al. (2013) Streptophyte Algae and the Origin of Land Plants Revisited Using Heterogeneous Models with Three New Algal Chloroplast Genomes. Molecular Biology and Evolution, 31, 177-183.
https://doi.org/10.1093/molbev/mst200
[38] Paudel, Y.P., Hu, Z.X., Khatiwada, J.R., et al. (2021) Chloroplast Genome Analysis of Chrysotila dentata. Gene, 804, Article ID: 145871.
https://doi.org/10.1016/j.gene.2021.145871
[39] Dobrogojski, J., Adamiec, M. and Luciński, R. (2020) The Chloroplast Genome: A Review. Acta Physiologiae Plantarum, 42, Article No. 98.
https://doi.org/10.1007/s11738-020-03089-x