Abstract
Main conclusion
A 541 Mb draft genome of Pterocarpus santalinus is presented and evidence of whole-genome duplication in the Eocene period with expansion of drought responsive gene families is documented.
Abstract
Pterocarpus santalinus Linn. f., popularly known as Red Sanders, is a deciduous tree, endemic to southern parts of Eastern Ghats in India. The heartwood is highly valued in the international market due to its deep red colour, fragrant heartwood and wavy grained texture. In the present study, a high-quality draft genome of P. santalinus was assembled using short and long reads generated from Illumina and Oxford Nanopore Sequencing platforms, respectively. The haploid genome size was estimated at 541 Mb and the hybrid assembly showed 99.60% genome completeness. A total of 51,713 consensus gene set were predicted with 31,437 annotated genes. The age of the whole-genome duplication event in the species was dated at 30–39 mya with 95% confidence suggesting early genome duplication event during the Eocene period. Concurrently, phylogenomic assessment of seven Papilionoideae members including P. santalinus grouped the species based on the tribal classification and established divergence of the tribe Dalbergieae from tribe Trifolieae at ~ 54.20 mya. A significant expansion of water deprivation/drought responsive gene families documented in the study probably explains the occurrence of the species in dry rocky patches. Additionally, re-sequencing of six diverse genotypes predicted one variant every 27 bases. This report presents the first draft genome in the genus Pterocarpus and the unprecedented genomic information generated is expected to accelerate population divergence studies in the species in relation to its endemic nature, support trait-based breeding programme and aid in development of diagnostic tools for timber forensics.
Similar content being viewed by others
Data availability
Pterocarpus santalinus genome assembly sequences have been deposited at the NCBI GenBank under BioProject PRJNA913770 and BioSample Accession SAMN32305990.
Abbreviations
- mya:
-
Million years ago
- SNP:
-
Single-nucleotide polymorphism
- WGD:
-
Whole genome duplication
References
Agasthikumar S, Patturaj M, Samji A, Aiyer B, Munusamy A, Kannan N, Arivazhagan V, Warrier RR, Ramasamy Y (2022) De novo transcriptome assembly and development of EST-SSR markers for Pterocarpus santalinus L. f. (Red sanders), a threatened and endemic tree of India. Genet Resour Crop Evolu 69:2469–2484. https://doi.org/10.1007/s10722-022-01385-8
Alemán M, Figueroa-Fleming T, Etcheverry Á, Sühring S, Ortega-Baes P (2014) The explosive pollination mechanism in Papilionoideae (Leguminosae): an analysis with three Desmodium species. Plant Syst Evol 300(1):177–186. https://doi.org/10.1007/s00606-013-0869-8
Almagro Armenteros JJ, Tsirigos KD, Sønderby CK, Petersen TN, Winther O, Brunak S, von Heijne G, Nielsen H (2019a) SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat Biotechnol 37(4):420–423. https://doi.org/10.1038/s41587-019-0036-z
Almagro Armenteros JJ, Salvatore M, Emanuelsson O, Winther O, von Heijne G, Elofsson A, Nielsen H (2019b) Detecting sequence signals in targeting peptides using deep learning. Life Sci Alliance 2(5):e201900429. https://doi.org/10.26508/lsa.201900429
Andrews S (2010) FastQC: a quality control tool for high throughput sequence data. http://www.bioinformatics.babraham.ac.uk/projects/fastqc
Arunakumara KKIU, Walpola BC, Subasinghe S, Yoon MH (2011) Pterocarpus santalinus Linn. f. (Rath handun): a review of its botany, uses, phytochemistry and pharmacology. J Korean Soc Appl Biol Chem 54:495–500. https://doi.org/10.3839/jksabc.2011.076
Arunkumar AN, Joshi G (2014) Pterocarpus santalinus (red sanders) an endemic, endangered tree of India: current status, improvement and the future. J Trop for Environ 4:1–10. https://doi.org/10.31357/jtfe.v4i2.2063
Beier S, Thiel T, Münch T, Scholz U, Mascher M (2017) MISA-web: a web server for microsatellite prediction. Bioinformatics 33(16):2583–2585. https://doi.org/10.1093/bioinformatics/btx198
Bucchini F, Del Cortona A, Kreft Ł, Botzki A, Van Bel M, Vandepoele K (2021) TRAPID 2.0: a web application for taxonomic and functional analysis of de novo transcriptomes. Nucleic Acids Res 17:e101. https://doi.org/10.1093/nar/gkab565.
Buchfink B, Xie C, Huson DH (2015) Fast and sensitive protein alignment using DIAMOND. Nat Methods 12:59–60. https://doi.org/10.1038/nmeth.3176
Cannon SB, McKain MR, Harkess A, Nelson MN, Dash S, Deyholos MK, Peng Y, Joyce B, Stewart CN Jr, Rolf M, Kutchan T (2015) Multiple polyploidy events in the early radiation of nodulating and nonnodulating legumes. Mol Biol Evol 32(1):193–210. https://doi.org/10.1093/molbev/msu296
Cardoso DB, Pennington RT, De Queiroz LP, Boatwright JS, Van Wyk BE, Wojciechowski MF, Lavin M (2013) Reconstructing the deep-branching relationships of the papilionoid legumes. S Afr J Bot 89:58–75. https://doi.org/10.1016/j.sajb.2013.05.001
Cascales-Miñana B, Cleal CJ (2014) The plant fossil record reflects just two great extinction events. Terra Nova 26(3):195–200. https://doi.org/10.1111/ter.12086
Chan PP, Lin BY, Mak AJ, Lowe TM (2021) tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res 49:9077–9096. https://doi.org/10.1093/nar/gkab688
Chauhan L, Vijendra Rao R (2003) Wood anatomy of the legumes of India. Their identification, properties and uses. Bishen Singh Mahendra Pal Singh, Dehra Dun
Chen Y, Nie F, Xie SQ, Zheng YF, Dai Q, Bray T, Wang YX, Xing JF, Huang ZJ, Wang DP, He LJ (2021) Efficient assembly of nanopore reads via highly accurate and intact error correction. Nat Commun 12:60. https://doi.org/10.1038/s41467-020-20236-7
Choi IS, Cardoso D, de Queiroz LP, de Lima HC, Lee C, Ruhlman TA, Jansen RK, Wojciechowski MF (2022) Highly resolved papilionoid legume phylogeny based on plastid phylogenomics. Front Plant Sci 13:823190. https://doi.org/10.3389/fpls.2022.823190
Chomicki G, Ward PS, Renner SS (2015) Macroevolutionary assembly of ant/plant symbioses: Pseudomyrmex ants and their ant-housing plants in the Neotropics. Proc Royal Soc B 282:20152200. https://doi.org/10.1098/rspb.2015.2200
Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L et al (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms. SnpEff Fly 6(2):80–92. https://doi.org/10.4161/fly.19695
De La Torre AR, Birol I, Bousquet J, Ingvarsson PK, Jansson S, Jones SJ, Keeling CI, MacKay J, Nilsson O, Ritland K, Street N, Yanchuk A, Zerbe P, Bohlmann J (2014) Insights into conifer giga-genomes. Plant Physiol 166(4):1724–1732. https://doi.org/10.1104/pp.114.248708
Fukasawa Y, Ermini L, Wang H, Carty K, Cheung MS (2020) LongQC: a quality control tool for third generation sequencing long read data. G3 Genes Genomes Genet 10(4):1193–1196. https://doi.org/10.1534/g3.119.400864
Guan D, McCarthy SA, Wood J, Howe K, Wang Y, Durbin R (2020) Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics 36(9):2896–2898. https://doi.org/10.1093/bioinformatics/btaa025
Gurevich A, Saveliev V, Vyahhi N, Tesler G (2013) QUAST: quality assessment tool for genome assemblies. Bioinformatics 29(8):1072–1075. https://doi.org/10.1093/bioinformatics/btt086
Hane JK, Ming Y, Kamphuis LG, Nelson MN, Garg G, Atkins CA, Bayer PE, Bravo A, Bringans S, Cannon S, Edwards D et al (2017) A comprehensive draft genome sequence for lupin (Lupinus angustifolius), an emerging health food: insights into plant–microbe interactions and legume evolution. Plant Biotechnol J 15(3):318–330. https://doi.org/10.1111/pbi.12615
Hong Z, Wu Z, Zhao K, Yang Z, Zhang N, Guo J, Tembrock LR, Xu D (2020) Comparative analyses of five complete chloroplast genomes from the genus Pterocarpus (Fabacaeae). Int J Mol Sci 21(11):3758. https://doi.org/10.3390/ijms21113758
Humann JL, Lee T, Ficklin S, Main D (2019) Structural and functional annotation of eukaryotic genomes with GenSAS. In: Kollmar M (ed) Gene prediction. Methods and protocols, Humana, New York, pp 29–51. https://doi.org/10.1007/978-1-4939-9173-0_3
Indu BK, Mohonty SK, Bhat S, Swamy MK, Anuradha M (2019) Genetic diversity and conservation of Pterocarpus santalinus Lf through molecular approaches. In: Pullaiah T, Balasubramanya S, Anuradha M (eds) Red sanders: silviculture and conservation. Springer, Singapore, pp 173–187. https://doi.org/10.1007/978-981-13-7627-6_13
Jha S (2022) Red Sanders falls back in IUCN’s ‘endangered’ category. Down to earth. https://www.downtoearth.org.in/news/wildlife-biodiversity/red-sanders-falls-back-in-iucn-s-endangered-category-81053
Jin J, Tian F, Yang DC, Meng YQ, Kong L, Luo J, Gao G (2017) PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res 45:D1040–D1045. https://doi.org/10.1093/nar/gkw982
Jones P, Binns D, Chang HY et al (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30(9):1236–1240. https://doi.org/10.1093/bioinformatics/btu031
Kanehisa M (2016) KEGG bioinformatics resource for plant genomics and metabolomics. Methods Mol Biol 1374:55–70. https://doi.org/10.1007/978-1-4939-3167-5_3
Kanehisa M, Goto S (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30. https://doi.org/10.1093/nar/27.1.29
Kanehisa M, Sato Y (2020) KEGG mapper for inferring cellular functions from protein sequences. Protein Sci 29:28–35. https://doi.org/10.1002/pro.3711
Koenen EJ, Ojeda DI, Steeves R, Migliore J, Bakker FT, Wieringa JJ, Kidner C, Hardy OJ, Pennington RT, Bruneau A, Hughes CE (2020) Large-scale genomic sequence data resolve the deepest divergences in the legume phylogeny and support a near simultaneous evolutionary origin of all six subfamilies. New Phytol 225(3):1355–1369. https://doi.org/10.1111/nph.16290
Koenen EJ, Ojeda DI, Bakker FT, Wieringa JJ, Kidner C, Hardy OJ, Pennington RT, Herendeen PS, Bruneau A, Hughes CE (2021) The origin of the legumes is a complex paleopolyploid phylogenomic tangle closely associated with the Cretaceous–Paleogene (K–Pg) mass extinction event. Syst Biol 70(3):508–526. https://doi.org/10.1093/sysbio/syaa041
Kolmogorov M, Yuan J, Lin Y, Pevzner PA (2019) Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 37(5):540–546. https://doi.org/10.1038/s41587-019-0072-8
Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM (2017) Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27(5):722–736. https://doi.org/10.1101/gr.215087.116
Krueger F (2015) Trim galore. A wrapper tool around Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ file. Babraham Bioinform 516:517
Kumar S, Sane PV (2003) Legumes of South Asia, a check-list. Royal Botanic Gardens, Kew, London
Kumar S, Stecher G, Suleski M, Hedges SB (2017) TimeTree: a resource for timelines, timetrees, and divergence times. Mol Biol Evol 34(7):1812–1819. https://doi.org/10.1093/molbev/msx116
Lagesen K, Hallin P, Rødland EA, Staerfeldt HH, Rognes T, Ussery DW (2007) RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 35(9):3100–3108. https://doi.org/10.1093/nar/gkm160
Landis JB, Soltis DE, Li Z, Marx HE, Barker MS, Tank DC, Soltis PS (2018) Impact of whole genome duplication events on diversification rates in angiosperms. Am J Bot 105:433–444. https://doi.org/10.1002/ajb2.1060
Lavin M, Herendeen PS, Wojciechowski MF (2005) Evolutionary rates analysis of Leguminosae implicates a rapid diversification of lineages during the tertiary. Syst Biol 54(4):575–594. https://doi.org/10.1080/10635150590947131
Leebens-Mack JH, Barker MS, Carpenter EJ et al (2019) One thousand plant transcriptomes and the phylogenomics of green plants. Nature 574:679–685. https://doi.org/10.1038/s41586-019-1693-2
Levin DA, Soltis DE (2018) Factors promoting polyploid persistence and diversification and limiting diploid speciation during the K-Pg interlude. Curr Opin Plant Biol 42:1–7. https://doi.org/10.1016/j.pbi.2017.09.010
Liu H, Wu S, Li A, Ruan J (2021) SMARTdenovo: a de novo assembler using long noisy reads. Science. https://doi.org/10.46471/gigabyte.15
Lomsadze A, Ter-Hovhannisyan V, Chernoff Y, Borodovsky M (2005) Gene identification in novel eukaryotic genomes by self-training algorithm. Nucleic Acids Res 33(20):6494–6506. https://doi.org/10.1093/nar/gki937
LPWG [Legume Phylogeny Working Group] (2021) The world checklist of vascular plants (WCVP). Govaerts R (ed) Fabaceae, vers. June 2021. http://sftp.kew.org/pub/data_collaborations/Fabaceae/DwCA/
LPWG [Legume Phylogeny Working Group], Azani N, Babineau M, Bailey CD, Banks H, Barbosa A, Pinto RB, Boatwright J, Borges L, Brown G (2017) A new subfamily classification of the Leguminosae based on a taxonomically comprehensive phylogeny. Taxon 66:44–77. https://doi.org/10.12705/661.3
Lynch M, Conery JS (2000) The evolutionary fate and consequences of duplicate genes. Science 290:1151–1155. https://doi.org/10.1126/science.290.5494.1151
Manchanda N, Portwood JL, Woodhouse MR, Seetharam AS, Lawrence-Dill CJ, Andorf CM, Hufford MB (2020) GenomeQC: a quality assessment tool for genome assemblies and gene structure annotations. BMC Genom 21(1):1–9. https://doi.org/10.1186/s12864-020-6568-2
Marazzi B, Gonzalez AM, Delgado-Salinas A, Luckow MA, Ringelberg JJ, Hughes CE (2019) Extrafloral nectaries in Leguminosae: phylogenetic distribution, morphological diversity and evolution. Aust Syst Bot 32:409–458. https://doi.org/10.1071/SB19012
Marçais G, Kingsford C (2011) A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27(6):764–770. https://doi.org/10.1093/bioinformatics/btr011
Mathesius U (2022) Are legumes different? Origins and consequences of evolving nitrogen fixing symbioses. J Plant Physiol 276:153765. https://doi.org/10.1016/j.jplph.2022.153765
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303. https://doi.org/10.1101/gr.107524.110
Mendes FK, Vanderpool D, Fulton B, Hahn MW (2021) CAFE 5 models variation in evolutionary rates among gene families. Bioinformatics 36(22–23):5516–5518. https://doi.org/10.1093/bioinformatics/btaa1022
Mistry J, Chuguransky S, Williams L, Qureshi M, Salazar GA, Sonnhammer EL, Tosatto SC, Paladin L, Raj S, Richardson LJ, Finn RD (2021) Pfam: the protein families database in 2021. Nucleic Acids Res 49(D1):D412-419. https://doi.org/10.1093/nar/gkaa913
Qiao X, Li Q, Yin H, Qi K, Li L, Wang R, Zhang S, Paterson AH (2019) Gene duplication and evolution in recurring polyploidization–diploidization cycles in plants. Genome Biol 20(1):1–23. https://doi.org/10.1186/s13059-019-1650-2
Raju KK, Nagaraju A (1999) Geobotany of Red Sanders (Pterocarpus santalinus): a case study from the southeastern portion of Andhra Pradesh. Environ Geol 37:340–344. https://doi.org/10.1007/s002540050393
Senthilkumar S, Ulaganathan K, Ghosh Dasgupta M (2021) Reference-based assembly of chloroplast genome from leaf transcriptome data of Pterocarpus santalinus. 3 Biotech 11(8):393. https://doi.org/10.1007/s13205-021-02943-0
Siddi Raju S (2013) Red Sanders as a stratigraphic guide in the correlation of the Cuddapah formations. Int J Sci Res 2(2):87–89
Silvestro D, Cascales-Miñana B, Bacon CD, Antonelli A (2015) Revisiting the origin and diversification of vascular plants through a comprehensive Bayesian analysis of the fossil record. New Phytol 207(2):425–436. https://doi.org/10.1111/nph.13247
Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31(19):3210–3212. https://doi.org/10.1093/bioinformatics/btv351
Soltis PS, Soltis DE (2021) Plant genomes: markers of evolutionary history and drivers of evolutionary change. Plants People Planet 3:74–82. https://doi.org/10.1002/ppp3.10159
Stanke M, Morgenstern B (2005) AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acids Res 33(Web Server issue):W465–W467. https://doi.org/10.1093/nar/gkl200
Tamura K, Stecher G, Kumar S (2021) MEGA11: molecular evolutionary genetics analysis version 11. Mol Biol Evol 38(7):3022–3027. https://doi.org/10.1093/molbev/msab120
Tarailo-Graovac M, Chen N (2009) Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinform 5:4–10. https://doi.org/10.1002/0471250953.bi0410s05
Teixeira da Silva JA, Kher MM, Soner D et al (2019) Red sandalwood (Pterocarpus santalinus L. f.): biology, importance, propagation and micropropagation. J for Res 30:745–754. https://doi.org/10.1007/s11676-018-0714-6
Tucker SC (2003) Floral development in legumes. Plant Physiol 131:911–926. https://doi.org/10.1104/pp.102.017459
Vanneste K, Baele G, Maere S, Van de Peer Y (2014) Analysis of 41 plant genomes supports a wave of successful genome duplications in association with the Cretaceous–Paleogene boundary. Genome Res 24(8):1334–1347
Vaser R, Sović I, Nagarajan N, Šikić M (2017) Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res 27(5):737–746. https://doi.org/10.1101/gr.214270.116
Vurture GW, Sedlazeck FJ, Nattestad M, Underwood CJ, Fang H, Gurtowski J, Schatz MC (2017) GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics 33(14):2202–2204. https://doi.org/10.1093/bioinformatics/btx153
Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S et al (2014) Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9(11):e112963. https://doi.org/10.1371/journal.pone.0112963
Xu L, Dong Z, Fang L, Luo Y, Wei Z, Guo H, Zhang G, Gu YQ, Coleman-Derr D, Xia Q, Wang Y (2019) OrthoVenn2: a web server for whole-genome comparison and annotation of orthologous clusters across multiple species. Nucleic Acids Res 47(W1):W52–W58. https://doi.org/10.1093/nar/gkz333
Yahara T, Javadi F, Onoda Y, de Queiroz LP, Faith DP, Prado DE, Akasaka M, Kadoya T, Ishihama F, Davies S, Slik JF (2013) Global legume diversity assessment: concepts, key indicators, and strategies. Taxon 62(2):249–266. https://doi.org/10.12705/622.12
Yan Z, Sang L, Ma Y, He Y, Sun J, Ma L, Li S, Miao F, Zhang Z, Huang J, Wang Z (2022) A de novo assembled high-quality chromosome-scale Trifolium pratense genome and fine-scale phylogenetic analysis. BMC Plant Biol 22(1):1–2. https://doi.org/10.1186/s12870-022-03707-5
Yang Z (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24:1586–1591. https://doi.org/10.1093/molbev/msm088
Yu Y, Ouyang Y, Yao W (2018) shinyCircos: an R/Shiny application for interactive creation of Circos plot. Bioinformatics 34(7):1229–1231. https://doi.org/10.1093/bioinformatics/btx763
Zhao Y, Zhang R, Jiang KW et al (2021) Nuclear phylotranscriptomics and phylogenomics support numerous polyploidization events and hypotheses for the evolution of rhizobial nitrogen-fixing symbiosis in Fabaceae. Mol Plant 14(5):748–773. https://doi.org/10.1016/j.molp.2021.02.006
Zwaenepoel A, Van de Peer Y (2019) wgd—simple command line tools for the analysis of ancient whole-genome duplications. Bioinformatics 35(12):2153–2155. https://doi.org/10.1093/bioinformatics/bty915
Acknowledgements
The authors acknowledge the funding support provided by National Biodiversity Authority of India, Government of India. The funding support as research fellowship was provided to SS by the National Biodiversity Authority, Government of India, and ME by the Department of Biotechnology, Government of India is gratefully acknowledged. The authors are grateful to Andhra Pradesh Forest Department for providing the support to access the germplasm used for genome sequencing and re-sequencing studies.
Funding
This study was funded by National Biodiversity Authority, Government of India (Grant No. Tech./Gen1/22/149/17/18-19/290). The funding support as research fellowship provided to SS by the National Biodiversity Authority, Government of India and ME by Department of Biotechnology, Government of India is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Communicated by Dorothea Bartels.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ghosh Dasgupta, M., Senthilkumar, S., Muthulakshmi, E. et al. The draft genome reveals early duplication event in Pterocarpus santalinus: an endemic timber species. Planta 258, 27 (2023). https://doi.org/10.1007/s00425-023-04190-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00425-023-04190-4