Skip to main content
Log in

De novo assembly of white poplar genome and genetic diversity of white poplar population in Irtysh River basin in China

  • Cover Article
  • Published:
Science China Life Sciences Aims and scope Submit manuscript

Abstract

The white poplar (Populus alba) is widely distributed in Central Asia and Europe. There are natural populations of white poplar in Irtysh River basin in China. It also can be cultivated and grown well in northern China. In this study, we sequenced the genome of P. alba by single-molecule real-time technology. De novo assembly of P. alba had a genome size of 415.99 Mb with a contig N50 of 1.18 Mb. A total of 32,963 protein-coding genes were identified. 45.16% of the genome was annotated as repetitive elements. Genome evolution analysis revealed that divergence between P. alba and Populus trichocarpa (black cottonwood) occurred ~5.0 Mya (3.0, 7.1). Fourfold synonymous third-codon transversion (4DTV) and synonymous substitution rate (ks) distributions supported the occurrence of the salicoid WGD event (~ 65 Mya). Twelve natural populations of P. alba in the Irtysh River basin in China were sequenced to explore the genetic diversity. Average pooled heterozygosity value of P. alba populations was 0.170±0.014, which was lower than that in Italy (0.271±0.051) and Hungary (0.264±0.054). Tajima’s D values showed a negative distribution, which might signify an excess of low frequency polymorphisms and a bottleneck with later expansion of P. alba populations examined.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Alexa, A., and Rahnenfuhrer, J. (2010). topGO: Enrichment Analysis for Gene Ontology. R package version 2.30.1.

    Google Scholar 

  • Argus, G.W., Eckenwalder, J.E., Kiger, R.W. (2010). Salicaceae. In Flora of North America, Flora of North America Editorial Committee, ed. vol. 7. (New York: Oxford University Press).

    Google Scholar 

  • Bairoch, A., and Apweiler, R. (2000). The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res 28, 45–48.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bao, W., Kojima, K.K., and Kohany, O. (2015). Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA 6, 11.

    Article  PubMed  PubMed Central  Google Scholar 

  • Benson, G. (1999). Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27, 573–580.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Biswas, S., and Akey, J.M. (2006). Genomic insights into positive selection. Trends Genets 22, 437–446.

    Article  CAS  Google Scholar 

  • Brundu, G., Lupi, R., Zapelli, I., Fossati, T., Patrignani, G., Camarda, I., Sala, F., and Castiglione, S. (2008). The origin of clonal diversity and structure of Populus alba in Sardinia: evidence from nuclear and plastid microsatellite markers. Ann Bot 102, 997–1006.

    Article  PubMed  PubMed Central  Google Scholar 

  • Chan, A.P., Crabtree, J., Zhao, Q., Lorenzi, H., Orvis, J., Puiu, D., Melake-Berhan, A., Jones, K.M., Redman, J., Chen, G., et al. (2010). Draft genome sequence of the oilseed species Ricinus communis. Nat Biotechnol 28, 951–956.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chen, C., Khaleel, S.S., Huang, H., and Wu, C.H. (2014). Software for preprocessing Illumina next-generation sequencing short read sequences. Source Code Biol Med 9, 8.

    Article  PubMed  PubMed Central  Google Scholar 

  • Christe, C., Stölting, K.N., Bresadola, L., Fussi, B., Heinze, B., Wegmann, D., and Lexer, C. (2016). Selection against recombinant hybrids maintains reproductive isolation in hybridizing Populus species despite F1 fertility and recurrent gene flow. Mol Ecol 25, 2482–2498.

    Article  CAS  PubMed  Google Scholar 

  • Christe, C., Stölting, K.N., Paris, M., Fraїsse, C., Bierne, N., and Lexer, C. (2017). Adaptive evolution and segregating load contribute to the genomic landscape of divergence in two tree species connected by episodic gene flow. Mol Ecol 26, 59–76.

    Article  CAS  PubMed  Google Scholar 

  • Dai, X., Hu, Q., Cai, Q., Feng, K., Ye, N., Tuskan, G.A., Milne, R., Chen, Y., Wan, Z., Wang, Z., et al. (2014). The willow genome and divergent evolution from poplar after the common genome duplication. Cell Res 24, 1274–1277.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • DePristo, M.A., Banks, E., Poplin, R., Garimella, K.V., Maguire, J.R., Hartl, C., Philippakis, A.A., del Angel, G., Rivas, M.A., Hanna, M., et al. (2011). A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43, 491–498.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Edgar, R.C., and Myers, E.W. (2005). PILER: identification and classification of genomic repeats. Bioinformatics 21, i152–i158.

    Article  CAS  PubMed  Google Scholar 

  • Emms, D.M., and Kelly, S. (2015). OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol 16, 157.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • EUFORGEN. (1999). Populus nigra network: Report of the fifth meeting..

    Google Scholar 

  • Fang, C., Zhao, S., Skvortsov, A. (1999). Salicaceae. In Flora of China, Z. Y. Wu, P.H. Raven, D.Y. Hong, ed. vol. 4. (Beijing: Science Press; St. Louis, MO: Missouri Botanical Garden Press).

    Google Scholar 

  • Ferreira, S., Hjernø, K., Larsen, M., Wingsle, G., Larsen, P., Fey, S., Roepstorff, P., and Salomé Pais, M. (2006). Proteome profiling of Populus euphratica Oliv. upon heat stress. Ann Bot 98, 361–377.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Fussi, B., Lexer, C., and Heinze, B. (2010). Phylogeography of Populus alba (L.) and Populus tremula (L.) in Central Europe: secondary contact and hybridisation during recolonisation from disconnected refugia. Tree Genets Genomes 6, 439–450.

    Article  Google Scholar 

  • Götz, S., García-Gómez, J.M., Terol, J., Williams, T.D., Nagaraj, S.H., Nueda, M.J., Robles, M., Talón, M., Dopazo, J., and Conesa, A. (2008). High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res 36, 3420–3435.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Haas, B.J., Salzberg, S.L., Zhu, W., Pertea, M., Allen, J.E., Orvis, J., White, O., Buell, C.R., and Wortman, J.R. (2008). Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol 9, R7.

    Google Scholar 

  • Hamzeh, M., and Dayanandan, S. (2004). Phylogeny of Populus (Salicaceae) based on nucleotide sequences of chloroplast TRNTTRNF region and nuclear rDNA. Am J Bot 91, 1398–1408.

    Article  CAS  PubMed  Google Scholar 

  • Han, M.V., Thomas, G.W.C., Lugo-Martinez, J., and Hahn, M.W. (2013). Estimating gene gain and loss rates in the presence of error in genome assembly and annotation using CAFE 3. Mol Biol Evol 30, 1987–1997.

    Article  CAS  PubMed  Google Scholar 

  • Hoff, K.J., Lange, S., Lomsadze, A., Borodovsky, M., and Stanke, M. (2016). BRAKER1: Unsupervised RNA-Seq-based genome annotation with GeneMark-ET and AUGUSTUS. Bioinformatics 32, 767–769.

    Article  CAS  PubMed  Google Scholar 

  • Verde, I., Abbott, A.G., Scalabrin, S., Jung, S., Shu, S., Marroni, F., Zhebentyayeva, T., Dettori, M.T., Grimwood, J., Cattonaro, F., et al. (2013). The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution. Nat Genet 45, 487–494.

    Article  CAS  PubMed  Google Scholar 

  • Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M., and Tanabe, M. (2016). KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res 44, D457–D462.

    Article  CAS  PubMed  Google Scholar 

  • Kent, W.J. (2002). BLAT—The BLAST-like alignment tool. Genome Res 12, 656–664.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kofler, R., Orozco-terWengel, P., De Maio, N., Pandey, R.V., Nolte, V., Futschik, A., Kosiol, C., and Schlötterer, C. (2011). PoPoolation: a toolbox for population genetic analysis of next generation sequencing data from pooled individuals. PLoS ONE 6, e15925.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lamesch, P., Berardini, T.Z., Li, D., Swarbreck, D., Wilks, C., Sasidharan, R., Muller, R., Dreher, K., Alexander, D.L., Garcia-Hernandez, M., et al. (2012). The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res 40, D1202–D1210.

    Article  CAS  PubMed  Google Scholar 

  • Lexer, C., Fay, M.F., Joseph, J.A., Nica, M.S., and Heinze, B. (2005). Barrier to gene flow between two ecologically divergent Populus species, P. alba (white poplar) and P. tremula (European aspen): the role of ecology and life history in gene introgression. Mol Ecol 14, 1045–1057.

    Article  CAS  PubMed  Google Scholar 

  • Li, H., and Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lin, Y.C., Wang, J., Delhomme, N., Schiffthaler, B., Sundström, G., Zuccolo, A., Nystedt, B., Hvidsten, T.R., de la Torre, A., Cossu, R.M., et al. (2018). Functional and evolutionary genomic inferences in Populus through genome and population sequencing of American and European aspen. Proc Natl Acad Sci USA 115, e10970–E10978.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ma, T., Wang, J., Zhou, G., Yue, Z., Hu, Q., Chen, Y., Liu, B., Qiu, Q., Wang, Z., Zhang, J., et al. (2013). Genomic insights into salt adaptation in a desert poplar. Nat Commun 4, 2797.

    Article  CAS  PubMed  Google Scholar 

  • Motamayor, J.C., Mockaitis, K., Schmutz, J., Haiminen, N., Livingstone, D., Cornejo, O., Findley, S.D., Zheng, P., Utro, F., Royaert, S., et al. (2013). The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color. Genome Biol 14, r53.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Myburg, A.A., Grattapaglia, D., Tuskan, G.A., Hellsten, U., Hayes, R.D., Grimwood, J., Jenkins, J., Lindquist, E., Tice, H., Bauer, D., et al. (2014). The genome of Eucalyptus grandis. Nature 510, 356–362.

    Article  CAS  PubMed  Google Scholar 

  • Ouyang, S., Zhu, W., Hamilton, J., Lin, H., Campbell, M., Childs, K., Thibaud-Nissen, F., Malek, R.L., Lee, Y., Zheng, L., et al. (2007). The TIGR Rice Genome Annotation Resource: improvements and new features. Nucleic Acids Res 35, D883–D887.

    Article  CAS  PubMed  Google Scholar 

  • Parra, G., Bradnam, K., and Korf, I. (2007). CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067.

    Article  CAS  PubMed  Google Scholar 

  • Roiron, P., Ali, A.A., Guendon, J.L., Carcaillet, C., and Terral, J.F. (2004). Preuve de l'indigénat de Populus alba L. dans le Bassin méditerranéen occidental. Comptes Rendus Biologies 327, 125–132.

    Article  PubMed  Google Scholar 

  • Schmutz, J., Cannon, S.B., Schlueter, J., Ma, J., Mitros, T., Nelson, W., Hyten, D.L., Song, Q., Thelen, J.J., Cheng, J., et al. (2010). Genome sequence of the palaeopolyploid soybean. Nature 463, 178–183.

    Article  CAS  PubMed  Google Scholar 

  • Simão, F.A., Waterhouse, R.M., Ioannidis, P., Kriventseva, E.V., and Zdobnov, E.M. (2015). BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212.

    Article  CAS  PubMed  Google Scholar 

  • Singh, R., Ming, R., and Yu, Q. (2016). Comparative analysis of GC content variations in plant genomes. Tropical Plant Biol 9, 136–149.

    Article  CAS  Google Scholar 

  • Smit, A., Hubley, R., and Green, P. (2013–2015). RepeatMasker Open-4.0 ( http://www.repeatmasker.org).

    Google Scholar 

  • Stamatakis, A. (2014). RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Stölting, K.N., Nipper, R., Lindtke, D., Caseys, C., Waeber, S., Castiglione, S., and Lexer, C. (2013). Genomic scan for single nucleotide polymorphisms reveals patterns of divergence and gene flow between ecologically divergent species. Mol Ecol 22, 842–855.

    Article  CAS  PubMed  Google Scholar 

  • Stölting, K.N., Paris, M., Meier, C., Heinze, B., Castiglione, S., Bartha, D., and Lexer, C. (2015). Genome-wide patterns of differentiation and spatially varying selection between postglacial recolonization lineages of Populus alba (Salicaceae), a widespread forest tree. New Phytol 207, 723–734.

    Article  PubMed  Google Scholar 

  • Tajima, F. (1989). Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123, 585–595.

    CAS  PubMed  PubMed Central  Google Scholar 

  • Trapnell, C., Pachter, L., and Salzberg, S.L. (2009). TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Trapnell, C., Williams, B.A., Pertea, G., Mortazavi, A., Kwan, G., van Baren, M.J., Salzberg, S.L., Wold, B.J., and Pachter, L. (2010). Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28, 511–515.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tuskan, G.A., Difazio, S., Jansson, S., Bohlmann, J., Grigoriev, I., Hellsten, U., Putnam, N., Ralph, S., Rombauts, S., Salamov, A., et al. (2006). The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313, 1596–1604.

    Article  CAS  PubMed  Google Scholar 

  • Van de Peer, Y., Fawcett, J.A., Proost, S., Sterck, L., and Vandepoele, K. (2009a). The flowering world: a tale of duplications. Trends Plant Sci 14, 680–688.

    Article  CAS  PubMed  Google Scholar 

  • Van de Peer, Y., Maere, S., and Meyer, A. (2009b). The evolutionary significance of ancient genome duplications. Nat Rev Genet 10, 725–732.

    Article  CAS  PubMed  Google Scholar 

  • Van der Auwera, G.A., Carneiro, M.O., Hartl, C., Poplin, R., Del Angel, G., Levy-Moonshine, A., Jordan, T., Shakir, K., Roazen, D., Thibault, J., et al. (2013). From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics 11, 11.10.11-11.10.33.

    PubMed Central  Google Scholar 

  • Wang, Y., Tang, H., Debarry, J.D., Tan, X., Li, J., Wang, X., Lee, T., Jin, H., Marler, B., Guo, H., et al. (2012). MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res 40, e49.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wu, G.A., Prochnik, S., Jenkins, J., Salse, J., Hellsten, U., Murat, F., Perrier, X., Ruiz, M., Scalabrin, S., Terol, J., et al. (2014). Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication. Nat Biotechnol 32, 656–662.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Xu, Z., and Wang, H. (2007). LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res 35, W265–W268.

    Article  PubMed  PubMed Central  Google Scholar 

  • Yang, Z. (2007). PAML 4: Phylogenetic analysis by maximum likelihood. Mol Biol Evol 24, 1586–1591.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank Dr. Jian Wang for assisting with the population sampling from Irtysh River basin. This work was supported by the National Science Fund for Distinguished Young Scholars (31425006) and Chinese Academy of Forestry (CAFYBB2018ZX001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qing-Yin Zeng.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, YJ., Wang, XR. & Zeng, QY. De novo assembly of white poplar genome and genetic diversity of white poplar population in Irtysh River basin in China. Sci. China Life Sci. 62, 609–618 (2019). https://doi.org/10.1007/s11427-018-9455-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11427-018-9455-2

Keywords

Navigation