Multiplexed shotgun sequencing reveals congruent three-genome phylogenetic signals for four botanical sections of the flax genus Linum

https://doi.org/10.1016/j.ympev.2016.05.010Get rights and content

Highlights

Abstract

A genome-wide detection of phylogenetic signals by next generation sequencing (NGS) has recently emerged as a promising genomic approach for phylogenetic analysis of non-model organisms. Here we explored the use of a multiplexed shotgun sequencing method to assess the phylogenetic relationships of 18 Linum samples representing 16 species within four botanical sections of the flax genus Linum. The whole genome DNAs of 18 Linum samples were fragmented, tagged, and sequenced using an Illumina MiSeq. Acquired sequencing reads per sample were further separated into chloroplast, mitochondrial and nuclear sequence reads. SNP calls upon genome-specific sequence data sets revealed 6143 chloroplast, 2673 mitochondrial, and 19,562 nuclear SNPs. Phylogenetic analyses based on three-genome SNP data sets with and without missing observations showed congruent three-genome phylogenetic signals for four botanical sections of the Linum genus. Specifically, two major lineages showing a separation of Linum–Dasylinum sections and Linastrum–Syllinum sections were confirmed. The Linum section displayed three major branches representing two major evolutionary stages leading to cultivated flax. Cultivated flax and its immediate progenitor were formed as its own branch, genetically more closely related to L. decumbens and L. grandiflorum with chromosome count of eight, and distantly apart from six other species with chromosome count of nine. Five species of the Linastrum and Syllinum sections were genetically more distant from cultivated flax, but they appeared to be more closely related to each other, even with variable chromosome counts. These findings not only provide the first evidence of congruent three-genome phylogenetic pathways within the Linum genus, but also demonstrate the utility of the multiplexed shotgun sequencing in acquisition of three-genome phylogenetic signals of non-model organisms.

Introduction

The last two decades have seen increased efforts in studies of phylogenetic relationships of plant species using DNA sequences from different genomic compartments (Chaw et al., 2000, Guo and Ge, 2005, Rieseberg and Soltis, 1991, Tsutsui et al., 2009, Wang et al., 2000, Yang et al., 2012). These studies frequently showed incongruent phylogenies and rarely reported evidence for congruent phylogenetic signals extracted from organelle and nuclear genomes (Soltis and Kuzoff, 1995, Pelser et al., 2010, Wang et al., 2011, Yu et al., 2013). These research findings are largely expected (Korpelainen, 2004, Wolfe et al., 1987), as inheritance pathways of these three genomes in most plant species differ; the chloroplast, mitochondrial, and nuclear genomes are paternally, maternally, and biparentally inherited, respectively (Hansen et al., 2007, Korpelainen, 2004). Also, genes from different genomes may carry distinct phylogenies in differential responses to processes such as lineage sorting, gene duplication/ deletion, lateral gene transfer, and hybrid speciation (Doyle, 1997, Maddison, 1997, Wendel and Doyle, 1998). However, previous studies are mainly relied on a few genes from each genome and it is possible that the limited resolution of genome sampling was among important factors clouding the phylogenetic analyses (Rokas et al., 2003, Tsutsui et al., 2009)

Next generation sequencing (NGS) has recently emerged as a promising genomic approach for phylogenetic analysis of non-model organisms with a genome-wide detection of phylogenetic signals (Soltis et al., 2013). Now it is technically possible to obtain enormous amounts of sequence data from any genomes of any species in a short time at low cost (Davey et al. 2011). Phylogenetic reconstruction with increased resolution can be performed using genome-wide simple nucleotide polymorphism (SNP) data (Ekblom and Galindo, 2011, Mamanova et al., 2010, Straub et al., 2012). However, the applications of NGS in phylogenetic analysis have been relatively slow, as they mainly deal with non-model organisms, require sequencing of more samples per species, have no consensus NGS protocols specific for particular research questions, and are in the transitional use of whole-genome sequencing versus genome-reduction approach (McCormack et al., 2013). Currently, four major NGS approaches are explored with great potential for phylogenetic analysis: amplicon sequencing, restriction-digest, target enrichment, and transcriptome analysis (Escudero et al., 2014, McCormack et al., 2013, Ruhsam et al., 2015, Sveinsson et al., 2013). Among those approaches, restriction-digest based methods such as restriction-site associated DNA sequencing (RADseq; Emerson et al., 2010, Hohenlohe et al., 2011) or genotyping-by-sequencing (GBS; Elshire et al., 2011) are highly promoted, while the whole-genome sequencing is least applied (Andolfatto et al., 2011, Philippe et al., 2009, Rokas et al., 2003, Ruhsam et al., 2015). Thus, little is known about the effectiveness and feasibility of these NGS approaches in a phylogenetic analysis (Escudero et al., 2014, Sveinsson et al., 2013, Zimmer and Wen, 2015).

The flax genus Linum is the largest genus in the family Linaceae of flowering plant with over 180 species spread over six continents (Winkler, 1931) and a relatively old genus thought to be originated about 46 million years ago (MYA) (McDill and Simpson, 2011, McDill et al., 2009). It is divided into six botanic sections: Linum, Dasylinum, Syllinum, Linastrum, Cathartolinum and Cliococca (Winkler, 1931). There are two major lineages: a large, predominately blue-flowered clade containing the sections Dasylinum and Linum and a yellow-flowered clade consisting of the sections Cathartolinum, Linopsis and Syllinum (McDill and Simpson, 2011, McDill et al., 2009). Cultivated flax (L. usitatissimum L.) is an important source of high-quality fibers (Mohanty et al., 2000) and seed oil (Green, 1986). The cultivated flax was further divided based on important traits into several varietal groups, including L. usitatissimum L. landrace, L. usitatissimum L. fiber and L. usitatissimum L. oil (Diederichsen and Fu, 2006, Dillman et al., 1953). In recent years, the medicinal applications of Linum, such as treatments for cardiovascular diseases and cancer, have renewed interest in systematic relations among Linum species (McDill et al., 2009, Rickard-Bon and Thompson, 2003). The phylogenetic relationships and evolutionary history of diverse species in the Linum genus have been investigated using cytogenetic, genetic, and genomic approaches (Allaby et al., 2005, Chennaveeraiah and Joshi, 1983, Fu and Allaby, 2010, Fu et al., 2002, Gill and Yermanos, 1967, Melnikova et al., 2014, Sveinsson et al., 2013, Uysal et al., 2010). Overall, the revealed evolutionary relationships of assayed Linum species are largely consistent across these studies, with little incongruent phylogenetic relationships reported (Allaby et al., 2005, Fu and Peterson, 2012, McDill et al., 2009, Sveinsson et al., 2013, Uysal et al., 2010). Thus, Linum may be a good genus for assessing the congruency of phylogenetic signals among different genomes.

The objectives of this study were to (1) investigate the utility of multiplexed shotgun sequencing approach to retrieve chloroplast, mitochondrial, and nuclear SNP data for phylogenetic inference and (2) compare three-genome phylogenetic relationships of 18 Linum samples representing 16 species within four botanical sections of the Linum genus. It is our hope that this study will allow us to assess the congruency of the three-genome phylogenetic signals among those four taxonomic sections and to assess the effectiveness of the multiplexed shotgun sequencing approach in the acquisition of multiple-genome phylogenetic signals at the lower taxon level of plant species.

Section snippets

Materials and methods

Our exploration for multiplexed shotgun sequencing approach was empirically based and focused on flax species. The effort included several components: (1) flax species selection, (2) multiplexed shotgun sequencing trial, (3) bioinformatics pipeline development and application, and (4) phylogenetic analysis.

Multiplexed shotgun sequencing

In this study, we have explored a multiplexed shotgun sequencing protocol for acquiring phylogenetic signals from organelle and nuclear genomes of non-model organisms. As outlined in Fig. 1, the protocol consists of four major parts: library preparation, sequencing on MiSeq, SNP calling, and phylogenetic analysis. Each part has many steps as detailed in the materials and methods. Large effort was made with the development of the 3GenomeSNP pipeline to separate genome-specific reads, assemble

Discussion

Here we explored the use of multiplexed shotgun sequencing to retrieve phylogenetic signals from organelle and nuclear genomes of the Linum genus and revealed the first evidence for congruent three-genome phylogenetic relationships present among four botanical sections of the genus. This finding is significant for phylogenetic inference, as such evidence has been rarely reported at the low taxon level of plant species. More importantly, our application has demonstrated the effectiveness and

Concluding remarks

We explored a multiplexed shotgun sequencing protocol to retrieve phylogenetic signals from plant organelle and nuclear genomes. Application to the flax genus Linum revealed the first evidence for congruent three-genome phylogenetic relationships present among four botanical sections of the genus. This finding is significant for phylogenetic inference, as such evidence has been rarely reported from organelle and nuclear genomes at the lower taxonomic level of plant species. More research with

Authors’ contributions

YBF conceived the research, designed and developed the pipeline, generated and analyzed the NGS data, wrote the manuscript; YD developed and tested the pipeline, analyzed the NGS data, wrote the manuscript; M-HY analyzed the NGS data and wrote the manuscript.

Conflict of interest

The authors declare no conflict of interest.

Acknowledgments

We would like to thank Mr. Greg Peterson and Ms. Carolee Horbach (Agriculture and Agri-Food Canada) for their technical assistances in flax shotgun sequencing development; Dr. Marcus Hecker and Mr. Jonathon Doering at Toxicology Centre, University of Saskatchewan, for the use of their Illumina MiSeq facility for this project; and Dr. Frank You for the access to the published flax sequence scaffolds. This research was financially supported by an Agriculture and Agri-Food Canada A-Base research

References (76)

  • S.M. Chaw et al.

    Seed plant phylogeny inferred from all three plant genomes: monophyly of extant gymnosperms and origin of Gnetales from conifers

    Proc. Natl. Acad. Sci. USA

    (2000)
  • M.S. Chennaveeraiah et al.

    Karyotypes in cultivated and wild species of Linum

    Cytologia

    (1983)
  • R. Chikhi et al.

    Space-efficient and exact de Bruijn graph representation based on a Bloom filter

    Algor. Mol. Biol.

    (2013)
  • D. Darriba et al.

    JModelTest 2: more models, new heuristics and parallel computing

    Nat. Methods

    (2012)
  • J.W. Davey et al.

    Genome-wide genetic marker discovery and genotyping using next-generation sequencing

    Nat. Rev. Genet.

    (2011)
  • A. Diederichsen et al.

    Phenotypic and molecular (RAPD) differentiation of four infraspecific groups of cultivated flax (Linum usitatissimum L. subsp. usitatissimum)

    Genet. Resour. Crop Evol.

    (2006)
  • Dillman, A.C., 1953. Classification of flax varieties, 1946. USDA Technical Bulletin No. 1054. United States Department...
  • J.J. Doyle

    Trees within trees: genes and species, molecules and morphology

    Syst. Biol.

    (1997)
  • R. Ekblom et al.

    Applications of next generation sequencing in molecular ecology of non-model organisms

    Heredity

    (2011)
  • R.J. Elshire et al.

    A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species

    PLoS ONE

    (2011)
  • K.J. Emerson et al.

    Resolving postglacial phylogeography using high throughput sequencing

    Proc. Natl. Acad. Sci. USA

    (2010)
  • Y.B. Fu

    Genetic diversity analysis of highly incomplete SNP genotype data with imputations: an empirical assessment

    Gen. Genom. Genet.

    (2014)
  • Y.B. Fu et al.

    Phylogenetic network of Linum species as revealed by no-coding chloroplast DNA sequences

    Genet. Resour. Crop Evol.

    (2010)
  • Y.B. Fu et al.

    RAPD analysis of genetic relationships of seven flax species in the genus Linum L.

    Genet. Resour. Crop Evol.

    (2002)
  • Y.B. Fu et al.

    PaSNPg: a GBS-Based pipeline for protein-associated SNP discovery and genotyping in non-model species

    J. Proteom. Bioinform.

    (2015)
  • Y.B. Fu et al.

    Developing genomic resources in two Linum species via 454 pyrosequencing and genomic reduction

    Mol. Ecol. Resour.

    (2012)
  • K.S. Gill et al.

    Cytogenetic studies on the genus Linum

    Crop Sci.

    (1967)
  • A.G. Green

    Genetic control of polyunsaturated fatty acid biosynthesis in flax (Linum usitatissimum) seed oil

    Theor. Appl. Genet.

    (1986)
  • Y.L. Guo et al.

    Molecular phylogeny of Oryzeae (Poaceae) based on DNA sequences from chloroplast, mitochondrial, and nuclear genomes

    Am. J. Bot.

    (2005)
  • A.K. Hansen et al.

    Paternal, maternal, and biparental inheritance of the chloroplast genome in Passiflora (Passifloraceae): implications for phylogenetic studies

    Am. J. Bot.

    (2007)
  • P.A. Hohenlohe et al.

    RAD sequencing identifies thousands of SNPs for assessing hybridization between rainbow trout and westslope cutthroat trout

    Mol. Ecol. Resour.

    (2011)
  • J.P. Huelsenbeck et al.

    Bayesian inference of phylogeny and its impact on evolutionary biology

    Science

    (2001)
  • D.H. Huson et al.

    Application of phylogenetic networks in evolutionary studies

    Mol. Biol. Evol.

    (2006)
  • Illumina, 2012. Nextera® XT DNA Sample Preparation Guide. Part # 15031942 Rev. C, 2012....
  • Illumina, 2013. Preparing Libraries for Sequencing on the MiSeq®. Part # 15039740 Rev. D, 2013....
  • A. Jhala et al.

    Potential hybridization of flax with weedy and wild relatives: an avenue for movement of engineered genes?

    Crop Sci.

    (2008)
  • H. Korpelainen

    The evolutionary processes of mitochondrial and chloroplast genomes differ from those of nuclear genomes

    Naturwissenschaften

    (2004)
  • B. Langmead et al.

    Fast gapped-read alignment with Bowtie 2

    Nat. Methods

    (2012)
  • Cited by (12)

    View all citing articles on Scopus
    View full text