Genome-wide SNPs resolve phylogenetic relationships in the North American spruce budworm (Choristoneura fumiferana) species complex

https://doi.org/10.1016/j.ympev.2017.04.001Get rights and content

Highlights

  • Genotyping-by-sequencing has resolved relationships in the spruce budworm complex.

  • Multiple analyses agreed on an unexpectedly basal placement for Choristoneura pinus.

  • Relationships remain ambiguous for a clade of western species.

  • Speciation has likely been driven by a combination of ecological factors.

Abstract

High throughput sequencing technologies have revolutionized the potential to reconcile incongruence between gene and species trees, and numerous approaches have been developed to take advantage of these advances. Genotyping-by-sequencing is becoming a regular tool for gathering phylogenetic data, yet comprehensive evaluations of phylogenetic methods using these data are sparse. Here we use multiple phylogenetic and population genetic methods for genotyping-by-sequencing data to assess species relationships in a group of forest insect pests, the spruce budworm (Choristoneura fumiferana) species complex. With few exceptions, all methods agree on the same relationships, most notably placing C. pinus as basal to the remainder of the group, rather than C. fumiferana as previously suggested. We found strong support for the monophyly of C. pinus, C. fumiferana, and C. retiniana, but more ambiguous relationships and signatures of introgression in a clade of western lineages, including C. carnana, C. lambertiana, C. occidentalis occidentalis, C. occidentalis biennis, and C. orae. This represents the most taxonomically comprehensive genomic treatment of the spruce budworm species group, which is further supported by the broad agreement among multiple methodologies.

Introduction

Incongruence between gene and species trees has been a long-recognized concept in molecular systematics and phylogenetics (e.g. Fitch, 1970, Avise et al., 1983, Maddison, 1997). Conceptually, this incongruence can be accommodated by considering the modal gene phylogeny (the most frequently occurring gene phylogeny) as the species phylogeny (Pamilo and Nei, 1988, Maddison, 1997, Sperling, 2003). Ten to twenty years ago, the main limitation of this conceptual solution was methodological: the prohibitive cost of gathering sufficient data to estimate a modal gene phylogeny, especially for many taxa. This particular limitation has been substantially decreased by the immense growth of high-throughput next-generation sequencing (NGS) (Metzker, 2009). However, systematists now face another suite of methodological challenges regarding how to appropriately use these genomic data to construct phylogenies. Genotyping-by-sequencing (GBS: Elshire et al., 2011, Poland et al., 2012) has garnered particular interest for phylogenetic reconstruction (Miller et al., 2007, Baird et al., 2008, Davey et al., 2011), particularly for relatively shallow phylogenetic context (see Rubin et al., 2012) (note: although we focus on GBS here, the same methodological considerations apply to other restriction-site-associated DNA sequencing (RAD-seq) techniques; for simplicity in referring to methodological considerations we use “GBS” exclusively). Flexible requirements for a priori genomic resources and cost-effectiveness at the scales of loci and individuals (Peterson et al., 2012) make GBS ideal for systematics in the current era, where the distinction between traditional phylogenetics and population genetics is fading (Edwards, 2009). Although a growing number of studies have illustrated the utility of GBS for reconstructing phylogenies (e.g. Rubin et al., 2012, Wagner et al., 2012, Eaton and Ree, 2013, Jones et al., 2013, Nadeau et al., 2013, Cruaud et al., 2014, Hipp et al., 2014, Gohli et al., 2015, DaCosta and Sorenson, 2016, Díaz-Arce et al., 2016, Stervander et al., 2016, Rivers et al., 2016), critical and comprehensive evaluations of the phylogenetic methods used for these datasets are still relatively rare (e.g. DaCosta and Sorenson, 2016).

Most empirical studies using GBS for phylogenomics have relied on concatenation (or supermatrix) methods, under the assumption that the volume of phylogenetic signal in thousands of loci outweighs any potential gene tree/species tree discordance (e.g. Nadeau et al., 2013, Wagner et al., 2012, Cruaud et al., 2014). Concatenation methods simplify model selection for genomic data and have been shown to be robust for estimating species trees from GBS data (Rivers et al., 2016). However, many studies have also shown that such treatment of phylogenomic data can be misleading for both determination of species relationships and evaluation of tree support (e.g. strong bootstrap support for incorrect relationships) (Kubatko and Degnan, 2007, Weisrock et al., 2012, McVay and Carstens, 2013, Wielstra et al., 2014, Giarla and Esselstyn, 2015). Alternatively, coalescent methods (such as BEAST: Drummond and Rambaut, 2007), which may be more appropriate when genealogical discordance exists, present computational challenges for GBS data, and may become unfeasible with large datasets (e.g. Zimmerman et al., 2014). GBS loci are also relatively short (generally <100 bp) and have few variable sites per locus, which can lead to unresolved relationships when analyzed individually to construct “gene” trees (following “traditional” methods to determine species trees: Knowles and Kubatko, 2010). Likewise, emerging methods that use single nucleotide polymorphisms (SNPs) rather than sequence-based data (e.g. SNAPP: Bryant et al., 2012) can also be computationally demanding with large datasets. Given the methodological uncertainty surrounding GBS data and phylogenomics, and apparent impracticality of some approaches, critical evaluation of the leading methods is vital to establishing best practices for its application in phylogenetic contexts.

Here, we assessed phylogenetic relationships among lineages of the spruce budworm (Choristoneura fumiferana (Clemens, 1865)) species complex in North America using GBS data. Species in this group (most notably C. fumiferana, considered North America’s most destructive insect defoliator of living conifers: Volney and Fleming, 2007) exhibit wide population oscillations, with outbreak densities causing high tree mortality (Gray and MacKinnon, 2006) and serious economic losses. The complex is composed of eight or nine species (Lumley and Sperling, 2011a, Gilligan et al., 2014, Brunet et al., 2016) that differ primarily on the basis of larval host preferences (Stehr, 1967, Harvey, 1985), female sex-pheromone chemistry (Sanders et al., 1977), diapause characteristics (Harvey, 1967) and geography (Stehr, 1967). However, phylogenetic relationships in the complex remain uncertain, especially the placement of the eastern pine feeding species, Choristoneura pinus Freeman 1953 (Fig. 1). While most marker systems place C. fumiferana as the sister taxon to the rest of the complex, a recent analysis employing a genome-wide set of SNPs has cast doubt on this hypothesis by placing C. pinus basal to the group (Bird, 2013). However, these findings were confounded by the omission of several vital taxa within the group, including other pine feeders (e.g. C. lambertiana (Busck 1915)), and limited by the sole use of concatenation methods for phylogeny reconstruction.

In this study, we include all consistently recognized species in the complex and assess phylogenetic relationships of the group using multiple methods and datasets. First, we implement traditional phylogenetic methods with a concatenated (supermatrix) dataset. Given the common use of GBS datasets for population genomics, we compare the results of concatenated phylogenetic analyses to individual-based Bayesian clustering and ordinations using SNPs. Finally, we implement several approaches to assess gene tree/species tree incongruence, with and without a coalescent framework using both SNPs and sequence data. We find strong support for a single set of species-level relationships, and use this phylogeny to interpret mechanisms for speciation in the complex. This study represents the most taxonomically comprehensive phylogenomic evaluation of the spruce budworm species complex to date.

Section snippets

Sample collection

A total of 127 specimens were selected to maximize sampling of geographic range, host associations, and taxonomic diversity across the spruce budworm species complex. Larvae were collected by hand and reared to adult stage on host clippings from the plants they were collected on. Adults were collected using ultraviolet and mercury vapor light traps and sheets, as well as pheromone traps baited with species-specific pheromone lures (for pheromone compositions, see Silk and Eveleigh, 2012). We

Genotyping-by-sequencing

Unambiguous barcodes were found in a total of 285 million reads (average of 2.2 million reads per individual) and 233 million reads were retained after initial quality control (average per individual: 1.8 million). Eighty-four million reads were mapped to the C. fumiferana reference genome (average per individual: 0.6 million), and 72,570 SNPs were obtained from the populations portion of Stacks. When outgroup specimens were omitted (for population genetic analyses), this number was reduced to

Methodological considerations for GBS based phylogenies

The use of GBS datasets in phylogenetics is increasingly common, yet critical evaluations of alternative phylogenetic methods have accumulated more slowly (e.g. DaCosta and Sorenson, 2016, Gohli et al., 2015, Stervander et al., 2015, Razkin et al., 2016, Rivers et al., 2016, Stervander et al., 2016). Despite healthy discussion concerning the merits of various phylogenetic methods (most often concatenation vs. coalescent methods: Weisrock et al., 2012, Wielstra et al., 2014, Giarla and

Conclusion

We present a comprehensive comparison of several phylogenetic methods using GBS data and highlight the importance of testing multiple methods and parameters for creating phylogenies from these genomic datasets. In the process, we have also resolved basal relationships in the spruce budworm species complex in North America. With few exceptions, all methods agreed on the same topology, placing C. pinus as basal to the remainder of the group. Choristoneura fumiferana, C. pinus, and C. retiniana

Data accessibility

Datafiles and alignments available from the Dryad data repository http://dx.doi.org/10.5061/dryad.00715.

Acknowledgements

We thank Alberta Environment and Sustainable Resource Development, G. Anweiler, S. Brunet, Canadian Forest Service, J. De Benedictis, J. Dombroskie, J. Doucette, A. Hundsdoerfer, J.F. Landry, L. Nolan, B. Proshek, A. Roe, D. Rubinoff, Saskatchewan Environment, and C. Whitehouse for assistance in specimen collection and two anonymous reviewers for comments. Funding was provided by an Alberta Innovates Bio Solutions grant (# VCS-11-034) and National Science and Engineering Research Council

References (111)

  • H.M. Bird

    Phylogenomics of the Choristoneura fumiferana species complex (Lepidoptera: Tortricidae)

    (2013)
  • Blackburn, G.S., Brunet, B.M.T., Muirhead, K., Cusson, M., Béliveau, C., Levesque, R.C., Lumley, L.M. Sperling, F.A.H.,...
  • Brunet, B.M.T., Blackburn, G.S., Muirhead, K., Lumley, L.M., Boyle, B., Levesque, R.C., Cusson, M., Sperling, F.A.H.,...
  • D. Bryant et al.

    Inferring species trees directly from biallelic genetic markers: bypassing gene trees in a full coalescent analysis

    Mol. Biol. Evol.

    (2012)
  • Bouckaert, R., Heled, J., 2014. DensiTree 2: seeing trees through the forest. bioRxiv....
  • R. Bouckaert et al.

    BEAST 2: a software platform for Bayesian evolutionary analysis

    PLOS Comput. Biol.

    (2014)
  • P.J. Castrovillo

    Interspecific and intraspecific genetic comparisons of North American spruce budworms (Choristoneura spp.)

    (1982)
  • M. Cariou et al.

    Is RAD-seq suitable for phylogenetic inference? An in silico assessment and optimization

    Ecol. Evol.

    (2013)
  • J. Catchen et al.

    Stacks: building and genotyping loci de novo from short-read sequences

    G3: Genes, Genomes, Genet.

    (2011)
  • J. Catchen et al.

    Stacks: an analysis tool set for population genomics

    Mol. Ecol.

    (2013)
  • K.M.A. Chan et al.

    Leaky prezygotic isolation and porous genomes: rapid introgression of maternally inherited DNA

    Evolution

    (2005)
  • A. Cruaud et al.

    Empirical assessment of RAD sequencing for interspecific phylogeny

    Mol. Biol. Evol.

    (2014)
  • P. Danecek et al.

    The variant call format and VCFtools

    Bioinformatics

    (2011)
  • P.T. Dang

    Morphological study of male genitalia with phylogenetic inference of Choristoneura Lederer (Lepidoptera: Tortricidae)

    Can. Entomol.

    (1992)
  • J.W. Davey et al.

    Genome-wide genetic marker discovery and genotyping using next-generation sequencing

    Nat. Rev. Genet.

    (2011)
  • J.J. Dombroskie et al.

    Phylogeny of the tribe Archipini (Lepidoptera: Tortricidae: Tortricinae) and evolutionary correlates of novel secondary structures

    Zootaxa

    (2013)
  • A.J. Drummond et al.

    BEAST: Bayesian evolutionary analysis by sampling trees

    BMC Evol. Biol.

    (2007)
  • J.R. Dupuis et al.

    Multi-locus species delimitation in closely related animals and fungi: one marker is not enough

    Mol. Ecol.

    (2012)
  • D.A.R. Eaton

    PyRAD: assembly of de novo RADseq loci for phylogenetic analyses

    Bioinformatics

    (2014)
  • D.A.R. Eaton et al.

    Inferring phylogeny and introgression using RADseq data: an example from flowering plants (Pedicularis: Orobanchaceae)

    Syst. Biol.

    (2013)
  • S.V. Edwards

    Is a new and general theory of molecular systematics emerging?

    Evolution

    (2009)
  • R.J. Elshire et al.

    A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species

    PLoS ONE

    (2011)
  • G. Evanno et al.

    Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study

    Mol. Ecol.

    (2005)
  • D. Falush et al.

    Inference of population structure: extensions to linked loci and correlated allele frequencies

    Genetics

    (2003)
  • J. Felsenstein

    Confidence limits on phylogenies: an approach using the bootstrap

    Evolution

    (1985)
  • W.M. Fitch

    Distinguishing homologous from analogous proteins

    Syst. Zool.

    (1970)
  • W.M. Fitch

    Toward defining the course of evolution: minimum change for a specific tree topology

    Syst. Zool.

    (1971)
  • T.N. Freeman

    On coniferophagous species of Choristoneura (Lepidoptera: Tortricidae) in North America, I. Some new forms of Choristoneura allied to C. fumiferana

    Can. Entomol.

    (1967)
  • D.J. Funk et al.

    Species-level paraphyly and polyphyly: frequency, causes, and consequences, with insight from animal mitochondrial DNA

    Annu. Rev. Ecol. Evol. Syst.

    (2003)
  • T.C. Giarla et al.

    The challenges of resolving a rapid, recent radiation: empirical and simulated phylogenomics of Philippine shrews

    Syst. Biol.

    (2015)
  • Gilligan, T.M., Baixeras, J., Brown, J.W. Tuck, K.R., 2014. T@RTS: Online World Catalogue of the Tortricidae v3.0...
  • J. Gohli et al.

    The evolutionary history of Afrocanarian blue tits inferred from genomewide SNPs

    Mol. Evol.

    (2015)
  • S. Guindon et al.

    New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0

    Syst. Biol.

    (2010)
  • D.R. Gray et al.

    Outbreak patterns of the spruce budworm and their impacts in Canada

    For. Chron.

    (2006)
  • G.T. Harvey

    On caniferophagous Choristoneura (Lepidoptera: Tortricidae) in North America. V. Second diapause as a species character

    Can. Entomol.

    (1967)
  • G.T. Harvey

    The taxonomy of the coniferophagous Choristoneura (Lepidoptera: Tortricidae): a review

  • G.T. Harvey

    Genetic relationships among Choristoneura species (Lepidoptera: Tortricidae) in North American as revealed by isozyme studies

    Can. Entomol.

    (1996)
  • G.T. Harvey

    Interspecific crosses and fertile hybrids among the coniferophagous Choristoneura (Lepidoptera: Tortricidae)

    Can. Entomol.

    (1997)
  • M. Hasegawa et al.

    Dating of the human-ape splitting by a molecular clock of mitochondrial DNA

    J. Mol. Evol.

    (1985)
  • A.L. Hipp et al.

    A framework phylogeny of the American oak clade based on sequenced RAD data

    PLoS ONE

    (2014)
  • Cited by (31)

    • Gauging ages of tiger swallowtail butterflies using alternate SNP analyses

      2022, Molecular Phylogenetics and Evolution
      Citation Excerpt :

      This data consists of short sequences (<100 bp) containing one or few single nucleotide polymorphisms (SNPs) and requires careful consideration in phylogenetic analyses. Many studies have evaluated strategies for using these data for tree estimation (Campbell et al., 2020; DaCosta and Sorenson, 2016; Dupuis et al., 2017; Leaché and Oaks, 2017; L. Loureiro et al., 2020; L. O. Loureiro et al., 2020; McCormack et al., 2013; Rivers et al., 2016; Schmidt-Lebuhn et al., 2017), species delimitation (Beheregaray et al., 2017; Georges et al., 2018; Leaché et al., 2014; Ortiz et al., 2021; Pante et al., 2015; Shaffer and Thomson, 2007; Villamil et al., 2019), and other tasks in evolutionary biology such as inference of introgression (Paetzold et al., 2019). However, best practices remain elusive for estimating divergence times with SNPs.

    • The potential of genome-wide RAD sequences for resolving rapid radiations: a case study in Cactaceae

      2020, Molecular Phylogenetics and Evolution
      Citation Excerpt :

      However, in our analyses, ASTRAL led to higher resolution in short internal nodes than quartets (Fig. 3). Despite that, the analysis of many small loci may remain problematic when co-estimation methods (e.g., STARBEAST; Dupuis et al., 2017) are used, which is mainly due to the high computational demand. Indeed, our preliminary analyses using summary species tree methods led to inconsistent results (data not shown).

    • Convergent herbivory on conifers by Choristoneura moths after boreal forest formation

      2018, Molecular Phylogenetics and Evolution
      Citation Excerpt :

      No mitogenome has been published yet for the well-studied species of Archipini, several of which are in the Choristoneura fumiferana (Clemens, 1865) species complex, also known as the spruce budworm (SBW) species complex. The SBW complex includes eight (Brunet et al., 2017; Dupuis et al., 2017) or nine (Brown, 2005; Gilligan et al., 2014a) coniferophagous species of Choristoneura in North America. In addition to major impact on forestry by the SBW complex (Alfaro and Fuentealba, 2016), Choristoneura murinana (Hübner, 1799) is an important conifer pest in Europe (Sarýkaya and Avcý, 2005), while Choristoneura rosaceana (Harris, 1841) and Choristoneura conflictana (Walker, 1863) are pests in orchards and aspen forest in North America (Holsten and Hard, 1985; Reissig, 1978).

    View all citing articles on Scopus
    View full text