Deep metazoan phylogeny: When different genes tell different stories
Graphical abstract
Highlights
► Deep metazoan phylogeny was tested using non-overlapping multi-gene matrices. ► Different partitions produce conflicting phylogenies. ► Level of saturation and LBA artifacts depend on gene sampling strategy. ► Ctenophora-basal and the sponge paraphyly correlate with higher saturation. ► Genes involved in translation support the Coelenterata and monophyly of Porifera.
Introduction
The historical sequence of early animal diversification events has been the subject of debate for approximately a century. Morphological character analyses leave a degree of uncertainty concerning the evolutionary relationships among the five major metazoan lineages: Porifera, Placozoa, Ctenophora, Cnidaria, and Bilateria (Collins et al., 2005). In the last few years, this debate has been fueled by a plethora of conflicting phylogenetic hypotheses generated using molecular data (Dunn et al., 2008, Erwin et al., 2011, Philippe et al., 2009, Pick et al., 2010, Schierwater et al., 2009, Sperling et al., 2009). The persisting controversy includes questions concerning the earliest diverging animal lineage (Porifera vs. Placozoa vs. Ctenophora), the validity of the Eumetazoa (Bilateria + Cnidaria + Ctenophora) and Coelenterata (Cnidaria + Ctenophora) clades, and relationships among the main lineages of Porifera (sponges; reviewed in Wörheide et al. (2012)). These questions are fundamental for understanding the evolution of both animal body plans and genomes (Philippe et al., 2009).
In 2003, Rokas and co-authors (Rokas et al., 2003a) showed that the evolutionary relationships between major metazoan lineages cannot be resolved using single genes or a small number of protein-coding sequences. Because of the high stochastic error, the analyses of the individual genes resulted in conflicting phylogenies. These authors also observed that at least 8000 randomly selected characters (>20 genes) are required to overcome the effect of these discrepancies (Rokas et al., 2003b). However, the authors’ subsequent attempt at resolving the deep metazoan relationships using a large dataset containing 50 genes from 17 metazoan taxa (including six non-bilaterian species) was not successful (Rokas et al., 2005). By contrast, the analysis of the identical set of genes robustly resolved the higher-level phylogeny of Fungi, a group of approximately the same age as the Metazoa (Yuan et al., 2005). Based on this result, these authors concluded that because of the rapidity of the metazoan radiation, the true phylogenetic signal preserved on the deep internal branches was too low to reliably deduce their branching order (Rokas and Carroll, 2006). However, this conclusion did not discourage scientists from further attempts at resolving this difficult phylogenetic question using the traditional sequence-based phylogenetic approach. The main strategy of the subsequent studies was increasing the amount of data, including both gene and taxon sampling. In 2008, a novel hypothesis of early metazoan evolution was proposed by Dunn et al. (2008) based on the analysis of 150 nuclear genes (21,152 amino acid [aa] characters) from 71 metazoan taxa (however, with only nine non-bilaterian species among them). According to this hypothesis, ctenophores represent the most ancient, earliest diverging branch of the Metazoa. This evolutionary scenario did not gain any support from the analysis of another large alignment that contained 128 genes (30,257 aa) and a larger number of non-bilateral metazoan species (22; Philippe et al., 2009). This study revived the Coelenterata and Eumetazoa hypotheses (Hyman, 1940) and placed the Placozoa as the sister-group of the Eumetazoa. Another scenario for early metazoan evolution was proposed by Schierwater et al. (2009) based on the analysis of a dataset that included not only nuclear protein-coding genes but also mitochondrial genes and morphological characters (a “total evidence” dataset). This study reconstructed monophyletic “Diploblasta” (i.e., non-bilaterian metazoans) with a “basal” Placozoa as the sister-group of the Bilateria.
Recently published metazoan phylogenies differ in their taxon and gene sampling and their application of phylogenetic methods and thresholds, including the use of different models of amino acid substitution. Any of these factors may be a source of the observed incongruity among the proposed deep metazoan phylogenies (Dunn et al., 2008, Philippe et al., 2009, Schierwater et al., 2009). Comparative analyses of the three above-described multi-gene alignments showed that the observed conflict can be partially attributed to the presence of contaminations, alignment errors, and reliance on simplified evolutionary models (Philippe et al., 2011) or long branch attraction artifacts caused by insufficient ingroup taxon sampling (Pick et al., 2010). Correcting the alignment errors in the datasets by Dunn et al. (2008) and Schierwater et al. (2009) and applying an evolutionary model that best fit these data, altered both the tree topology and basal node support, but failed to resolve the incongruences between the three phylogenies.
The objective of the present study is to further assess the causes of inconsistency between deep (non-bilaterian) metazoan phylogenies obtained using phylogenomic (large multi-gene) datasets with a main emphasis on the effect of gene sampling. We approached this question with multiple comparative analyses of a novel phylogenomic dataset with two multi-gene sub-matrices that have identical taxon samplings, comparable lengths, and missing data percentage but different gene contents. We also increased the taxon sampling by adding new data from non-bilaterian lineages, including seven Porifera species, one Ctenophora species, and a novel placozoan strain.
Section snippets
Data acquisition
New data were generated for nine species of non-bilaterian metazoans, including one ctenophore, Beroe sp., an unidentified placozoan species (Placozoan strain H4), and seven sponges: Asbestopluma hypogea, Ephydatia muelleri, Pachydictyum globosum, Tethya wilhelma (all from class Demospongiae), Crateromorpha meyeri (class Hexactinellida), Corticium candelabrum (class Homoscleromorpha), (Expressed Sequence Tag [EST] libraries), and Sycon ciliatum (class Calcarea; EST and genomic data). The data
Different gene matrices tell different stories
The ProtTest analyses indicated that LG + Γ + I was the evolutionary model that best fit the majority of the single-gene alignments in a Maximum Likelihood (ML) framework. However, a further statistical comparison (cross-validation test; Stone, 1974) extended to more complex evolutionary models rejected the LG in favor of GTR (scores of 383 and 61 in favor of GTR for the ribosomal and non-ribosomal matrices, respectively), which, in turn, was outperformed by both the Bayesian CAT (with a score
Why do different genes tell different stories?
The multiple conflicting metazoan phylogenies presented here and in previous publications (Dunn et al., 2008, Erwin et al., 2011, Philippe et al., 2009, Pick et al., 2010, Schierwater et al., 2009, Sperling et al., 2009, Srivastava et al., 2010) have one feature in common: they have long terminal and short internal branches. Frequently, such a topology is a sign of ancient rapid radiations, which are closely spaced diversification events that occurred deep in time (Rokas et al., 2003a; Rokas et
Conclusions
This study shows an extreme sensitivity of the higher-level metazoan phylogeny to the gene composition of the phylogenomic matrices. The gene sampling strategy determines the level of saturation and LBA biases in the resulting phylogenies. According to our results, a careful a priori (i.e., post-sequencing and before analyses) selection of genes that evolve slowly across all metazoan lineages helps to decrease systematic errors and recover the phylogenetic signal from the noise. Using this
Author contributions
G.W. conceived the research and obtained the funding; T.N. and G.W. designed the research; T.N. and F.S. analyzed the data; M.A., Mn.A., M.E., J.H., B.S., W.M., M.W. and G.W. provided data; M.M., M.N., and J.V. provided samples; M.M. contributed to manuscript revision; and T.N. and G.W. wrote the paper.
Acknowledgments
We thank S. Leys, B. Bergum, Ch. Arnold, M. Krüß, and E. Gaidos for providing samples; M. Kube and his team (MPE for Molecular Genetics, Berlin, Germany) for library construction; I. Ebersberger and his team (Center for Integrative Bioinformatics, Vienna, Austria) for data processing; and K. Nosenko for the artwork. This work was financially supported by the German Research Foundation (DFG Priority Program SPP1174 “Deep Metazoan Phylogeny,” Projects Wo896/6 and WI 2216/2-2). M.A. and Mn.A.
References (84)
- et al.
Evaluating phylogenetic informativeness and data-type usage for new protein-coding genes across vertebrata
Molecular Phylogenetics and Evolution
(2011) Current advances in the phylogenetic reconstruction of metazoen evolution. A new paradigm for the Cambrian Explosion?
Molecular Phylogenetics and Evolution
(2002)- et al.
Phylogenomics: the beginning of incongruence?
Trends in Genetics
(2006) - et al.
Evolutionary systems biology: links between gene evolution and function
Current Opinion in Biotechnology
(2006) - et al.
FASconCAT: convenient handling of data matrices
Molecular Phylogenetics and Evolution
(2010) - et al.
Nearly complete rRNA genes from 371 Animalia: updated structure-based alignment and detailed phylogenetic analysis
Molecular Phylogenetics and Evolution
(2012) - et al.
Phylogenomics revives traditional views on deep animal relationships
Current Biology
(2009) - et al.
A molecular phylogenetic framework for the phylum Ctenophora using 18S rRNA genes
Molecular Phylogenetics and Evolution
(2001) - et al.
Toward resolving the eukaryotic tree: the phylogenetic positions of jakobids and cercozoans
Current Biology
(2007) - et al.
Rare genomic changes as a tool for phylogenetics
Trends in Ecology & Evolution
(2000)
Testing the phylogenetic stability of early tetrapods
Journal of Theoretical Biology
Deep phylogeny and evolution of sponges (Phylum Porifera)
ProtTest: selection of best-fit models of protein evolution
Bioinformatics
A review of long-branch attraction
Cladistics
Calculating the evolutionary rates of different genes: a fast, accurate estimator with applications to maximum likelihood phylogenetic analysis
Systematic Biology
On the phylogenetic position of Myzostomida: can 77 genes get it wrong?
BMC Evolutionary Biology
Archaea sister group of bacteria? Indications from tree reconstruction artifacts in ancient phylogenies
Molecular Biology and Evolution
TrimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses
Bioinformatics
Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis
Science
The functional genomic distribution of protein divergence in two animal phyla: coevolution, genomic conflict, and constraint
Genome Research
Phylogenetic context and basal metazoan model systems
Integrative and Comparative Biology
From phylogenetics to phylogenomics: the evolutionary relationships of insect endosymbiotic gamma-proteobacteria as a test case
Systematic Biology
The suitability of molecular and morphological evidence in reconstructing plant phylogeny
Broad phylogenomic sampling improves resolution of the animal tree of life
Nature
A consistent phylogenetic backbone for the fungi
Molecular Biology and Evolution
MUSCLE: multiple sequence alignment with high accuracy and high throughput
Nucleic Acids Research
The Cambrian conundrum: early divergence and later ecological success in the early history of animals
Science
A likelihood approach to character weighting and what it tells us about parsimony and compatibility
Biological Journal of the Linnean Society
Parsimony in systematics: biological and statistical issues
Annual Review of Ecology and Systematics
PATRISTIC: a program for calculating patristic distances and graphically comparing the components of genetic change
BMC Evolutionary Biology
Hidden likelihood support in genomic data: can forty-five wrongs make a right?
Systematic Biology
Evaluating the phylogenetic utility of genes – a search for genes informative about deep divergences among vertebrates
Systematic Biology
Generelle Morphologie der Organismen
Taxonomic sampling, phylogenetic accuracy, and investigator bias
Systematic Biology
Outgroup misplacement and phylogenetic inaccuracy under a molecular clock – a simulation study
Systematic Biology
The rates of evolution in some ribosomal components
Journal of Molecular Evolution
Dense taxonomic EST sampling and its applications for molecular systematics of the Coleoptera (beetles)
Molecular Biology and Evolution
The Invertebrates: Protozoa through Ctenophora
The genome of the choanoflagellate Monosiga brevicollis and the origin of metazoans
Nature
Annotation pattern of ESTs from Spodoptera frugiperda Sf9 cells and analysis of the ribosomal protein genes reveal insect-specific features and unexpectedly low codon usage bias
Bioinformatics
A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process
Molecular Biology and Evolution
Cited by (202)
Phylogeny of sea spiders (Arthropoda: Pycnogonida) inferred from mitochondrial genome and 18S ribosomal RNA gene sequences
2023, Molecular Phylogenetics and EvolutionPhylogenetic systematics of Yphthimoides Forster, 1964 and related taxa, with notes on the biogeographical history of Yphthimoides species
2022, Molecular Phylogenetics and EvolutionExon-capture data and locus screening provide new insights into the phylogeny of flatfishes (Pleuronectoidei)
2022, Molecular Phylogenetics and Evolution
- 1
Current address: Swire Institute of Marine Science, School of Biological Sciences, The University of Hong Kong, Hong Kong