Trends in Ecology & Evolution
ReviewGene tree discordance, phylogenetic inference and the multispecies coalescent
Section snippets
The problem of gene tree discordance
Until recently, the state of the art for molecular phylogenetic studies typically involved (i) sequencing a gene in individual representatives of a collection of species; (ii) inferring a ‘gene tree’ (see Glossary) for the sequences; and (iii) declaring the gene tree to be the estimate of the tree of species relationships. With the increasing abundance of molecular data and the recognition that evolutionary trees from different genes often have conflicting branching patterns 1, 2, 3, 4, 5, 6, 7
The multispecies coalescent
Coalescent theory 1, 2, 17, which models genealogies within populations, can be used to investigate probabilities that gene trees have branching patterns (topologies) that differ from a species tree topology. The basic model, which we call the ‘multispecies coalescent,’ generalizes the Wright-Fisher model of genetic drift 18, 19, 20, applying it to multiple populations connected by an evolutionary tree.
The coalescent for a single population traces the ancestries of a subset of individual copies
Conceptual basis for discordance
Given enough time measured in coalescent time units (Box 2), lineages within a population coalesce with high probability. After ∼5Ne generations along species tree branches, where Ne is the effective number of chromosomes, lineages are likely to have coalesced within each population, and monophyly of lineages (and, therefore, congruence between gene trees and the species tree) is probable 3, 25, 29, 41, 42. With shorter branches, multiple gene lineages tend to persist into deeper portions of
Gene tree probabilities
Probability calculations for properties of gene trees given a species tree are important for understanding the magnitude of genealogical discordance, for predicting the behavior of phylogenetic algorithms and for assessing the fit of the multispecies coalescent. Such computations rely on the concept of coalescent histories, which for a given gene tree and species tree topology represent the sequences of species tree branches on which gene tree coalescences can occur (online Supplementary Box S1
Species tree inference
Discordant gene trees contain information about features of the species tree, such as its topology, divergence times and population sizes. Conflicting gene trees therefore provide a basis for inferring species trees using procedures that do not simply equate the estimated species tree with a single estimated gene tree. A desirable property for methods that estimate species trees is statistical consistency: an estimator should converge on the true species tree as more individuals, longer DNA
Conclusions
Conflicts between gene trees estimated at different loci have sometimes been seen as obstacles for inferring phylogenies. However, we suggest that gene tree conflict provides an opportunity to obtain information regarding the processes that have shaped organismal genomes. Researchers have used conflicting gene genealogies to infer ancestral population parameters such as population size and divergence times 30, 72, and to examine species divergence processes 11, 36. It is only recently, however,
Acknowledgements
We thank M. DeGiorgio, S. Edwards, M. Slatkin and two anonymous reviewers for comments. This work was supported by grants from the National Science Foundation (DEB-0716904), the Burroughs Wellcome Foundation and the Alfred P. Sloan Foundation.
Glossary
- Ancestral polymorphism
- the existence of more than one allele at a locus in an ancestral population; through incomplete lineage sorting, polymorphisms can persist through species divergences, resulting in misleading similarities of DNA sequences that do not necessarily reflect population relationships.
- Anomalous gene tree (AGT)
- a gene tree topology that is more probable than the gene tree topology that matches the species tree topology.
- Anomaly zone
- for a given species tree topology, the set of
References (80)
- et al.
Phylogenetic relationships of mitochondrial DNA under various demographic models of speciation
Gene trees and species trees are not the same
Trends Ecol. Evol.
(2001)DNA archives and our nearest relative: the trichotomy problem revisited
Mol. Phylogenet. Evol.
(2000)- et al.
Genomic divergences between human and other hominoids and the effective population size of the common ancestor of humans and chimpanzees
Am. J. Hum. Genet.
(2001) The probability of topological concordance of gene trees and species trees
Theor. Popul. Biol.
(2002)- et al.
Detecting hybrid speciation in the presence of incomplete lineage sorting using gene tree incongruence: a model
Theor. Pop. Biol.
(2009) - et al.
Deciphering ancient rapid radiations
Trends Ecol. Evol.
(2007) - et al.
The molecular phylogenetics of tuco-tucos (genus Ctenomys, Rodentia: Octodontidae) suggests an early burst of speciation
Mol. Phylogenet. Evol.
(1998) The evolution of supertrees
Trends Ecol. Evol.
(2004)Maximum likelihood estimation of population divergence times and population phylogenies under the infinite sites model
Theor. Popul. Biol.
(1998)