Skip to main content
Log in

Topology and inference for Yule trees with multiple states

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

We introduce two models for random trees with multiple states motivated by studies of trait dependence in the evolution of species. Our discrete time model, the multiple state ERM tree, is a generalization of Markov propagation models on a random tree generated by a binary search or ‘equal rates Markov’ mechanism. Our continuous time model, the multiple state Yule tree, is a generalization of the tree generated by a pure birth or Yule process to the tree generated by multi-type branching processes. We study state dependent topological properties of these two random tree models. We derive asymptotic results that allow one to infer model parameters from data on states at the leaves and at branch-points that are one step away from the leaves.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Aldous DJ (1996) Probability distributions on cladograms. Random Discrete Structures, (IMA Volumes Math Appl 76), pp 1–18

  • Aldous DJ (2001) Stochastic models and descriptive statistics for phylogenetic trees, from Yule to today. Stat Sci 16(1):23–34

    Article  MathSciNet  MATH  Google Scholar 

  • Aldous D, Popovic L (2005) A critical branching process model for biodiversity. Adv Appl Probab 37:1094–1115. doi:10.1239/aap/1134587755

    Article  MathSciNet  MATH  Google Scholar 

  • Fitzjohn RG (2010) Quantitative traits and diversification. Syst Biol 59:619–633

    Article  Google Scholar 

  • Fitzjohn RG (2012) What drives biological diversification? Detecting traits under species selection. University of British Columbia, PhD Thesis

  • Gascuel O, Steel M (2014) Predicting the ancestral character changes in a tree is typically easier than predicting the root state. Syst Biol 63(3):421–435

    Article  Google Scholar 

  • Goldberg EE, Igic B (2012) Tempo and mode in plant breeding system evolution. Evolution 66:3701–3709

    Article  Google Scholar 

  • Goldberg EE, Lancaster LT, Ree RH (2011) Phylogenetic inference of reciprocal effects between geographic range evolution and diversification. Syst Biol 60:451–465

    Article  Google Scholar 

  • Harding EF (1971) The probabilities of rooted tree-shapes generated by random bifurcation. Adv Appl Probab 3:44–77

    Article  MathSciNet  MATH  Google Scholar 

  • Janson S (2004) Functional limit theorems for multitype branching processes and generalized Pólya urns. Stoch Process Appl 110(2):177–245

    Article  MathSciNet  MATH  Google Scholar 

  • Jones G (2011) Calculations for multi-type age-dependent binary branching processes. J Math Biol 63(1):33–56

    Article  MathSciNet  MATH  Google Scholar 

  • Lambert A, Popovic L (2013) The coalescent point-process of branching trees. Ann Appl Probab 23(1):99–144. doi:10.1214/11-AAP820

    Article  MathSciNet  MATH  Google Scholar 

  • Maddison WP, Midford PE, Otto SP (2007) Estimating a binary character’s effect on speciation and extinction. Syst Biol 56(5):701–710

    Article  Google Scholar 

  • McKenzie A, Steel M (2000) Distributions of cherries for two models of trees. Math Biosci 164(1):81–92

    Article  MathSciNet  MATH  Google Scholar 

  • Mode CJ (1962) Some multi-dimensional birth and death processes and their applications in population genetics. Int Biometric Soc 18(4):543–567

    Article  MATH  Google Scholar 

  • Mooers AO, Heard SB (1997) Inferring evolutionary process from phylogenetic tree shape. Q Rev Biol 72(1):31–54

    Article  Google Scholar 

  • Mossel E, Steel M (2005) How much can evolved characters tell us about the tree that generated them? In: Gascuel O (ed) Mathematics of evolution and phylogeny, chap 14. Oxford University Press, Oxford, pp 384–412

    Google Scholar 

  • Mossel E, Steel M (2014) Majority rule has transition ration 4 on Yule trees under a 2-state symmetric model. J Theor Biol 18(360):315–318

    Article  MATH  Google Scholar 

  • Nee S, May RH, Harvey PH (1994) The reconstructed evolutionary process. Philos Trans Roy Soc B 344(1309):305–311

    Article  Google Scholar 

  • NG J, Smith SD (2014) How traits shape trees: new approaches for detecting character state-dependent lineage diversification. J Evol Bio. doi:10.1111/jeb.12460

  • Popovic L, Rivas M (2014) The coalescent point-process of multi-type branching trees. Stoch Process Appl 124(12):4120–4148

    Article  MathSciNet  MATH  Google Scholar 

  • Smythe RT (1996) Central limit theorems for urn models. Stoch Process Appl 65(1):115–137

    Article  MathSciNet  MATH  Google Scholar 

  • Yule GU (1924) A mathematical theory of evolution, based on the conclusions of Dr. J. C. Willis. Philos Trans Roy Soc London Ser B 213:21–87

    Article  Google Scholar 

Download references

Acknowledgments

We thank the referees for constructive comments and suggestions which improved the paper’s exposition. This research was supported by NSERC (Natural Sciences and Engineering Research Council of Canada) Discovery Grant # 346197-2010.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lea Popovic.

Appendix

Appendix

Proof of Lemma 12

In the event that \(\varvec{Z}(T)=0\) there is nothing to prove, so we consider \(\varvec{W}\) on the event \(\varvec{Z}(T)\ne 0\Leftrightarrow \varvec{W}(0)\ne 0\) (and \(\varvec{W}(T)\ne 0\) as well).

For any \(n\ge 1\) let \(0\le t_0\le t_1\le \cdots \le t_n\le T\), we denote the joint distribution of \(\varvec{W}\) at these times by

$$\begin{aligned} P_{t_0;t_1,\ldots ,t_n}(\varvec{z}_0;\varvec{w}_1,\ldots ,\varvec{w}_n)={\mathbb {P}}\left[ \varvec{W}({t_j})=\varvec{w}_j,\,1\le j\le n\,\big |\,\varvec{Z}(t_0)=\varvec{z}_0\right] . \end{aligned}$$

We first show, by induction, that \(\forall n\ge 1\)

$$\begin{aligned}&P_{t_0;t_1,\ldots ,t_n}(\varvec{z}_0;\varvec{w}_1,\ldots ,\varvec{w}_n)\nonumber \\&\quad = P_{t_0;t_1,\ldots ,t_{n-1}}(\varvec{z}_0;\varvec{w}_1,\ldots ,\varvec{w}_{n-1})\frac{ P_{t_0;t_{n-1},t_n}(\varvec{z}_0;\varvec{w}_{n-1},\varvec{w}_n) }{ P_{t_0;t_{n-1}}(\varvec{z}_0;\varvec{w}_{n-1}) }. \end{aligned}$$
(10)

This is evident for \(n=2\). Assume the equation is true \(\forall i\le n-1\) with \(n>2\). Notice that

$$\begin{aligned}&P_{t_0;t_1,\ldots ,t_n}(\varvec{z}_0;\varvec{w}_1,\ldots ,\varvec{w}_n)\nonumber \\&\quad =\sum _{\varvec{z}_1\ge \varvec{w}_1}{\mathbb {P}}[\varvec{Z}({t_1})=\varvec{z}_1|\varvec{Z}({t_0})=\varvec{z}_0]P_{t_1;t_1,\ldots ,t_n}(\varvec{z}_1;\varvec{w}_1,\ldots ,\varvec{w}_n). \end{aligned}$$
(11)

The branching property of the birth-death process \(\varvec{Z}\) guarantees independence of its subtrees originating from non-overlapping subsets of individuals present at any time \(t_1\). Since all individuals surviving at time T must be descendants of the process \(\varvec{W}\), we have

$$\begin{aligned} P_{t_1;t_1,\ldots ,t_n}(\varvec{z}_1;\varvec{w}_1,\ldots ,\varvec{w}_n)= & {} {\mathbb {P}}\left[ \varvec{W}({t_j})=\varvec{w}_j,\,1\le j\le n\,\big |\,\varvec{Z}({t_1})=\varvec{z}_1\right] \nonumber \\= & {} C_{\varvec{z}_1,\varvec{w}_1}{\mathbb {P}}\left[ \varvec{W}({t_j})\right. \nonumber \\= & {} \left. \varvec{w}_j,\,1\le j\le n\,\big |\,\varvec{Z}({t_1})=\varvec{w}_1\right] p_{\varvec{z}_1-\varvec{w}_1}^{\varvec{0}}(t_1,T)\nonumber \\= & {} C_{\varvec{z}_1,\varvec{w}_1}P_{t_1;t_1,\ldots ,t_n}(\varvec{w}_1;\varvec{w}_1,\ldots ,\varvec{w}_n)p_{\varvec{z}_1-\varvec{w}_1}^{\varvec{0}}(t_1,T)\qquad \end{aligned}$$
(12)

where \(C_{\varvec{z}_1,\varvec{w}_1}\) denotes the combinatorial number of distinct ways of choosing \(\varvec{w}_1\) out of \(\varvec{z}_1\) individuals, and \(p_{\varvec{z}}^{\varvec{0}}(t,T)={\mathbb {P}}[\varvec{Z}(T)=0|\varvec{Z}(t)=\varvec{z}]\) is the extinction probability by time T of the process \(\varvec{Z}\) started at time t with \(\varvec{Z}(t)=\varvec{z}\).

Given \(\varvec{Z}({t_1})=\varvec{w}_1\), the process \((\varvec{Z}(t))_{t\ge t_1}\) is the sum of birth-death processes defined by subtrees \(\{\mathcal T^{(i)}\}, i=1,\ldots ,|\varvec{w}_1|\), originated by one of each of the \(|\varvec{w}_1|\) individuals at time \(t_1\). We may assume that each \({\mathcal {T}}^{(i)}\) is started by an individual of state \(\tau ^{(i)}\), where \(\tau ^{(1)},\ldots ,\tau ^{(|\varvec{w}_1|)}\) is some ordering of the \(|\varvec{w}_1|\) surviving originator states. Probability for the surviving lineages is

$$\begin{aligned}&P_{t_1;t_1,\ldots ,t_n}(\varvec{w}_1;\varvec{w}_1,\ldots ,\varvec{w}_n)\\&\quad ={\mathbb {P}}\left[ \varvec{W}({t_j})=\varvec{w}_j,\,1\le j\le n\,\big |\,\varvec{Z}({t_1})=\varvec{w}_1\right] \\&\quad ={\mathbb {P}}\left[ \varvec{W}({t_j})({\mathcal {T}}^{(i)})\ne 0\, \forall i,\;\, \sum _{i=1}^{|\varvec{w}_1|} \varvec{W}({t_j})({\mathcal {T}}^{(i)})=\varvec{w}_j, \,\forall 2\le j\le n\right] \end{aligned}$$

where \(\varvec{W}(t)({\mathcal {T}}^{(i)})\) denotes the number of individuals of \({\mathcal {T}}^{(i)}\) at time t which have a surviving lineage at time T. Since the subtrees \({\mathcal {T}}^{(i)}\) are independent

$$\begin{aligned}&P_{t_1;t_1,\ldots ,t_n}\left( \varvec{w}_1;\varvec{w}_1,\ldots ,\varvec{w}_n\right) \nonumber \\&\quad = \displaystyle \sum _{\begin{array}{c} \forall 2\le j\le n,\; \left( \varvec{w}_j^{\left( i\right) }\right) _{1\le i\le |\varvec{w}_1|}:\\ \varvec{w}_j^{\left( i\right) }>0,\; \sum _{i=1}^{|\varvec{w}_1|}\varvec{w}_j^{\left( i\right) } = \varvec{w}_j \end{array}} \prod _{i=1}^{|\varvec{w}_1|}P_{t_1;t_2,\ldots ,t_n}\left( \varvec{e}_{\tau ^{\left( i\right) }};\varvec{w}^{\left( i\right) }_2,\ldots ,\varvec{w}^{\left( i\right) }_n\right) , \end{aligned}$$
(13)

where \(\varvec{e}_i\) denotes the unit k-dimensional vector whose i-th coordinate is 1 and all other coordinates are 0, and the summation is over all possible decompositions of \(\varvec{w}_j\) into vectors \((\varvec{w}_j^{(i)})_{i=1,\ldots ,|\varvec{w}_1|}\) with all nonzero coordinate values, for each \(j=2,\dots , n\). By the inductive hypothesis (10) for \(n-1\), the probabilities in the product on the right side are equal to

$$\begin{aligned}&P_{t_1;t_2,\ldots ,t_n}\left( \varvec{e}_{\tau ^{\left( i\right) }};\varvec{w}^{\left( i\right) }_2,\ldots ,\varvec{w}^{\left( i\right) }_n\right) \\&\quad =\displaystyle P_{t_1;t_2,\ldots ,t_{n-1}}\left( \varvec{e}_{\tau ^{\left( i\right) }};\varvec{w}^{\left( i\right) }_2,\ldots ,\varvec{w}^{\left( i\right) }_{n-1}\right) \frac{ P_{t_1;t_{n-1},t_n}\left( \varvec{e}_{\tau ^{\left( i\right) }};\varvec{w}^{\left( i\right) }_{n-1},\varvec{w}^{\left( i\right) }_n\right) }{ P_{t_1;t_{n-1}}\left( \varvec{e}_{\tau ^{\left( i\right) }};\varvec{w}^{\left( i\right) }_{n-1}\right) }\\&\quad = P_{t_1;t_2,\ldots ,t_{n-1}}\left( \varvec{e}_{\tau ^{\left( i\right) }};\varvec{w}^{\left( i\right) }_2,\ldots ,\varvec{w}^{\left( i\right) }_{n-1}\right) {\mathbb {P}}\left[ \varvec{W}\left( {t_n}\right) =\varvec{w}_n^{\left( i\right) }|\varvec{W}\left( {t_{n-1}}\right) \right. \\&\quad \left. =\varvec{Z}\left( {t_{n-1}}\right) =\varvec{w}_{n-1}^{\left( i\right) }\right] \end{aligned}$$

where the last equality follows from (11) and (12) since

$$\begin{aligned}&P_{t_1;t_{n-1},t_n}\left( \varvec{e}_{\tau ^{\left( i\right) }};\varvec{w}^{\left( i\right) }_{n-1},\varvec{w}^{\left( i\right) }_n\right) \\&\quad =\,{\mathbb {P}}\left[ \varvec{W}\left( {t_{n-1}}\right) =\varvec{w}_{n-1}^{\left( i\right) }, \varvec{W}\left( {t_n}\right) =\varvec{w}_n^{\left( i\right) }|\varvec{Z}\left( {t_1}\right) =\varvec{e}_{\tau ^{\left( i\right) }}\right] \\&\quad =\sum _{\varvec{z}_{n-1}\ge \varvec{w}_{n-1}} {\mathbb {P}}\left[ Z\left( {t_n-1}\right) =\varvec{z}_{n-1}|\varvec{Z}\left( {t_1}\right) =\varvec{e}_{\tau ^{\left( i\right) }}\right] C_{\varvec{z}_{n-1};\varvec{w}_{n-1}^{\left( i\right) }} p^{\varvec{0}}_{\varvec{z}_{n-1}-\varvec{w}_{n-1}}\left( t_{n-1},T\right) \\&\qquad \times {\mathbb {P}}\left[ \varvec{W}\left( {t_n}\right) =\varvec{w}_n^{\left( i\right) }|\varvec{W}\left( {t_{n-1}}\right) =\varvec{Z}\left( {t_{n-1}}\right) =\varvec{w}_{n-1}^{\left( i\right) }\right] \\&\quad =\,{\mathbb {P}}\left[ \varvec{W}\left( {t_n}\right) =\varvec{w}_n^{\left( i\right) }|\varvec{W}\left( {t_{n-1}}\right) =\varvec{Z}\left( {t_{n-1}}\right) =\varvec{w}_{n-1}^{\left( i\right) }\right] P_{t_1;t_{n-1}}\left( \varvec{e}_{\tau ^{\left( i\right) }};\varvec{w}^{\left( i\right) }_{n-1}\right) . \end{aligned}$$

As the first factor on the right side above does not depend on \((\varvec{w}_n^{(i)})_{i=1,\ldots ,|\varvec{w}_1|}\) the sum in (13) may be split into outer sums, over \(2\le j\le n-1\), and an inner sum, over \(j=n\) that is equal to

$$\begin{aligned} \sum _{\begin{array}{c} \left( \varvec{w}_n\right) :\varvec{w}_n^{\left( i\right) }>0,\\ \sum _{i=1}^{|\varvec{w}_1|}\varvec{w}_n^{\left( i\right) } = \varvec{w}_n \end{array}} \prod _{i=1}^{|\varvec{w}_1|}{\mathbb {P}}\left[ \varvec{W}\left( {t_n}\right) =\varvec{w}_n^{\left( i\right) }|\varvec{W}\left( {t_{n-1}}\right) =\varvec{Z}\left( {t_{n-1}}\right) =\varvec{w}_{n-1}^{\left( i\right) }\right] . \end{aligned}$$

By the same argument using splitting over independent subtrees, but this time splitting the individuals at time \(t_{n-1}\) into subsets of sizes \((\varvec{w}_{n-1}^{(i)})_{i=1,\ldots ,|\varvec{w}_1|}\), we can show that this sum contributes to the outer sums a factor of

$$\begin{aligned} {\mathbb {P}}\left[ \varvec{W}\left( {t_n}\right) =\varvec{w}_n|\varvec{W}\left( {t_{n-1}}\right) =\varvec{Z}\left( {t_{n-1}}\right) =\varvec{w}_{n-1}\right] = \frac{ P_{t_0;t_{n-1},t_n}\left( \varvec{z}_0;\varvec{w}_{n-1},\varvec{w}_n\right) }{ P_{t_0;t_{n-1}}\left( \varvec{z}_0;\varvec{w}_{n-1}\right) }, \end{aligned}$$

where the last equality follows again from Eqs. (11) and (12), and combining with the outer sums in (13) implies

$$\begin{aligned}&P_{t_1;t_1,\ldots ,t_n}\left( \varvec{w}_1;\varvec{w}_1,\ldots ,\varvec{w}_n\right) \\&\quad = P_{t_1;t_1,\ldots ,t_{n-1}}\left( \varvec{w}_1;\varvec{w}_1,\ldots ,\varvec{w}_{n-1}\right) \frac{ P_{t_0;t_{n-1},t_n}\left( \varvec{z}_0;\varvec{w}_{n-1},\varvec{w}_n\right) }{ P_{t_0;t_{n-1}}\left( \varvec{z}_0;\varvec{w}_{n-1}\right) }, \end{aligned}$$

as wanted. By using once again Eqs. (11) and (12), this becomes Eq. (10) for step n. Equation (10) may be written in terms of conditional probabilities as

$$\begin{aligned}&{\mathbb {P}}\left[ \varvec{W}\left( {t_n}\right) =\varvec{w}_n\,\big |\,\varvec{W}\left( {t_j}\right) =\varvec{w}_j,\,1\le j\le n-1,\,\varvec{Z}\left( {t_0}\right) =\varvec{z}_0\right] \\&\quad ={\mathbb {P}}\left[ \varvec{W}\left( {t_n}\right) =\varvec{w}_n\,\big |\,\varvec{W}\left( {t_{n-1}}\right) =\varvec{w}_{n-1},\,\varvec{Z}\left( {t_0}\right) =\varvec{z}_0\right] \end{aligned}$$

which implies the Markov property for \((\varvec{W}(t))_{t\ge 0}\). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Popovic, L., Rivas, M. Topology and inference for Yule trees with multiple states. J. Math. Biol. 73, 1251–1291 (2016). https://doi.org/10.1007/s00285-016-0992-6

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00285-016-0992-6

Keywords

Mathematics Subject Classification

Navigation