Critical case stochastic phylogenetic tree model via the Laplace transform

Birth-and-death models are now a common mathematical tool to describe branching patterns observed in real-world phylogenetic trees. Liggett and Schinazi (2009) is one such example. The authors propose a simple birth-and-death model that is compatible with phylogenetic trees of both influenza and HIV, depending on the birth rate parameter. An interesting special case of this model is the critical case where the birth rate equals the death rate. This is a non-trivial situation and to study its asymptotic behaviour we employed the Laplace transform. With this we correct the proof of Liggett and Schinazi (2009) in the critical case.


Introduction
Different viral types have phylogenetic trees exhibiting different branching properties, with influenza and HIV being two extreme examples. In the influenza tree a single type dominates for a long time with other types dying out quickly until suddenly a new type completely takes over and the old type dies out. The HIV phylogeny is the complete opposite, with a large number of co-existing types.
In [6] a stochastic model is described and depending on the choice of parameters it can exhibit both types of dynamics. We briefly describe the model after [6]. We only keep track of the number of different viral types at each time point t. Let N (t) denote the number of distinct viral types at time t. In the nomenclature of phylogenetics N (t) counts the number of different species alive at time t. At each time point the birth rate is λN (t) and the death rate is N (t). If there is only one type alive then it cannot die. Clearly N (t) is a Markov chain with discrete state space and continuous time. Each virus type is described by a fitness value that is randomly chosen at its birth. If a death event occurs the type with smallest fitness dies. This means that only the fitness ranks matter and so the exact distribution of a virus' fitness will not play a role.
The main result of [6] is the asymptotic behaviour of the dominating type, whether it is expected to remain the same for long stretches of time or change often. This is summarized in Theorem 1 [6], Theorem 1.1 (Theorem 1, [6]). Take α ∈ (0, 1). If λ ≤ 1 then lim t→∞ P(maximal types at times αt and t are the same) = α, while if λ > 1 then this limit is 0.
The proof of this theorem is based on considering successive visits to the state 1, in particular denote τ 1 , τ 2 , . . . to be the (random) times between visits of the chain to N (t) = 1 and T n := τ 1 + . . . + τ n . In [6] the latter random variable is represented as where X i are independent mean 1 exponential random variables and H i are the (independent) hitting times of the state 1 conditional on starting in state 2. Descriptively H i is the ith return from state 2 to the state of one virus type alive, since from 1 the chain has to jump to two types. The Markovian nature of the process ensures that the H i s are independent and identically distributed for distinct is. In the proof of Theorem 1 it is stated that the cumulative distribution function of H i , F (t), satisfies, solved uniquely for, This gives the asymptotic behaviour (Lemma 3 [6]), from which the result of Theorem 1 is derived when λ = 1. However Eq. (1.1) does not take into account that this model differs from a classical birth-death model where 0 is the absorbing state. We illustrate this in Fig. 1. We can easily re-numerate the state values, but the intensity values will differ between the two models. Correcting for this difference in intensity values one will still get the same asymptotics as in Eq. (1.3) and hence the same result as in Theorem 1 but with a more complicated proof. Below we present a correct proof of Lemma 3 [6] in the case of λ = 1, based on Lemmas 2.1 and 2.2. From this the statement of Theorem 1 of [6] follows.

Auxiliary lemmas
Then P 1 (t) solves the renewal equation Proof. In panel A of Fig. 1 we can see a representation of the studied Markov chain on the state space S = 1, 2, 3, . . .. Due to the H i s being independent for different is we can study the distribution of H 1 and treat 1 as an absorbing state. Let P n (t) denote the probability of being in state n at time t, when one starts in state 2 at time 0. The system of differential equations describing the probabilities is, t) = −2nP n (t) + (n + 1)P n+1 (t) + (n − 1)P n−1 (t), n ≥ 3 Taking first derivatives we get the partial differential equation, with initial conditions Following §2.1 [2] and using the substitution Evaluating the derivative of both sides with respect to s at 0 we find that the function P 1 (t) must satisfy the following integral equation (we can recognize it as a renewal equation), (1 + (t − y)) 3 P 1 (y)dy.
Lemma 2.2. F (t) ≡ P 1 (t), the solution of the renewal equation (2.1), has the following properties, Proof. The proof is based on Tauberian theory and we refer the reader to [3,4] for details on this. Another approach would be to study the asymptotic behaviour of F ′ (t) by renewal theory results (see e.g. [1,5]). The Laplace transform of a density We will use the following theorem from [3], Theorem 2 §XIII.5, Theorem 2.3 (Theorem 2 §XIII.5, [3]). If L is slowly varying at infinity and 0 ≤ ρ < ∞, then each of the relations implies the other.
F (t) ≡ P 1 (t) defined as the solution to the renewal equation (2.1) after differentiating will satisfy We calculate the Laplace transforms of F ′ (t), tF ′ (t) and t 2 F ′ (t), (2.14) We are interested in the behaviour of the transforms as s → 0, and for this we will use the well known property (verifiable by the de L'Hôspital rule) of the exponential integral, to arrive at, (2.17) As both the constant function (s 0 ) and log(1/s) are slowly varying functions for s → 0 the Tauberian theorem allows us to conclude that, 3. Proof of Lemma 3 [6] We will now use Lemmas 2.1 and 2.2 to prove Lemma 3 from [6]. Define as there, m n := ρn 0 tF ′ (t)dt and s n := ρn 0 t 2 F ′ (t)dt, with ρ n := n log(n). By Lemma 2.2 we know that, m n ∼ log(ρ n ) ∼ log(n) and s n ∼ ρ n .
We now need to check how n(1 − F (ρ n )) behaves asymptotically. We do not know what F (ρ n ) is but using the Tauberian theorem and Eq. (2.17) from the proof of Lemma 2.2 we get that, Therefore using integration by parts and Eq. (2.19), and so we arrive at The rest of the proof is a direct repeat of the one in [6] and so we get (as in [6]) that T n /(n log(n)) tends to 1 in probability implying Theorem 1 [6] for λ = 1.
If we applied the same chain of reasoning to the model of panel B in Fig. 1 starting off with the system of differential equations, the analogue of Eq. (2.4) would be a homogeneous partial differential equation,