Coalescent models derived from birth–death processes

A coalescent model of a sample of size n is derived from a birth–death process that originates at a random time in the past from a single founder individual. Over time, the descendants of the founder evolve into a population of large (infinite) size from which a sample of size n is taken. The parameters and time of the birth–death process are scaled in N 0 , the size of the present-day population, while letting N 0 → ∞ , similarly to how the standard Kingman coalescent process arises from the Wright– Fisher model. The model is named the Limit Birth–Death (LBD) coalescent model . Simulations from the LBD coalescent model with sample size n are computationally slow compared to standard coalescent models. Therefore, we suggest different approximations to the LBD coalescent model assuming the population size is a deterministic function of time rather than a stochastic process. Furthermore, we introduce a hybrid LBD coalescent model, that combines the exactness of the LBD coalescent model model with the speed of the approximations.


Introduction
A population that originates from a single individual appears in multiple biological contexts.For example, a mutant allele might have arisen only once in the past, giving rise to a population of mutant alleles.Also, a family of homologous genes evolves from a single gene copy by gene duplication and loss (Ohta, 1989;Hurles, 2004).In addition, a population of cancer cells composing a tumor typically derive from a single precursor cell (Nowell, 1976).Furthermore, virus infections might be caused by a single particle that infects the host (Zwart et al., 2009;Zwart and Elena, 2015).Finally, species that belong to a monophyletic group share a single common ancestor in the past.
A general birth-death process appears to be an adequate framework to model such scenarios (Kendall, 1948;Kimmel and Axelrod, 2015;MacPherson et al., 2021).Since Yule's seminal paper (Yule, 1924), birth-death processes have been widely applied in biology, for example, in the context of epidemic diseases (Andersson and Britton, 2000), gene families sizes (Demuth et al., 2006), protein domain evolution (Karev et al., 2002), macroevolution (Nee, 2006), and ecology and evolution (Crawford and Suchard, 2012).In the context of evolution, typically a (small) sample is taken from a (large) population of unknown size.A main interest is often to describe the ancestry of the sampled individuals in terms of a coalescent process (Wakeley, 2008).
The birth-death process with Bernoulli sampling provides one solution to the problem of describing the sample ancestry in terms of a coalescent process (Gernhard, 2008a,b;Stadler, 2009;Lambert and Stadler, 2013;Ignatieva et al., 2020).This process is initiated at a random time in the past with a single individual that evolves forward in time according to a standard birth-death process.At the present time, a random number of individuals are chosen from the population (of unknown size), with a given probability.If the sampling probability is one, then the entire population is sampled.The coalescent process describing the ancestry of the sampled individuals under this framework is called the conditioned reconstructed process with Bernoulli sampling (CRPB) (Gernhard, 2008b).Explicit mathematical expressions can be obtained for CRPB quantities of interest, and efficient tools for Bayesian inference have been developed (Drummond et al., 2012;Drummond and Bouckaert, 2015).
However, despite its mathematical simplicity, the CRPB has its shortcomings.In many real situations, the number of sampled individuals is controlled by the researcher, but the CRPB assumes the sample size is a random variable (which might result in zero, one, . . .individuals).In the CRPB, this implies that the sample size correlates positively with the population size, and hence also with the age of the founder individual (not to be confused with the most recent common ancestor (MRCA) of the sample).In other words, the age of the founder individual depends on the sample size (Gernhard, 2008b;Wiuf, 2018).If the sample size is large, then the founder is expected to be older than if the sample size is small, simply because the number of (randomly) sampled individuals reflects the size of the entire population.
To overcome this limitation, we propose a new approach.Our starting point is a birth-death process with a single origin at a random time in the past.At the present time, the population is assumed to be large, say, with N 0 individuals.By scaling the parameters and time in N 0 and letting N 0 → ∞, we derive a coalescent process, named the LBD coalescent model (LBD = Limit Birth-Death), of a sample of n individuals.This approach is akin to the standard (i.e., large N 0 ) coalescent processes in population genetics.Conveniently, this process does not have a parameter for the (unknown) current population size, so it implies one parameter less than the original birth-death process.Furthermore, the sample size does not correlate positively with the population size as in the CRPB.
We describe this new LBD coalescent model of a sample of size n, and provide means to simulate it.A similar approach has been applied in population genetics to study a population, and a sample, of rare alleles (Wiuf, 2000(Wiuf, , 2018)), see also Thompson (1975).However, this strategy can be computationally slow, so we also propose different approximations inspired by the coalescent with deterministic (time-dependent) population size (Griffiths and Tavaré, 1994), approximating the stochastic population size over time with a deterministic function.One of these approximations is the coalescent model with exponential growth (Slatkin and Hudson, 1991), that has been investigated previously in connection with birth-death processes (Stadler et al., 2015;Volz and Frost, 2014).We demonstrate theoretically and by simulation that these deterministic population-size approximations behave well near the present time, but become gradually worse as the population size becomes small and stochastic fluctuations cannot be ignored.To leverage the advantages of the LBD coalescent model and the deterministic population-size approximations, we propose a hybrid approach and further an ad hoc approach that maximize the trade-off between accuracy and computational speed.Surprisingly, the ad hoc approach appears to be more efficient than the hybrid approach, both in terms of computational time and accuracy.
Simulation details are in Appendix A. Mathematical derivations are described in the Supplementary Material.

The standard birth-death process
Parameters.The starting point for the LBD coalescent model is a standard birth-death process (Kendall, 1948), see Fig. 1.This process has two parameters, a birth rate ν, and a death rate µ.

Define
where ν > 0, and µ > 0 (Wiuf, 2018).We assume ν ̸ = µ for convenience, and note that ν = µ results in different mathematical expressions, but is otherwise not special.Hence, (γ , δ) ∈ R 2 + .Furthermore, (γ , δ) remains the same if the birth rate and the death rate are interchanged.Note that γ is a dimensionless constant, thus a characteristic of the process, while δ depends on the chosen time scale.If the time scale is changed, γ remains invariant.The parameter γ might be interpreted as a growth rate independent of the chosen unit of time.
A population of finite size N 0 .We assume the birth-death process is initiated at a random time (assuming an improper uniform prior) in the past, and evolves until the present time, when there are N 0 individuals.A reverse (coalescent) process going backwards in time, describing the ancestry of all N 0 individuals is the following, see Gernhard (2008b) and Fig. 2. Let T 1 , T 2 , . . ., T N 0 be the coalescent times, such that T 1 is the age of the population, and T k , k = 2, . . ., N 0 , the time while there at least k ancestors.Clearly, The density of the increment where γ s = 1 + (γ − 1)e −δs .Hence T ℓ = s + t.If k = N 0 , then s = 0 and γ 0 = γ .For ℓ = 1 and k = N 0 , we find the density of T 1 , the age of the process, For details, see Gernhard (2008b) (using a different parameterization) or (Wiuf, 2018) (using the same parameterization as here).
We name the process T 1 , T 2 , . . ., T N 0 with distribution given by (1)-(2) the birth-death coalescent process with population size N 0 .

The LBD coalescent model with sample size n
In many biological scenarios, the population size is much bigger than the sample size, N 0 ≫ n.Here, we consider the situation where N 0 is large and unknown, hence it is sensible to consider the limiting model as N 0 → ∞.This approach is common in population genetics (Wakeley, 2008).Conveniently, N 0 is a parameter that is difficult to estimate reliably from empirical data and taking N 0 → ∞ reduces the number of parameters by one.
The population process described by (1)-(2) has three parameters N 0 , δ, γ .Consider the scenario: where, N 0 δ → ∆ (resp.N 0 γ → Γ ) means we consider models such that the product of N 0 and δ (resp.γ ) converges to ∆ (resp.Γ ), while N 0 → ∞, implying δ → 0 (resp.γ → 0) at the same time.Compared to a standard coalescent model, N 0 plays the role of the effective population size.where T k is the time while there are at least k ancestors to the present day population and T 1 is the age of the population, assuming N 0 individuals at the present time.To the left are shown the coalescent times of the sample, T2 , T3 , where Tk is the time while there are at least k ancestors to the sample.The sample's ancestry is shown in red.There are m k−1 ancestors of the population the first time there are k − 1 ancestors of the sample, such that Tk = T m k−1 +1 , k = 2, . . ., n.For example, for k = 3, m k−1 = m 2 = 4 (there are four lineages in total the first time there are only two red lineages, backwards in time) and We derive the coalescent times of the whole population under the constraint (3), and use Saunders et al. (1984, Lemma 3) to derive the coalescent times of the sample in terms of those of the entire population.We name the coalescent process of the sample the LBD coalescent model (with sample size n), as mentioned earlier.
The limit of (1) under the constraint (3) is and for k = ∞ (a population of infinite size), (5) (Wiuf, 2018).In the LBD coalescent model, time is scaled in units of N 0 (as N 0 → ∞) as is standard in large N 0 coalescent approximations.Note that (5) depends on Γ , while (4) does not.We emphasize that Γ is dimensionless and therefore invariant under rescaling of time, while ∆ is not.We therefore assume ∆ = 1 in simulations.
Consider a sample of size n taken from the population and denote the coalescent times of the sample by Tn , . . . ,T2 , see Fig. 2. That is, Tk , k = 2, . . . ,n, is the time while there are at least ancestors of the sample.Let the (stochastic) number of ancestors of the entire population be m k , k = 1, . . ., n − 1, the first time there are k ancestors of the sample.Hence, Tk = T m k−1 +1 .When there are at least k ancestors of the sample, there are at least m k−1 + 1 ancestors of the population, see Fig. 2.
According to Saunders et al. (1984, Lemma 3), the distribution with x = k − 1, . . ., m k − 1, and for k = n, we have m n = ∞ and . (7) The lineages that coalesce are chosen uniformly among the ancestral lineages, similarly to other neutral coalescent processes (Kingman, 1982).Thus, to draw a genealogy of a sample of size n from the LBD coalescent model, we first draw the numbers 6)-( 7), and then draw the times Tk = T m k−1 +1 , k = 2, . . . ,n from ( 4)-( 5).Simulation details are provided in Appendix A.
The limit expressions (4)-( 5) are in good agreement with those of the birth-death coalescent process with population size N 0 (1)-( 2), even for relatively small N 0 , such as N 0 = 100.This is illustrated in Figure S1 in the Supplementary Material, where the time of the origin, T 1 , and the time of the MRCA of the population, T 2 , respectively, are compared for different values of N 0 , including the limit N 0 → ∞.The approximation accuracy depends on Γ , but not ∆, as ∆ just scales time.
We end this section by pointing out related work in the literature.In Stadler (2009), an expression is derived from ( 1)-( 2) for the density of the coalescent time of a sample of size n = 2; with and without taking the limit as N 0 → ∞.In Harris et al. (2019), continuous-time Galton-Watson processes are discussed.The coalescent times of a sample of fixed size, say n, taken from a population of unknown size, are derived, assuming the time of the origin T 1 is a fixed parameter, unlike in our case where we put a prior distribution on T 1 .Some general results are derived in the critical case ν = µ (which is not covered in this paper), and in the scaling limit as T 1 → ∞, assuming the mean number of offspring is 1 The mean number of offspring converges to one with increasing T 1 .In Grosjean and Huillet (2018), the transition structure and coalescence times of a sample of fixed size are discussed for general Bienaymé-Galton-Watson processes, assuming the origin is a fixed parameter.A different scaling limit is considered in Ignatieva et al. (2020) in the context of the CRPB with vanishing sampling probability, ψ → 0. As a consequence, the current population size increases towards infinity in the limit.Exact expressions are obtained for distributions of interest.This approach, though mathematically very appealing, has the same problem as the CRPB, namely, that the age of the founder individual depends on the sample size.Finally, discrete-time Galton-Watson processes are analysed in Burden and Soewongsono (2019) in the diffusion limit as ν → 1 and T 1 → ∞ (in a particular way), while the number of founder individuals becomes infinitely large.

Approximations to the LBD coalescent model
In the LBD coalescent model we have to simulate the number of ancestors of the entire population at each coalescent event of the sample, which is computationally intensive.To simplify simulations and speed up computations, we next propose three different approximations to the LBD coalescent model that assume a coalescent process with time-varying deterministic population size (Griffiths and Tavaré, 1994).This avoids the hazzle of simulating the number of ancestors of the entire population at each coalescent event of the sample.
We will argue heuristically for the approximations.Let N(t) denote the stochastic population size at time t in the past, and let N(0) = N 0 .For the moment, we assume time is unscaled.Assume there are 2 ≤ k ≤ n ancestors of a sample of present size n.If a coalescent event happens at time t in the past (which is a birth event in the forward direction of time), then the probability that the coalescent event is between two of the k ancestors, is (Saunders et al., 1984) (there are N(t) + 1 individuals in the population just prior to the coalescent event).By conditioning on the population size process, N = (N(t), t ≥ 0), the probability that a coalescent event has not happened between time Tk+1 = s and time t > s, is assuming the population size is large.Here B s,t denotes the number of birth events between time s and t, s i denotes the time of the ith birth event, and the number of birth events within a small time interval is of order νN(z) in probability.
By taking the expectation over the population size process, we obtain where E N 0 denotes the expectation over all possible outcomes of the population size process, assuming N(0) = N 0 .Later, we also condition on the age of the origin, in which case the expectation in ( 8) is over all outcomes of the population size process with a given age.By Jensen's inequality, where ≳ means approximately greater than.In (9), we have (approximate) equality if the population sizes are large and the stochastic fluctuations comparably small (Sjodin et al., 2005;Krone and Kai, 2003).In fact, we later show this is the case for small (population scaled) times, see (18), whereas it fails for large times.Also by Jensen's inequality, exp with the same amendments about equality as before.The right hand side in (10) might in some cases be closer to the probability P N 0 ( Tk > t| Tk+1 = s) than the left hand side.In fact, as the function x ↦ → e −1/x is convex for 0 < x < 1 2 , the right hand side of (10) will be larger than P N 0 ( Tk > t| Tk+1 = s), whenever the exponent of the exponential is smaller than 1 2 .We note that the factor 2ν is consistent with Volz et al. (2009).
If the population size is large and stochastic effects comparably small, then equality is expected above (Sjodin et al., 2005;Krone and Kai, 2003).In that case, by scaling time in units of N 0 , assuming N 0 δ → ∆ and N 0 γ → Γ , as in (3), the right hand sides of ( 9) and ( 10) give rise to time-varying population size coalescent processes (Griffiths and Tavaré, 1994).In a timevarying population size coalescent process, the random variable Tk , conditional on Tk+1 = u has density and distribution function, respectively, given by k = 2, . . ., n, where Tn+1 = 0 for convenience, λ(u) is the coalescent intensity at time u ≥ 0, and Λ(v) = ∫ v 0 λ(u)du, the accumulated intensity (Griffiths and Tavaré, 1994).Taking λ(u) = u, we retrieve the standard Kingman coalescent.See Appendix A on how to simulate from a time-varying population size coalescent process (11).
In our case, the coalescent intensities are derived from the exponents of the right hand sides of ( 9)-( 10); , respectively, (b) 2ν under the assumption of large N 0 and by rescaling of time to units (where T 1 is in units of N 0 ).Then, the coalescent intensities for the two cases in ( 13) are the large N 0 limits of The limits are derived in the Supplementary Material, Sections 2 and 3, and discussed below.Before providing the details, we note that Approximation 1: The M * coalescent model, based on (14)(a).To take some of the stochasticity in the population size process into account, we further condition on the age of the process.The coalescent intensity at time u given the age T 1 is the large N 0 limit of (14)(a), ) , for 0 ≤ u ≤ T 1 , see the Supplementary Material (Sections 2 and 3) for details.
As λ * (u, T 1 ) → ∞ as u → T 1 , then it is guaranteed that a MRCA of the sample is found eventually (Griffiths and Tavaré, 1994).This accumulated intensity Λ * (u, T 1 ) = ∫ u 0 λ * (s, T 1 )ds cannot be given in explicit form, hence we apply numerical approximations to compute it.
Approximation 2: The M 2 coalescent model, based on (13)(b).For the second approximation, we also condition on the age of the process.The average relative population size at time u given the age T 1 is ) , for 0 ≤ u ≤ T 1 , see the Supplementary Material (Sections 2 and 3) for details of the derivation.The relative population size at time T 1 is h(T 1 , T 1 ) = 0, which is the limit of 1/N 0 as N 0 → ∞.The coalescent intensity at time u given the age T 1 is the large N 0 limit of (13)(b), where In this case, the accumulated intensity can be found explicitly, ) .
As h(T 1 , T 1 ) = 0, then λ 2 (u, T 1 ) → ∞ as u → T 1 , and it is guaranteed that a MRCA of the sample is found before time T 1 (Griffiths and Tavaré, 1994).
Approximation 3: The M K coalescent model.We suggest one further heuristic approximation, that allows the coalescent intensity to be increased or decreased, relatively to M 2 and M * .It provides one way to repair the shortcomings of the M 2 and M * coalescent models (to be discussed later).
The coalescent intensity, λ 2 (u, T 1 ), for M 2 might alternatively be written as This suggests a suit of approximations, with intensities given by For small u, the intensities of M K and M 2 , respectively, M K and M * , agree, λ K (u, T 1 ) λ 2 (u, T 1 ) ≈ 1 and λ K (u, T 1 ) λ * (u, T 1 ) ≈ 1, for u ≈ 0, while for large u, we find Thus, by choosing K , one might speed up or slow down the coalescent intensity by a constant factor for large u, compared to M 2 and M * , respectively.In any case, λ K (u, T 1 ) ≥ λ 2 (u, T 1 ) for K ≤ 2. See Fig. 3 for an illustration.
For the accumulated intensity of M K , we find ) , with

The hybrid LBD coalescent model
The M * coalescent model is computationally inefficient, while M 2 might be unsatisfactory compared to the LBD coalescent model (to be discussed later).Therefore, it might be interesting to explore a hybrid approach between the LBD coalescent model and the M 2 coalescent model.To do this, for a given sample size n, we first simulate the coalescent times according to the M 2 coalescent model until there are k 0 ancestors of the sample at time Tk 0 +1 , for a fixed k 0 ≤ n.Then we continue with the LBD coalescent model from time Tk 0 +1 .For this, we need to know the number of ancestors A 0 of the whole population in the M 2 coalescent model at time Tk 0 +1 .
The number A 0 is estimated from the equation 2 noting that Λ 2 (•, T 1 ) provides a correspondence between the time in the M 2 coalescent model and the time in the standard Kingman coalescent model with constant population size.Such relationship is generally used to simulate times from a timevarying population size coalescent model (Griffiths and Tavaré, 1994).See the paragraph The M * and M K coalescent models, K > 0, in the Appendix A for details.The value 2/A 0 in (15) is the expectation of U A 0 +1 , the time until there are A 0 ancestors to the whole population in the standard Kingman coalescent model.If A 0 is large (enough), the time U A 0 +1 has negligible standard deviation compared to its expectation, making 2/A 0 an accurate estimate of U A 0 +1 , hence also of the corresponding time T A 0 +1 (= Tk 0 +1 ) in the LBD coalescent model, see the Supplementary Material (Section 5) for details of the calculation.From time Tk 0 +1 onwards the coalescent times are sampled according to the LBD coalescent model, letting m k 0 = A 0 initially.
In the LBD coalescent model, the number of ancestors m k of the entire population the first time there are k ancestors of the sample is time-consuming to simulate.Furthermore, the expectation of m n−1 is infinite, while it is nk/(n − k − 1), for k = 1, . . ., n−2, see the Supplementary Material (Section 1) for a demonstration.This suggests taking k 0 to be small.However, one cannot choose k 0 too small as this increases the probability that A 0 in ( 15) is smaller then k 0 , which leads to an impossibility.
A similar ''hybrid'' approach is taken in Bhaskara et al. ( 2014) in a study of the accuracy of the standard Kingman coalescent model for small population sizes.

Simulations
Implementation.We implemented functions in R (https://cran.rproject.org/)to simulate from the LBD coalescent model and the different approximations, including the hybrid LBD coalescent model.The R code is available at https://github.com/crespofabian8012.
We investigated the LBD coalescent model and the various approximations for two different sample sizes (n = 10,100) with fixed ∆ = 1, and varying values of Γ .For all choices of parameters, 10 4 simulations of ancestry were obtained (except for assessment of the running time).The effect of ∆ is a time scale transformation, hence there is no point in varying ∆.For the hybrid approach we took k 0 to be 50% or 25% of n, that is, The effect of Γ in the LBD coalescent model.Table 1 shows the effect of Γ on the genealogy relating a sample of individuals.Both the relationship between the age and the time of the MRCA as well as the relationship between branch lengths depend on Γ .Eventually, for large Γ , the tree becomes star shaped, and the time gap between the MRCA and the origin shrinks to zero.Fig. 4 shows single realizations of genealogies.

Computational time.
For the LBD coalescent model and the hybrid approach, the main determinant of computational speed is the simulation of the number of ancestors m k .As the distribution of this quantity is independent of Γ , the overall computational time is independent of Γ .The same is the case for the approximations M K , K > 0, while for M * computational time might depend on Γ through the concrete numerical procedure applied, though this does not seem to be an issue with our implementation.Shown is the coalescent intensity λ * (u, T 1 ) (black), λ 0.5 (u, T 1 ) (green), λ 1 (u, T 1 ) (red), and λ 2 (u, T 1 ) (blue) for Γ = 0.1 (top row) and Γ = 10 (bottom row), and two random draws of T 1 for each Γ (and ∆ = 1), according to (5).The coalescent intensities are not necessarily increasing functions over time, in particular, this might not be the case for small Γ and large T 1 (which is unlikely for small Γ ).This is even so for the intensities of M * and M 2 .The intensity of M K decreases with increasing K .The reduction (or increase) in computational time compared to the LBD coalescent model is shown in Table 2.In particular for the hybrid LBD coalescent model with n = 10, it happens that k 0 > A 0 , which causes an impossibility (the number of ancestors of the sample cannot be larger than the number of ancestors of the population).Since, we need to exclude these cases, this causes a bias towards genealogies with large A 0 .
The coalescent intensities and the relative population size.Examples of coalescent intensities for M * , M 2 and M K are given in Fig. 3.It appears the coalescent intensities are not necessarily monotonically increasing, and might first decrease, then increase.For M 2 , this implies the opposite pattern occurs for the relative population size h(u, T 1 ).However, this seemingly paradoxial result disappears when averaging over the distribution of T 1 , given in (5), see the Supplementary Material (Section 4) for an evaluation of the integral.The function H(u) is strictly decreasing, and does not depend on Γ in contrast to h(u, T 1 ).One might construct yet another time-varying population size coalescent process from ( 16) with intensity coinciding with the intensity of the coalescent with exponential growth (Slatkin and Hudson, 1991), (where the factor 2∆ Γ in front is a time scale adjustment, similar to Volz et al. (2009)).This model performs worse than the M 2 coalescent model (see Figure S2 in the Supplementary Material) and is not pursued further here.The model has the additional Ratios of expected LBD coalescent times for a sample of size n = 10.The second column shows the ratio of the expected time of the MRCA of the sample, E( T2 ), to the expected age of the population, E(T 1 ).The remaining columns show the expected time while there are k ancestors E( Tk − Tk+1 ), k = 2, . . . ,n, eddivided by E( T2 ), with Tn+1 = 0 for convenience.drawback that the age T 1 is not directly build into it.In contrast, a simulated coalescent history of a sample from one of the coalescent models described in this paper contains the coalescent times of the sample in addition to the age of the origin.
Accuracy of the M * and M 2 coalescent models.Fig. 5 shows the ratio of the median coalescent times for M * and M 2 coalescent models, compared to the LBD coalescent model.Both approximations are quite accurate near the present, but neither approximation is able to reproduce the distributions of the LBD coalescent times near the origin, with M * underestimating and M 2 overestimating the corresponding values.These findings are in agreement with the bounds ( 9) and ( 10), and might be attributed to the fact that large N 0 coalescent approximations are more accurate when the population size is (in fact) large.Also, when the population size is small, stochastic fluctuations are not negligible and increase the variance of the coalescent times (Sjodin et al., 2005;Krone and Kai, 2003).As the LBD coalescent model eventually ends with a single individual, large N 0 approximations are expected to fail eventually for both M * and M 2 .We note that the approximations seem to be more accurate for large values of Γ and n. and Γ = 10 (blue, green) with T 1 drawn randomly from the distribution (5).
Time is scaled in units of T 1 for comparison, hence it runs from 0 to 1.The coefficient of variation always approaches 1/ √ 2 ≈ 0.70 as the scaled time approaches one, but the speed by which it does so depends on Γ as well as on the concrete realization of T 1 .
To illustrate these observations further, we found the coefficient of variation as N 0 → ∞, (see the Supplementary Material, Section 3, for the derivation).
For u ≈ T 1 , CV(u, T 1 ) ≈ 1/ √ 2 ≈ 0.7, hence not only is the population size small, it is also far from being constant, see Fig. 6.
Accuracy of the M K coalescent model.Fig. 7 shows the performance of the M 0.8 coalescent model, where K = 0.8 minimizes the squared deviation between median coalescent times of the LBD coalescent model and those of the M K coalescent model (see Figures S3 and S4 in the Supplementary Material), for the three values Γ = 0.001, 0.1, 10.The same value of K was found for n = 10 and n = 100 (with the chosen precision).The median coalescent times are generally in very good agreement with the LBD coalescent model (note the y-scale in Fig. 7).
In addition, we calculated the correlation coefficients for pairs of coalescent times for the M 0.8 coalescent model and the LBD coalescent model, and found that these generally are in good agreement with each other, see Figures S5-S7 in the Supplementary Material for n = 100.This agreement is best near the present time and gradually worsen deeper into the genealogy.
We chose the same K for all n and Γ .Simulations indicate the best K depends to some degree on n and Γ , though the dependence seems to be weak for the values investigated.This is also seen in Figures S5-S7 in the Supplementary Material .
The hybrid LBD coalescent model.The M K coalescent model, for any K ≥ 0, is generally much faster to simulate from than the LBD coalescent model (Table 2).This makes the hybrid LBD coalescent model speed-efficient as well, though the speed reduction depends on the chosen k 0 .
Fig. 8 shows that the median coalescent times of the hybrid LBD coalescent model with K = 2 fits well to the median coalescent times of the LBD coalescent model (only n = 100 is shown).For the correlation coefficients, we find similar agreement to that of the M 0.8 coalescent model, see Figures S5-S7 in the Supplementary Material for n = 100.However, surprisingly the correlation coefficients appear to be (slightly) less accurate than those of M 0.8 .One possible explanation for this might be that the distribution of m k 0 (the number of ancestors of the population when the hybrid approach changes from M 2 to the LBD coalescent model) differs from that of the pure LBD coalescent model.For the hybrid LBD coalescent model, the simulated mean (variance) of m k 0 is 13.42 (59.66) for n = 10 and k 0 =

Discussion
We have proposed a coalescent framework based on a birthdeath process under the constraint of a large (infinite) presentday population size, that we named the LBD coalescent model.To simulate from the this model with sample size n, one needs to simulate the number of ancestors of the entire population at each coalescent event of the sample.This appears difficult to do efficiently.To overcome this, we suggest efficient approximations in terms of accuracy and computational time.However, as the population size is stochastic and eventually decreases towards one, these approximations eventually fail when time is near the age of the population.To repair this deficiency, we proposed alternative models and a hybrid model approach, that combines the LBD coalescent framework with the M 2 coalescent model.We showed using simulation that both the M K (for K = 0.8) coalescent model and the hybrid LBD coalescent model are efficient and accurate, while the M 2 and the M * coalescent models are less accurate.
When using a time-varying population size approach, the stochastic nature of the birth-death process is partly ignored.In the present case, this problem is two-fold.On one hand, the population size varies stochastically.On the other hand, the population size decreases eventually to one.Both of these problems run counter to obtaining accurate time-varying population size coalescent models, as also addressed previously (Stadler et al., 2015;Krone and Kai, 2003;Griffiths and Tavaré, 1994).
Our approach shows one important difference to the CRPB in that the coalescent rates of the sample are quadratic in the number of lineages instead of linear.This difference is caused by the stochastic nature of the sample size in the CRPB (Gernhard, 2008b;Wiuf, 2018), where each sampled individual contributes a geometric random number to the entire population size (i.e., ∑ n i=1 X i , where X i are iid geometric random variables); hence the entire population size scales linearly in the number of sampled individuals and the coalescent rates become linear in the sample size.In contrast, the LBD coalescent model assumes the In Stadler et al. (2015), a birth-death process is compared to a coalescent model with exponentially growing population size (for a sample of size n = 2), determined by the expectation of the population size of the birth-death process running forward in time.This approach is different from our approach, though it results in the same coalescent intensity as in (17).It assumes a fixed age and a stochastically varying population size, including a stochastic current population size.One might see our approach as an attempt to control this variation by conditioning on the age of the origin (while keeping it stochastic) and fixing the present day population size, while letting it be large.
One drawback of the coalescent models M K , K > 0, is that the best K depends to some extent on n and Γ .This implies one might have to readjust K , for example, if Γ is updated stochastically in a Bayesian inference scheme.In contrast, time-varying population size coalescent approximations, such as M K , K > 0, facilitates using much of the inference machinery developed over the past twenty (or so) years for coalescent processes, including importance sampling (Griffiths and Tavaré, 1994), ABC (Cornuet et al., 2014) and more advanced Bayesian approaches (Drummond et al., 2012;Drummond and Bouckaert, 2015).Moreover, heterochronous sampling, which for example is relevant for modelling virus data, is relatively simple to handle in time-varying population size coalescent models (Drummond et al., 2005).It is considerably harder to handle heterochronous sampling in the LBD coalescent model, where the entire population size at any time in the past is stochastic and finite (Wiuf, 2018).Finiteness implies that individuals included, say at time u > 0 in the past, might coincide with ancestors of samples included prior to time u, or that more individuals should be included than there are individuals in the population at that time.These aspects naturally put constraints on sampling-based inference procedures for the LBD coalescent model.
It is noteworthy, that one cannot infer the parameter ∆ and the mutation rate, say θ , separately (assuming data are DNA sequences), but only jointly as a compound parameter θ /∆.It follows by rescaling time by ∆, hence the mutation rate in the new unit is θ/∆.In contrast, the parameter Γ , which is a scaled growth rate independent of the time unit, might be inferred.This is the case both for the LBD coalescent model and its approximations.
The LBD coalescent model is a relatively simple population process.However, our work demonstrates the difficulties in providing fast and reliable approximations in terms of time-varying population size coalescent models.

Fig. 1 .
Fig. 1.A standard birth-death process.Left: Outcome of a birth-death process.The population size varies stochastically over time with current population size N 0 = 4. Solid lines: lineages surviving until the present time.Dashed lines: lineages dying out before present time.Right: The same figure as before but with lineages going extinct removed.One lineage is 'ironed out' for illustrative purposes.Compared to standard coalescent processes, the tip above the MRCA of the population represents the time from the MRCA back to the origin.

Fig. 2 .
Fig. 2. Illustration of sample and population coalescent times.The entire population is of size N 0 = 6 (red and black dots) with sample size n = 3 (red dots).To the right are shown the coalescent times of the population T 1 , . . ., T 6 ,

Fig. 4 .
Fig. 4. Examples of genealogies and lineages-through-time plots.One example for each of Γ = 0.001, 0.1, 10, 1000, and n = 10.Top panel: Genealogies.Bottom panel: The number of lineages in the genealogy (normalized to one) against normalized time.As Γ increases the genealogy becomes more and more star-shaped. .

Fig. 5 .
Fig. 5. Ratios of median coalescent times for the M * and M 2 coalescent models.Shown is the ratios of the median coalescent times obtained for M * and M 2 over that for the LBD coalescent model for two sample sizes, n = 10,100, and three values of Γ = 0.001, 0.1, 10.Red: M 2 , Blue: M * .

Fig. 7 .
Fig. 7. Ratios of median coalescent times for M 0.8 coalescent model.Shown is the ratios of the median coalescent times obtained for M 0.8 over that for the LBD coalescent model for two sample sizes, n = 10,100, and three values of Γ = 0.001, 0.1, 10.Red: M 2 , n = 10, Green: M 2 , n = 100.Red: n = 10, Blue: n = 100.currentpopulation size is fixed, though large (N 0 → ∞).Hence the stochasticity in the current population size is controlled.InStadler et al. (2015), a birth-death process is compared to a coalescent model with exponentially growing population size

Fig. 8 .
Fig. 8. Ratios of median coalescent times for hybrid LBD coalescent model.Shown are the ratios of the median coalescent times obtained for the hybrid LBD coalescent model with k 0 = 25, 50 over that for the LBD coalescent model for n = 100, and three values of Γ = 0.001, 0.1, 10.Top panel: k 0 = 25.Bottom panel: k 0 = 50.

Table 2
Average running times.Shown is average running times for the LBD coalescent model in milliseconds, and the average running times in % of the LBD mean running time for the other coalescent models, for Γ = 10.For the hybrid LBD coalescent model (H), it is stated how often the condition k 0 > A 0 (Failed) occurred.These cases are excluded.Similar running times are obtained for other values of Γ .n: sample size, sim: number of simulations.The simulations were executed on a MacBook Pro Intel Core I5, 2.3 GHz, 8 GB RAM 2133 MHz LPDDR3, using a single core.
5, and 102.86  (255.62)forn = 100 and k 0 = 50, compared to the theoretically determined mean (variance) of m k 0 in the LBD coalescent model, 12.5 (56.25) for n = 10, and 102.04 (219.05) for n = 100 (see Supplementary Material, Section 1, for a derivation of the theoretical mean and variance of m k ).