Fast Bayesian parameter estimation for stochastic logistic growth models

The transition density of a stochastic, logistic population growth model with multiplicative intrinsic noise is analytically intractable. Inferring model parameter values by fitting such stochastic differential equation (SDE) models to data therefore requires relatively slow numerical simulation. Where such simulation is prohibitively slow, an alternative is to use model approximations which do have an analytically tractable transition density, enabling fast inference. We introduce two such approximations, with either multiplicative or additive intrinsic noise, each derived from the linear noise approximation of the logistic growth SDE. After Bayesian inference we find that our fast LNA models, using Kalman filter recursion for computation of marginal likelihoods, give similar posterior distributions to slow arbitrarily exact models. We also demonstrate that simulations from our LNA models better describe the characteristics of the stochastic logistic growth models than a related approach. Finally, we demonstrate that our LNA model with additive intrinsic noise and measurement error best describes an example set of longitudinal observations of microbial population size taken from a typical, genome-wide screening experiment.


Introduction
Stochastic models simultaneously describe dynamics and noise or heterogeneity in real systems (Chen et al., 2010).For example, stochastic models are increasingly recognised as necessary tools for understanding the behaviour of complex biological systems (Wilkinson, 2012(Wilkinson, , 2009) ) and are also used to capture uncertainty in financial market behaviour (Kijima, 2013;Koller, 2012).Many such models are written as continuous stochastic differential equations (SDEs) which often do not have analytical solutions and are slow to evaluate numerically compared to their deterministic counterparts.Simulation speed is often a particularly critical issue when inferring model parameter values by comparing simulated output with observed data (Hurn et al., 2007).
For SDE models where no explicit expression for the transition density is available, it is possible to infer parameter values by simulating a latent process using a data augmentation approach (Golightly and Wilkinson, 2005).However, this method is computationally intensive and not practical for all applications.When fast inference for SDEs is important, for example for real-time analysis as part of decision support systems or for big data inference problems where we simultaneously fit models to many thousands of datasets (e.g.Heydari et al. (2012)), we need an alternative approach.Here we demonstrate one such approach: developing an analytically tractable approximation to the original SDE, by making linear noise approximations (LNAs).We apply this approach to a SDE describing logistic population growth for the first time.
The logistic model of population growth, an ordinary differential equation (ODE) describing the self-limiting growth of a population of size X t at time t, was developed by Verhulst (1845) The ODE has the following analytic solution: where Q = K P − 1 e rt 0 and P = X t 0 .The model describes a population growing from an initial size P with an intrinsic growth rate r, undergoing approximately exponential growth which slows as the availability of some critical resource (e.g.nutrients or space) becomes limiting (Turner Jr. et al., 1976).Ultimately, population density saturates at the carrying capacity (maximum achievable population density) K, once the critical resource is exhausted.Where further flexibility is required, generalized forms of the logistic growth process (Tsoularis and Wallace, 2002;Peleg et al., 2007) may be used instead.
To account for uncertainty about processes affecting population growth which are not explicitly described by the deterministic logistic model, we can include a term describing intrinsic noise and consider a SDE version of the model (Li et al., 2011;Román-Román and Torres-Ruiz, 2012).Here we extend the ODE in (1) by adding a term representing multiplicative intrinsic noise (3) to give a model which we refer to as the stochastic logistic growth model (SLGM).
where X t 0 = P and is independent of W t , t ≥ t 0 , Alternative stochastic formulations of the logistic ODE can be generated (Campillo et al., 2013).The Kolmogorov forward equation has not been solved for (3) (or for any similar formulation of a logistic SDE) and so no explicit expression for the transition density is available.
Román-Román and Torres-Ruiz (2012) introduce a diffusion process approximating the SLGM (which we label RRTR) with a transition density that can be derived explicitly.The Bayesian approach can be applied in a natural way to carry out parameter inference for state space models with tractable transition densities (West and Harrison, 1997).A state space model describes the probabilistic dependence between an observation process variable X t and state process S t .The transition density is used to describe the state process S t and a measurement error structure is chosen to describe the relationship between X t and S t .
The Kalman filter (Kalman, 1960) is typically used to infer the hidden state process of interest S t and is an optimal estimator, minimising the mean square error of estimated parameters when all noise in the system can be assumed to be Gaussian.We use the Kalman filter to reduce computational time in a parameter inference algorithm by recursively computing the marginal likelihood (West and Harrison, 1997).
The RRTR can be fit to data within an acceptable time frame by assuming multiplicative measurement error to give a linear Gaussian structure, allowing us to use a Kalman filter for inference.
We introduce two new first order linear noise approximations (LNAs) (Wallace, 2010;Komorowski et al., 2009) of (3), one with multiplicative and one with additive intrinsic noise, which we label LNAM and LNAA respectively.The LNA reduces a SDE to a linear SDE with additive noise, which can be solved to give an explicit expression for the transition density.We derive transition densities for the two approximate models and construct a Kalman filter by choosing measurement noise to be either multiplicative or additive to construct a linear Gaussian structure.Exact simulations from the SGLM are compared with each of the three approximate models.We compare the utility of each of the approximate models during parameter inference by comparing simulations with both synthetic and real datasets.

The Román-Román and Torres-Ruiz (2012) diffusion process
Román-Román and Torres-Ruiz (2012) present a logistic growth diffusion process (RRTR) which has a transition density that can be written explicitly, allowing inference of model parameter values from discrete sampling trajectories.The RRTR is derived from the following ODE: The solution to (4) is given in (2) (it has the same solution as ( 1)).
Román-Román and Torres-Ruiz ( 2012) see ( 4) as a generalisation of the Malthusian growth model with a deterministic, time-dependent fertility h(t) = Qr e rt +Q , and replace this with Qr e rt +Q + σW t to obtain the following approximation to the SLGM: where Q = K P − 1 e rt 0 , P = X t 0 and is independent of W t , t ≥ t 0 .The process described in (5) is a particular case of the lognomal process with exogenous factors, therefore an exact transition density is available (Gutiérrez et al., 2006).The transition density for Y t , where Y t = log(X t ), can be written:

Linear noise approximation with multiplicative noise
We now take a different approach to approximating the SLGM (3), which will turn out to be closer to the exact solution of the SLGM than the RRTR (5).Starting from the original model (3), we apply Itô's lemma with the transformation f (t, X t ) ≡ Y t = log X t to obtain the following Itô drift-diffusion process: The log transformation from multiplicative to additive noise, gives a constant diffusion term, so that the LNA will give a good approximation to (3).We now separate the process Y t into a deterministic part V t and a stochastic part Z t so that Y t = V t + Z t and consequently dY t = dV t + dZ t .We choose V t to be the solution of the deterministic part After redefining our notation as follows: a = r − σ 2 2 and b = r K , we solve (7) for V t : This process is a particular case of the time-varying Ornstein-Uhlenbeck process, which can be solved explicitly.The transition density for Y t (derivation in Appendix A) is then: The LNA of the SLGM with multiplicative intrinsic noise (LNAM) can then be written as where P = X t 0 and is independent of W t , t ≥ t 0 .Note that the RRTR given in (5) can be similarly derived using a zero-order noise approximation (e Zt ≈ 1) instead of the LNA.

Linear noise approximation with additive noise
As in Section 3, we start from the SLGM, given in (3).Without first log transforming the process, the LNA will lead to a worse approximation to the diffusion term of the SLGM, but we will see in the coming sections that there are nevertheless advantages.We separate the process X t into a deterministic part V t and a stochastic part Z t so that dX t = dV t + dZ t and consequently X t = V t + Z t .We choose V t to be the solution of the deterministic part After redefining our previous notation as follows: a = r and b = r K , we solve dV t to give: V t = aP e aT bP (e aT − 1) + a .
Differentiating w.r.t.t, we obtain: We now solve dZ t , where dZ t = dX t − dV t .Expressions for both dX t and dV t are known: a 2 P e aT (a − bP ) (bP (e aT − 1) + a) 2 dt. Simplifying: a 2 P e aT (a − bP ) (bP (e aT − 1) + a) 2 dt + σX t dW t .We then substitute in X t = V t + Z t and rearrange to give We now apply the LNA, by setting second-order terms −bZ 2 dt = 0 and σZ t dW t = 0 to obtain This process is a particular case of the Ornstein-Uhlenbeck process, which can be solved.The transition density for X t (derivation in Appendix B) is then where The LNA of the SLGM, with additive intrinsic noise (LNAA) can then be written as where P = X t 0 and is independent of W t , t ≥ t 0 .

Simulation and Bayesian inference for logistic SDE and approximations
To test which of the three approximate models best represent the SLGM, we first compare simulated forward trajectories from the RRTR, LNAM and LNAA with simulated forward trajectories from the SLGM (Figure 1).We use the Euler-Maruyama method (Carletti, 2006) with very fine discretisation to give arbitrarily exact simulated trajectories from each SDE.
The LNAA and LNAM trajectories are visually indistinguishable from the SLGM (Figure 1  A,C & D).On the other hand, population sizes simulated with the RRTR display large deviations from the mean as the population approaches stationary phase (Figure 1A & B). Figure 1E further highlights the increases in variation as the population approaches stationary phase for simulated trajectories of the RRTR, in contrast to the SLGM and LNA models.

Bayesian parameter inference with approximate models
To compare the quality of parameter inference using each of these approximations we simulated synthetic time-course data from the SLGM and combined this with either lognormal or normal measurement error.Carrying out Bayesian inference with broad priors (see ( 13) and ( 14)) we compared the parameters recovered using each approximation with those used to generate the synthetic dataset.The synthetic time-course datasets consist of 27 time points generated using the Euler-Maruyama method with very fine intervals (Kloeden and Platen, 1992).
We formulate our inference problem as a dynamic linear state space model.To allow fast parameter inference we take advantage of a linear Gaussian structure and construct a Kalman filter recursion for marginal likelihood computation (Appendix D).We therefore assume lognormal (multiplicative) error for the RRTR and LNAM, and for the LNAA we assume normal (additive) measurement error.Dependent variable y t i and independent variable {t i , i = 1, ..., N} are data input to the model (where t i is the time at point i and N is the number of time points).X t is the state process, describing the population size.See Table C.1 for prior hyper-parameter values.
For the RRTR and LNAM, µ t i and Ξ t i are given by ( 6) and ( 10) for the RRTR and LNAM respectively.Priors are as follows: For the LNAA, See Table 1 for parameter values.A) The stochastic logistic growth model (SLGM).B) The Román-Román and Torres-Ruiz (2012) (RRTR) approximation.C) The linear noise approximation with multiplicative intrinsic noise (LNAM).D) The linear noise approximation with additive intrinsic noise (LNAA).E) Standard deviations of simulated trajectories over time for the SLGM (black), RRTR (red), LNAM (green) and LNAA (blue).
µ t i and Ξ t i are given by ( 12).Priors are as in (13).Our prior for log σ −2 is truncated below 1 to avoid unnecessary exploration of extremely low probability regions, which could be caused when there are problems identifying ν, for example when log ν −2 takes large values, and to ensure that intrinsic noise does not dominate the process.The truncation limit was chosen by visual inspection.
To see how the inference from our approximate models compares with slower "exact" models, we consider Euler-Maruyama approximations (Kloeden and Platen, 1992) of ( 3) and of the log transformed process, using fine intervals.Given these approximations we can construct a state space model for an "exact" SLGM with lognormal measurement error (SLGM+L) and similarly for the SLGM with normal measurement error (SLGM+N), priors are as in (13).
Model fitting is carried out using standard MCMC techniques (the Gibbs sampler) (Gamerman and Lopes, 2006).Posterior means are used to obtain point estimates and standard deviations for describing variation of inferred parameters.The Heidelberger and Welch convergence diagnostic (Heidelberger and Welch, 1981) and effective sample size diagnostic (Plummer et al., 2006) are used to determine whether convergence has been achieved for all parameters.
Computational times for convergence of our MCMC schemes (code is available at https:// github.com/jhncl/LNA.git)can be compared using estimates for the minimum effective sample size per second (ESS min /sec).The average ESS min /sec of our approximate model (coded in C) is ∼100 and "exact" model ∼1 (coded in JAGS (Plummer, 2010) with 15 imputed states between time points, chosen to maximise ESS min /sec).We find that our C code is typically twice as fast as the simple MCMC scheme used by JAGS, indicating that our inference is ∼50× faster than an "exact" approach.A more efficient "exact" approach could speed up further, say by another factor of 5, but our approximate approach will at least be an order of magnitude faster.We use a burn-in of 600,000 and a thinning of 4,000 to obtain a final posterior sample size of 1,000 for MCMC convergence of all our models.
To compare the approximate models ability to recover parameters from the SLGM with simulated lognormal measurement error, we simulate data and carry out Bayesian inference.Figure 2 shows that all three approximate models can capture the synthetic time-course well, but that the RRTR model is the least representative with the largest amount of drift occurring at the saturation stage, a property not found in the SLGM or the two new LNA models.Comparing forwards trajectories with measurement error (Figure 2), the "exact" model is visually similar to all our approximate models, but least similar to the RRTR.Further, Table 1 demonstrates that parameter posterior means are close to the true values and that standard deviations are small for all models and each parameter set.By comparing posterior means and standard deviations to the true values, Table 1 shows that all our models are able to recover the three different parameter sets considered.
To compare the approximations to the SLGM with simulated normal measurement error, we simulate data and carry out Bayesian inference.Figure 3 shows that of our approximate models, only the LNAA model can appropriately represent the simulated time-course as both our models with lognormal measurement error, the RRTR and LNAM do not closely bound the data.Comparing forwards trajectories with measurement error (Figure 3), the "exact" model is most visually similar to the LNAA, which shares the same measurement error structure.Further, Table 1 demonstrates that only our models with normal measurement error have posterior means close to the true values and that standard deviations are larger in the models with lognormal measurement error.Observing the posterior means for K for each parameter set (Table 1), we can see that the RRTR has the largest standard deviations and that, of the approximate models, its posterior means are furthest from both the true values and the "exact" model posterior means.Comparing LNA models to the "exact" models with matching measurement error, we can see in   1).See ( 13) or ( 14) for model and Table C.1 for prior hyper-parameter values.See Table 1 for

Application to observed yeast data
We now consider which diffusion equation model can best represent observed microbial population growth curves taken from a Quantitative Fitness Analysis (QFA) experiment (Addinall et al., 2011;Banks et al., 2012), see Figure 4.The data consists of scaled cell density estimates over time for budding yeast Saccharomyces cerevisiae.Independent replicate cultures are inoculated on plates and photographed over a period of 5 days.The images captured are then converted into estimates of integrated optical density (IOD, which we assume are proportional to cell population size), by the software package Colonyzer (Lawless et al., 2010).The dataset chosen for our model fitting is a representative set of 10 time-courses, each with 27 time points.
As in Figure 3, we see that the LNAA model is the only approximation that can appropriately represent the time-course and that both the RRTR and LNAM fail to bound the data as tightly as the LNAA (Figure 4).Our two "exact" models are visually similar to our approximate models with the same measurement error, with the SLGM+N most similar to the LNAA and the SLGM+L to the RRTR and LNAM.This is as expected due to matching measurement error structures.Table 1 summarises parameter estimates for the observed yeast data using each model.The variation in the the LNAA model parameter posteriors is much smaller than the RRTR and LNAM, indicating a more appropriate model fit.Comparing the LNA models and "exact" models with matching measurement error, we can see in Table 1 that they share similar posterior means and standard deviations for all parameters and in particular, they are very similar for both K and r, which are important phenotypes for calculating fitness (Addinall et al., 2011).1).See ( 13) or ( 14) for model and Table C.1 for prior hyper-parameter values.See Table 1

for parameter posterior means. A) SLGM+N (pink). B) RRTR model with lognormal error (red). C) LNAM model with lognormal error (green). D) LNAA model with normal error (blue).
In Table 2, to compare quality of parameter inference for 10 observed yeast time-courses with each approximate model.Mean square error (MSE) for 1000 posterior sample forward simulations are calculated for each yeast time course and summed to give a Total MSE for each model.It is clear that the RRTR is the worst overall representation of the 10 yeast time courses, with the highest total MSE and a much larger total MSE than the "exact" SLGM+L.It is interesting to see there is a very similar total MSE for the SLGM+L and LNAM, and similarly for the SLGM+N and LNAA, demonstrating that our approximations perform well.13) or ( 14) and Table C.1 for prior hyper-parameter values.See Table 1 for parameter posterior means.A) SLGM+N (pink).B) SLGM+L (orange).A) RRTR model with lognormal error (red).B) LNAM model with lognormal error (green).C) LNAA model with normal error (blue).

Conclusion
We have presented two new diffusion processes for modelling logistic growth data where fast inference is required: the linear noise approximation (LNA) of the stochastic logistic growth model (SLGM) with multiplicative noise and the LNA of the SLGM with additive intrinsic noise.Both the LNAM and LNAA are derived from the linear noise approximation of the stochastic logistic growth model (SLGM).The new diffusion processes approximate the SLGM more closely than an alternative approximation (RRTR) proposed by Román-Román and Torres-Ruiz (2012).The RRTR lacks a mean reverting property that is found in the SLGM, LNAM and LNAA, resulting in increasing variance during the stationary phase of population growth (see Figure 1).
We compared the ability of each of the three approximate models and the SLGM to recover parameter values from simulated datasets using standard MCMC techniques.When modelling stochastic logistic growth with lognormal measurement error we find that our approximate models are able to represent data simulated from the original process and that the RRTR is least representative, with large variation over the stationary phase (see Figure 2).When modelling stochastic logistic growth with normal measurement error we find that only our models with normal measurement error can appropriately bound data simulated from the original process (see Figure 3).We also compared parameter posterior distribution summaries with parameter values used to generate simulated data after inference using both approximate and "exact" models (see Table 1).We find that, when using the RRTR model, posterior distributions for the carrying capacity parameter K are less precise than for the LNAM and LNAA approximations.We also note that it is not possible to model additive measurement error while maintaining a linear Gaussian structure (which allows fast inference with the Kalman filter) when carrying out inference with the RRTR.We conclude that when measurement error is additive, the LNAA model is the most appropriate approximate model.
To test model performance during inference with real population data, we fitted our approximate models and the "exact" SLGM to microbial population growth curves generated by quantitative fitness analysis (QFA) (see Figure 4).We found that the LNAA model was the most appropriate for modelling experimental data.It seems likely that this is because a normal error structure best describes this particular dataset, placing the LNAM and RRTR models at a disadvantage.We demonstrate that arbitrarily exact methods and our fast approximations perform similarly during inference for 10 diverse, experimentally observed, microbial population growth curves (see Table 2) which show that, in practise, our fast approximations are as good as "exact" methods.We conclude that our LNA models are preferable to the RRTR for modelling QFA data.
It is interesting to note that, although the LNAA is not a better approximation of the original SGLM process than the LNAM, it is still quite reasonable.Figure 1A and 1D shows that the SLGM and LNAA processes are visually similar.Figure 1E demonstrates that forward trajectories of the LNAA also share similar levels of variation over time with the SLGM and LNAM.
Fast inference with the LNAA gives us the potential to develop large hierarchical Bayesian models which simultaneously describe thousands of independent time-courses from QFA with a diffusion equation, allowing us to infer the existence of genetic interactions on a genome-wide scale using realistic computational resources.
Here, we have concentrated on a biological model of population growth.However, we expect that the approach we have demonstrated: generating linear noise approximations of stochastic processes to allow fast Bayesian inference with Kalman filtering for marginal likelihood computation, will be useful in a wide range of other applications where simulation is prohibitively slow.

A LNAM Solution
First we look to solve dZ t , given in equation ( 9).We define f (t) = − baP e aT bP (e aT −1)+a to obtain the following, U t , has the following solution, As U t = φ(t) −1 Z t , Z t then has the following solution, Finally, the distribution at time t is Z t |Z 0 ∼ N(M t , E t ), where M t = φ(t)Z 0 and E t = φ(t) Further, M t = a bP (e aT −1)+a Z 0 and E t = σ 2 a bP (e aT −1)+a 2a (bP (e aT − 1) + a) 2 .
Taking our solutions for V t (8) and Z t , we can now write our solution for the LNA to the log of the logistic growth process (7). As Note: aP e aT bP (e aT −1)+a has the same functional form as the solution to the deterministic part of the logistic growth process (3) and is equivalent when σ = 0 (such that a = r − σ 2 2 = r).
Further, as Y t is normally distributed, we know X t = e Yt will be log normally distributed and From our solution to the log process we can obtain the following transition density where

B LNAA Solution
U t has the following solution,  From our solution to the process we can obtain the following transition density where x t i−1 =v t i−1 + z t i−1 , µ t i =v t i−1 + aP e aT i bP (e aT i − 1) + a − aP e aT i−1 bP (e aT i−1 − 1) + a + e a(t i −t i−1 ) bP (e aT i−1 − 1) + a bP (e aT i − 1) + a 2 Z t i−1 and Ξ t = 1 2 σ 2 aP 2 e 2aT i 1 bP (e aT i − 1) + a 4 × [b 2 P 2 (e 2aT i − e 2aT i−1 ) + 4bP (a − bP )(e aT i − e aT i−1 ) + 2a(t i − t i−1 )(a − bP ) 2 ].
C Prior Hyper-parameters for Bayesian State Space Models

D Kalman Filter
To find π(y t 1:N ) for the LNAA with normal measurement error we can use the following Kalman Filter algorithm.First we assume the following: and initialize with m 0 = P and C 0 = 0. Now suppose that, The transition density distribution, see ( 12) is as follows: where H α,t i = H α (t i , t i−1 ) =V t − V t−1 e a(t i −t i−1 ) bP (e aT i−1 − 1) + a bP (e aT i − 1) + a 2 and H β,t i =H β (t i , t i−1 ) = e a(t i −t i−1 ) bP (e aT i−1 − 1) + a bP (e aT i − 1) + a 2 .
The measurement error distribution is as follows: where F = 0 1 and U = σ 2 ν .
Figure E.1: Trace, auto-correlation and density plots for the LNAA model parameter posteriors (sample size = 1000, thinning interval = 4000), see Figure 3D.Posterior density (black), prior density (dashed blue) and true parameter values (red) are shown in the right hand column.
Writing down an expression for dZ t , where dZ t = dY t − dV t .Vt − e Yt dt + σdW t .We substitute in Y t = V t + Z t to give dZ t = b e Vt − e Vt+Zt dt + σdW t .Zt ≈ 1 + Z t and then simplify to give dZ t = −be Vt Z t dt + σdW t .Finally, substitute e Vt = t dt + σdW t .
Table 1 that they share similar posterior means and only slightly larger standard deviations.Example posterior diagnostics given in Appendix E, demonstrate that posteriors are distributed tightly around true values for our LNAA and data from the SLGM with Normal measurement error.

Table 1 :
Bayesian state space model parameter posterior means, standard deviations and true values for Figure2, 3 and 4. True values for the simulated data used for Figure1, and 3 are also given.

Table 2 :
Total mean squared error (MSE) for 10 observed yeast growth time courses, each with 1000 forward simulated time-courses with measurement error.Parameter values are taken from posterior samples.Standard Deviations give the variation between the sub-total MSEs for each yeast time course fit (n=10).
t)Z t dt + σdW t .S = s − t 0 and T = t − t 0 .Apply the chain rule to U t , Now substitute in dZ t = f (t)Z t dt + σdW t and simplify to giveU t = e First we look to solve dZ t , given in (11).We define f (t) = a − 2bV t and g(t) = aV t − bV 2 t − a 2 P e aT (a−bP )(bP (e aT −1)+a) 2 to obtain the following,dZ t = (g(t) + f (t)Z t )dt + σV t dW t .= s − t 0 and T = t − t 0 .Apply the chain rule to U t , Now substitute in dZ t = (g(t) + f (t)Z t ) dt + σV t dW t and simplify to give, f (s)ds σV t dW t .Apply the following notation φ(t) = e