Skip to main content
Advertisement
  • Loading metrics

The impact of temporal sampling resolution on parameter inference for biological transport models

Abstract

Imaging data has become an essential tool to explore key biological questions at various scales, for example the motile behaviour of bacteria or the transport of mRNA, and it has the potential to transform our understanding of important transport mechanisms. Often these imaging studies require us to compare biological species or mutants, and to do this we need to quantitatively characterise their behaviour. Mathematical models offer a quantitative description of a system that enables us to perform this comparison, but to relate mechanistic mathematical models to imaging data, we need to estimate their parameters. In this work we study how collecting data at different temporal resolutions impacts our ability to infer parameters of biological transport models by performing exact inference for simple velocity jump process models in a Bayesian framework. The question of how best to choose the frequency with which data is collected is prominent in a host of studies because the majority of imaging technologies place constraints on the frequency with which images can be taken, and the discrete nature of observations can introduce errors into parameter estimates. In this work, we mitigate such errors by formulating the velocity jump process model within a hidden states framework. This allows us to obtain estimates of the reorientation rate and noise amplitude for noisy observations of a simple velocity jump process. We demonstrate the sensitivity of these estimates to temporal variations in the sampling resolution and extent of measurement noise. We use our methodology to provide experimental guidelines for researchers aiming to characterise motile behaviour that can be described by a velocity jump process. In particular, we consider how experimental constraints resulting in a trade-off between temporal sampling resolution and observation noise may affect parameter estimates. Finally, we demonstrate the robustness of our methodology to model misspecification, and then apply our inference framework to a dataset that was generated with the aim of understanding the localization of RNA-protein complexes.

Author summary

We consider how the temporal resolution of imaging studies affects our ability to carry out accurate parameter estimation for a stochastic biological transport model. This model provides a mechanistic description of motile behaviour and is often used to interrogate transport processes, such as the motion of bacteria. Parameter inference is necessary to characterise different types of transport and to make predictions about biological behaviour under different conditions. Typically, observations of the transport process, at the level of individual trajectories, are made at discrete times. This can lead to errors in parameter estimation because we do not have complete trajectory information. We present a framework for Bayesian inference for these models of biological transport processes. Using this framework, we study the effects of collecting data more or less frequently, and with varying measurement noise, on what we can learn about the biological system via parameter estimation.

Introduction

Biological transport processes occur on a wide range of spatial and temporal scales, and a common mechanism for transport involves two phases: fast active transport, and a quasi-stationary reorientation phase. This pattern of movements has been observed at a range of scales from the intracellular transport of cellular components such as mRNA particles moving on a microtubule network [1], to the run-and-tumble motion of bacteria such as Escherichia coli [24], and the flights of birds between nesting sites [5]. To capture appropriately these two phases of motion, a class of models known as velocity jump process (VJP) models [2, 58] (also known as correlated random walks [913] or Levy Walks [1417]) have been developed.

Estimating the parameters of these models can give us mechanistic information relating to the underlying biological process, such as the rate of reorientation. Being able to obtain accurate estimates, with appropriate uncertainty, for these parameters allows us to compare different biological species or mutants, and gain an understanding of the underlying mechanistic behaviour. Importantly, parameterising models and quantifying the uncertainty in parameter estimates, as can be achieved via Bayesian inference, enables us to use models to make quantitive predictions of behaviour in new conditions with quantifiable uncertainty. By performing experiments to test model predictions, we can evaluate the areas in which a given model fails to describe experimental data, and so iteratively refine our understanding of a given system or phenomenon.

In this work, we consider the effects that experimental design can have on the information we can obtain from a data set, in terms of using that data to estimate parameters of a mechanistic model. In particular, for time series data describing a biological transport process, we vary the time between successive measurements. We demonstrate a framework for estimating the parameters of a VJP model for data of this form in the presence of noise, and examine how the posterior estimates of the model parameters change for more coarsely sampled and noisier datasets. Our framework formulates the VJP model as a process with hidden states, as in a hidden Markov model (HMM), which allows us to use particle Markov Chain Monte Carlo (pMCMC) methods to perform exact Bayesian inference. We use our framework to suggest sensible experimental design choices in the context of microscopy studies, where there may be a trade-off between how frequently it is possible to image, and the noise resulting from more or less frequent observations of the process. We present a comparison between the pMCMC framework described in this work and approximate Bayesian computation (ABC) for parameter inference in this context. We demonstrate robustness to model mispecification in a situation where we attempt inference on synthetic data generated with a different model to the assumed model. Finally, we apply our method to an imaging dataset of tracks of RNA-protein complexes allowing us to estimate motility parameters for a mechanistic model.

Velocity jump process models

VJP models have been developed to describe biological transport processes where there is persistent or biased random motion [2, 5, 18]. Correlated, persistent motion is observed experimentally in a variety of contexts [19, 20]. VJP models are most appropriate for motion consisting of multiple phases, such as a fast directed phase and a stationary or reorientation phase, and these models describe how an object moves in one direction before reorienting and moving in a new direction. This type of “run and reorientate” motion is exhibited, for example, by Escherichia coli [3, 4], fibroblasts [20], and RNA-protein complexes [1].

We present here a mathematical description of a VJP model. Suppose we have a running time distribution, with probability density function (pdf) fτ, and a waiting time distribution, with pdf fμ. Random variables drawn from these distributions dictate the lengths of time spent in the fast active transport, or running, phase of the VJP and the reorientation phase, respectively. After the reorientation phase, a velocity for the new run is chosen according to a transition kernel, fv. This transition kernel could incorporate a distribution of speeds as well as directions, or could rely on a fixed speed and specify only the directionality of the run. In the fixed speed case, where fΦ is a reorientation kernel descibing the angle change, and c is the constant running speed. For biological processes with a distinct separation of timescales between the running and reorientation phases, it is often possible to assume that reorientations between successive runs occur instantaneously and therefore neglect the reorientation phase in a model of the process [2, 5]. Repeated simulation from this model can be used to generate trajectories of individual particles (see Fig 1, for example).

thumbnail
Fig 1. Four example VJP trajectories, simulated with a uniform reorientation kernel, running speed c = 50μms−1, exponential running time distribution, fτ(t) = λe−λt, with reorientation frequency λ = 0.2 s−1 and no waiting time between runs, for a duration of Tfinal = 64s.

These trajectories start from the origin, and are orientated initially parallel to the positive x-axis.

https://doi.org/10.1371/journal.pcbi.1006235.g001

In the simplest case, where we assume that the running time distribution is memoryless, that is, exponentially distributed, with fτ(t) = λ exp(−λt), the long time behaviour of the mean squared displacement scales linearly with time, t [2]. Further moments of the motion of individuals displaying such VJP behaviour have been characterised by making certain closure assumptions [7]. For the case of a more general running time distribution, with finite mean and variance, in the large time limit the probability of the particle being at position x at time t follows a diffusion equation [5].

In practice, VJP models are often parameterised by obtaining measurements of the effective diffusion coefficient or the mean squared displacement, and using these data to estimate the parameters of a specific running distribution [2, 5]. Rosser et al. [4] parameterised a HMM via maximum likelihood estimation, whilst Nicosia et al. [21] fitted a hidden state random walk model to animal movement data using an expectation-maximization algorithm. These frequentist approaches can provide useful point estimates of the VJP parameters, and quantify the error in these estimates. However, Bayesian approaches can propagate uncertainty from both process noise and measurement noise, which can be crucial when dealing with noisy biological data (see Fig 1). In addition, they provide the added benefit that we can interpret the results of a Bayesian analysis as probabilistic statements about the model parameters. This quantification of uncertainty enables the generation of predictions of further biological behaviour using the model. In addition, we can consider the effects of noisy data measurements upon the accuracy of the inferred parameters distributions, something which has previously been difficult to deal with in practice, or has been neglected.

For simplicity, in this work we will assume that there is a separation of timescales between the lengths of the running and reorientation phases, such that reorientations can be considered instantaneous. In addition, we assume that the running time distribution, fτ, is exponentially distributed, that particles run at a fixed speed, c, and the reorientation kernel is a uniform distribution on [−π, π). That is fτ(t) = λ exp(−λt) and . We follow the trajectory of a single, motile individual, and take, as experimental measurements, the change in angle between successive observed positions (calculated relative to the previous observed position, see Fig 2), subject to measurement noise drawn from a wrapped Normal distribution, N(0, σ2), where σ is the magnitude of the noise. Our Bayesian inference approach will target estimation of the reorientation rate, λ, and the magnitude of the noise, σ.

thumbnail
Fig 2. The motile individual is observed at discrete times t = kΔt for k ∈ {0, 1, …, T} (marked by crosses).

In each time interval between observations, we measure the angle change between the observed direction of travel between successive observations. We assign a hidden state to each time interval according to whether a reorientation event occurred in that time interval or not. In this example, which excludes measurement noise, there is a reorientation, through angle ϕ, between observations at times t = kΔt and t = (k + 1)Δt; however, the measured angle change between these observations is θ1. Conversely, there is no reorientation event between observations at t = (k + 1)Δt and t = (k + 2)Δt and an observed angle change of θ2.

https://doi.org/10.1371/journal.pcbi.1006235.g002

Inference for velocity jump process models via particle Markov Chain Monte Carlo

Inference for partially observed Markov processes can be performed using particle Markov Chain Monte Carlo (pMCMC), as developed by Andrieu and Roberts [22] and Andrieu et al. [23]. pMCMC provides a Bayesian framework for parameter estimation by allowing samples to be drawn from the posterior distribution of the model parameters, given observed data, without needing to evaluate the likelihood function directly. For partially observed Markov process models, the model structure makes directly evaluating the likelihood difficult or expensive. (We note that VJP models can be viewed in this form by introducing hidden states, as explained in Section ‘Methods’.) Instead of evaluating the likelihood directly, we can use (unbiased) estimates of the likelihood within an MCMC algorithm. Estimating the likelihood of the observed data given certain parameters can be achieved with a particle filter (also known as a sequential Monte Carlo scheme) [24] for a fixed, finite, number of particles. The results of Andrieu et al. [23] demonstrate that, even when a finite number of particles are used in the filter to estimate the likelihood, the MCMC algorithm will still target the correct posterior distribution. These methods have been applied in the context of modelling epidemics [25] and biochemical reaction networks [26, 27]. However, pMCMC methods for parameter inference have not previously been applied to spatial agent-based models, such as the VJP model considered here. The details of the pMCMC algorithm used is this work are given in Section ‘Methods’.

Methods

We will demonstrate how to exploit pMCMC methods to obtain posterior parameter estimates for the VJP outlined in the Introduction by formulating the model within an appropriate framework that incorporates hidden states. This formulation additionally allows us to incorporate a model for measurement noise in our observations, as well as explicitly accounting for the temporal discretisation of the data. The hidden states (also known as latent variables) in our model will describe whether or not a reorientation event occurred between observations of the VJP. This is not a variable that we can observe directly, since we only measure the observed angle change. For example, in Fig 2 there was a reorientation event between observations at t = kΔt and t = (k + 1)Δt, and an observed angle change of θ1. On the other hand, there was no reorientation event between observations at t = (k + 1)Δt and t = (k + 2)Δt but still a non-zero observed angle change of θ2. Note that the true reorientation angle for the reorientation event that takes place between observations at t = kΔt and t = (k + 1)Δt is ϕ.

We assume that we observe the system at times {kΔt: k ∈ {0, 1, …, T}}, up until a final time Tfinal = TΔt, by measuring the position of the individual of interest and recording the observed angle change. We define the hidden variable, Xk, as follows: (1) The observed state is the observed angle change, θk, obtained as the difference between the observed direction of travel during [(k − 1)Δt, kΔt) and [kΔt, (k + 1)Δt), k ≥ 1, as illustrated in Fig 2. For example, if we directly observe as data the positions {(xk, yk): k ∈ {0, 1, …, T}}, then we can calculate the observed angle change as We illustrate the sequence of hidden and observed states in Fig 3, with dependence between these states given by the transition and emission probabilities, denoted βi and pij, respectively. The hidden variable, Xk, evolves according to the VJP model. Here, since we have an exponential distribution for the running time, P(reorientation event in [(k − 1)Δt, kΔt)) = 1 − exp(−λΔt).

thumbnail
Fig 3. Hidden and observed states in a partially observed Markov process model.

We observe an angle change, θk, which is dependent on a hidden state Xk defined in Eq (1). This dependency is shown by the arrows between states. Here, we require dependencies on the previous hidden state, Xk−1, since observation times will not coincide with reorientation events in general. Additionally, we introduce an extra layer of state dependencies to capture the measurement noise in this noisy biological system. We suppose that θk is the true observed angle change and our noisy observed version of this is zk, with dependencies shown by the arrows.

https://doi.org/10.1371/journal.pcbi.1006235.g003

We assume that reorientation events are rare relative to the sampling rate such that λΔt ≪ 1. This assumption allows us to neglect multiple reorientations in a single time interval, enabling us to greatly simplify the problem, so that it is tractable via the binary hidden variables Xk. We highlight that the model structure as shown in Fig 3 is an approximation. In reality, observed data will not be a function of only the current and previous hidden states, Xk and Xk−1, but may depend on other previous states also. The assumption that λΔt ≪ 1 enables us to neglect dependence on earlier states.

Model without measurement noise

We initially make progress by simplifying the problem and its exposition via the assumption of zero measurement error. To relate the unobserved hidden state to the observed angle change, which we can measure, we derive probability distributions for the angle change given the hidden state. Note that there is additional dependence not only on the current hidden state, but also on the previous hidden states, as shown in Fig 3. We can obtain expressions for the emission probabilities by considering the path that is taken under different sequences of hidden states (Fig 4); we will outline simplifying assumptions as they are made.

thumbnail
Fig 4. Example paths for each pattern of hidden states.

The particle of interest moves from left to right, and is observed at times (k − 1)Δt, kΔt, and (k + 1)Δt. In a), there are no reorientation events, so the particle continues along a straight trajectory. In b), there is a single reorientation in the time interval [kΔt, (k + 1)Δt), where the particle turns through an angle ϕ, but the observed angle change is θ1 due to the discrete nature of the observations. In c), there is a single reorientation in the time interval [(k − 1)Δt, kΔt). The particle turns through an angle ϕ and continues on its new trajectory. We observe an angle change θ1 for the time interval [(k − 1)Δt, kΔt), and observe an angle change θ2 for the next time interval [kΔt, (k + 1)Δt) even though there was no reorientation during this time interval. Similarly, in d), there was a reorientation of true angle change ϕ during [(k − 1)Δt, kΔt) followed by another reorientation of true angle change ϕ′ in the time interval [kΔt, (k + 1)Δt). In this case, we observe angle change θ1 for the time interval [(k − 1)Δt, kΔt), and observe an angle change of for the time interval [kΔt, (k + 1)Δt).

https://doi.org/10.1371/journal.pcbi.1006235.g004

Suppose we observe the system at times (k − 1)Δt, kΔt, and (k + 1)Δt for some k, as shown in Fig 4. The sequence of hidden states i, j, corresponds to whether there were reorientation events in the time intervals [(k − 1)Δt, kΔt) and [kΔt, (k + 1)Δt), respectively. Let pij(θ) be the pdf of observing a reorientation angle of θ given the sequence of hidden states i, j. We assume angle change θ was observed over the time interval [kΔt, (k + 1)Δt) corresponding to the hidden state j. In the case where no reorientation occurs in either of the time intervals (corresponding to the situation in Fig 4a)), then, assuming no noise, we would observe zero angle change. That is, we have p00(θ) = δ0(θ). If a reorientation occurs in the time interval [kΔt, (k + 1)Δt), but not in the preceding time interval, [(k − 1)Δt, kΔt), (as shown in Fig 4b)), then the observed reorientation angle is θ1, as labelled in Fig 4b). This gives . The marginal distribution of Θ1 is derived in the Section ‘Derivation of emission probabilities’, and depends on the running time distribution and the reorientation kernel.

If, immediately after a reorientation, we have no reorientation in the following time interval, then we may still observe a nonzero angle change since our discrete observation times do not in general coincide with the reorientation events. In this case, we have a reorientation event during [(k − 1)Δt, kΔt), and no reorientation during [kΔt, (k + 1)Δt). Such a situation with a pattern of hidden states 1,0 is shown in Fig 4c), and the observed angle change during the time interval [kΔt, (k + 1)Δt) corresponds to θ2 in the diagram. We note that, by geometric arguments, we have (2) where ϕ is the true angle change. Therefore, given a pattern of hidden states 1, 0 the angles θ1 and θ2 are not independent, and we have . Note that throughout we will assume that we can observe the previous angle change directly, which is equivalent to assuming that the pattern of hidden states is in fact 0, 1, 0.

The remaining cases involve reorientation events in successive time intervals. Suppose we have the case where we have two successive reorientation events, giving a pattern of hidden states 1, 1, which is shown graphically in Fig 4d). This case is similar to the case with hidden states 1, 0 (see Fig 4c)), in that we observe a contribution to the reorientation angle, θ2, from the correction for the previous reorientation event, and also a contribution from the new reorientation event, . As can be seen in Fig 4d), these contributions sum to give an observed angle change for the time interval [kΔt, (k + 1)Δt) of . The probability density for sums of random variables can be expressed as a convolution [28]. Hence, , where * is the convolution operator. Any further cases involve multiple reorientation events within a single time interval which occurs rarely, with probability on the order of . We can safely neglect these provided λΔt ≪ 1.

To summarise therefore, the emission probabilities (that is the probability of observing a certain angle change given the sequence of hidden states) in the case without noise can be given as follows:

Model with measurement noise

To account for noise, we introduce another layer of states in our diagram of state dependencies, as shown in Fig 3. Let zk be the noisy observed angle change at time kΔt, which is a noisy observation of θk, such that for a noise model K we have zkK(⋅, θk). Alternatively, we can write the noisy observed angle as (3) where ϵkK(⋅, 0). Under the assumption of a wrapped normal noise model, ϵkN(0, σ2), where σ is the magnitude of the measurement noise. Since Eq (3) represents zk as a sum of random variables, it becomes clear that we can obtain the noisy emission probabilities, qij, corresponding to a pattern of hidden states i, j via the following convolution: To extend our previous results to the noisy case, we therefore need to compute these convolutions, a task that must be carried out numerically.

Derivation of emission probabilities

In previous work [29], marginal distributions for the observed angle change, Θ1, were obtained and we summarise the arguments here. Fig 5 shows the true angle change, Φ, and the observed angle change, Θ1, based on discrete time observations of the VJP for the case of hidden states 0, 1.

thumbnail
Fig 5. True and observed angle changes (Φ and Θ1, respectively) based on discrete time observations of a VJP without measurement error.

Cartesian co-ordinates X and Y are shown in addition to polar co-ordinates R and Θ1. The particle is moving along the trajectory from left to right as shown by the arrows.

https://doi.org/10.1371/journal.pcbi.1006235.g005

By changing co-ordinates from the displacement, L, (which is given by the running time distribution fτ) and true angle change, Φ, to Cartesian co-ordinates X and Y, and then to polar co-ordinates, R and Θ1, we can obtain a joint distribution for R and Θ1 of where C = {rcΔt, θ ∈ [−π, π)} is the set of permissible values for r and θ, Δt is the discretisation in time and c is the running speed. Integrating over r allows us to obtain the marginal distribution for θ1 via In the case where we assume a uniform reorientation kernel, , then this integral becomes We note that the marginal distribution for θ1 does not depend on the speed, c, or the time discretisation, Δt. This is intuitive because is the pdf of a certain observed angle change, θ1, given there was a reorientation in that interval, irrespective of the length of that interval.

Derivation of the joint distribution of Θ1 and Θ2.

By considering the displacement in the X and Y directions during a time step, we have Dividing these expressions, we can relate θ1 to l and ϕ by (4) and rearranging for l, we have Differentiating with respect to θ1, we obtain which can be simplified to give To transform from coordinates (L, Φ) to coordinates (Θ1, Φ), we can use the Jacobian JL, Φ, where (5) Therefore the joint distribution of θ1 and ϕ is where C = {(θ1, ϕ):ϕ ∈ [−π, π), θ ∈ (min(0, ϕ), max(0, ϕ))}. Under the assumption that ϕ is uniform on [−π, π), as described in the Section ‘Velocity jump process models’, such that , we have (6) Changing variables again to (θ1, θ2), via ϕ = θ1 + θ2, we have (7) where C = {(θ1, θ2): θ1 + θ2 ∈ [−π, π)}. From this joint distribution, we can then find the conditional distribution of θ2 given θ1, which is required to give .

Particle MCMC algorithm

The pMCMC algorithm used in this work is given in Algorithm 1 for an observed dataset y = {yk | k = 1, 2, …, T}. We use a Metropolis-Hastings MCMC algorithm [30], proposing new parameters using a proposal distribution q(⋅|θ). In step 6 of Algorithm 1, we need to evaluate the likelihood, p(y|θ), in calculating the acceptance probability for the proposed move. We replace the likelihood with an unbiased estimate of the likelihood, , obtained from a particle filter.

Algorithm 1 Particle MCMC

1: Initialise parameters, .

2: Run a particle filter (see Algorithm 2) to compute an estimate of the marginalised likelihood , where is the observed data.

3: for do

4:  Draw parameters from a proposal distribution .

5:  Run a particle filter to compute an estimate of the marginalised likelihood .

6:  Accept the proposed move with probability where

 If the move is accepted, set , otherwise set .

7: end for

Algorithm 2 Bootstrap particle filter

1: Sample a collection of particles from an initial density .

2: for do

3:  Compute the weights for each particle, , via

4:  Find the normalised weights

5: end for

6: Resample times with replacement from the collection of particles with probabilities given by the normalised weights .

7: for do

8:  for do

9:   Evolve the current collection of particles according to the forward model, by drawing

10:   Compute the weights for each particle, , via

11:   Find the normalised weights

12:  end for

13:  Resample times with replacement from the collection of particles with probabilities given by the normalised weights .

14: end for

15: Obtain an estimate of the marginal likelihood using the unnormalised weights, via (16)

16: return

Details of the bootstrap particle filter algorithm [24] used within the pMCMC algorithm are given in Algorithm 2. A particle filter represents the state of the system via a population of weighted particles [31]. We obtain an estimate of the likelihood by successively updating the hidden state of the system (represented via the particles) and comparing this hidden state with the observed data at each observed time point, to give new weights for the particles according to how well they match the observed data. To prevent a degenerate situation where the state of the system is represented solely by a single particle, it is necessary to resample from the population of particles according to their weights.

Implementation of Markov Chain Monte Carlo methods

We use a Metropolis-Hastings algorithm to run the MCMC algorithm with a bootstrap particle filter [24] using 400 particles to provide an estimate of the likelihood. As a proposal distribution, we use the kernel K(., θ) ∼ N(θ, Σ), where We run the Markov chain for N = 50,000 steps with thinning of m = 2 which gives a minimum effective sample size of neff = 743 for the slowest chain to converge, corresponding to the smallest value of Δt. We demonstrate the convergence of our Markov chains in S4 Fig. We choose the number of particles in the filter by considering the variance in estimating the log likelihood using the particle filter at the true values of the parameters used to generate the synthetic data, and balancing this with the time needed to run the particle filter to obtain a single estimate. The variance and computational cost are shown in S5 Fig. To further tune the number of particles for optimal efficiency of the pMCMC algorithm, the recommendations in Sherlock et al. [32] and Pitt et al. [33] could be employed. The prior is uniform on the log of the parameters over the intervals [−1.70, 1.30] and [−5, 1] for λ and σ, respectively.

Approximate Bayesian computation

An alternative method commonly used for parameter estimation for models with intractable likelihoods is approximate Bayesian computation, or ABC [34, 35]. Suppose we can simulate data from a generative model, xg(x|θ). An example would be simulating a path from our VJP model. To generate samples from the posterior distribution, we repeatedly generate parameters from the prior, θπ(θ), and use these parameters in our model to simulate synthetic data, xg(x|θ). We compare the simulated data with the true observed data and, if it is similar enough, we accept the sampled parameter as a sample from the posterior. In cases where the data are discrete, we can consider whether simulated data, x, are equal to observed data, y, and accept parameters correspondingly, which gives exact samples from the posterior. Often, though, our models provide continuous data and the acceptance rate from an exact comparison is prohibitively small. Instead we can take a distance function and consider when the distance between x and y is within a chosen tolerance ϵ, that is d(x, y) < ϵ. This introduces an approximation which becomes exact only in the limit ϵ → 0.

In practice, datasets can be high dimensional, resulting in very low acceptance rates for simulated data. By using summary statistics of data instead of the full datasets, we can reduce the dimensionality and obtain higher acceptance rates. Using (insufficient) summary statistics introduces another approximation into the method, meaning we compare d(s(x), s(y)) < ϵ, where s(x) gives the summary statistics for dataset x. We present the ABC method in Algorithm 3, and will show a comparison between application of pMCMC and ABC for parameter estimation of a VJP model based on time series data.

Algorithm 3 ABC rejection sampling

1: Sample parameter value from the prior .

2: Generate synthetic dataset from the model .

3: Accept sample if , for a distance function, , summary statistics, , and tolerance .

In practice, the choice of summary statistics can have a strong effect on the efficiency of the ABC algorithm. Here, we perform ABC with three different choices of summary statistic. First, we consider using the full data, a vector of the noisy observed angle changes, zi, i = 1, …, T. Second, we use a simple threshold-based summary statistic which produces a count of the number of observed angle changes with magnitude above a certain threshold, h. That is . This provides an intuitive one-dimensional summary statistic. Finally, we use a transition matrix as a summary statistic of the data, an approach described by Jones et al. [13]. A transition matrix allows us to summarise time series data via a two-dimensional binning of the observed angle changes, based on the current observed angle change and the previous observed angle change. We bin the observed angle change time series data into an n by n matrix, where n = 5. This summary statistic provides a much lower dimensional summary of the data for small values of Δt, although for large values of Δt we may end up with higher dimensional (sparse) data compared to the full time series data, as in this case there will be fewer than n2 observations. The distance function, d(s(x), s(y)), used also plays a role in the parameter estimation possible via ABC, and has been investigated in other work [13, 3638].

In using Algorithm 3, we generate N = 1,000,000 synthetic datasets based on parameters sampled from the prior, calculate the Euclidean distance between these synthetic datasets and our observed data, and select the parameters corresponding to the 0.1% of the datasets closest to the observed data. As for the pMCMC approach, we take a duration of time series Tfinal = 64s, and a prior uniform on the log of the parameters on the intervals [−1.70, 1.30] and [−5, 1] for λ and σ, respectively.

Results

We are able to investigate the effects of experimental design such as the discretisation, Δt, of a time series on the estimated biophysical parameters, using the framework for inferring the parameters of a VJP described in the Section ‘Methods’. We demonstrate the effects of restrictions imposed by experimental constraints on a trade-off between discretisation in time versus measurement noise. Our results can be used to provide guidance for experimental design choices. As we vary experimental design hyperparameters, such as Δt, we will produce a separate posterior distribution for each of the model parameters, λ and σ, for each value of the hyperparameter. This approach allows us to illustrate directly the effects of the experimental design hyperparameter on the inferred posterior distributions.

Increasing Δt results in an abrupt breakdown in the posterior

We first assume that the magnitude of the noise on the observed angle, σ, is fixed. We vary the sampling frequency used to collect the data (corresponding to using different values of Δt). We generate observed (in silico) data by simulating one single trajectory directly from the VJP model. We discretise this trajectory to generate observed angle changes with a temporal resolution of Δt, and add independent Gaussian noise with zero mean and variance σ2 to these observations, to represent measurement noise. Using datasets generated from the same simulated path discretised at different temporal resolutions, Δt, as shown in Fig 6, we infer the posterior distribution for the reorientation rate, λ, of the VJP model and the magnitude of the measurement noise, σ, via pMCMC.

thumbnail
Fig 6. The data used in the pMCMC inference discretised at differing resolutions.

The true trajectory is shown without discretisation in a). The same path is shown in b) with circle markers to show the observed positions with Δt = 4s and filled triangular markers to show observed positions with Δt = 8s. The corresponding observed angle changes are shown in c) for different values of Δt. Observations corresponding to multiple reorientations in a single time interval are highlighted in red. Parameters used in generating these data were a reorientation rate of λ = 0.2 s−1, a run speed of c = 50μms−1 and a total duration of observation of Tfinal = 64 s. A circular uniform distribution was used for the reorientation kernel. Observations are shown without measurement noise.

https://doi.org/10.1371/journal.pcbi.1006235.g006

The results of using pMCMC to infer model parameters are shown in Fig 7 for Δt = 1/8, 1/4, 1/2, 1, 2, 4, 8 seconds, with noise of unknown magnitude σ, where σ = 0.04 rad. The estimated posterior distributions are very similar for small Δt, but we observe an abrupt breakdown in the quality of the posterior distributions obtained as Δt is increased. This breakdown in posterior quality arises for values of Δt ≥ 2s, as multiple reorientation events in a time interval start to become more common (see also S2 Fig). Provided that Δt and λ are such that the probability of multiple reorientations in a time interval is small, we obtain accurate estimates of the joint posterior distribution for the model parameters.

thumbnail
Fig 7. Results of parameter estimation via pMCMC for the reorientation rate, λ, and measurement noise, σ.

Data collected with different values of Δt and fixed measurement noise, σ = 0.04 rad, were used in a) and b). Data generated with a fixed value of Δt = 1 s and different values of the measurement noise, σ, were used in c) and d). The marginal posterior distributions for λ are given in a) and c), while the marginal posterior distributions for σ are shown in b) and d). The red dashed lines indicate the true values of parameters used in simulation of the datasets. A reorientation rate of λ = 0.2 s−1 was used throughout.

https://doi.org/10.1371/journal.pcbi.1006235.g007

Increasing σ increases the variance of the posterior for λ

To investigate sensitivity of the posterior to measurement noise, we now fix the discretisation, Δt, and vary the measurement noise amplitude, σ, used to create each dataset. Applying the same analysis as in the previous subsection, for a fixed value of Δt = 1s, we obtain posterior distributions for the reorientation rate, λ, and measurement noise, σ, as shown in Fig 7. We find that increasing the noise amplitude, σ, increases the variance in the posterior distribution obtained for the reorientation rate, λ. In addition, we are able to accurately estimate the value of σ used to generate the datasets.

We note that the presence of noise in the datasets results in a bias towards smaller estimates of the reorientation rate, λ. This can be explained intuitively by reasoning that some reorientations leading to very small observed angle changes are mistaken for noise as σ increases.

More data provides better estimates

Let Tfinal be the total duration of our observations of the system. For a fixed value of the time discretisation, Δt, varying Tfinal is equivalent to gathering a bigger dataset. We fix Δt = 1s and allow Tfinal to vary so that we can consider the effects of collecting more data. In practice, collecting more data experimentally may come at a cost. Quantifying the benefits of collecting more data with an in silico model can aid decisions about whether it is worthwhile to collect a larger dataset. Posteriors for the reorientation rate, λ, are shown in Fig 8.

thumbnail
Fig 8. Results of parameter estimation via pMCMC for the reorientation rate, λ, with Δt = 1s and σ = 0rad, whilst varying Tfinal to give different sized datasets.

https://doi.org/10.1371/journal.pcbi.1006235.g008

It is clear that larger datasets result in less bias and less variance for estimates of the reorientation rate, λ. However, we note that running the particle filter within the pMCMC algorithm becomes much more computationally intensive as the size of the dataset increases, scaling as as the size of the dataset increases.

Experimental constraints

We investigate the implications of these results for experimental design by considering an imaging experiment to observe the position of a particle of interest (for example, a bacterium) at regular time points. The behaviour of our system of interest happens over a certain timescale inherent to the biological process, so we fix the total duration for the imaging experiment, Tfinal, based on this timescale. We consider how best to choose the time between successive observations, Δt, given the restriction of a fixed photon budget. That is, we assume that the biological sample can only be exposed to a fixed number of photons before phototoxicity or photobleaching significantly reduce the quality of, or destroy, any further potential data. We assume that the sample is only exposed to photons during imaging. The results of Zhao et al. [39] suggest that the signal to noise ratio (SNR) is proportional to the square root of the time between successive frames, Δt, giving a relationship of the form (8) where κ is a constant that depends on the imaging set up, but not other experimental design choices. Assuming that for our model the noise is of magnitude σ, we find an inverse square relationship between σ and Δt, such that (9) for a new constant K which is κ times the average angle change.

To investigate how to choose Δt given a fixed photon budget, we set a value of the proportionality constant and vary σ and Δt according to this relationship. We take proportionality constants K = 0.08 and K = 0.8 in Fig 9. A larger value of the proportionality constant, K, corresponds to worse imaging conditions, in that for a fixed value of Δt, the noise in the images obtained will be greater. Therefore we expect our inference method to perform worse for a larger value of K. We then ask the question, for a given value of K, how should we choose Δt to improve the parameter estimation process? Our results in Fig 9 suggest that the value of Δt should be taken as small as possible, even if this increases the noise present in the data.

thumbnail
Fig 9. Results of parameter estimation via pMCMC for the reorientation rate, λ, and measurement noise, σ, using data generated with a fixed value of Δt and different values of the measurement noise, σ.

We have varied both σ and Δt with an inverse square root relationship between these, as in Eq (9). The posterior distributions for λ are given in a) and c), while the posterior distributions for σ are shown in b) and d). The proportionality constant is K = 0.08 for a) and b), and K = 0.8 for c) and d).

https://doi.org/10.1371/journal.pcbi.1006235.g009

Comparison to approximate Bayesian computation

A common approach for mathematical and computational models where the likelihood is intractable is to apply ABC for parameter inference [34, 35]. Although ABC produces samples from an approximate posterior, rather than the exact posterior distribution, it is intuitively simple to understand and implement. ABC methods have been applied to parameter estimation for biased, persistent random walk models very similar to our VJP model [12, 13, 40, 41]. For these methods, the choice of which summary statistics to use has notable effects on the approximate posterior distributions obtained via ABC. We consider three different summary statistic choices: the full time series of observed angle changes, a threshold-based summary statistic, and a transition matrix summary statistic.

We compare the quality of resulting posterior distributions obtainable with ABC to those from pMCMC. Our results, shown in Fig 10, indicate that the choice of summary statistic has a substantial effect on the inferred posterior distribution. When the dimensionality of the data observed is high, ABC performs poorly at approximating the posterior for the parameters of our VJP, which we were able to sample exactly using pMCMC. For the full time series summary statistic, the dimensionality is high (up to 511 dimensions for Δt = 0.125s), meaning that the approximation we obtain to the posterior is very poor (see Fig 10a) and 10b)). Since reorientation events are rare, particularly for small Δt, a greater fraction of the distance between simulated datasets and observed data is accounted for by the observation noise. ABC excludes part of the parameter space considered in the prior, but does poorly at identifying the reorientation rate, λ, since reorientations are rare events.

thumbnail
Fig 10. Parameter estimation via ABC rejection sampling for the reorientation rate, λ, in a), c), and e) and for the measurement noise, σ, in b), d), and f) using different summary statistics and data collected with noise of magnitude σ = 0.04rad, for different values of Δt.

N = 1,000,000 datasets were generated and parameter samples corresponding to the closest 0.1% of the datasets were retained to give the approximate posterior. As summary statistics, we take the full time series of observed angle changes in a) and b). In c) and d), we use a simple threshold summary statistic counting the number of observed angle changes with magnitude above a threshold, here taken as h = 0.1 rad. In d) and e), we use a transition matrix summary statistic using n = 5 bins to discretise the observed angle change.

https://doi.org/10.1371/journal.pcbi.1006235.g010

The threshold summary statistic is effective at identifying the number of reorientation events and provides a reasonable approximation of the marginal posterior for λ. However, it offers very little information about the observation noise, σ, except in relation to the threshold chosen. As a result the approximate marginal posteriors for σ are poor (see Fig 10c) and 10d)). Another summary statistic more informative about σ could be used in addition here to improve results.

When using the transition matrix summary statistic, the approximate posteriors for small Δt provide better estimates of the true reorientation rate and are much closer to the true posterior, as shown in Fig 10e) and 10f). The transition matrix is able to capture the distribution of the angle changes and dependence on recent history, whilst also reducing the dimensionality of the data. However, the transition matrix is unable to distinguish angle changes smaller than the resolution of the discretisation, 2π/n, and so offers limited information about the measurement noise, σ. This issue could be mitigated by increasing the number of bins used, n, to give a finer discretisation of the observed angle changes (which would require sufficient data), or by targeting the noise with an additional summary statistic.

We still notice a deterioration in the quality of the approximate posterior distributions as Δt increases when using ABC for each choice of summary statistic even though there are no explicit assumptions made about multiple reorientations within a time interval, as was the case when using pMCMC. This suggests that the lack of information content about the parameters may be a property of the data, rather than the inference method. For large values of Δt, the data contains limited information about the model parameters, particularly the reorientation rate, λ.

To obtain potentially improved results for inference with ABC for this type of data (time series observations of angle changes), we could consider a more systematic choice of summary statistics [4244] or a more efficient version of the ABC algorithm, such as ABC-SMC [45, 46], which applies sequential Monte Carlo methods to generate a sequence of approximations to the posterior distribution, using the previous posterior distribution to propose parameters for the next approximation. Other improvements could be possible by applying a regression adjustment, via linear regression [35] or using nonlinear regression techniques such as with a neural network [47].

Model misspecification

In general, in applying a model to a real world dataset, any model that we choose will be an approximation of the true data generating process. Before applying our inference methods to real world data, where we will fit to a model that is but an approximation to the true biological process, it is important to check the robustness of our method to fit a misspecified model. We investigate this robustness computationally by considering a misspecified model which should be no longer misspecified in an appropriate limit. It has previously been shown by Frazier et al. [48] that inference with ABC on a misspecified model can concentrate posterior mass around different pseudo-true parameter values depending on the version of ABC used.

Here, we demonstrate the robustness of our inference framework, using the pMCMC algorithm within a hidden states framework, to give relevant estimates for a misspecified model. We generate synthetic datasets using a wrapped normal reorientation kernel with dispersion parameter γ, but choose to estimate the model parameters with a model different to the data generating process by assuming a uniform reorientation kernel. As previously, we attempt to infer posterior distributions for the reorientation rate, λ, and the measurement noise, σ. The effect of misspecifying the model will be that some of the transmission probabilities used in the particle filter estimate of the likelihood of model parameters given the data will be wrong (see S3 Fig). This will give rise to some bias in our estimates of the likelihood within the particle filter (as demonstrated by Fig 11), and hence to some bias in our final approximation of the posterior. We will consider the effect of varying the dispersion parameter, γ, in the reorientation kernel used to generate the synthetic dataset, on the resulting posterior distribution for the model parameters. In the limit γ → ∞, the wrapped normal distribution converges to the uniform distribution on (−π, π], meaning that the model is no longer misspecified. The misspecification of the model is most pronounced for smaller values of γ.

thumbnail
Fig 11. Estimates of the log likelihood via a particle filter for data generated with the true model and under a misspecified model.

In a) and b), we show estimates of the log likelihood for different values of λ and σ based on a dataset simulated with a uniform reorientation kernel, and assuming this same reorientation kernel in estimating the log likelihood via the particle filter. The peak of the log likelihood coincides with the true values of the parameters shown by the red dashed line. In c) and d), we show estimates of the log likelihood for different values of λ and σ based on a dataset simulated with a wrapped normal reorientation kernel with dispersion γ = 0.1. However, we assume a uniform reorientation kernel in the particle filter. This model misspecification results in a slight shift in the peak of the log likelihood compared to the true parameter values used shown by the red dashed line. The log likelihood estimates will be biased in the case of the misspecified model.

https://doi.org/10.1371/journal.pcbi.1006235.g011

We find, for a fixed value of the measurement noise, σ, that a reasonable approximation to the true posterior can be obtained for values of the dispersion, γ, larger than the measurement noise, σ. The approximate posterior distributions obtained are shown in Fig 12. When the dispersion, γ, is a similar magnitude to the measurement noise, which here is σ = 0.04rad, then reorientation events are frequently missed resulting in low estimates of the reorientation rate. For larger values of the dispersion parameter, we are able to robustly sample from the posterior distribution for λ and σ, despite notable differences in the misspecified reorientation kernel compared to the assumed uniform reorientation kernel. In this case, when a reorientation occurs, the observed angle changes are sufficiently different to the background measurement noise, that often it is interpreted as a reorientation event by the particle filter, despite error in the emission probabilities due to the model misspecification.

thumbnail
Fig 12. The posterior distribution obtained under model misspecification converges to the distribution obtained with the true model as the misspecified model approaches the true model.

We assume a wrapped normal reorientation kernel with dispersion parameter γ and vary this dispersion parameter. For large γ, this tends in distribution towards a uniform reorientation kernel, as shown in a). The posterior distributions obtained from performing inference with data generated for different values of γ are shown in b), with the corresponding marginal distributions shown in c) and d). In b), the true parameter values used are shown by the dark blue circle. In c) and d), the true parameter values are given by the red dashed line. Parameters used to generate the synthetic datasets are λ = 0.2 s−1, σ = 0.04 rad, Δt = 0.25 s, c = 50μms−1, and the details of the pMCMC are as in Fig 7.

https://doi.org/10.1371/journal.pcbi.1006235.g012

Application to RNA transport dataset

To demonstrate the effectiveness of our inference framework, we apply it here to a dataset of tracks obtained from imaging the transport of RNA-protein (RNP) complexes in a Drosophila oocyte. These complexes move on microtubules via molecular motors [1]. Occasionally, the complexes fall off a microtubule, and reattach on a different microtubule moving in a different direction. We neglect the diffusive stationary phase between falling off and reattaching on microtubules, meaning that we assume a running phase is followed immediately by another running phase. We apply our modelling framework, assuming an exponentially distributed running time distribution, fτ(t) = λ exp(−λt), with constant running speed, c, and a uniform reorientation kernel, .

We take 10 tracks of separate complexes moving in Drosophila nurse cells obtained from an in vivo imaging dataset of movements of staufen protein available in Zimyanin et al. [49]. Each track is short since the dataset is imaged in a single plane in the z direction, meaning that complexes move out of the frame of view frequently. The tracks used are shown in Fig 13a). We infer a subposterior distribution for each individual track and combine these together to produce a single posterior distribution combining data from different tracks. To combine the subposteriors, we use the consensus Monte Carlo algorithm of Scott et al. [50] developed for running MCMC on large datasets, via the implementation in the R [51] package parallelMCMCcombine [52]. This produces an approximate posterior distribution for the turning rate, λ, and measurement noise, σ, as shown in Fig 13b). We find 95% credible intervals for the turning rate λ of [1.06, 1.58] s−1 and for the measurement noise σ of [0.27, 0.47] rad, respectively. Based on a VJP model corresponding to Brownian motion, the relation λ = c2/(2D) holds for a constant running speed c and diffusion constant D. The running speed in active transport for staufen RNPs is of the order 0.5μms−1 [49], and based on the size of the staufen protein, we can estimate its diffusion coefficient as 1μm2s−1 [53], which gives a similar order of magnitude for λ as the estimate obtained here. Evidently the dataset used here is very noisy, but nonetheless we are able to obtain estimates for the turning rate in a model of RNA transport.

thumbnail
Fig 13. Posterior distribution for model parameters λ and σ for RNA transport in a Drosophila oocyte.

The results are based on 10 tracks labelled manually and sampled at a resolution of Δt = 0.478 s, with subposteriors obtained for each track and combined to give a single posterior via Consensus Monte Carlo. The tracks are shown drawn on the first frame in a), although not all tracks start from the same time point. The marginal posterior distributions for the reorientation rate, λ, and the measurement noise, σ, are shown in b).

https://doi.org/10.1371/journal.pcbi.1006235.g013

Discussion

In this work, we have considered parameter estimation of a VJP model for biological transport and how insights from this parameter inference process can inform experimental design. We generated estimates of the posterior distribution for parameters of a VJP model based on noisy datasets collected at varying temporal resolutions. To perform this parameter inference, we used pMCMC and derived the appropriate emission probabilities. We observed an abrupt breakdown in the quality of the posteriors obtained when decreasing the temporal resolution. This transition corresponds to a breakdown in modelling assumptions underlying the derivation of the emission probabilities. For example, as Δt increases we see multiple reorientations in a time interval. These assumptions are necessary in the development of our inference framework which relies on hidden states consisting of binary variables indicating whether or not a reorientation occurred in a time interval.

Increasing the magnitude of the noise in the data slowly decreased the quality of the median posterior estimates and increased the posterior variance. In general, better estimates were obtained when more data was available, either by decreasing Δt, or by increasing the total duration of the experimental observations, Tfinal. These results are intuitive, but the real benefit here is in quantifying the effects of choices of these experimental hyperparameters to enable researchers to make decisions about experimental design.

We also compared parameter estimation with pMCMC to that with ABC, and results suggest that different methods are appropriate in different situations, dependent on the data available. In particular, pMCMC performs well provided the assumptions made in deriving the emission probabilities hold, which is true for small Δt. For larger values of Δt, neither ABC nor pMCMC provide accurate samples from the full posterior distribution. The information about the model parameters present in the data for large Δt is limited. In terms of computational cost comparisons between ABC and pMCMC, an ABC rejection sampler is highly parallelisable over independent parameter samples [54] whereas pMCMC is less amenable to simple parallelisation. Making use of this parallelisation, our implementation of ABC rejection sampling was more computationally efficient than pMCMC. We show a quantitative comparison between the methods in S6 Fig. We note that the computational cost is strongly problem and implementation dependent.

Through comparison between inference with pMCMC and ABC, our results highlight a weakness of inference with ABC, in that it performs poorly for high dimensional data, and also a weakness of our application of pMCMC, which is that it relies upon assumptions about the number of reorientation events in a time step; when this fails our posterior estimates are no longer accurate. We note, additionally, that pMCMC allows us to sample from the exact posterior for parameters from a model (given that model is appropriate to describe the data), whereas via ABC we obtain approximate posteriors for a fixed tolerance, ϵ, which only become exact in the limit as ϵ → 0.

In Section ‘Methods’, we described a framework for inference using pMCMC, and made certain assumptions about the VJP model, such as a separation in timescales between runs and reorientations, and a memoryless exponential distribution for the running time. Although many of these are standard assumptions, it would be possible to perform the same analysis in a more general model. For instance, if a running time distribution was chosen that does not satisfy the memoryless property, we could introduce an extra hidden variable, s, for the time since the last reorientation. In addition, in this work we have described parameter estimation via a hidden states formulation of the VJP model using dependence on hidden states from two time intervals. This can be extended to hidden states from three time intervals, which can allow rare consecutive reorientation events to be handled more accurately, albeit at greater computational cost.

Given a fixed photon budget and a trade-off between temporal sampling frequency and measurement noise, our results in the Section ‘Experimental constraints’ indicate that a small value of Δt should be used for the discretisation in time i.e. that motile individuals should be imaged as frequently as possible. In practice, there may be disadvantages to this choice of Δt. Computationally, conducting parameter inference via pMCMC will be significantly more expensive, although the computational run time may still be small in comparison to the duration of an experimental protocol. Datasets with higher noise present may also be much harder to interpret. Here, we have assumed that the noise present in the data is applied to the observed angle change. In reality there is noise on each pixel, which may contribute to an uncertainty in identifying the observed position of the object of interest.

Additionally, we considered using our particle MCMC inference framework to fit data from a misspecified model; our results indicate that the framework is robust to moderate amounts of model misspecification. This allowed us to be confident in applying our framework to a real dataset of RNA tracks to consider the motility of RNA-protein complexes moving on the cytoskeleton, demonstrating the ability of our framework to obtain parameter estimates in challenging conditions with very noisy data.

Supporting information

S1 Appendix. Comparison with simulations.

Description of simulations to verify the analytic form for the emission probabilities.

https://doi.org/10.1371/journal.pcbi.1006235.s001

(PDF)

S1 Fig. Comparison with simulations.

Comparison between simulated results and theoretical predictions of the observed angle change for hidden states of the form 0, 1 in a), 1, 0 in b), and 1, 1 in c). The simulated results are shown by the histogram and the theoretical prediction for the observed angle change distribution is shown as the red dashed line. For both b) and c), we have conditioned on an observed angle change in the previous time interval of 0.1 rad, and for c) we also conditioned on an observed angle change prior to that of −1.0 rad. To generate these results, we used N = 107 simulated trajectories with running speed c = 50μms−1, uniform reorientation kernel, reorientation rate λ = 0.2 s−1 and time discretization Δt = 1 s.

https://doi.org/10.1371/journal.pcbi.1006235.s002

(TIF)

S2 Fig. Probability of hidden state sequences.

The probability of sequences of hidden states as Δt varies with reorientation rate λ = 0.2 s−1. For large values of Δt, the assumptions of the model start to break down as multiple reorientations appear within a single time interval.

https://doi.org/10.1371/journal.pcbi.1006235.s003

(TIF)

S3 Fig. Emission probabilities for misspecified model.

Comparison between assumed distribution of observed angle changes and simulated distributions when using a misspecified model for the reorientation kernel. We simulate N = 107 angle changes using a wrapped normal reorientation kernel with dispersion parameter γ given a certain sequence of hidden states (0, 1 in a), d), g); 1, 0 in b), e), h); 1, 1 in c), f), i)) and show a grey histogram of the simulated observed angle changes. To demonstrate the misspecification in the emission probabilities, we plot the assumed theoretical distribution of the observed angle changes as a red dashed line, based on assuming that the reorientation kernel is a uniform distribution. The model is more misspecified for a smaller value of the dispersion parameter γ. As in S1 Fig, we have conditioned on an observed angle change in the previous time interval of 0.1 rad for b), c), e), f), h), and i). For c), f), and i), we also conditioned on an observed angle change prior to that of −1.0 rad. We used a dispersion parameter, γ, in the reorientation kernel of γ = 0.4 for a), b), and c), γ = 1 for d), e), and f), and γ = 6.4 for g), h), and i).

https://doi.org/10.1371/journal.pcbi.1006235.s004

(TIF)

S4 Fig. Traceplots and autocorrelation functions for MCMC chains.

Traceplots and autocorrelation functions for MCMC chains to analyse convergence are shown for data generated with parameters λ = 0.2 s−1, Δt = 0.25 s, σ = 0.04 rad, as for the first posterior shown in Fig 7a). Similar results are seen in sampling for other posterior distributions shown.

https://doi.org/10.1371/journal.pcbi.1006235.s005

(TIF)

S5 Fig. Analysis of number of particles in the particle filter.

The number of particles used in the particle filter affects the variance of estimates of the log-likelihood. However, a higher computational cost is required to use more particles. In a), we show how the distribution of the log-likelihood estimates varies (provided the filter does not become degenerate) as we change the number of particles. We use 1000 runs of the particle filter with the specified number of particles and estimate the log-likelihood at the true value of the parameters (λ = 0.2 s and σ = 0.08 rad) used to generate a synthetic dataset. In b), we illustrate the variance in the (nondegenerate) log-likelihood estimates, and the mean time to obtain a single estimate. A moderate increase in the compute time to run the particle filter offers substantial decrease in the variance of the log-likelihood estimates. To strike a reasonable balance, we use 400 particles to generate the results presented in this work.

https://doi.org/10.1371/journal.pcbi.1006235.s006

(TIF)

S6 Fig. The computational cost of pMCMC and ABC per effective sample size.

The computational cost of parameter estimation with pMCMC and ABC per effective sample size depends on Δt. We quantify here a comparison between the computational cost of the pMCMC and ABC methods for parameter estimation when varying Δt. Datasets were generated with σ = 0.04 rad as in Fig 7a). The cost to produce a sample via pMCMC increases as Δt decreases, as shown by the grey squares. The cost for ABC remains approximately constant with Δt as we simulate data from the model a fixed number of times, N. The acceptance rate, α, in ABC affects the sample size we produce for a fixed number of simulations, N. For the results in Fig 10, an acceptance rate of α = 0.1% was used. This gives a cost per sample lower than for pMCMC, shown by the blue circles and triangles. For a smaller acceptance rate, α = 0.01% (shown by red circles and triangles), the cost per sample is much higher and the plots of the posterior are unchanged compared to those for α = 0.1%. The results for pMCMC are given by squares, ABC with transition matrix summary statistics are shown as circles and ABC with time series summary statistics are shown as triangles. We note that the computational cost depends strongly on the problem considered and the implementation used.

https://doi.org/10.1371/journal.pcbi.1006235.s007

(TIF)

S1 Code. Code implementing Bayesian inference via pMCMC for a VJP model.

Example code to implement pMCMC for a VJP model, as described in Section “Methods”, is available in a .zip folder or at https://github.com/shug3502/pmmc_inference_for_vjps.

https://doi.org/10.1371/journal.pcbi.1006235.s008

(ZIP)

Acknowledgments

The authors are grateful to Richard Parton and Ilan Davis for helpful discussions of the application to RNA transport.

References

  1. 1. Parton R M, Davidson A, Davis I, and Weil T T. Subcellular mRNA localisation at a glance. Journal of Cell Science, 127(10):2127–2133, 2014. pmid:24833669
  2. 2. Othmer H G, Dunbar S R, and Alt W. Models of dispersal in biological systems. Journal of Mathematical Biology, 26(3):263–298, 1988. pmid:3411255
  3. 3. Berg H C and Brown D A. Chemotaxis in Escherichia coli analysed by three-dimensional tracking. Nature, 239(5374):500–504, 1972. pmid:4563019
  4. 4. Rosser G, Fletcher A G, Wilkinson D A, de Beyer J A, Yates C A, Armitage J P, Maini P K, and Baker R E. Novel methods for analysing bacterial tracks reveal persistence in Rhodobacter sphaeroides. PLOS Computational Biology, 9(10):1:18, 2013. pmid:24204227
  5. 5. Taylor-King J P, van Loon E, Rosser G, and Chapman S J. From birds to bacteria: generalised velocity jump processes with resting states. Bulletin of Mathematical Biology, 77(7):1213–1236, 2015. pmid:26060098
  6. 6. Othmer H G and Hillen T. The diffusion limit of transport equations derived from velocity-jump processes. SIAM Journal on Applied Mathematics, 61(3):751–775, 2000.
  7. 7. Codling E A and Hill N A. Calculating spatial statistics for velocity jump processes with experimentally observed reorientation parameters. Journal of Mathematical Biology, 51(5):527–556, 2005. pmid:15868200
  8. 8. Codling E A, Plank M J, and Benhamou S. Random walk models in biology. Journal of the Royal Society Interface, 5(25):813–834, 2008.
  9. 9. Kareiva P M and Shigesada N. Analyzing insect movement as a correlated random walk. Oecologia, 56(2):234–238, 1983. pmid:28310199
  10. 10. Bovet P and Benhamou S. Spatial analysis of animals’ movements using a correlated random walk model. Journal of Theoretical Biology, 131(4):419–433, 1988.
  11. 11. Jonsen I D, Flemming J M, and Myers R A. Robust state-space modeling of animal movement data. Ecology, 86(11):2874–2880, 2005.
  12. 12. Liepe J, Taylor H, Barnes C, Huvet M, Bugeon L, Thorne T, Lamb J, Dallman M, and Stumpf M P H. Calibrating spatio-temporal models of leukocyte dynamics against in vivo live-imaging data using approximate Bayesian computation. Integrative Biology, 4(3):335–345, 2012. pmid:22327539
  13. 13. Jones P, Sim A, Taylor H, Bugeon L, Dallman M, Pereira B, Stumpf M P H, and Liepe J. Inference of random walk models to describe leukocyte migration. Physical Biology, 12(6):66001–66012, 2015.
  14. 14. Viswanathan G M, Afanasyev V, Buldyrev S V, Murphy E J, Prince P A, and Stanley H E. Lévy flight search patterns of wandering albatrosses. Nature, 381(6581):413–415, 1996.
  15. 15. Viswanathan G M, Afanasyev V, Buldyrev S V, Havlin S, Da Luz M G, Raposo E P, and Stanley H E. Lévy flights in random searches. Physica A, 282(1):1–12, 2000.
  16. 16. Edwards A M, Phillips R A, Watkins N W, Freeman M P, Murphy E J, Afanasyev V, Buldyrev S V, da Luz M G, Raposo E P, and Stanley H E. Revisiting Lévy flight search patterns of wandering albatrosses, bumblebees and deer. Nature, 449(7165):1044–1048, 2007. pmid:17960243
  17. 17. Benhamou S. How many animals really do the Lévy walk? Ecology, 88(8):1962–1969, 2007. pmid:17824427
  18. 18. Painter K J. Modelling cell migration strategies in the extracellular matrix. Journal of Mathematical Biology, 58(4):511–543, 2009. pmid:18787826
  19. 19. Mandeville J T, Ghosh R N, and Maxfield F R. Intracellular calcium levels correlate with speed and persistent forward motion in migrating neutrophils. Biophysical Journal, 68(4):1207–1217, 1995. pmid:7787012
  20. 20. Pankov R, Endo Y, Even S-Ram, Araki M, Clark K, Cukierman E, Matsumoto K, and Yamada K M. A Rac switch regulates random versus directionally persistent cell migration. Journal of Cell Biology, 170(5):793–802, 2005. pmid:16129786
  21. 21. Nicosia A, Duchesne T, Rivest L P, and Fortin D. A general hidden state random walk model for animal movement. Computational Statistics and Data Analysis, 105:76–95, 2017.
  22. 22. Andrieu C and Roberts G O. The pseudo-marginal approach for efficient Monte Carlo computations. The Annals of Statistics, 37(2):697–725, 2009.
  23. 23. Andrieu C, Doucet A, and Holenstein R. Particle Markov chain Monte Carlo methods. Journal of the Royal Statistical Society: Series B, 72(3):269–342, 2010.
  24. 24. Gordon N J, Salmond D J, and Smith A F. Novel approach to nonlinear non-Gaussian Bayesian state estimation. In IEE Proceedings F, volume 140, pages 107–113. IET, 1993.
  25. 25. Rasmussen D A, Ratmann O, and Koelle K. Inference for nonlinear epidemiological models using genealogies and time series. PLOS Computational Biology, 7(8):e1002136, 2011. pmid:21901082
  26. 26. Golightly A and Wilkinson D J. Bayesian parameter inference for stochastic biochemical network models using particle Markov chain Monte Carlo. Interface Focus, 1(6):807, 2011. pmid:23226583
  27. 27. Golightly A, Henderson D A, and Sherlock C. Delayed acceptance particle MCMC for exact inference in stochastic kinetic models. Statistics and Computing, 25(5):1039–1055, 2015.
  28. 28. Grimmett G and Stirzaker D. Probability and Random Processes. Oxford University Press, 2001.
  29. 29. Rosser G. Mathematical modelling and analysis of aspects of planktonic bacterial motility. PhD thesis, University of Oxford, 2012.
  30. 30. Metropolis N, Rosenbluth A W, Rosenbluth M N, Teller A H, and Teller E. Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21(6):1087–1092, 1953.
  31. 31. Doucet A, Godsill S, and Andrieu C. On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics and Computing, 10(3):197–208, 2000.
  32. 32. Sherlock C, Thiery A H, Roberts G O, and Rosenthal J S. On the efficiency of pseudo-marginal random walk Metropolis algorithms. The Annals of Statistics, 43(1):238–275, 2015.
  33. 33. Pitt M K, dos Santos Silva R, Giordani P, and Kohn R. On some properties of Markov chain Monte Carlo simulation methods based on the particle filter. Journal of Econometrics, 171(2):134–151, 2012.
  34. 34. Pritchard J K, Seielstad M T, Perez-Lezaun A, and Feldman M W. Population growth of human Y chromosomes: a study of Y chromosome microsatellites. Molecular Biology and Evolution, 16(12):1791–1798, 1999. pmid:10605120
  35. 35. Beaumont M A, Zhang W, and Balding D J. Approximate Bayesian computation in population genetics. Genetics, 162(4):2025–2035, 2002. pmid:12524368
  36. 36. Prangle D. Adapting the ABC distance function. arXiv preprint arXiv:1507.00874, 2015.
  37. 37. Harrison J U and Baker R E. An automatic adaptive method to combine summary statistics in approximate Bayesian computation. arXiv preprint arXiv:1703.02341, 2017.
  38. 38. E Bernton, Jacob P E, Gerber M, and Robert C P. Inference in generative models using the wasserstein distance. arXiv preprint arXiv:1701.05146, 2017.
  39. 39. Zhao Q, Young I T, and De Jong J G. Photon budget analysis for fluorescence lifetime imaging microscopy. Journal of Biomedical Optics, 16(8):086007–086007, 2011. pmid:21895319
  40. 40. Taylor H B, Liepe J, Barthen C, Bugeon L, Huvet M, Kirk P W, Brown S B, Lamb J R, Stumpf M P H, and Dallman M. P38 and JNK have opposing effects on persistence of in vivo leukocyte migration in zebrafish. Immunology and Cell Biology, 91(1):60–69, 2013. pmid:23165607
  41. 41. Weavers H, Liepe J, Sim A, Wood W, Martin P, and Stumpf M P H. Systems analysis of the dynamic inflammatory response to tissue damage reveals spatiotemporal properties of the wound attractant gradient. Current Biology, 26(15):1975–1989, 2016. pmid:27426513
  42. 42. Fearnhead P and Prangle D. Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society: Series B, 74(3):419–474, 2012.
  43. 43. Barnes C, Filippi S, Stumpf M P H, and Thorne T. Considerate approaches to constructing summary statistics for ABC model selection. Statistics and Computing, 22(6):1181–1197, 2012.
  44. 44. Blum M G, Nunes M A, Prangle D, and Sisson S A. A comparative review of dimension reduction methods in approximate Bayesian computation. Statistical Science, 28(2):189–208, 2013.
  45. 45. Sisson S A, Fan Y, and Tanaka M M. Sequential Monte Carlo without likelihoods. Proceedings of the National Academy of Sciences, 104(6):1760–1765, 2007.
  46. 46. Toni T, Welch D, Strelkowa N, Ipsen A, and Stumpf M P H. Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. Journal of the Royal Society Interface, 6(31):187–202, 2009.
  47. 47. Blum M G and François O. Non-linear regression models for approximate Bayesian computation. Statistics and Computing, 20(1):63–73, 2010.
  48. 48. Frazier D T, Robert C P, and Rousseau J. Model misspecification in abc: Consequences and diagnostics. arXiv preprint arXiv:1708.01974, 2017.
  49. 49. Zimyanin V L, Belaya K, Pecreaux J, Gilchrist M J, Clark A, Davis I, and St Johnston D. In vivo imaging of oskar mRNA transport reveals the mechanism of posterior localization. Cell, 134(5):843–853, 2008. pmid:18775316
  50. 50. Scott S L, Blocker A W, Bonassi F V, Chipman H A, George E I, and McCulloch R E. Bayes and big data: The consensus Monte Carlo algorithm. International Journal of Management Science and Engineering Management, 11(2):78–88, 2016.
  51. 51. R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2013. URL http://www.R-project.org/.
  52. 52. Miroshnikov A and Conlon E M. ParallelMCMCcombine: an R package for Bayesian methods for big data and analytics. PLOS One, 9(9):e108425, 2014. pmid:25259608
  53. 53. Kumar M, Mommer S M and Sourjik V. Mobility of cytoplasmic, membrane, and DNA-binding proteins in Escherichia coli. Biophysical Journal, 98(4):552–559, 2010. pmid:20159151
  54. 54. Owen J, Wilkinson D J, and Gillespie C S. Scalable inference for Markov processes with intractable likelihoods. Statistics and Computing, 25(1):145–156, 2015.