Statistical modelling of cell movement

Collective cell movement affects vital biological processes in the human organism such as wound healing, immune response, and cancer metastasis. A better understanding of the mechanisms driving cell movement is then essential for the advancement of medical treatments. In this paper, we demonstrate how the unscented Kalman filter, a technique used extensively in engineering in the context of filtering, can be applied to estimate random or directed cell movement. Our proposed model, formulated using stochastic differential equations, is fitted on data describing the movement of Dictyostelium cells, an amoeba routinely used as a proxy for eukaryotic cell movement.

In this paper, we propose a model that describes the movement of any individual cell being driven by an external resource gradient using stochastic differential equations (SDEs) of the following form: Equations 1 and 2) describe the evolution in time of the x and y coordinates of a cell in 2D space. dB X t and dB X t are Brownian motion terms, which represent in this model the intrinsic randomness in a cell's movement. The coordinate in the y direction has a drift term that is described by three parameters: , the amplitude of the resource gradient; , the steepness of the gradient; and , which indicates how fast the gradient changes over time. The strength of the random component in the cell movement equations is indicated by the diffusion coefficient .
Inference in nonlinear dynamical systems, such as the one we propose in Equations 1 and 2, poses numerous challenges due to the stochastic nature of the data, intractable likelihoods, and unidentifiable parameters. Recent developments have tackled this problem using likelihood-free methods such as sequential Monte Carlo approximate Bayesian computation (SMC ABC), or computational methods (particle Markov Chain Monte Carlo [MCMC]; Golightly & Wilkinson, 2011); however, these can become too computationally expensive as the number of time points or parameters increases. More specifically, in high-dimensional spaces (either in terms of data or parameters), approximate Bayesian computation (ABC) methods become inefficient due to high rejection rates, which means that a large number of model simulations with proposed parameter values will be required for posterior inference (Beaumont, Zhang, & Balding, 2002). Furthermore, whilst summary statistics can potentially reduce the dimensionality of the problem, selecting appropriate measures is a problem-specific task, meaning that the algorithm needs to be adapted to each application (Aeschbacher, Beaumont, & Futschik, 2012). Lastly, particle MCMC, although an effective and easy-to-generalise approach, is very computationally intensive because a particle filter is run at each iteration of the algorithm, and furthermore, thousands of iterations are normally required (Andrieu, Doucet, & Holenstein, 2010). A great deal of research is currently focused on easing the computational burden by parallelising particle MCMC algorithms (Mingas, Bottolo, & Bouganis, 2017).
State space models (SSMs) are widely used representations of physical processes described as first-order differential equations in terms of inputs, outputs, and parameters. Applications of SSMs typically involve noisy observations in the form of time series representing a latent process, governed by unknown parameters. The problem of interest (estimation in SSMs), also known as "filtering", refers to inferring the latent process and the unknown parameters by making use of the observations in a recursive or "online" form, as observations become available. In simple cases, where the problem can be formulated as a linear Gaussian SSM (LG-SSM), the Kalman filter has been shown to provide an optimal solution to the filtering problem, in a sense that the estimates produced minimise the root mean squared error (MSE) of the parameters. One of the earliest applications of the Kalman filter was in navigation systems and object tracking, such as airplanes, missiles, and spaceships, using noisy radar data. Perhaps the most prominent contribution to the field of aerospace was the use of the Kalman filter for trajectory estimation in the Apollo mission in the 1960s (Grewal & Andrews, 2010). Further applications of the Kalman filter are found in time series forecasting, image and signal processing, sensor data fusion, robotics (Choset & Nagatani, 2001), economics (Bahmani-Oskooee & Brown, 2004), etc.
However, most often, the applications will not meet the linearity or Gaussian assumptions, meaning that the classical Kalman filter cannot be applied, and this gave rise to a different class FIGURE 1 Three frames from the high-resolution microscopy video recording Dictyostelium cells movement, corresponding to times (min:sec): 00:00, 04:44, 09:28. The data used in this study consist of two cell paths in the form of 200 spatial locations (x and y coordinates) extracted at 4-s intervals. The two chosen cell paths correspond to the red (strong directed movement) and dark blue (weak directed movement) lines in the three time frames of nonlinear Kalman filters that can be applied to nonlinear system representations. One of these methods is the unscented Kalman filter (UKF), an online Bayesian filtering method for nonlinear systems introduced by Julier and Uhlmann (1997) as an alternative to the widely used extended Kalman filter. The former provides improved performance in terms of stability and accuracy of estimates without any linearisation. Furthermore, it can easily be scaled up to higher dimensions. Intuitively, the classical Kalman filter starts from the initial distribution of the state vector, drawn from a multivariate normal distribution, which is then iterated through a prediction and updating step for each measurement available using the transition and observation models. Recent applications of the UKF include modelling soft tissue mechanics of the heart (Xi et al., 2011), chemical kinetics (Baker, Poskar, & Junker, 2011), modelling beetle flying techniques (Mohamad, 2015), neural network training (van der Merwe & Wan, 2001), and nonlinear dynamical system identification (Sitz, Schwarz, Kurths, & Voss, 2002;Voss, Timmer, & Kurths, 2004) .
In this paper, we present our approach to fitting the SDE model in Equations 1 and 2 to cell movement data using the UKF, a nonlinear Bayesian filter. We provide some insights into the particularities of this model using simulated data and discuss the results of this analysis from a real data set, describing the movement of Dictyostelium cells (see Figure 1).

The Kalman filter in general
We introduce the UKF by referring to a general SSM, composed of a state equation as follows: and a measurement equation as follows: where x t represents the vector of the hidden states, y t are the measurements, t is the process noise at time t, with covariance matrix Q t , t is the observation noise at time t, with covariance matrix R t , and the functions f and g represent the transition and the observation models, respectively. Whilst non-Gaussian or correlated noise sources can be accommodated using nonlinear Kalman filters, in simpler applications, t and t usually represent independent and identically distributed additive Gaussian noise. Furthermore, if the observation and transition models are linear, Equations 3 and 4 can be simplified to give rise to an LG-SSM or a linear dynamic model, as follows: In Equations 5 and 6, A t and B t represent the transition and measurement matrices. The importance of the LG-SSM model comes from the fact that it supports exact inference via the Kalman filter: The Gaussian prior representing the initial state of the system, (x 1 ) =  ( 1|0 , 1|0 ) is propagated to the following states that will also be Gaussian, (x t |y 1∶t ) =  ( t|t , t|t ) =  ( t , t ) (for notational simplicity). In general, the filtering problem just refers to estimating the latent states x t given the noisy observations y t , that is, calculating the density p(x t |y 1:t ). This can be performed online or, as the data stream in, by sequentially applying Bayes rule in the form of an update step, as follows: and a prediction step, as follows: The advantage of the Kalman filter comes from the fact that the probability distribution of the predictor step p(x t |y 1:t−1 ) and the probability distribution of the updating step p(x t |y t , y 1:t−1 ) can be obtained in closed form using the properties of Gaussian distribution and linearity assumptions as follows: and so, wherēt and̄t are the prediction mean and covariance matrix at time t.
For the update step, we make use of the known result for Bayes rule applied to linear Gaussian systems (Murphy, 2012, p. 119), which means that p(x t |y t , y 1:t−1 ) from (7) has a closed-form solution and is also a Gaussian distribution, as follows: where t and t are the update mean and covariance matrix at time t: In Equations 14 and 15, represents the Kalman gain matrix and r t = y t −ŷ t = y t − B t t|t−1 represents the residual or the difference between the predicted and the actual observations. For step-by-step derivations of the Kalman filter equations, see p. 643 in Murphy (2012).
Therefore, the algorithm essentially updates the mean and covariance of the Gaussian distribution of the state vector at each iteration, meaning that the Kalman filter equations are entirely dependent on the predicted means and covariances of x t and y t , given in (12).
As such, Julier and Uhlmann (1997) pointed out that in a nonlinear scenario (i.e., when f or g is nonlinear), the Kalman filter challenge essentially amounts to calculating the first two moments of a random variable that has to undergo a nonlinear transformation.

Unscented transformation
In the UKF algorithm, the approximation of the Gaussian distribution is made using the unscented transform, which consists of a set of deterministically chosen sigma points that are passed through the nonlinear function and weighted to obtain the mean and covariance of the Gaussian (Sitz et al., 2002). The unscented transform was developed based on the intuition that it is easier to approximate a Gaussian distribution with a fixed number of parameters than to calculate an approximation to a nonlinear transformation (Julier & Uhlmann, 2004). However, because no random sampling is involved, this is not to be confused with a Monte Carlo method, which also means that a smaller number of sampled points are required (see Figure 2 for an illustration of these methods for a 2D system) . More specifically, in the context of UKF, the location and weights for the sigma points are chosen to match the first two moments (mean and covariance) of the prior distribution. The points are then transformed using the nonlinear function, and the statistics of interest (mean and covariance) are calculated (Julier & Uhlmann, 2004).
is a nonlinear function, and we want to estimate p(y), the unscented transformation achieves this by first calculating a set of 2d + 1 sigma points, as follows: where i denotes the ith column of the matrix , is a scaling parameter, and d is the dimension of the vector x.
To obtain the transformed sigma points, we propagate them through the nonlinear function  i = ( i ), and then, we obtain the mean and covariance of p(y) by weighting the transformed sigma points, as follows: where the weights are defined as follows: . (20)

Nonaugmented UKF
The unscented transformation is used twice for each iteration of the UKF algorithm. First, the algorithm approximates the predictive density (x t |y 1∶t−1 ) =  (x t | t , t ) by using the previous , which is transformed using the nonlinear function f and then used to calculate the true mean and covariance (displayed as a black circle and a black ellipse). Note that with only five sigma points, shown in the right panel (which is orders of magnitude less than a typical Monte Carlo sampler, shown by the blue points in the left panel), the UT provides good estimates for the mean and covariance of the transformed variable y. The unscented transformation steps shown on the right-hand side are as follows: A set of five sigma points (red circles at the top) is chosen to represent the 2D system, which is then passed individually through the nonlinear function to obtain the set of transformed sigma points  = f() (red circles at the bottom). Calculating the weighted sample mean and covariance of the transformed sigma points (plotted as a green circle and a green ellipse) will provide an estimate for the true mean and covariance of the variable of interest y (obtained using Monte Carlo simulation and displayed as a black circle and a black ellipse; Wan and van der Merwe, 2000) time point belief state  (x t−1 | t−1 , t−1 ) and by passing it through the transition model f, as follows: The second unscented transform is performed to approximate the likelihood p(y t |x t ) by using the prior  (x t |̄t,̄t) and the observation model g, as follows: We then obtain the posterior density p(x t |y 1:t ) using Bayes rule (see Murphy, 2012, chapter 18.5, for derivations) as follows: An extension to the classical UKF proposed by Sitz et al. (2002) indicates that model parameters can be included as dynamical variables in the hidden states vector x t , which means that these can be estimated at every time point alongside other unobserved system states. Thus, we can use the UKF in an augmented form as a unified methodology for system state tracking and parameter estimation. Wu, Hu, Wu, and Hu (2005) presented an augmented version of the UKF and demonstrated the advantages of process and measurement noise augmentation to the state vector in the case of nonadditive or non-Gaussian noise. The reason behind this is that the measurement and process noise added to the state vector capture better odd-moment information through the computation of additional sigma points. Applications and implementations of this method can be found in Hartikainen, Solin, & Särkkä (2011) and Sitz et al. (2002).

Augmented UKF
The augmented UKF can be reformulated from the standard UKF SSM by merging the signal states x t , parameters t , process noise t , and measurement noise t and by constructing an augmented state equation as follows: and an augmented measurement equation, as follows: Note that the augmented UKF algorithm equations can be derived in a similar manner to Equations 21-28, except that in the prediction step, the sigma points are calculated using the augmented state vector x a t and the corresponding augmented covariance matrix, as follows: where t is the covariance matrix for the parameters. Also, the number of sigma points is now 2d a + 1, where d a is the size of the augmented state vector x a t .

SIMULATION RESULTS
In this section, we include results from various simulations we carried out using the augmented and nonaugmented UKF implementations from the MATLAB toolbox developed by Hartikainen et al. (2011). First, we apply the Euler-Maruyama discretisation to bring the system described in Equations 1 and 2 into the standard SSM described in Section 2, as follows: Here, ΔB X t andΔB Y t are just sums of random normal increments between time t − 1 and t. As such, in a more general notation (dropping the superscript for ease), t (0, 1).
In the notation above, Δt is the sampling time step and t is the integration time step required by the discretisation of the Brownian motion path (for more details on the discretisation of SDEs, see Higham, 2001). We fix these time steps at values Δt = 0.1, t = 0.001 for all the simulations presented in this section.
Equations 34 and 35 thus define the transition function f from (3). In this scenario, we assume that the process is observed with a small amount of additive Gaussian noise t ∼  (0, 0.1 2 ); hence, g from (4) We then fit the nonaugmented UKF to a synthetic data set using the following parameters: = 5, = 2, = 0.5, = 0.1. The results summarised in Figure 3 indicate good agreement between the estimated UKF path and the true cell path.
The UKF also provides good estimates for the parameterŝ= 1.88,̂= 0.51,̂= 0.04 with relatively small standard errors 0.17, 0.83, 0.06, except for̂where the estimates indicate a more substantial deviation from the true parameter (bias: 0.44, and standard error is 0.35).

Profile likelihood
A potential source of bias as the one observed in Figure 3 can be investigated by looking at the likelihood, that is, marginal likelihood with respect to the hidden states. In order to do that, we first derive the probability of the observed system at time t conditional on the state of the system at time t − 1 by integrating out the latent variable x t , as follows: Using the Gaussian convolution integral results (Bishop, 2006, chapter 2.3), we simplify (37) to the following: wherēt and̄t are the predicted mean and covariance at time t, and R t is the measurement noise covariance matrix at time t.
The log likelihood is then as follows: where t and̄t are defined in Equations 32 and 24, respectively. We evaluate the marginal log likelihood in (40) by considering a grid of values for each parameter in the model and by fitting the UKF with each parameter combination. The results summarising the profile likelihood for the parameter in Figure 3 can be used to calculate the Cramér-Rao lower bound, which provides an indication of the intrinsic uncertainty specific to the problem. In this case, the minimum standard deviation attainable by an estimator of is 0.14. The standard error obtained from the UKF estimation for is 0.35; this then indicates that the estimated value of the parameter is reasonably close to the true value.

Comparison of the augmented and nonaugmented UKF
It was also of interest to investigate the performance of the augmented UKF (including the error terms in the state vector) compared with the nonaugmented UKF, as the former has been reported in the literature to provide better estimation (Hartikainen et al., 2011;Wu et al., 2005).
A simulation study was carried out to explore the differences in estimation of the augmented versus nonaugmented UKF on the cell movement system parameters in Equations (34) and (35). Both methods of interest were applied to 20 data instantiations, where all the system parameters = 5, = 2, = 0.5, = 0.1, as well as the integration and discretisation parameters t = 0.001, Δt = 0.1, were kept fixed. The difference between the 20 data sets would then be due to the intrinsic stochasticity of the sampled Brownian motion paths ΔB X t , ΔB Y t . The comparison of the two methods is performed by comparing the relative bias (R.Bias) and relative MSE (R.MSE), calculated as follows: Relative MSE: wherêdenotes the UKF estimate for each parameter of interest, is the true parameter value, and N is the number of observations in the simulation.   Figure 4 displays boxplots of the relative bias (left-hand side) and the relative MSE (right-hand side) for the nonaugmented Kalman filter constructed with the estimates from the 20 data set instantiations. Similarly, Figure 5 displays the same measures for the augmented Kalman filter. In terms of relative bias, the augmented version provides an average improvement of 13% for , 9% for , 18% for , and 10% for . In terms of R.MSE, an improvement is also observed from the augmented UKF for each of the parameters in the cell movement system (see Table 1).

APPLICATION ON DICTYOSTELIUM CELLS MOVEMENT
Dictyostelium cells are widely used in experiments as proxies for understanding the mechanisms of human disease because of their similarities to important human cells (leukocytes and cancer cells) in terms of biology and response to chemotaxis (Tweedy et al., 2016).
Data on cell movement is available in the form of a high-resolution microscopy video of a group of Dictyostelium amoebae. These data were produced in a dish of agar containing spatially homogeneous levels of the chemoattractant folate, which can be consumed and depleted by the cells to create a resource gradient. Dictyostelium cells were added to a trough cut in the centre of the dish, and the movement of these cells once they had moved under the agar and out of this trough was filmed under a microscope. The data used in this section consist of two cell paths corresponding to 200 spatial locations as x and y coordinates extracted at 4-s intervals using the ImageJ software (Rasband, 1997).
The choice of cells for this study was motivated by their movement behaviour; One displays a more random movement, due to weaker drift, and one displays a more directed movement, due to stronger drift. These cells can be identified visually from Figure 1 as belonging to either the group of front cells or the group of cells that linger behind. These specific movement patterns arise as a consequence of the fact that the front cells are driven upwards by the availability of the chemoattractant, which they deplete locally, whereas the cells behind show a more random movement because the chemoattractant they experience has been partly depleted by the front group. Figure 6 displays the augmented versus nonaugmented UKF estimation results for a Dictyostelium cell displaying more random (weak drift) movement. We emphasise that the main interest for biological applications is the inference of the parameters, which are included in Figure 6 (bottom). However, because the true parameters for the real data are unknown, we use the tracking of the cell trajectories as a proxy for assessing the accuracy of inference  (see Figure 6, top). In terms of MSE, the augmented UKF provides a marginal improvement over the nonaugmented UKF (see Table 2), which is in agreement with the findings from visually inspecting the cell path estimates. Similarly, the corresponding results for the directed movement (strong drift) of a Dictyostelium cell and the visual estimates in Figure 7 (top) are in close agreement with the MSE calculation in Table 2; the augmented version provides marginally better estimates for the cell's x and y coordinates. As before, we include the comparison of parameter estimates in Figure 7 (bottom), with the mention that a gold standard (i.e., true parameter value) is not available, and we rely on simulations from the system equations in (34) and (35) to assess goodness of fit.

CONCLUSIONS AND FUTURE WORK
In this paper, we demonstrate the application of the UKF, a Bayesian filtering technique that adequately trades off accuracy versus computational efficiency, to a real-world problem potentially relevant to cancer research: the movement of Dictyostelium cells, which has not been tackled at individual cell level before.
First, we have investigated the performance of the noise augmented UKF in comparison with the nonaugmented UKF through a simulation study. Our results indicated a small reduction of the estimation bias for cell movement parameters between, on average, 9% and 18%. Second, we applied both of these methods to real data consisting of the movement of two Dictyostelium cells, one displaying strong drift (more directed) movement and the other one displaying weak drift (more random) movement. Our results indicate that both methods estimate the cell paths well, with the augmented version providing a marginal improvement.
In conclusion, our results indicate that the UKF can be successfully used for parameter inference and tracking cells displaying various movement patterns. Future research will extend this work by applying the UKF to a population of cells. Additionally, we plan to use this framework of parameter and state estimation to fit models describing alternative movement mechanisms, such as the self-induced gradient model described by Tweedy et al. (2016), and employ model selection criteria to choose the best model.