Recovering ‘lost’ information in the presence of noise: application to rodent–predator dynamics

A Hamiltonian approach is introduced for the reconstruction of trajectories and models of complex stochastic dynamics from noisy measurements. The method converges even when entire trajectory components are unobservable and the parameters are unknown. It is applied to reconstruct nonlinear models of rodent–predator oscillations in Finnish Lapland and high-Arctic tundra. The projected character of noisy incomplete measurements is revealed and shown to result in a degeneracy of the likelihood function within certain null-spaces. The performance of the method is compared with that of the conventional Markov chain Monte Carlo (MCMC) technique.


Introduction
Measurements of complex (multidimensional, nonlinear) stochastic dynamical systems are often missing essential information about the dynamical model and system trajectory (hidden variables). For historic time series where measurements cannot be repeated, missing information can be considered 'lost'. Examples include: reconstruction of hidden variables of climate evolution [1] from observable isotope concentrations; fluctuations in predator-prey communities [2]- [6]; and coupled matter radiation systems [7]. How to reconstruct missing information about a stochastic dynamical model, including the full system trajectory, from incomplete noisy measurements is an unresolved scientific conundrum of long standing.
Recently, this problem has received considerable attention. A particle filter approach [8] and the Markov chain Monte Carlo (MCMC) approach [9] were applied successfully to reconstruct the model parameters alone. A simplified version, with all the dynamical variables measured directly, was considered in [10]- [13] using Bayesian and log-likelihood approaches. In the absence of measurement noise, attractor reconstruction [14] in time-delayed variables was used [5] to infer model parameters; in the absence of dynamical noise, shooting and recursive approaches were applied to reconstruct both the parameters and the trajectory [15]. The problem has remained unresolved, however, for the practically important case where hidden dynamical variables, measurement noise and dynamical noise all coexist together. In this case, unknown dynamical model parameters are trading against hidden trajectory components within certain null-spaces leading to a quasi-degenerate likelihood function with a very complex 'landscape' and numerous local minima (cf [15]).
In what follows, we present a solution of this general problem based on a novel Hamiltonian formalism derived from the first principles of stochastic dynamics [16]- [19] within the maximum likelihood estimation (MLE) framework. The key result is the discovery of the Hamiltonian equations underlying the MLE problem, by analogy with the Wentzel-Friedlin theory [17] of large fluctuations. It reveals the complex structure of the likelihood and provides for robust convergence. We apply it to the analysis of an archetypical problem in nonlinear dynamics and statistics: noise-driven rodent-predator oscillations [2,3,5,6] for which the predator population could not in practice be observed. We show that, contrary to earlier beliefs [2,3,5,6], noise-corrupted measurements of the prey dynamics alone are a sufficient basis for computing both the likelihood of the predator population and the ecological model parameters. We compare the performance of the method with that of the conventional [12,13]  MCMC technique. Our formalism promises to be useful, not only for the analysis of population dynamics with hidden variables (see e.g. several thousand time series in the database [20]), but also in many other disciplines and contexts. Figure 1(a) plots N (t) time-series data (points) for a rodent (prey) population density in Finnish Lapland [20]. Such cyclic behavior among small mammalian species has preoccupied population ecologists for decades. Perhaps on account of its stochastic multiannual quasiperiodicity, and difficulties in measuring predator numbers, there is still no agreement on the mechanisms of oscillation, nor of the reasons why some populations are cyclic and others are not (e.g. noncyclic lemming populations in Arctic Canada [2,3,21]). In these settings, the ability to discriminate between various models by inferring both the model and the unobserved population is especially useful. In the example we consider here, the problem is to use a limited number (80) of noise-corrupted measurements to recover both the unmeasured predator population density P(t) and the underlying nonlinear stochastic dynamical model. To solve it, we introduce a Hamiltonian approach. In doing so, we chose from among the various predation models [2,3] one that distinguishes between the specialist and generalist predators that respectively promote and suppress the cyclic behavior. This particular model explains many observed features of rodent-predator dynamics and can be written as [2,3,5]

A problem in ecology
(1) Functions f n, p (σ n, p , t) = (1 − e n, p sin(2π t + φ n, p ) + σ n, p ξ n, p (t)) in (1) represent the modulation of the birth rate by periodic (seasonal) and stochastic forcing with amplitudes e n, p and σ n, p respectively, and φ n, p are unknown phase shifts of periodic forcing. ξ n, p (t) are mutually uncorrelated zero-mean white Gaussian dynamical noise sources; r and s are, respectively, the intrinsic rates of prey and predator growth; k is the habitat's prey-carrying capacity, which determines the saturation level of the prey population in the absence of predators. The effect of generalist predators (e.g. foxes) whose population does not change with N is encapsulated within the term ∝ N 2 with coefficient g (maximum mortality rate). The effect of the specialist predators (small mustelids) believed to maintain the oscillatory prey dynamics is described by the term ∝ N P, with maximum killing rate c. However, the predator population is notoriously difficult to study in the field, so that the corresponding density P(t) is an unmeasured (hidden) variable. The measured rodent density N (t) is related to the actual (unknown) value N (t) via N = N e σ obs η(t) where η(t) is white Gaussian measurement noise of unit intensity. The precise functional forms are known neither for predation nor for the predator response, and some modifications of equations (1) have been considered [5]. By a change of variables x 1 (t) = log(N (t)/k ), x 2 (t) = log(q P/k ) (for known nominal values of the scaling coefficients k and q ) the predator-prey and the measurement equations are transformed into a standard [8]- [15] form with noise entering the equations only additivelẏ Here  (1)).  1 , e 2 , k, g, c, q, h, d, φ n , φ p , σ n , σ p , σ obs }, the measurement matrix B i j = δ i1 δ j1 , and the vector fields are The standard form (2) represents a generalization that can be applied to a great variety of stochastic dynamical model reconstruction problems with hidden variables.

General solution
We now formulate the general inference problem of finding unknown model parameters M and unobserved trajectory components (here x 2 (t)) in probabilistic terms. A key statistical quantity is the likelihood probability density functional (LPDF), which is proportional (within the MLE framework) to the posterior distribution P Y [x(t); M] representing the joint probability density that the system trajectory is x(t) and that the system parameter values are M, conditioned on the observed time series Y = {y(t m ), t m = m h} K m=1 . It can be written as ; M] can then be found using the path-integral approach to fluctuational dynamics [18,19], as Here y = y(t). Also T = Kh where K is the number of data points, h is the sampling time, and M L implying the existence of hidden variables. Despite the hidden dynamical variables not being measured directly, the functional S Y (3) contains the dynamical coupling between the variables via the force fields K i (x 1 , x 2 ). If the measurements provide information sufficient to pin down both the key model parameters of the system and its trajectory, the joint LPDF P Y will be well localized in the vicinity of its maximum, thus revealing the most probable trajectory x * (t) and the parameter values M * that minimize the functional S Y for a given set of measurements Y.
We solve this variational problem by introducing a new paradigm in which S Y is viewed as the mechanical action of an auxiliary Hamiltonian system with coordinate x, momentum p =D −1 (ẋ(t) − K) and a Hamiltonian function H (x, p) The extremum of S Y in the joint space (x(t), M) is then found by solving the coupled variational problems The first condition corresponds to a solution of the boundary value problem (BVP) for the Hamiltonian equationṡ satisfying the boundary conditions p(0) = p(t) = 0. This BVP can be solved very efficiently using, e.g. the results of [22]. An interesting general property of this solution is that the difference between measured data y(t) and x(t) is fedback in the last term of the second equation in (6). This term effectively acts as a 'control force' of an amplitude inversely proportional to the measurement noise intensity driving the x(t) towards its most probable value at a fixed M. We then fix the inferred trajectory x(t) and update the values of the parameters in the set M, using an analytic solution [11] of the second variational problem, ∂ S Y ∂M = 0. This procedure is iterated until the desired convergence is achieved. The outcome is the most probable system trajectory x * (t) and set of model parameters M * providing a minimum of the action functional S Y . It corresponds to a point p * =(x * (t), M * ) in the joint trajectory-parameter space.

Application to the ecological problem
Note that the accuracy of the presentation (3) depends strongly on the sampling rate 1/ h. For the particular ecological problem (1), it was possible to simulate densely-sampled data by introducing a spline interpolation of the experimental points. Convergence was guaranteed by the fact that the data set has nine points per period of oscillation.
Next, we note that in the presence of hidden variables slight changes in the model parameters can lead to significant changes in the system's trajectory, giving rise to a complex shape for S Y . This is the case for the problem in hand because, with a periodic driving force, (1) can undergo a transition to chaos as the system parameters vary [2,5]. To investigate the shape of S Y we define a window of possible values for each model parameter within the set of plausible parameter ranges elucidated in earlier research [2,3,5,6]. For many parameters the windows are very broad. We show some of them as rectangular planes in figure 2. Starting from a random point inside the predefined range we find a point p * corresponding to one of the solutions of the dual variational problem (5). Random restarts are needed if there is (as in our case) more than one solution p * . Projections of the points p * onto the 2D parameter plains and corresponding 3D plots of the LPDF and S Y are shown in figure 2.

7
For example figure 2(a) shows the projection of the LPDF on the plane of parameters (r, r k /k). In this figure, all the parameters except parameters r and r k /k are kept fixed at their mean values. Parameters (r, r k /k) are then chosen from the uniform distribution shown in the background of the figure. After one step of the double optimization procedure described above, the optimal values of the trajectory and parameters r and r k /k are obtained p * = (x * (t), r * , (r k /k) * ) and the corresponding value of the cost function S * Y ( p * ) is calculated. The procedure is repeated N = 20 000 times. Then the projection of the LPDF surface on the (r, r k /k) plane ( figure 2(a)) is found as follows: where ρ i j is the density of points in the bin (i j), S * Y,i j is the mean value of the cost function in this bin, N i j is the subset of N found in the bin (i j), I = 60 and J = 40 are the number of bins in r and r k /k directions, and dr i d(r k/k j ) is the size (area) of the bin. The corresponding projection of the cost function (figure 2(c)) is found as Projections of the LPDF and cost functions on the plane (s, sq/q ) (figures 2(b) and (d)) are found in a similar way. It can be seen in the density plots that the LPDF has a very steep localized ridge whose top forms a 'line' R = { p * } = {(x * (t), M * } in the joint parameter-trajectory space. Fluctuations along R are large and non-Gaussian, while fluctuations transverse to the ridge top are strongly suppressed (figure 2). The mechanical action along R varies slightly, each point on R corresponding approximately to a solution of (5). This difference in scales of fluctuations and the quasi-degenerate forms of S Y and LPDF can be attributed to the projective character of the measurements with M < L: they are incapable of resolving certain tradeoffs between the parameters and hidden trajectory components. The different points along R correspond to quite different time traces of the hidden trajectory component (predator oscillations shown in figure 1(b)) but they all fit very well to the set of prey population observations. Such tradeoffs are intrinsic to any given model and their understanding is crucial for experts applying the technique for data interpretation and model discovery.

Validation of the new dynamical inference approach
Next, we validate the approach by applying it to the inference of population dynamics in a stoat-lemming community [6] and by comparing it to the MCMC method. The corresponding data (figure 3) are similar to the earlier set [5] but with an important difference: the dynamics of both populations, prey N (t) and predator P(t), were recorded. We assume initially, however, that only the prey population was measured, and subsequently compare the inferred predator population with the measured one. The dynamical model [6] for this community is rather elaborate. However, given the close similarity between the two communities it is instructive to perform such a comparison treating (1) as a simplified model. Our results are shown in figure 3: the measured predator trajectory is reproduced, including the correct slope and scale of the oscillations, unlike the earlier simulations [6]. With sufficient care, the MCMC using a Gibbs sampler with the Metropolis-Hastings steps method can show comparable performance. We have found for a number of models, however, that convergence of the Hamiltonian method is faster and more robust with respect to the initial conditions due to the existence of local minima corresponding to nonsmooth time functions P * (t). These local minima slow down the convergence of the MCMC algorithm (see e.g. dashed blue curve in figure 3), but are excluded from the start in our Hamiltonian approach (4)-(6).

Conclusion
In conclusion, our Hamiltonian approach facilitates inference of parameters and hidden (unobserved) trajectory components of a nonlinear dynamical model in the presence of dynamical noise and measurement errors. The application of the method to ecological problems leads to the conclusion that a lack of observational data for predator populations need no longer constitute a fundamental obstacle to the inference of hidden population dynamics and ecological parameters from noisy measurements [2,3,5,6]. The key limitation of the proposed method is that it relies on the Euler approximation in derivation of the cost function. Correspondingly, the accuracy of this approximation depends strongly on the sampling rate, which has to be dense enough. Here, we simulated dense sampling by spline interpolation and convergence was guaranteed by the data set having nine periods of population oscillation with nine points per period. Note, however, that the convergence of the BVP solution does not require dense sampling and in the future the restrictions on the sampling density can be relaxed. In general, the dependence of the uncertainty of inference on the number of data points is nontrivial in character and will be discussed in detail elsewhere. The method has allowed us to reveal a quasi-degeneracy of the log-likelihood along a hyperplane R in the joint parameters-trajectory space. The existence and shape of R reflects tradeoffs between parameters r and k /k (s and q/q ) on one hand, and hidden trajectory components on the other. Importantly, our procedure, unlike MCMC, avoids sampling in the trajectory space. Instead, it relies on a continuous-in-time solution of the deterministic auxiliary problem (6), thereby exploiting recent advances [22] in tackling the BVP. The method will also be applicable quite generally to cases where some state variables could not be recorded, e.g. those mentioned in the introduction [1,23]. It could be particularly useful in the context of physiological measurements relating difficultto-access parameters to noninvasively measured data [24], and to cases where dynamical variables are intrinsically hidden and cannot be measured experimentally, even in principle: e.g. flow parameters in aerospace applications [25] and susceptible and exposed populations in epidemiology [23].