NONLINEAR FUNCTIONAL RESPONSE PARAMETER ESTIMATION IN A STOCHASTIC PREDATOR-PREY MODEL

Parameter estimation for the functional response of predator-prey systems is a critical methodological problem in population ecology. In this paper we consider a stochastic predator-prey system with non-linear Ivlev functional response and propose a method for model parameter estimation based on time series of field data. We tackle the problem of parameter estimation using a Bayesian approach relying on a Markov Chain Monte Carlo algorithm. The efficiency of the method is tested on a set of simulated data. Then, the method is applied to a predator-prey system of importance for Integrated Pest Management and biological control, the pest mite Tetranychus urticae and the predatory mite Phytoseiulus persimilis. The model is estimated on a dataset obtained from a field survey. Finally, the estimated model is used to forecast predator-prey dynamics in similar fields, with slightly different initial conditions.

1. Introduction.Integrated Pest Management (IPM) programs are implemented with the objective of minimizing losses due to plant pests and controlling insect vectors of important plant, animal, and human diseases, reducing the impact of control techniques on environment and human health [9].Biological control techniques are available to IPM specialists and are based on manipulation of inter-specific relationships (use of predators, parasites, diseases and plant resistance to suppress pest populations) [24].Population dynamics models can provide insight into the complexity of interacting populations systems under the effect of control operations [1] and contribute to the evaluation and improvement of IPM tactics and strategies.The following model characteristics and procedures are of critical importance for a possible contribution of trophic interaction models supporting the design of IPM and biological control schemes.(a) Models should mimic the qualitative behavior of predator-prey populations systems relevant to biological control tactics and strategies, for instance, admitting either stable solutions with prey extinction or an equilibrium with limit cycle.Model analysis can be further detailed defyining how populations system dynamics and stability depend on variation in both parameters 76 GIANNI GILIOLI, SARA PASQUALI AND FABRIZIO RUGGERI values and initial conditions expressed in term of predator and prey abundance.(b) The selection of an appropriate model for the functional response is critical in relation to model behavior and the objective of model development in the context of IPM [17].Several models of functional response have been proposed [27].These functions may be prey dependent or may depend on both prey and predator densities, and take into account different behavioral and physiological aspects [2,17].(c) Estimation of ecological parameters used in trophic interaction models is a critical methodological problem in basic as well as in applied population ecology [22].Parameter estimation for the functional response is the most troublesome aspect because of the complexity of behavioral and physiological responses related to the predation process occurring in a heterogeneous environment.Functional response parameter estimation can rely on classical experimental approaches performed in simple arenas or on more complex experimental setups addressing the problem of scaling up the results from the laboratory to the field.These approaches have been criticized for the oversimplification of the environment in which predator behavior occurs [21].In fact, artificial conditions needed to follow the behavior of a single individual may affect the predator performance, increasing the efficacy of resourcecapture process.Approaches relying on field observations appear more promising, particularly those based on the analysis of time series data on prey and predator abundances [19,20].In a natural setup, parameter estimation may exploit the advantages offered by natural environment in terms of within (canopy) and between plant complex environment [21] as well as other environmental aspects influencing the predation process (e.g.semiochemicals).
To explore potentiality of approaches relying on field observations, Gilioli et al. [12] proposed a Bayesian method for functional response parameter estimation starting from time series of field data on predator-prey dynamics.The method focused on the estimation of a behavioral parameter appearing in the Lotka-Volterra functional response when a small number of observations is available.The method by Gilioli et al. [12] is here further developed considering a different stochastic predator-prey model described by two stochastic differential equations.The plant dynamics is not taken into account by the model that only focuses on the interaction between the prey (phytophagous) and the predator (biological control agent).Behavioral, demographic and environmental stochasticities are introduced in the model and represented by noise terms affecting each population as well as their interaction and depend on parameters that have to be estimated.Three main novelties are introduced in the approach here adopted.
(i) The simple and linear Lotka-Volterra functional response model in [12] is substituted with the more general and realistic non-linear Ivlev model [18,27].The Ivlev model was introduced to address the major weakness in the Lotka-Volterra model that considers an unlimited increase in the predator demand as the prey density increases [27].The Ivlev model presents two more interesting aspects.It displays qualitative properties that well interpret the behavior of the predator-prey system of interest for biological control and, among the functional response models having the same properties, e.g. the Holling II functional response, it offers the advantage of considering only one parameter.The Ivlev model with densitydependent prey growth presents three equilibrium points.They are characterized by: extinction of both prey and predator, extinction of the predator, coexistence of prey and predator (under suitable condition for the parameters -see, for example, [4]).The last point is the most interesting equilibrium state and its stability properties depend on the value of the behavioral parameter in the functional response.It can be either a sink or a source and in the latter it generates limit cycles [3,4].It follows that a reliable estimate of the behavioral parameter is critical given its influence on the expected outcome of predator-prey interaction and considering the meaning of different stability regimes in the context of biological control.
(ii) In Gilioli et al. [12] estimation was restricted to a single parameter expressing the rate of effective search of the predator.Here four parameters related to behavioral, demographic and environmental stochasticity are estimated.The method followed in [12] was founded on a Markov Chain Monte Carlo (MCMC) procedure.In recent years Bayesian methods based on MCMC algorithm have been largely used to deal with the problem of parameter estimation, in particular when the number of observed data is small.We recall, among others, the papers by Golightly and Wilkinson [14,15,16], Eraker [8], Elerian et al. [7], where latent data are introduced between real observations.Here we present a MCMC method that combines parameter estimation with generation of latent data based on the diffusion bridge proposed by Durham and Gallant [6].Latent data are new data generated between successive real observations, and drawn from a distribution depending on the latter and the system dynamics.This MCMC method differs from the one in [12] because of a modification in the proposal distribution used to generate latent data.The estimation problem becomes more troublesome since the presence of parameters in the diffusion term slows down the convergence of the MCMC, as pointed out by Roberts and Stramer [26] and Golightly and Wilkinson [15].
(iii) A forecast on different fields has been performed.We have used three datasets on the same predator-prey populations system collected in the same period in three independent experimental fields in the same area.In the same spirit of splitting data into training and test sets, the first set is used to estimate the parameters of the model; the estimated model is then used to simulate the trajectories of prey and predator for the other two predator-prey systems differing only for initial conditions.Forecasting property of the performed numerical experiments have been evaluated comparing simulated trajectories and observed biomass.
The paper is structured as follows.In Section 2 we present the model used to represent the dynamics of the predator-prey system.Section 3 is devoted to the Bayesian method applied to estimate unknown parameters.An application to the acarine predator-prey system Phytoseiulus persimilis -Tetranychus urticae is presented in Section 4. Finally, Section 5 collects concluding remarks.
2. The model.We consider, on the time interval [0, T ], a predator-prey stochastic system described by the following equations with initial condition (x(0), y(0)) = (x 0 , y 0 ) and where x t and y t are the biomass of prey and predator at time t per habitat unit (plant) normalized with respect to the prey carrying capacity per habitat unit; r is the maximum specific growth rate of the prey, b is the maximum specific predation rate, b (b < b for the conversion efficiency) is the maximum specific predator production rate, u is the specific predator loss rate due to natural mortality and q 0 is a measure of the efficiency of the predation process [4].The lumped parameters r, b, b , u (thoroughly described in [4]) are known, as the initial condition (x 0 , y 0 ).The Ivlev [18] nonlinear functional response (1 − e −q0xt ) contains the unknown parameter q 0 to be estimated.Moreover, also the parameters σ, ε and η have to be estimated.The first diffusion term in both equations represents the behavioral stochasticity which affects search and capture processes.It can be obtained supposing that the Ivlev functional response (1 − e −q0xt ) of the deterministic term is subject to random fluctuations so that 1 − e −q0xt =⇒ 1 − e −q0xt + σe −λxt ξ t where ξ t is a Gaussian white noise and λ is a positive parameter.The Ivlev functional response is an increasing function of x t which converges to 1 as the prey biomass increases.The fact that q 0 is unknown only affects the slope of the curve but not its shape.All the possible functional responses, obtained for different values of q 0 , are similar for large values of x t .It is therefore natural to consider an effect of random fluctuations more pronounced for small values of x t and then decreasing for large values of x t ; this is also an expected outcome for the random search process of the predator.The parameter λ allows to set the threshold at which the fluctuations become negligible.Since our aim is to obtain small fluctuations for values of x t near one, it would be better to choose λ ≤ q 0 .For simplicity, we choose λ = 1.
The second diffusion term in both equations can be interpreted as environmental stochasticity deriving from environmental variables fluctuation [25].This type of stochasticity is proportional to the prey and predator population abundance for the first and second equation, respectively.
We denote by θ = (q 0 , σ, ε, η) the vector of parameters.The main goal of the paper is the illustration of an efficient method to estimate θ.A prior distribution will be specified on θ and inference will be performed based upon its joint posterior distribution.
For simplicity, we write system (1) in vectorial form where (3) The coefficients µ and β satisfy the conditions for the existence and uniqueness of a strong solution of a stochastic differential equation (see, for example, [23]).Remark that parameter θ appears both in drift and diffusion coefficients.The presence of θ in the diffusion term makes estimation, through MCMC methods, more difficult.Golightly and Wilkinson [15] and Roberts and Stramer [26] emphasize that, when the diffusion term depends on the parameter θ, the convergence of the MCMC algorithm is slow.In particular, when the diffusion coefficient depends on θ, there is dependence between the quadratic variation and the diffusion coefficient.This fact results in long mixing times of MCMC algorithms.Roberts and Stramer [26] overcome this dependence by transforming the stochastic differential equation in an equivalent equation with constant diffusion coefficient.In this case, as in many nonlinear multivariate diffusions, this transformation cannot be applied.Golightly and Wilkinson [15] propose an MCMC algorithm based on alternative sampling from the parameter posterior distributions and the Brownian motion (instead of the latent data).MCMC algorithms are widely used in Bayesian statistics to get a sample from the posterior distribution.They are based on Monte Carlo simulations with draws at each step depending on the ones at the previous step (Markov property).Theoretical results prove that the posterior distribution is the invariant measure of the Markov chain, whereas tools have been developed to assess if values drawn from the Monte Carlo simulation can be considered as a sample from the posterior distribution, e.g.removing dependence on the initial values ("burn-in"), decorrelating the draws (taking one draw every n) and performing diagnostics tests on the sample valued.A detailed illustration of MCMC algorithms can be found in, e.g., [10].In our case, in each step of the simulation we draw the parameters, one at the time, from their conditional posterior distribution and latent data in the whole time interval.This is a particular MCMC algorithm, called Metropolis within Gibbs algorithm.It is similar to those presented in [12] where the slow convergence did not give practical problems for a set of data comparable with the datasets used in the current work.

3.
A Bayesian algorithm to estimate θ.As well known, in the Bayesian setting estimation of parameter θ is based on its posterior distribution, obtained starting from a prior distribution of θ and a set of field observations.Since posterior distribution of θ cannot be found in closed form, we will have to resort to apply MCMC methods and use a sample from the algorithm to estimate parameters.
3.1.The posterior distribution.We introduce a prior distribution for the unknown vector θ.Denote by X = (X 0 , X 1 , ..., X p ) the vector of observations; then, due to Markovianity of the process X, the posterior distribution of θ is given by where is the conditional distribution of X i given the observation at previous time and the value of the parameters.Substituting µ and β of (3) in (5) and recalling that X i = [x i , y i ], after some straightforward, but cumbersome, calculations, we obtain We suppose, for simplicity, that the joint prior distribution is the product of independent distributions The discussion can be generalized to the case of a generic joint prior distribution π(q 0 , σ, ε, η).
The expression of f (X i |X i−1 , θ) suggests to choose π(σ 2 ) as an inverse gamma distribution, so that, given the observations and the value of the remaining parameters, we obtain an inverse gamma conditional posterior distribution for σ 2 .On the contrary, even if we choose a known form for the prior distributions of q 0 , ε, and η, their conditional posterior distributions, given the observations and the value of the remaining parameters, will not have a known form.
The parameter q 0 is positive since it measures the efficiency of the predation process; therefore we choose a gamma prior for q 0 because it is simple to deal with and sufficiently flexible for the presence of two parameters (other prior distributions, with positive support, can be alternatively considered).We suppose that parameters ε and η are positive as well, so we choose again gamma priors for them for the same reason.More precisely, we take It follows that σ 2 has an inverse gamma conditional distribution with parameters where

The MCMC algorithm.
A fundamental problem in dealing with discrete observations and Euler approximation of (1), is to employ a sufficiently large number of (real and latent) data to ensure that the discretization bias is arbitrarily small (see [7], for a detailed discussion).In many problems of population dynamics only few observations are available due to difficulties and time required for data collection (e.g. a sample point per week).In this case time intervals between observations are too large to guarantee a good approximation of the maximum likelihood.To overcome this problem, many authors suggest to generate additional data, in literature usually called latent data, between two consecutive real observations (see, for example, [28,8,7,15]).In this work we will follow this procedure.In general, the MCMC method based on latent observations is composed of three fundamental steps: 1. draw a value for σ 2 from the prior distribution π σ 2 ; draw a value for q 0 from the prior distribution π (q 0 ); draw a value for ε from the prior distribution π (ε); draw a value for η from the prior distribution π (η); 2. generate latent data between each pair of real observations using their posterior density given the current value of θ; 3. sample a value for θ from its conditional posterior distribution through a Gibbs algorithm: sample a value for σ 2 from its inverse gamma conditional posterior density; sample a value for q 0 from its conditional posterior density π(q 0 | X, σ, ε, η); sample a value for ε from its conditional posterior density π(ε| X, σ, q 0 , η); sample a value for η from its conditional posterior density π(η| X, σ, q 0 , ε).
Applying recursively steps 2 and 3, after an initial burn-in period, we obtain a sequence of θ-values giving an approximation of the posterior densities of θ.

3.2.1.
Step 2 of the MCMC algorithm.In step 2 of the MCMC algorithm, latent data have to be generated.We follow the method proposed by Eraker [8] and then applied by Golightly and Wilkinson [14] and by Gilioli et al. [12], among others.In this method a diffusion bridge between data at two consecutive times t i and t i+1 is considered to generate latent observations.Here, as in [12], we update observations in blocks; more precisely we generate simultaneously latent data between two consecutive real observations, as done also by Durham and Gallant [6].
We generate M equidistant latent data between each pair of real observations.In particular, we introduce the same number of latent data in each interval between two consecutive observations, independently of the different width of the intervals.Another possible choice would be to introduce different numbers of latent data for each interval to obtain equidistant latent data over all the whole interval [0, T ].The choice of the optimal M is generally suggested by an empirical study, as discussed in next section.
Denote by X j = (x(t j ), y(t j )) a real datum at time t j , by X * i = (x * (t i ), y * (t i )) a latent datum at time t i , and by Y = X 0 , X * 1 , ..., X * M , X M +1 , ..., X * n−1 , X n the matrix of all data, where n = pM + p + 1 is the total number of observations.
At the first step we consider a "guess" value for θ drawn from the prior distribution π (θ) and generate latent data by means of a linear interpolation between two consecutive real observations.At this point we generate a new value for θ as shown in Step 3. Then we update latent observations on the whole time interval [0, T ].
Suppose, at a generic step of the algorithm for the estimation of θ, to have updated the observations up to X k .We have to update the M -block X * (k+1,k+M ) = X * k+1 , ..., X * k+M .The conditional density of X * (k+1,k+M ) is given by In general, this is not a density of a standard form, so we apply a Metropolis-Hastings (M-H) algorithm to simulate from f X * (k+1,k+M ) |X k , X k+M +1 ; θ .In the M-H algorithm a proposal density has to be specified.Then we sample a value from this density and accept or reject it on the basis of the "acceptance probability" (see [13], p. 5, for a detailed explanation of the M-H algorithm).
In this case the variables X * j represent population abundance, so they must be non negative.Then we choose a truncation of the bivariate normal distribution used by Golightly and Wilkinson [16] and Durham and Gallant ([6], p. 305), as proposal for the M-H algorithm This distribution is different from the truncated bivariate normal distribution presented in [12].In fact, here mean and variance of the proposal distribution are obtained considering the modified diffusion bridge proposed by Golightly and Wilkinson [16].This approach is based on the diffusion bridge of Durham and Gallant [6], which uses a discretization of the stochastic differential equation representing the dynamics of the system.In [12] the mean of the truncated normal distribution is simply obtained as average of two points: the positions at times t k+M +1 and t j−1 .
Suppose to dispose of all latent data and parameters at step s of the algorithm used to estimate θ.Denote by X * (s) (k+1,k+M ) the value of X * (k+1,k+M ) at the s th iteration.To calculate the value of X * (k+1,k+M ) at the (s + 1) th iteration, we draw a value w from the proposal density and we compute the acceptance probability to decide if the candidate vector has to be accepted or rejected.
The acceptance probability, depending on the conditional density f and on the proposal density g, is given by We generate a uniform random number a.
If a ≤ α(X * (k+1,k+M ) , w|X k , X k+M +1 ; θ) then we accept the candidate vector w and set X

3.2.2.
Step 3 of the MCMC algorithm.To generate a value for σ we use the known conditional inverse gamma distribution of σ 2 .The posterior distributions of q 0 , ε and η have not a standard form, then we again use a M-H algorithm to sample from them.
Let θ (s) be the value of the parameter at the generic iteration s.We have to generate a value for θ at iteration s + 1.We apply a Gibbs algorithm to obtain the parameters σ, q 0 , ε, η.In the Gibbs algorithm we generate the values of the parameters recursively refreshing the conditioning variables.Since σ 2 has a known conditional posterior distribution, we start by generating σ 2 from its inverse gamma conditional posterior distribution.From this value we obtain σ (s+1) .Then we draw q (s+1) 0 from π q 0 |σ (s+1) , ε (s) , η (s) , ε (s+1) from π ε|σ (s+1) , q (s+1) 0 , η (s) , and η (s+1) from π η|σ (s+1) , q (s+1) 0 , ε (s+1) .It does not matter the order chosen here for the update of q 0 , ε, η; any other order can work well.To obtain the values of the last three parameters (q 0 , ε, η) we apply a M-H algorithm since their posterior distributions are not of a known form.It follows that we apply a Metropolis within Gibbs algorithm.To clarify the procedure, we illustrate it for the parameter q 0 .Then analogous procedures can be followed for ε and η.Let ḡ (q 0 |Y, σ, ε, η) denote the proposal density for the M-H algorithm relative to q 0 .Let q * 0 be a value drawn from the proposal distribution.The acceptance probability is .
We generate a uniform random number a.If a ≤ ᾱ(q * 0 |Y, σ, ε, η), then set q (s+1) 0 = q * 0 ; otherwise set q 0 .The proposal distributions we choose for the M-H algorithms relative to q 0 , ε, η are gamma distributions, a quite standard choice in this context.4. Application to a field case.The parameter estimation procedure here proposed is applied to a predator-prey system of primary importance for biological control, the acarine system Tetranychus urticae-Phytoseiulus persimilis.The population dynamics is described by system (1) where parameters q 0 , σ, ε and η are unknown.To test the efficiency of the method here described, it will be applied to parameters estimation in case of simulated data.Then an application to observational data will be performed.Both posterior medians and means can be considered as parameter estimates; we observe that here they have very similar values.4.1.Datasets and model parameters.Datasets.To estimate the parameters q 0 , σ, ε and η, a dataset of predator-prey dynamics has been purposely arranged [11].This dataset derives from an intensive biological control survey of the pest mite Tetranychus urticae (prey) by the artificially introduced biological control agent Phytoseiulus persimilis (predator) on a field annual crop of strawberry.Two main problems were considered in the design of the experiment.Ivlev model depends only on prey abundance; different initial conditions, in terms of predator-prey ratio, have been considered to account for possible role of different predator-prey ratios.Furthermore, since population dynamics are known displaying high variability among experimental replications, availability of averaged dynamics from different replications could regularize sampled trajectories.Accordingly, the experimental design was organized as follows.Three independent population dynamics were established in the same period in three different experimental fields A, B, C. The fields were located in the same area, with the same crop and are in the same agronomic conditions.Time series from field A is used for parameter estimation.Then, the estimated parameters are used in the dynamics (1) to simulate trajectories for fields B and C, and compare them with observed data.In each field four isolated plots were designed and a different predator-prey ratio was tested in each plot.Initial prey condition in each plot was different but the four prey-predator ratios were the same in the three experiments.The four dynamics in each field were considered as replication at different densities of the same process and a single average predatorprey dynamics was calculated.Time evolution of prey and predator abundance was followed during the period in which the system develops on the host plant by means of an intensive sampling protocol.Twelve local predator-prey dynamics (three fields and four replications within field) were sampled.To obtain high precision in the estimated abundance a large number of sample units (single leaf) were collected and observed in laboratory using stereomicroscope to count individuals belonging to each biological stage of the two species.Sampling variables are defined as prey and predator biomass per plant.To convert numbers into biomass the procedure in [4] was applied.

Simulated data.
In this section we simulate data from model (1) with initial conditions: x 0 = 0.1 ; y 0 = 0.007, final time t f in = 90 days, and parameters q 0 = 6, σ = 0.05, ε = 0.09 and η = 0.05.We generate 20 observations, at equidistant times as average of 5 trajectories obtained from system (1).Now, we use these data as field observations for model (1) and try to estimate parameters q 0 , σ, ε, and η through the MCMC algorithm described in Section 3. We consider two different cases.CASE 1).If we have information about the true values of the parameters (based on experts' assessment), we use the information and consider prior distributions with mean near the true value of the parameter and small variance: -parameter q 0 : gamma distribution with mean 6.5 and variance 3; -parameter σ 2 : inverse gamma distribution with mean 4 • 10 −4 and variance 10 −9 ; -parameter ε: gamma prior with mean 0.08 and variance 0.00001; -parameter η: gamma prior with mean 0.04 and variance 0.000001.
The proposal densities at step s + 1 of the MCMC algorithm are as follows: -parameter q 0 : gamma distribution with mean q (s) 0 and variance 0.5; -parameter ε: gamma distribution with mean ε (s) and variance 0.000005; -parameter η: gamma distribution with mean η (s) and variance 0.000005.
As parameter estimate we consider the mean of the parameter posterior distribution obtained performing 100,000 simulations with a burn-in of 50,000 simulations.When introducing only one latent datum between two consecutive observations, the parameter estimates are quite bad, but starting from 5 latent data there is a considerable improvement (Table 1).The differences in the estimates between 20 and 30 latent data are negligible, denoting convergence of the estimates; furthermore, they are also very close to the true values of the parameters.Table 1.Estimates of q 0 , σ, ε and η (mean of the posterior distribution) in the case of 20 data simulated from equation ( 1) and prior distributions in CASE 1. Figure 1 shows the histograms of the posterior distributions obtained for 30 latent data simulated between two consecutive observations.Histograms are obtained from 100,000 simulations with a burn-in of 50,000 simulations.

Latent
CASE 2).If we have an idea, not very precise, about the values of the parameters, we choose prior distributions with mean that can be far from the true value of the parameter and variances larger than those of the previous case.For example, using the following prior distributions: -parameter q 0 : gamma distribution with mean 10 and variance 5; -parameter σ 2 : inverse gamma distribution with mean 2.5 • 10 −4 and variance 1.6 • 10 −9 ; -parameter ε: gamma prior with mean 0.03 and variance 0.002; -parameter η: gamma prior with mean 0.1 and variance 0.01; and using the same proposal densities of the previous case, we obtain parameter estimates in Table 2.
Table 2. Estimates of q 0 , σ, ε and η (mean of the posterior distribution) in the case of 20 data simulated from equation (1)  The estimates show a convergence towards the true values even if more slow due to "far" prior distributions.In the first case only 10 latent data are sufficient to assure a good estimate, while in the second case it is necessary to use 30 latent data to obtain a comparable approximation.
If we have no information about the true values of the parameters we can use non informative priors.In this case a greater number of latent data is necessary to obtain satisfactory estimates.The efficiency of the method proposed is confirmed by the goodness of the fit given by the trajectories of prey and predator reported in Figure 2.These trajectories are obtained considering the mean over 100 simulations of trajectories for each one of 50 different values of the parameters q 0 , σ, ε, η drawn from their estimated posterior distributions obtained for 30 latent data and prior distributions in CASE 1.We recommend to get 10 -30 latent data since estimates seem to be robust.Moreover, to support the goodness of the estimates and the fairly good mixing properties of the Markov chain, the usual convergence tests have been applied [5].

Field data.
In this subsection we consider the three datasets of predator-prey dynamics described in subsection 4.1.
As already said, to reduce variability among different trends in the four predatorprey dynamics of each field, the predator and prey abundance per habitat (single plant) unit, normalized with respect to the prey carrying capacity for spatial unit, have been considered.4.3.1.Parameter estimates.Time series from field A have been used for model parameters estimation.The estimation procedure has been restricted to the first population cycle of the prey under predator control, from the 2nd to the 9th weekly observation in the time series.Restriction in the considered time series is justified as follows.First, the necessity to consider homogeneous biological and ecological processes operating in the system, avoiding phenomena such as very low population density that could modify predator functional response and prey growth (i.e.undercrowding effects), and may require modification of model (1).Secondly, model (1) assumes predator-prey systems are in a relatively constant environment both in terms of environmental forcing variables and plant resource condition.For the latter, in particular during the late phase of population dynamics, it can be hypothesized that the plant withering may switch the system to other dynamical patterns for which model ( 1) is likely to be unfit.
The initial condition for the prey in field A is given by x 0 = 0.088; while the predator is introduced in the system after 9 days.The biomass of introduced predator, at time t = 9, is y 9 = 0.0035.
The proposal densities at step s + 1 of the MCMC algorithm are as follows: -parameter q 0 : gamma distribution with mean q (s) 0 and variance 0.5; -parameter ε: gamma distribution with mean ε (s) and variance 0.00005; -parameter η: gamma distribution with mean η (s) and variance 0.00005.
The estimates for parameters q 0 , σ, ε and η are calculated for different values of latent data and reported in Table 3.As estimate we have chosen the mean of the posterior distribution (which is very similar to the posterior median); the posterior distributions for 30 latent date are represented in Figure 3 (for 5 -20 latent data, histograms of the posterior densities are very similar to those in Figure 3).Figure 4 shows prey and predator trajectories for different values of latent data.The trajectories are obtained considering the mean over 100 simulations of trajectories for each one of 50 different values of the parameters q 0 , σ, ε and η drawn from their estimated posterior distributions.It can be seen from the simulated trajectories that there are not great differences between the trajectories obtained using the different set of estimates in Table 3.It also follows, as pointed out in Section 2, that in this case the problem of long mixing times of MCMC algorithm is not relevant.All the trajectories reported in Figure 4 show a good fit of the observations, for a number of latent data between 5 and 30.
We observe that for q 0 > 0.8, the deterministic system admits a stable coexistence state as equilibrium point.If follows that the trajectories of prey and predator, solutions to stochastic system (1), fluctuate around this deterministic equilibrium.In the figures of the present section this behavior is not evident because we limit our study to the first cycle of the system and at this point the predator takes very low values.
Comparison between field data and simulated trajectories has been conducted using some numerical indexes summarizing important information on potential prey impact on the crop; these indexes are also applied to predator population expressing the capacity for the biological control agent to control the pest.
1. S max : maximum size of the population.2. t max : time required to reach the maximum population size.3. t 0.5 : time required to halve the maximum size of the population.4. t 0.1 : time required to reduce the population to one tenth of its maximum size. 5.The integral of the abundance up to t 0.1 : it represents the population pressure on the resource.6.The residual, that is the square root of the sum of the square differences between observed and simulated data at the observation times.Table 4 reports the values of these indexes for simulated trajectories in Figure 4 for data from field A, in the case of 20 latent data.Apart from the case of the maximum value of the predator abundance, a close agreement between indexes in experimental and simulated data is observed.The residuals are very low for both the prey and the predator strengthening that observed population dynamics is well approximated by the simulated trajectories.
We remark that the time required to generate the posterior distribution for 30 latent data and 100,000 iterations of the algorithm is about 10 minutes with a FORTRAN program on a AMD Athlon MP 2800+, 2.25 GHz processor.Table 3. Estimates of q 0 , σ, ε and η (mean of the posterior distribution), in the case of field observations (field A), for different values of latent data. .Histograms representing the posterior distributions of parameters q 0 , σ, ε and η, for field A, obtained applying the MCMC algorithm with 100,000 simulations, with burn-in of 50,000 iterations, taking one value every 10 data, and 30 latent data between each couple of real data.Table 4. Indexes representing key-aspects of prey and predator population dynamics.2nd column: maximum abundance S max .3rd column: time required to reach S max .4th column: time required to halve S max .5th column: time required to reduce the population to one tenth of S max .6th column: integral of the curve.In the case of field data the integral of the curve is obtained linking the field data with segments.7th column: residual (square root of the sum of the square differences between simulated and observed data).Indexes are relative to 20

4.3.2.
Posterior predictive estimation of trajectories.Fields B and C are distinct, but close, from field A and they all share similar agro-ecological characteristics.Furthermore data in the three fields were collected at the same times and all of them were the average of four dynamics, as pointed out previously.Finally the initial conditions of prey and predator are quite close.For those reasons, we think it is possible to consider data from the three fields as similar and we can apply the same model ( 1) and the posterior distribution from field A to make forecast on the dynamics in fields B and C.
Therefore, we consider data from field A as a training set and we use the parameters posterior distribution based on them, obtained in the previous subsection, to forecast the dynamics of prey and predator in the fields B and C, considered as test sets.In particular, we draw 50 values for θ from its posterior distribution; then, for each θ, we generate 100 trajectories using model (1) with the initial values of biomass in field B (or C) and, finally, we average over all trajectories.The mean trajectory is compared with field data.
In field B the average initial conditions are x 0 = 0.1408 and y 9 = 0.0011.Figure 5 shows the trajectories of prey and predator obtained considering the mean over 100 simulations of trajectories for each one of 50 values of the parameters q 0 , σ, ε and η drawn from the posterior distributions obtained for field A in case of 20 latent data.In field C we start from the initial conditions x 0 = 0.1207 and y 9 = 0.0017.Figure 6 shows the trajectories of prey and predator obtained considering the mean over 100 simulations for each one of 50 values of the parameters q 0 , σ, ε and η drawn from the posterior distributions obtained for field A in case of 20 latent data.
In general we observe that simulated trajectories satisfactorily approximate the experimental data, especially for the predator population.In particular, in both cases we have a good approximation of the time required to reach the maximum predator abundance.There is also a good approximation of the time required for the prey decrease towards zero.
We claim that the Bayesian method here proposed is not only able to give good parameter estimates for a training dataset, but the estimated parameters allow to forecast prey and predator dynamics with a good approximation in test datasets, obtained in similar experimental fields.

4.4.
Estimates from field B. Since the three fields A, B, C have the same agroecological characteristics, we might expect that direct estimates on field B, for example, give similar results than prediction on field B starting from parameters estimated in field A. To verify this conjecture we consider field B data and apply the estimation procedure described in section 3. Results, for different values of latent data, are summarized in Table 5.
As in the previous subsection, we draw 50 values of the parameters from their conditional posterior distributions obtained for 20 latent data and consider 100 trajectories obtained using model ( 1) with these parameters.The mean trajectories obtained are compared, in Figure 7, with the predicted trajectories in Fig. 5. Differences are negligible, in fact, residuals for the prey are 0.3693 in case of prediction  and 0.3257 in case of direct estimate, while for predator residuals are 0.0686 in case of prediction and 0.07 in case of direct estimate.The small differences in residuals, confirm the fact that direct estimate gives results similar to prediction starting from estimates on field A. Now, using posterior densities obtained from direct estimate in field B, we predict trajectories in field A. Figure 8 shows the comparison between trajectories estimated starting from data of field A (represented also in Figure 4 with a continuous line) and trajectories forecast using estimates of field B. The behavior of the trajectories are similar; we observe only a little advance in the time required to reach the maximum predator abundance.5. Concluding remarks.In this work, we follow a Bayesian approach for parameter estimation and forecast in a stochastic predator-prey model with a nonlinear functional response.
As far as parameter estimation is concerned, the major novelty introduced in the paper is the consideration of the non-linear Ivlev's functional response which takes into account predator physiological saturation and admits three equilibrium points with different stability properties, depending on the values of the parameters.To the best of our knowledge, the problem of statistical parameter estimation in a predatorprey model with functional response of Ivlev type, based on time series of field data, has never been considered, not only in a Bayesian framework.Parameter estimation is performed considering the stochastic version of a predator-prey model and the estimation concerns four model parameters: a behavioral parameter appearing in the functional response and three parameters related to environmental stochasticity.The stochastic model presents some advantages with respect to the non-stochastic model.In fact, from the comparison between the trajectories of the stochastic model and those of the deterministic model (not shown in this paper), it can be seen that, even if the non-stochastic model gives a good fit of field data, its residuals are larger than those of the stochastic model: 0.1470 for the prey and 0.0539 for the predator in the deterministic model and 0.0929 for prey and 0.0455 for the predator in the stochastic model (see also Table 4).The need of introducing and estimating environmental stochasticity is based on the ecological meaning of the stochastic terms that could take into account for perturbations in both predator and prey population dynamics.Such perturbations are due to change in environmental driving variables and factors not explicitly considered in the model (e.g., the effect of temperature and rainfall variability that affects all the biodemographic processes).This guarantees a greater flexibility of the model.
Since the number of available data is small, we opt for a Bayesian estimation method based on a MCMC algorithm.This method is similar to the one presented in [12] for the estimation of a single parameter, but with a different proposal distribution for the generation of latent data.The proposed algorithm performs fairly well in estimating the parameters, in terms of both computational time and accuracy (as shown by the usual convergence tests applied to the samples drawn from the MCMC algorithm).The small number of field observations seems to have a limited effect on the estimates, as well.
A remarkable result is given by the posterior predictive estimation which allows us to forecast the dynamics of prey and predator sampled in fields similar, in terms of initial population and agronomic conditions, to those used for the parameter estimation.The estimated curves show a fairly good qualitative behavior when compared with observed data.Although the forecast concerns only two fields, it suggests that for fields with analogous properties the model can be able to predict, with a good approximation, the dynamics of prey and predator.This is also confirmed by the study carried out changing the role of the fields.In fact, similar results are obtained performing the estimation on one of the two fields previously used for prediction and then predicting the trajectories on the field initially used for estimation.
In order to test model suitability to describe population dynamics in different conditions it would be important to validate the model in different combination of population densities and experimental conditions.Nevertheless, this would be problematic due to the high cost for data collections in natural setup.However, collecting data from many experimental dynamics remains the only possibility for the generalization of model forecast capability.
Even considering the limited available dataset, the model was able to capture all the essential qualitative aspects of the predator-prey dynamics as shown by indexes summarizing important population dynamics characteristics (time to reach the maximum abundance and densities at the population maxima, and time and mode of prey decline as effect of predation) providing indication of model forecast capability.
In the present work we consider an Ivlev functional response because, with respect to other functional responses with the same properties, it offers the advantage of considering only one parameter.It is, however, possible to extend the study to other functional responses, for example the Holling II.In this case we will have additional complexity due to the presence of new parameters to be estimated.
The obtained results can be of particular importance for the design of environmentally sound and effective means of reducing or mitigating pests and pest effects through the use of natural enemies acting as predators in agro-ecosystems.In fact, biological control programs could benefit from predator-prey models supporting decision on timing and amount of control agent release.When model forecast capability is generalizable to a wider range of initial conditions then computer simulations can be performed for the design of biological control strategy.In particular, predator-prey model (1) can be considered as constraint in a stochastic control problem where we have to maximize producer utility, expressed in term of increase of crop production and decrease of costs.This analysis could be of great importance in making biological control more rational and effective.

Figure 1 .
Figure1.Histograms obtained applying the MCMC algorithm with 100,000 simulations, with burn-in of 50,000 iterations, taking one value every 10 data, for 20 observations generated from model(1) and 30 latent data between each couple of real data.Prior distributions in CASE 1.

Figure 2 .
Figure 2. Prey and predator biomass as function of time, for 30 latent data.Trajectories are obtained considering the mean over 100 simulations and 50 values of the parameters q 0 , σ, ε and η drawn from their estimated posterior distributions.Asterisks denote simulated data.Prior distributions in CASE 1.
Figure 3. Histograms representing the posterior distributions of parameters q 0 , σ, ε and η, for field A, obtained applying the MCMC algorithm with 100,000 simulations, with burn-in of 50,000 iterations, taking one value every 10 data, and 30 latent data between each couple of real data.

Figure 4 .
Figure 4. Prey and predator biomass as function of time in field A, for different values of latent data.Trajectories are obtained considering the mean over 100 simulations and 50 values of the parameters q 0 , σ, ε and η drawn from their estimated posterior distributions.Asterisks denote field data.

Figure 5 .
Figure 5. Prey and predator biomass as function of time in field B. Trajectories are obtained considering the mean over 100 simulations and 50 values of the parameters q 0 , σ, ε and η drawn from the estimated posterior distributions of field A relative to 20 latent data.Asterisks denote field data.

Figure 6 .
Figure 6.Prey and predator biomass as function of time in field C. Trajectories are obtained considering the mean over 100 simulations and 50 values of the parameters q 0 , σ, ε and η drawn from the estimated posterior distributions of field A relative to 20 latent data.Asterisks denote field data.

Figure 7 .
Figure 7. Prey and predator biomass as function of time in field B. Trajectories are obtained considering the mean over 100 simulations and 50 values of the parameters q 0 , σ, ε and η drawn from the estimated posterior distributions relative to 20 latent data.Asterisks denote field data.Continuous line: direct estimate on field B. Dashed line: predicted trajectories, starting from estimated posterior distribution on field A.

Figure 8 .
Figure 8. Prey and predator biomass as function of time in field A. Trajectories are obtained considering the mean over 100 simulations and 50 values of the parameters q 0 , σ, ε and η drawn from the estimated posterior distributions relative to 20 latent data.Asterisks denote field data.Continuous line: direct estimate on field A. Dashed line: predicted trajectories, starting from estimated posterior distributions on field B.
latent data.

Table 5 .
Estimates of q 0 , σ, ε and η (mean of the posterior distribution), in the case of field observations (field B), for different values of latent data.