Stochastic and deterministic models for agricultural production networks

An approach to modeling the impact of disturbances in an agricultural production network is presented. A stochastic model and its approximate deterministic model for averages over sample paths of the stochastic system are developed. Simulations, sensitivity and generalized sensitivity analyses are given. Finally, it is shown how diseases may be introduced into the network and corresponding simulations are discussed.

1. Introduction.The current production methods for livestock follow the just-intime philosophy of manufacturing industries.Feedstock and animals are grown in different areas.Animals are moved from one farm to another, depending on their age.Unfortunately, shocks propagate rapidly through such systems; an interruption to the feed supply has a much larger impact when farms have minimal surplus supplies in-stock than when they have large reserves.The just-in-time movement of animals between farms serves as another vulnerability.Stopping movement of animals to and from a farm with animals infected by a disease will have effects that quickly spread through the system.Nurseries supplying the farm will have nowhere to send their animals as they grow up.Finishers and slaughterhouses supplied by the farm will have their supply interrupted.
The devastating foot-and-mouth disease (FMD) that hit the United Kingdom (UK) in 2001 lead to the slaughter of millions of animals.The outbreak shook many Western nations as citizens watched a nation with an advanced animal health surveillance and response system fail to get FMD under control, in part because the UK was unable to mount a rapid response in the face of modern agricultural 374BAI, BANKS, DEDIU, GOVAN, LAST, LLOYD, NGUYEN, OLUFSEN, REMPALA, SLENNING marketing systems [15].In an effort to eradicate the disease, the marketing channels were stopped, leaving uninfected producers with no income to maintain their livestock and no means to move them to locations where feed, shelter, and other support were available.As a result, between six million and ten million animals were destroyed in the UK over seven months, with more than one-third of those animals being destroyed for welfare reasons [41].Two years after the outbreak, animal agriculture in the UK was still declining, a chilling postscript to the widespread infrastructure damage FMD had wrought on the nation [37].
More recently, the world has witnessed the apparent failure of widespread national and international plans for using animal destruction to stem the spread of the highly pathogenic H5N1 strain of avian influenza.In a process frighteningly reminiscent of the UK FMD experience, the programs have also allowed domestic markets within and beyond affected countries to suffer.Global consumption of poultry has dropped enough to cause US domestic producers (e.g., Tyson, Pilgrim's Pride, et al.) to absorb decreased demand and decreased prices.This drop has translated to decreases, as well, in non-poultry markets, exacerbating the market effects of a disease not even present in the Western hemisphere [19].It has become painfully apparent that in the large-scale, interdependent, and highly mobile animal agriculture industry of the USA, the unintended consequences and market ripple effects of a disease incursion into our system could be even more severe than what was witnessed in the UK in 2001 and across Europe in 2005-6, and could induce decision-makers to call for even more draconian measures than previously seen.What is needed is a new view of how our emergency response programs might affect modern animal agriculture, a view that allows workers to assess the potential for other prevention strategies and responses.The view should also allow analysts to identify bottlenecks in the food and feed supply chain, and to test potential mitigation tools, procedures, and practices to increase the resilience of animal agriculture to catastrophic animal diseases.
This paper presents initial statistical and mathematical modeling ideas to address the above issues, using the North Carolina swine industry's potential response to FMD as an example.We focused our attention on the North Carolina swine industry because it is the second largest swine industry in the United States, and because it is local to us.Our goal was to develop a model that could be used to investigate how small perturbations to the agricultural supply system would affect its overall performance.A hurricane that throttles inter-farm transportation for a short period, or a disease outbreak that spreads through distribution channels are example causes of the perturbations of interest.In the former case, the just-in-time delivery systems may not provide enough slack to absorb the shock.In the latter case, strategies that involve destruction of all livestock in an infected branch of the system may be overly harsh; a more moderate response may be as effective without the high toll on the infrastructure.
We model a simplified swine production network in North Carolina containing four levels of production nodes: growers/sows (N ode 1), nurseries (N ode 2), finishers (N ode 3), and processing plants/slaughterhouses (N ode 4).At grower or sow farms (N ode 1), the new piglets are born and typically weaned three weeks after birth.The three-week old piglets are then moved to the nursery farms (N ode 2) to mature for another seven weeks.They are then transferred to the finisher farms (N ode 3), where they grow to full market size, which takes about twenty weeks.Once they reach market weight, the matured pigs are moved to the processors (slaughterhouses) (N ode 4).Pork products then continue through wholesalers to consumers.There are also several inputs to the system which we will not consider, such as food, typically corn grown in the Midwest.There are several types of breeding farms where purebred stock are raised; these are typically crossed to produce hybrid strains for pork production.
Our paper is organized as follows.In Section 2, we formulate a nonlinear stochastic model for our agricultural network and show how it can be converted to an equivalent (in a sense made precise below) deterministic differential equation model.This deterministic model readily lends itself to simulations and sensitivity analysis techniques.In Section 3 we present numerical simulations of the production model (without perturbations such as infectious disease), and carry out a sensitivity analysis of the model.Simulations of the model in the presence of an infectious disease are presented in Section 4. Finally, in Section 5 we give our conclusions and remarks for future work.
In addition to the development of models for a typical production network permitting perturbations, a significant contribution in this paper is the demonstration of stochastic, mathematical and computational methodology that is available to domain scientists, statisticians and applied mathematicians working in a concerted team effort on complex problems of the type exemplified here.The coauthors of this paper constituted such a team organized under the auspices and with the support of the Statistical and Applied Mathematical Sciences Institute (SAMSI) as a year-long working group in its recent research program on National Defense and Homeland Security.

2.
Modeling.We consider stochastic models to track an agricultural network.We are interested in how the parameters used in the model affect the overall capacity of a network and in how one discerns the existence and location of any bottlenecks.With deterministic models, one can answer the first question with a sensitivity analysis.Thus, after developing a typical stochastic production model, we also show how to obtain its deterministic approximation.We then demonstrate how to superimpose a simple contagious disease model on the production model that allows simulation of dynamics and spread of FMD through a production chain.
2.1.Basic Model.We consider a simplified swine production network with four aggregated nodes: sows (N ode 1), nurseries (N ode 2), finishers (N ode 3), and slaughterhouses (N ode 4).Our goal is to study the effects of perturbations within the network.This can be done by affecting either the nodes or the transitions between nodes directly or indirectly.For instance, a problem with the breeding farms would result in a reduction of sows available for producing new piglets.This would result in a reduced rate of transition from N ode 1 to N ode 2, since we could not grow as many piglets.We could then track the effect of this through our network.
Although unavoidable in actual production processes, we assume in our example that there are no net losses in the network (i.e., the total number of pigs in the network remains constant) and that the only deaths occur at the slaughterhouses.Thus we assume that the number of processed pigs per day at the slaughterhouses is equal to the number of newborn piglets per day at the growers.We can model reduced birth-rates by reducing the rate at which piglets move to the nurseries.This leads us to deal with a closed network.We note that this approximation is realistic when the network is efficient and operates at or near full capacity (i.e, when the number of animals removed from the chain are immediately replaced by new production/growth, avoiding significant idle times).Our closed network model for the swine production is summarized schematically in Figure 1.Each node with corresponding population number N i , i = 1, . . ., 4, in Figure 1 represents an aggregation of all the production units corresponding to that level in the production network.Given a specific production network, any of the four levels of the chain may be broken into its constituent units (e.g., farms), and analyzed in detail as a separate subnetwork.The directed edges between the nodes represent the movement of the pigs through the network.The rate is determined by the pigs' residence time, the number of pigs at each node, and the capacity constraints at the corresponding nodes.Let L i denote the capacity constraint at node i, i = 1, . . ., 4. Since we have a closed network, it is assumed that there is no capacity constraint at N ode 1, and therefore we take L 1 = ∞.We also define S m to be the maximum exit rate at N ode 4; i.e., the maximum killing capacity at the slaughter house.The residence times at each node, together with the capacity constraints and the slaughterhouse killing capacity, based on very rough estimates of swine production in North Carolina [1], are given in Table 1.The rates of transition of X(t) are nonlinear functions λ i : L → [0, ∞) for i = 1, . . ., 4, and for x ∈ L are given by:

Slaughter Finisher Nursery Sows
where k i , i = 1, . . ., 4, is proportional to the service rate at node i; L i , i = 2, 3, 4, is the buffer size (capacity constraint) at node i and S m is the slaughter capacity at node 4 as discussed above.For any real z, the symbol (z) + is defined as the non-negative part of z, i.e., (z) + = max(z, 0).Then q 1 (x 1 − 1, x 2 + 1, x 3 , x 4 ) is given by The other q i are given similarly.
The simple model ( 1) is formulated under the following assumptions and hypotheses.First it is assumed that the transportation rates q i , i = 1, 2, 3, are proportional to x i (L i+1 − x i+1 ) + , the product of the number of animals available and the available capacity at the next node.If no capacity is available, the rate is taken as zero.The rate at the slaughter house (Node 4) is the maximum S m if a sufficient number of animals is available; otherwise all animals present at the node are slaughtered on that day.Finally, it is assumed that the network is at or near steady-state and maximum efficiency in that the slaughter rate at Node 4 is the same as the input at Node 1 (this is represented schematically in Figure 1 by the arrow from Node 4 to Node 1).This results in the rate dynamics (9) below with the output rate at Node 4 the same as the input rate at Node 1.
We remark that the product nonlinearities x i (L i+1 − x i+1 ) + of (1) where transportation occurs more rapidly the further the node level is from capacity (i.e., the system reacts more rapidly to larger perturbations from capacity) are only one possible form for these terms.One could also reasonably argue for alternative terms of the form x i χ i+1 where χ i+1 is the characteristic function for the set {(L i+1 − x i+1 ) > 0}, so that the transportation rate from a node depends only on the number available at that node so long as capacity at the next node has not been reached.We remark that in this case the sensitivity analyses below are more difficult because of a lack of continuity of the dynamics in the system equations.
Let R i (t) i = 1, . . ., 4, denote the number of times that the ith transition occurs by time t.Then R i is a counting process with intensity λ i (X(t)), and the corresponding stochastic process can be defined by where the Y i are independent unit Poisson processes.That is, sample paths r i (t) of R i (t) are given in terms of sample paths x(t) of X(t) by We write R i in this form to illustrate that λ i is a rate of the corresponding counting process.Let e i , i = 1, . . ., 4, be standard basis vectors of R 4 and define, for i incremented by one modulo 4, the vectors 378BAI, BANKS, DEDIU, GOVAN, LAST, LLOYD, NGUYEN, OLUFSEN, REMPALA, SLENNING which represent the vector of changes in system counts at ith transition.We write the state of the system at time t as where ν is the matrix with rows given by the ν i , and R(t) is the (column) vector with components R i (t).In the chemical literature, the matrix ν T is often referred to as the stoichiometric matrix [29].More specifically, we have The above system typically cannot be solved for a stationary distribution, and an empirical approach based on the so-called Gillespie algorithm [29] can be used to investigate the long-term behavior of the system (see Section 3.2).The approximate large-population behavior of an appropriately scaled system may be also analyzed with macroscopic deterministic rate equations, as we shall explain next (the original theory is due to Kurtz and is discussed in [21] and the references therein).
Let N be the total network or population size.If N is known we may consider the animal units per system size or the units concentration in the stochastic process C N (t) = X(t)/N with sample paths c N (t).For large systems, this approach leads to a deterministic approximation (obtained as solutions to the system rate equation defined below) to the stochastic equation ( 4), in terms of c(t), the large sample size average over sample paths or trajectories c N (t) of C N (t).
We rescale the rate constants k i , L i and S m as follows: According to Equation (1), this rescaling implies that  [30].One can use this fact, along with the rescaling of the constants as given above, to argue that sample paths r i (t) for the counting process (2) defined in terms of the sample paths x(t) or c N (t) = x(t)/N may be approximated for large N in terms of the deterministic variables c(t), the averages over sample paths or trajectories c N (t) of C N (t), by and similarly For a full and rigorous discussion of this "approximation in mean," see Chapters 6.4 and 11 of [21] and Chapter 5 of [3].The averages c(t) satisfy a system of deterministic ordinary differential equations which can be heuristically derived by beginning with Equation (5).Upon dividing both sides of each equation by N and applying the above, we obtain the rate equations, (i.e., the system of integral equations approximating via the SLLN the original stochastic system), as follows: Upon approximating the c N i (t) on the left above by the c i (t) and differentiating the resulting equations, we find that the integral equation system is equivalent to a system of ordinary differential equations for c(t) ∈ R 4 given by with the initial conditions c(0) = c 0 .As we shall see in the next section, solutions of these equations yield quite good approximations to the sample paths of the stochastic system.

3.
Computations and Model Comparison.

Model Parameter Values.
To carry out numerical simulations and to compare the results of the stochastic and deterministic models (equations ( 5) and ( 9), respectively), we must choose reasonable values for all model parameters.We note that our paper focuses on methodological issues and, for confidentiality and proprietary reasons, only limited information on the swine production network was available to us.Thus some of these parameter values may be only rough approximations of those that might be obtained using inverse problem techniques with data from actual production networks [1].Consequently, the subsequent discussions in this paper are in no way an attempt to validate the above models.Nonetheless, we believe that the order-of-magnitude approximate parameter values we are using here are sufficient to allow us to develop and demonstrate effective use of methods and techniques which could be used with actual production network based parameters.
The parameters of the stochastic model, with the exception of the transition rate constants k i , are given in Table 1.From the expressions for the transition rates (1), we see that the residence times, t i , that pigs spend at node i are given by and t 4 = 1/k 4 = 1.As discussed above, the nonlinear form of the transition rates (1) means that the residence time at a given node depends on how far the following node is below its capacity.Consequently, we determine the k i by assuming that the given residence times pertain to the network in its equilibrium state.
Considering the deterministic model equations ( 9), we see that, if there is to be a flow through the system, the equilibrium population sizes N * i = N c * i at each node must be less than the capacities of the nodes.It is then straightforward to see that the equilibrium numbers of individuals at each of the first three nodes are proportional to the t i .This makes intuitive sense, since no loss occurs as individuals move between nodes, and so, at equilibrium, the relative residence times must equal the relative numbers of individuals at the nodes.This argument need not apply to the slaughter node, however, since individuals will spend longer there than the specified one-day residence time if the equilibrium value N * 4 is greater than S m .The flow rate from node four back to node one is the smaller of N * 4 and S m , and so we have that Notice that solving for the equilibrium of the deterministic model does not give us the value of N * 4 : since the network is closed, the total size of the population is equal to its value at the initial time.The values of the k i , for i = 1, 2 and 3, are then given by The parameter values and the initial states for the system (5) are tabulated in Table 2.
To obtain the parameters we use in our deterministic simulations we simply rescale the parameters in Table 2 by the total network size N = X 1 (t 0 ) + X 2 (t 0 ) + X 3 (t 0 ) + X 4 (t 0 ) = 3, 165, 000, using equation ( 6) and c i (t 0 ) = X i (t 0 )/N for i = 1, . . ., 4. The results are given below in Table 3. scaled slaughter capacity 1.390 382BAI, BANKS, DEDIU, GOVAN, LAST, LLOYD, NGUYEN, OLUFSEN, REMPALA, SLENNING 3.2.Stochastic Simulations.The standard method for the stochastic simulation of the discrete state continuous time Markov Chain of the type considered here is based on a standard Monte Carlo algorithm, also known as the Gillespie algorithm [29].This algorithm is described below: 1.For a given state of the system x, compute λ i (x) for i = 1 . . ., M (in our case M = 4).
2. Calculate the summation of the rates λ = M i=1 λ i (x) and simulate the time until the next transition by drawing from an exponential distribution with mean 1/λ. 3. Simulate the transition type R X ∈ {1, . . ., 4} by drawing from the discrete distribution with P (R X = i) = λ i (x)/λ.4. Update the system state x and repeat.
Using the above algorithm implemented in the statistical software R [40], we carried out numerous simulations for the model ( 5) with the initial conditions and values for parameters

Stochastic Simulations
Simulation of the stochastic model ( 5) of the food production chain via the Gillespie algorithm.Parameter values are as given in Table 2; N=3,165,000.
A sample of five realizations is plotted in Figure 2. Note that the realizations exhibit very little visible differences.However, when one carries out the simulations for a smaller system (N = 3, 165 pigs with the parameters in Table 2 scaled accordingly), the variations are readily visible as can be seen in Figure 3.We also remark that these two figures offer graphic depictions of the approximation theory discussed in Section 2.2 where in the case of very large N one can cannot distinguish between the stochastic simulations and the corresponding deterministic simulations for the sample path averages.2 scaled accordingly.
3.3.Deterministic Simulation.We numerically solved the ODE system (9) using the ode15s solver in Matlab.We fixed the parameters q = q * to be the same as in the stochastic simulation above (but scaled as in Table 3) and graph the solution of the rate equations (9) for t ∈ [0, 100] in Figure 4.In Figure 4 (left), we plot the numerical solution of the concentrations in system (9).To facilitate comparison with the MC realizations plotted in Figure 2, we also depict the rescaled quantities N i (t) = N c i (t), which provide approximations to averages over sample paths of X i (t) in Figure 4 (right).As expected, we find that the stochastic and deterministic computations provide similar numerical results, with the realizations fluctuating about the solution of the deterministic system for the averages as predicted by the theory.3.

384BAI, BANKS, DEDIU, GOVAN, LAST, LLOYD, NGUYEN, OLUFSEN, REMPALA, SLENNING
To observe the finer dynamics in the network, we plot the solution of the network on a smaller time scale in Figure 5 (left).
We find that our model solutions  3.  3. approach the steady states rather quickly.All components remain stable thereafter suggesting that the steady states are (at least locally) asymptotically stable (this can be verified with analytical arguments).We believe that this behavior of the model describes the food production network realistically when it is uninterrupted by external events.In Figures 5 (right), 6 (left) and 6 (right), we observe similar behavior in the food production chain for different values of the parameters L 4 and S m .As one can observe in these figures, when the value of S m is sufficiently large, the state N 4 , which is related to the replenishment of the network, will never reach S m , making the slaughter capacity constraint inactive in the production system.Only when S m is smaller than a certain critical value will it be active and in this case we observe accumulation of animals in the slaughter house (e.g., see Figure 6 (right)).These calculations along with numerous others we carried out suggest reasonable stability properties of the production chain in the absence of any interventions such as FMD (we will investigate such disturbances below).
3.4.Sensitivity Analysis.In this section, we perform a sensitivity analysis of the deterministic model (9), investigating how much the solution of the system changes when the rates κ i , the capacities l i , or the initial conditions c 0i , i = 1, . . ., 4 change.This analysis will be used to identify the parameters and the initial conditions to which the system is the most and least sensitive.
A second issue we address here, which is of great interest for inverse or parameter estimation problems in a typical nonlinear regression model, is the sensitivity of the parameter estimates with respect to the data measurements.We carry out this analysis by means of the generalized sensitivity functions (GSF) recently introduced by Thomaseth and Cobelli [44]; these are specifically designed for input-output identification experiments.GSF are based on information theoretical criteria (the Fisher information matrix) and, when used in conjunction with the traditional sensitivity functions, give a more accurate picture of the time distribution of the information content of measured outputs with respect to individual model parameters.
To use the well developed sensitivity analysis for the theory of dynamical systems for our purposes, we begin by writing the system (9) in vector form.We introduce the notation c(t) = (c 1 (t), c 2 (t), c 3 (t), c 4 (t)) T , q = (κ 1 , ..., κ 4 , s m , l 2 , ..., l 4 ), c 0 = (c 1 (0), ..., c 4 (0)) T , and denote by F = (f 1 , f 2 , f 3 , f 4 ) T the vector function whose entries are given by the expressions in the right side of (9).Then F : R 4 ×R 8 → R 4 , and we can write our ODE system in the general vector form To quantify the variation in the state variable c(t) with respect to changes in the parameters q j , j = 1, . . ., 8 and the initial conditions c 0k , k = 1, . . ., 4, we are naturally led to consider the sensitivity matrices We note that since our function F is sufficiently regular, the solutions c i are differentiable with respect to q j and c 0k , and therefore our sensitivity matrices Y and Z are well defined.The physical interpretation of the sensitivity matrices is obvious.Similar to the partial derivatives through which they are defined, they have a local character (in time and parameters).If, for example, the entry y ij = ∂c i /∂q j of the matrix Y takes values very close to zero in a certain time subinterval, then the state variable c i is insensitive to the parameter q j on that particular subinterval.
The same entry y ij can take large values on a different subinterval, indicating that in this time subinterval, the state variable c i is very sensitive to the parameter q j .386BAI, BANKS, DEDIU, GOVAN, LAST, LLOYD, NGUYEN, OLUFSEN, REMPALA, SLENNING From sensitivity analysis theory for dynamical systems [10,20,25,36,42], we also know that Y(t) is a 4 × 8 matrix that satisfies the ODE system Here we have used the notation F c = ∂F/∂c and F q = ∂F/∂q for the 4 × 4 and the 4 × 8 Jacobian matrices of F with respect to c and q, respectively, while 0 and I are the zero and the identity matrices with appropriate dimensions.Note that while equations ( 16), (17) are linear in Y and Z, they must be solved in tandem with equation (13), which is nonlinear.Consequently, the sensitivity analysis involves the solution of a set of nonlinear equations.
We will compute the sensitivity of the system (13) with respect to q and c 0 when the solutions are essentially at steady state.We carry this out by numerically solving the systems ( 16) and (17) for the same values of the parameters q = q * and initial conditions c = c * 0 as used in the stochastic simulations presented above (i.e., those given in Table 3) and by evaluating the solution at the fixed time t = 210 (arbitrarily chosen, but sufficiently large for our system to closely approach its steady state).Due to the nature of our problem, in which the parameters have different units and the state variables vary widely over many orders of magnitude, it is appropriate to consider the relative sensitivities S c i ,q j defined as the limit of the relative change in c i divided by the relative change in q j when the relative change in q j goes to zero; i.e., S c i ,q j = lim A simple analysis of the definition above (assuming that both c i and q j are nonzero) yields that the relative sensitivity S ci,qj can be obtained by normalizing the usual sensitivities ∂c i /∂q j such that We note that the S c i ,q j are dimensionless variables, invariant with respect to changes in units for c i and q j , which we can utilize to compare the degree of sensitivity of the state variables with respect to different parameters.In Table 4, we tabulate the relative sensitivities at time t = 210 of each state variable c i with respect to each parameter q j and each initial condition c 0k .For any fixed parameter/initial condition, we also tabulate the sensitivity of the system, cumulatively defined as the Euclidean norm of the relative sensitivities of the four state variables with respect to that parameter/initial condition.In other words, the sensitivity of the system with respect to q j is given by For the particular choice of the parameters q = q * and for the particular initial condition c 0 = c * 0 , the data displayed in the last column of Table 4 reveal that near Table 4. Relative sensitivities of the state variables/ system with respect to parameters q j and the initial conditions c 0k near steady state (t = 210).Baseline parameter values (q = q * ) and initial conditions c 0 = c * 0 are as in Table 3. the steady-state (t = 210) the system ( 9) is most sensitive (in decreasing order) to l 3 , c 03 and l 2 and least sensitive to s m .One interesting outcome we obtain from the analysis of the sensitivity data presented in Tables 4 and 5 is that the state variable c 1 is more sensitive to the parameter l 3 and the initial condition c 03 as compared to the parameters κ 1 , κ 4 , s m and l 4 and the initial condition c 01 on which this state variable depends directly.Without this sensitivity analysis, we could not infer this behavior simply by looking at the system (9) since the parameters l 3 and c 03 do not appear in the right side of the equation that defines dc 1 /dt.We find similar results for c 2 and c 4 which are more sensitive to the non-direct initial condition c 03 as compared to their direct initial conditions c 02 and c 04 .
As we pointed out previously, the sensitivity functions by definition have a local character both in the time domain and in the parameter domain, which implies that the results displayed in Table 4 characterizes the sensitivity of the system only for q = q * , c 0 = c * 0 and t = 210.However, to obtain a broader picture of the sensitivity map for our system (9) near steady state (t = 210) or for the whole time interval, one can compute the relative sensitivities (19) and (20) with respect to q j and c 0k on a uniform grid in a parameter neighborhood around the central values q * j and c * 0k and analyze the results thus obtained.(Although we carried out such analyses, the results are not presented here.) The sensitivity analysis we have performed so far is usually encountered in simulation studies (direct problems) where we need to quantify the effects of parameter variations on the trajectories of model outputs.Unlike simulations, in identification studies (inverse problems) one typically wants to estimate model parameters from data measurements, and one question of interest is to determine at which time points the measurements are most informative for the estimation of specific parameters.In addition, one may also desire a priori information about the degree of correlation between model parameters which the TSF functions of ( 14) and ( 15) used alone cannot provide.To address these questions, Thomaseth and Cobelli [44] introduced the generalized sensitivity functions (GSF) which provide information on the relevance of measurements of output variables of a system for the identification of certain parameters as well as on parameter correlation.More precisely, the generalized sensitivity functions describe the sensitivity of the parameter estimates with respect to data measurements.We illustrate below the utility of these Table 5. Summary of the sensitivity analysis presented in Table 4.The columns rank the parameters and initial conditions in order of increasing sensitivity for the four state variables, c i , and the system.
functions to provide a better understanding of our network model, by performing a sensitivity analysis for the inverse problem of estimating the parameters κ's of the system (9) through an ordinary least squares procedure.For a single-output model f (t, θ) (e.g., the case where one has longitudinal observations of one component c i of the system (13)) with discrete time measurements where θ 0 is the "true" parameter values (assumed to exist in most statistical inference and information content formulations-see [8,11,43]), the generalized sensitivity functions (GSF) are defined by where • represents element-by-element multiplication (see [44] and the Appendix of [7] for motivation and details).The measurement errors k in ( 21) are assumed to be independently and identically distributed with zero mean and known variance σ 2 (t k ) and the nonlinear model function f (t, θ) is assumed to be differentiable with respect to θ.Moreover, for simplicity, it is assumed that the observations y(t k ), as represented in (21), are used to estimate θ 0 by minimizing the weighted sum of squares (WSS) In our case, the nonlinear model function f is replaced by a vector-valued function which is the solution of the system (9) and the generalized sensitivity functions (GSF) are given by Here ∇ θ c l represents the gradient of the state variable c l with respect to θ, where θ ∈ R P is a vector including all (or just a subset of) the parameters κ's and l's and the initial conditions c 0 's.We note that the generalized sensitivity functions ( 23) are vector-valued functions having the same dimension P as θ and defined only at the discrete time points t k , k = 1, . . ., M .They are cumulative functions, at each time point t k taking into account only the contributions of the measurements up to t k , thus representing the influence of longitudinal measurements in contributing to the parameter estimates.
From (23) it follows that all the components of gs are one at the end of the experiment; i.e., gs(t M ) = 1.If one defines gs(t) = 0 for t < t 1 (gs is zero when no measurement is collected) and interpolates gs continuously between observation times, then each component gs p of gs varies continuously from 0 to 1 during the experiment.As we will see in the example below, this transition is not necessarily monotonic (gs p , p = 1, . . ., P may have oscillations) nor is it restricted to values in [0, 1] (i.e., gs p may take values outside [0, 1]) if large correlations between parameter estimates exist.As discussed in [44], the time subinterval during which this transition has the sharpest increase corresponds to measurements which provide the most information on possible variations in the corresponding true model parameters.
Since the GSF theory is developed in the context of estimation problems, we considered next an estimation problem using simulated "data."The data used for inversion was generated first by numerically solving the system (9) for the parameter values given in Table 3 and then by adding 5% Gaussian white noise to the solution obtained.We consider the problem of using this data to estimate the parameters κ 1 , κ 2 , κ 3 and κ 4 in an ordinary least squares procedure (with the other parameters and initial conditions fixed at the values from Table 3).The true values for the κ's are κ = (1.674,0.322, 4.521, 1) T .For θ 0 = κ, the generalized sensitivity functions (23) along with the traditional sensitivity functions for the system (9) are presented in Figure 7.In both figures we note a very well defined time subinterval, from t = 0 to about t = 60, where both GSF and TSF plots exhibit sharp increases and decreases.After this, the TSFs reach very quickly a steady state and the GSFs are forced to approach one.According to the theory, the interval [0, 60] is the time region where measurements are most informative for estimating the true parameters κ.So at least intuitively, sampling more data points in this region would result in more information about the parameters κ and therefore more accurate estimates for them.By computing the correlation matrix whose elements are given by standard formulas in least squares theory [9,43], one can also observe that strong correlations exist between estimates for κ 3 and κ 4 .In  3. fact, the correlation matrix for these parameters is given by 1.0000 0.4941 0.0004 0.2209 κ 2 0.4941 1.0000 0.1388 0.1497 κ 3 0.0004 0.1388 1.0000 -0.9502 κ 4 0.2209 0.1497 -0.9502 1.0000 which is agrees with the dynamics of the curves shown in Figure 7. Positive correlation between κ 1 and κ 2 is clearly indicated because the corresponding gsf graphs increase together, while the negative correlation between κ 3 and κ 4 is evidenced by the early opposite slope behavior in their corresponding gsf graphs.Ordinary least squares inverse problems carried out with different sets of data points illustrate and support the theory.We first performed the least squares minimization for a data set DS 1 consisting of a total of 15 observations, of which 8 were taken at equidistant points in the interval [0, 60] and 7 taken at equidistant points in the interval [80, 210] (for simplicity, we exclude the transition interval [60, 80] from our analysis here).The estimates for κ 1 , κ 2 , κ 3 and κ 4 along with the R 4 Euclidian norm of the error are displayed in Table 6.If we increase the number of data points in the interval [0, 60] from 8 to 15 and keep the number of data points in [80, 210] the same (see DS 2 entry, same table), we observe a significant decrease in the Euclidian norm of the error from 2.932 to 0.207, which represents a significant increase in the accuracy of the parameter estimates.A similar decrease from 2.224 to 1.859 and then to 0.973 is observed when we solve the least squares problem with data from [0, 60] only (see DS 5 , DS 6 and DS 7 entries).Thus, numerical calculations support the fact that increasing the number of data points in the region [0, 60] yields more accurate estimates for the parameter κ, in agreement with the theoretical expectations from TSF and GSF.
A totally different outcome is obtained when we carry out numerical estimation with an increasing number of data points in the interval [80,210].As one can see by comparing the entries DS 1 and DS 4 in Table 6, only a small gain is obtained ), we obtain very large errors which increase in magnitude as the number of sample points increases.Although puzzling at first view, this phenomenon is not surprising at all from the perspective of the theories presented above and in [6].Indeed, by blindly sampling more data points from the region where the generalized sensitivity functions exhibit the so called "forced-to-one" artifact and the traditional sensitivity curves are flat, we simply introduce redundancy in the sensitivity matrix, thus increasing the condition number of the Fisher information matrix for our problem.
For an illustration and discussion of this phenomenon, see [6].By the Cramér-Rao inequality, the consequence is that the variance of the unbiased estimator (and the corresponding standard errors) will be huge, making our estimates less useful.The same phenomenon (introduction of redundancy in the sensitivity matrix) is responsible for the poorer results which are obtained when we estimate κ using data set DS 3 (the Euclidian norm of the error increased as compared to DS 2 , where we doubled the number of points in [0, 60]).We observe an important drawback of the GSF: while they specify the most informative regions with respect to the estimation of parameters, they do not provide any information about the necessary number of data points to be used in those regions.We remark that in this section we have illustrated a methodology to quantify the sensitivity with respect to parameters and initial conditions of a large complex system.This methodology is a fundamental tool in identifying those parameters in a model which require further, more in-depth investigation as well as the basis of a statistical analysis for quantifying uncertainty in estimators (e.g., see [5,6,9,11,43]) as well as information content based model selection techniques [8].
4. Foot-and-Mouth Disease.Having established a basic model for the movement of animals (pigs) in the agricultural system (the pork industry of North Carolina), we are now ready to model the spread of an infectious disease in the 392BAI, BANKS, DEDIU, GOVAN, LAST, LLOYD, NGUYEN, OLUFSEN, REMPALA, SLENNING food production network.In this section we describe the incorporation of an SIRtype infection into the system and present simulations to illustrate the spread of foot-and-mouth disease throughout the aggregated agricultural network.
We describe the infection by an SIR process [2,12].It is assumed that a population can be partitioned into three groups: susceptible (S), infectious (I) and removed (R).In many settings the removed class represents individuals who have recovered from the infection.Individuals move between these classes as they become infected and recover from infection.Recovery is assumed to confer permanent immunity to infection and the demography of the population (i.e., births and non disease-related deaths) is ignored.In a well-mixed population the epidemiological model can be described by the flowchart of Figure 8 and equations (24).

Infectious
Recovered Here, S, I and R denote the numbers of susceptible, infective, and removed individuals, respectively.The transmission parameter is β: this parameter, when combined with the rate at which individuals meet each other and the probability that an infective would infect a susceptible during any one such meeting, yields the transmission rate.It is assumed that recovery occurs at constant rate γ, so that 1/γ is the average duration of infection.The population size is denoted by N , and we have in this case that N = S + I + R. The behavior of the simple SIR model is governed by the basic reproductive number, R 0 [2,12,31,14].This quantity equals the number of secondary infections caused by the introduction of a single infectious individual into an otherwise completely susceptible population.In terms of model parameters, the basic reproductive number is given by R 0 = β/γ.An outbreak of infection can only ever occur if R 0 is greater than one, otherwise the number of infectives can never increase.
We now combine the agricultural network model and the SIR model to produce a description of the potential spread of an SIR-type infection in our agricultural system.It is most convenient for us to work with the numbers of individuals of each type found at the various nodes of the network, and so we convert the concentrations of animals of the network model (9) into numbers.We write the number of individuals found at node i as N i , and so we have that N i (t) = N c i (t).
We expand the first three nodes of the network, (i.e., those describing the growers/sows, nurseries and finishers), by including an SIR model within each of them.
The infection statuses of the animals at the ith node are tracked by the quantities S i , I i and R i .Since N i is defined to be the total number of animals at this node we have that We make the following set of assumptions about the infection process and its impact upon the agricultural system: (1) Pigs are born healthy, but susceptible.Piglets are introduced into the network at the first node.(2) There is no infection or recovery during transport between the nodes.This is based on the idea that, in most cases, transportation takes no more than a couple of days, which is relatively short compared to the amount of time the pigs spend at each node.(3) There is no infection in the slaughter node.In the absence of human intervention this does not mean that the infectious animals are not processed: the assumption is that the infection is not propagated after the animal's death.(4) Infected pigs in node i recover at rate γ i .These recovery rates may differ between the nodes and, since the individuals found at different nodes will have different ages, these parameters depict age-dependent recovery rates.(5) Recovered pigs have temporary immunity; i.e., pigs do not immediately become susceptible after recovering from FMD.The rates at which recovered pigs become susceptible again are ρ i (i = 1, 2 or 3).While some diseases afford permanent immunity to the recovered animal, immunity acquired after FMD infection wanes in a matter of months.If vaccination of animals were considered, there would be an accumulation of animals in the recovered groups of the appropriate nodes.(6) We assume that the FMD-related death rate is small enough that we can ignore such deaths.Control strategies such as culling infective or susceptible animals could be modeled by including death terms.( 7) Since our network model assumes that the system is closed, we assume that any deaths are replenished by the introduction of piglets into the network.As already mentioned, these introductions occur into the first node.Consequently, as illustrated in the flowchart, Figure 9, the slaughtered animals that leave node four effectively return to the S 1 class.Similarly, if animal culling were being considered, the model would include flows from the appropriate classes back to S 1 .(8) No human intervention.In the model as presented here, we assume that humans do not make any adjustments to operation of the agricultural system in response to the infection: animal movement and processing continues as normal.Of course, one of the main aims of creating a model such as this is to enable the consideration of control strategies.In this study, we do not do this as we wish to first establish the baseline (no-control) behavior of the system.With this set of assumptions we obtain the following system of equations for the deterministic model of the aggregated agricultural network model with an SIR infection.
In many ways, this model resembles a standard multi-group epidemiological model, and so we might hope to find the basic reproductive number of the system using standard multi-group methodology [12,13].It is straightforward to find the next generation matrix, whose entries, t ij , give the average numbers of secondary infections that result in node i from the introduction of one infectious individual into node j when all individuals at node i are susceptible.For within-patch transmission, we have Here, the term 1/(γ i + k i (L i+1 − N i+1 ) + ) is the average duration of infection for infectives in node i, corrected for their departure on account of transportation.Movement of individuals between nodes reduces the average number of within-node secondary infections, with this effect being most noticeable if the node residence time is short compared to the duration of infection.For between-node transmission we have Here, the term k gives the probability that an infectious individual in node i is transported to node i + 1 before it recovers.
Multi-group theory reveals that, for an irreducible system (i.e., one in which infection can travel between any pair of nodes, possibly via intermediate nodes), the basic reproductive number is given by the dominant eigenvalue of this matrix.Our system is not irreducible, however, since infected animals can move only from a node to the following node: infection can spread to succeeding nodes but not preceding ones.Consequently, the standard definition of R 0 for multi-group systems is not helpful to us.From the t ij we can, however, easily calculate the average number of secondary infections caused by the introduction of a single infective into node j when all other individuals in the population are susceptible.These quantities equal t 11 + t 21 + t 31 , t 22 + t 32 , and t 33 , for nodes 1, 2, and 3, respectively.We now turn to numerical simulation of the model.Unfortunately, the existing studies in the literature do not provide us with an appropriate set of parameter values to use.Epidemiological parameters for the spread of FMD between various types of animals, including pigs, have been quantified [17,18,22,23,24,32,33,34,38,39,46] but on spatial scales that are quite different from our aggregated network description.On a large spatial scale, transmission between farms has been described, for instance during the 1967/68 and 2001 outbreaks in the UK, and parameter estimates obtained [22,23,24,32,33,34,46].Large-scale studies, however, take the individual unit of the model to be farms and so do not consider transmission between individual animals.
On a small spatial scale, detailed transmission experiments [17,18,39,38] have examined the spread of infection between small numbers of closely housed animals, either within or between pens.(Given the earlier comment regarding age-dependent epidemiological parameters, it is interesting to note that some of these experiments have been carried out on pigs of different ages.)These experiments, which typically involve placing one or more infected animals in close contact with a number of susceptible animals, demonstrate the high degree of infectiousness of FMD.In many instances every susceptible animal became infected [17].Instances in which all animals become infected provide less informative estimates of R 0 than might be hoped since the statistical methodology employed gives an infinite estimate for R 0 .An alternative statistical approach [18] accounts for the time dependence in the experiment and provides estimates of the transmission parameter β.
The literature provides us with satisfactory estimates for the average durations of infectiousness and immunity [28], but not R 0 (or, equivalently, β).For illustrative purposes, we shall take R 0 to equal 10 at each level of the network in the absence of transportation.All the simulations that follow assume that the network is initially in its demographic steady state.The epidemiological assumptions that we have made imply that this holds for all future times.(This would not be the case if we had deaths at the nursery or finisher nodes.)We choose to introduce infection by infecting a certain number of individuals in the initial node.
Our first simulation (Figure 10) illustrates the movement of a cohort of infectious individuals through the network in the absence of ongoing transmission.We set β 1 = β 2 = β 3 = 0, and introduce infection by placing 50, 000 infected piglets in the grower node.Of the piglets at the first node, 50, 000 are initially taken to be infectious, while the remainder of the population is susceptible.
In this simulation infected animals either recover or get transported to the next node.In the grower node, the number of infected animals simply decreases exponentially, while the number of recovered animals increases and then declines.Within a fairly short time period, the grower population is entirely replaced by susceptible individuals, reflecting the rapid turnover of the grower node.We see the appearance of infection first at the nursery and then at the finisher node.Since the system is at demographic equilibrium, we see no change at the slaughter node.
For our second simulation (Figure 11), we assume that, in the absence of transportation, the infection would have a basic reproductive number equal to 10 at each node.Consequently, we set β i /γ i = 10.As before, we take the initial population to be in demographic equilibrium, but we now introduce just 200 infective piglets into the grower node.The system rapidly approaches an equilibrium in which infectious animals are present at each node.The fraction of animals that are susceptible at equilibrium is much higher at the grower node (24.8%) than at either the nursery (6.9%) or finisher (7.9%).This should be expected, since all animals arriving at the grower are susceptible, while the infection statuses of those entering the nursery or finisher reflect the composition of the preceding node (i.e., grower and nursery nodes, respectively).For example, only 25% of the arrivees at the nursery are susceptible.The susceptible fraction at the finisher is slightly higher than at the nursery due to the replenishment of susceptibles by loss of immunity: this relatively slow process has a higher chance of occurring at the finisher since an animal's average residence time there is longer than at the nursery.
The differing susceptible fractions between the nodes mean that, relative to the size of the population of each node, there is less ongoing transmission at either the nursery or finisher node than at the grower.The equilibrium infective fraction 398BAI, BANKS, DEDIU, GOVAN, LAST, LLOYD, NGUYEN, OLUFSEN, REMPALA, SLENNING decreases as the supply chain is traversed (46.8%, 31.5% and 16.2% at the grower, nursery and finisher, respectively), while the recovered fraction increases (28.4%, 61.5% and 75.9%).
The impact of age-dependent transmission rates and the differing residence times is explored in our third simulation (Figure 12).Again only 200 infectives were introduced, but now we assume that younger animals are less infectious than older animals.The basic reproductive numbers, in the absence of transportation, at the grower, nursery and finisher nodes are taken to equal two, ten and fifteen, respectively.(We exaggerate the age-dependent differences in R 0 for illustrative purposes.)For the panel depicting the grower, note that susceptible numbers are plotted on a different scale (left vertical axis) from the infective and recovered numbers (right vertical axis), and these infective and recovered numbers are not in thousands.Also note that infection goes extinct in the grower node while a positive equilibrium level of infection is achieved at the nursery and finisher nodes.
Transportation has a major impact on the dynamics of the infection: the movement of individuals out of the grower node is sufficiently rapid that the infection cannot persist in this node.The number of infectives falls roughly exponentially, as does the number of recovereds, although the latter only occurs after an initial rise (which largely reflects the recovery of the initial pool of infectives).The dynamics in this node are somewhat reminiscent of those seen in the first simulation (in which there was no transmission), although they play out on a slower timescale since there is some ongoing transmission.
Disease transmission at the nursery and finisher nodes occurs sufficiently fast that the prevalence of infection approaches a positive, endemic, equilibrium at both.Observe that, even though the transmission parameter in the nursery node is the same as it was in the previous simulation, the equilibrium numbers of susceptibles and infectives are higher here than they were in the previous simulation.This reflects the differences between the compositions of the populations entering the nursery node in the two simulations, with more susceptibles arriving in the agedependent situation.
5. Concluding Remarks.In this paper we have demonstrated a methodological approach to the investigation of production networks and their vulnerability to disturbances such as diseases.The stochastic model and the resulting approximate deterministic system we employ were shown to agree well, but are not validated.Rather, we carry out simulations and sensitivity analyses with parameter values that are only loosely based on a swine network.We use the deterministic model to show how to determine the parameters to which the model at these parameter values exhibits the most sensitivity.Finally, we demonstrate how disease can be introduced and the resulting network vulnerabilities analyzed.An interesting next step would involve obtaining experimental data to validate and perhaps improve the model for a specific production network.This would require using inverse problem algorithms with the data to obtain estimates along with measures of associated uncertainty (e.g., standard errors [9,43]) for the underlying transition rates k i .
Other obvious questions for further investigation involve the transmission dynamics of the infection.It is well known, for example, that random effects can have a major impact on the invasion of an infection into a population.The use of constant rates of recovery and loss of immunity for the infection should also be questioned.These assumptions, which are biologically unrealistic for many infections, could be important in cases such as the one presented above where the life span of the animals is comparable in length to the duration of infection and immunity.Finally, the model used here assumes instantaneous transport between nodes.If infection during transport is an important factor (and depending on the disease it may well be) then the structure of the model should be modified to incorporate positive transport times.This could lead to more interesting (mathematically) and more difficult dynamical systems with time delays in place of ( 9) and the corresponding systems in Section 4.
The randomness seen in the stochastic network model originates from the random movement of discrete individuals from node to node.The analysis of Sections 2.2 and 3.2 shows that, due to an averaging effect, these random effects become less important as the system size N increases.Application of the stochastic transportation model to describe a real-world situation should, therefore, account for the size of the groups in which pigs are transported between nodes.If, for example, one thousand pigs were moved at a time, the appropriate notion of an "individual" within the model would be a thousand pigs.Treating each group of a thousand animals as a unit would lead to a marked increase in the magnitude of stochastic fluctuations seen at the population level.Consequently, even though the system size in our simulations is on the order of millions of pigs, it might be that the resulting stochastic fluctuations in a more realistic model of the production system are closer to those shown in Figure 3 than to those of Figure 2. The approach outlined in this paper has rather obvious potential for application to a wide range of problems.These include the investigation of the spread of diseases through spatially or structurally distributed dynamic populations (e.g., avian flu through migrating bird populations, contagious infections through human and animal populations that are highly mobile, highly age-structured, or both).In some of these cases the natural nodal structure would be a continuum, requiring stochastic and deterministic models with a continuum of spatial and structural heterogeneities, leading to partial differential equation systems.Such applications would undoubtedly motivate the development of interesting new stochastic and deterministic mathematical and computational methodologies.
We also note that the approach and methodology presented here are useful for investigation of a wide range of perturbations other than disease (e.g., loss of capacity at a given node such as a factory being shut down for some reason) in supply networks.In particular they would be useful in the assessment of risk of a foodborne pathogen (e.g., salmonella, listeria, etc.) entering the food chain [27] due to contamination (either accidental or deliberate) at some stage of the supply chain.

Figure 9 .
Figure 9. Flow diagram of the aggregated agricultural network model with SIR infection.

Figure 10 .
Figure 10.Simulation I: Recovery and movement of infectious individuals through the network, in the absence of transmission.Notice that susceptible numbers are plotted on different scales (left vertical axis on each panel) from infective and recovered numbers (right vertical axis on each panel).Parameter values are given in the text: each β i is set equal to zero, infection is assumed to last 31 days and immunity lasts 180 days.The initial conditions are N 1 = 315, 000, N 2 = 735, 000, N 3 = 2, 100, 000 and N 4 = 15, 000.Of the piglets at the first node, 50, 000 are initially taken to be infectious, while the remainder of the population is susceptible.

Figure 11 .
Figure 11.Simulation II: Dynamics following the introduction of infection, with R 0 = 10.This value of the basic reproductive number determines the value of the transmission parameters (β i ).Initial conditions and other parameter values are as in the previous figure, except that only 200 of the 315,000 individuals at the grower node are infective at the initial time.

Figure 12 .
Figure 12.Simulation III.Age-dependent transmission of infection.If there were no transportation of animals, the basic reproductive numbers at the grower, nursery and finisher nodes would equal 2, 10 and 15, respectively.These values of R 0 determine the parameters β 1 , β 2 and β 3 .All other parameter values and initial conditions are as in the previous figure.For the panel depicting the grower, note that susceptible numbers are plotted on a different scale (left vertical axis) from the infective and recovered numbers (right vertical axis), and these infective and recovered numbers are not in thousands.Also note that infection goes extinct in the grower node while a positive equilibrium level of infection is achieved at the nursery and finisher nodes.

Table 1 .
Network parameters based on swine production in NC.
[3,21]ochastic and Deterministic Models.We model the evolution of the food production network shown in Figure1as a continuous time discrete state density dependent jump Markov Chain (MC)[3,21]with a discrete state space embedded in an R 4 non-negative integer lattice L. The state of this MC at time t is denoted by X(t) = (X 1 (t), . . ., X 4 (t)), where X i (t) is the number of pigs at node i at time t, i = 1, . . ., 4.

Table 2 .
Aggregated agricultural network model: Parameters for stochastic simulations, together with our chosen initial conditions.All numbers of pigs are given in thousands here.

Table 3 .
Aggregated agricultural network model: Rescaled parameter values and initial conditions for the deterministic model.

Table 6 .
Parameter estimates for the transition rates κ 1 , κ 2 , κ 3 and κ 4 with different data sets.in the accuracy of the parameter estimates is gained by doubling the number of data points in the interval [80, 210].Moreover, when we try to estimate the parameters κ with data from the interval [80, 210] alone (see DS 8 , DS 9 and DS 10 396BAI, BANKS, DEDIU, GOVAN, LAST, LLOYD, NGUYEN, OLUFSEN, REMPALA, SLENNING