A storage model with random release rate for modeling exposure to food contaminants

,


Introduction
Certain foods may contain varying amounts of chemicals such as methyl mercury (present in seafood), dioxins (in poultry, meat) or mycotoxins (in cereals, dried fruits, etc.), which may cause major health problems when accumulating inside the body in excessive doses.Food safety is now a crucial stake as regards public health in many countries (as an example, it is a thematic top priority of the 7th European Research Framework program, see http://ec.europa.eu/research/fp7/).This topic naturally interfaces with various disciplines, such as biology, nutritional medicine, toxicology and of course applied mathematics with the aim to develop rigorous methods for quantitative risk assessment.A scientific literature devoted to probabilistic and statistical methods for the study of dietary exposure to food contaminants is progressively carving out a place in applied probability and statistics journals (see [58], [24], [30] or [9] for instance).Static viewpoints for the probabilistic modeling of the quantity X of a given food contaminant ingested on a short period have been considered in recent works, mainly focusing on the tail behavior of X and allowing for computation of the probability that X exceeds a maximum tolerable dose (see [8], [57]).However, as highlighted in [62], such approaches for food risk analysis do not take into account the accumulating and eliminating processes occurring in the body, which naturally requires to introduce time as a crucial description parameter of a comprehensive model (see also the discussion in [27]).
This paper aims at proposing a dynamic modeling of exposure to a certain food contaminant, incorporating important features of the phenomenon, in particular in a way that the model may account for the contaminant pharmacokinetics in man following intakes.The case of methyl mercury food contamination shall serve as a running illustration of the concepts and methods studied in this article: mathematical modeling of the pharmacokinetics behavior in man of methyl mercury (essentially present in seafoods) has received increasing attention in the toxicology literature (see [45], [56] [55], [1] or [26]) and dose-response relationships have been extensively investigated for this contaminant, establishing clearly its negative impact on human health (refer to [15], [18]).In our modeling the amount of contaminant present in the body evolves through its accumulation after repeated intakes (food consumption) and according to the pharmacokinetics governing its elimination/excretion, so that its temporal evolution is described by a piecewise-deterministic Markov process (PDM process in abbreviated form): the accumulation process is modeled by a marked point process in a standard fashion, while the elimination phenomenon is described by a differential equation with random coefficients, randomness accounting for the variability of the rate at which the total contaminant body burden decreases in between intakes due to metabolic factors.Such a process slightly extends storage models with general release rules widely used in operations research and engineering for dealing with problems such as water storage in dams, in that one allows here the (content-dependent) release rate to be random, as strongly advocated by biological modeling, and inter-intake times are not required to be exponentially distributed (the choice of a memoryless distribution being totally inadequate in the dietary context).Having practical use of the proposed exposure model for public health guidance in view (see [62]), we also discuss its relation to available data in the present paper: as sample paths of the exposure process cannot be observed in general, we set theoretical grounds for practical inference techniques based on intensive computer simulation methods.A thorough statistical analysis of toxicological and intake data based on the concepts and results developed in this article is carried out in a forthcoming companion paper (see [7]).
The outline of the paper is as follows.In section 2 a class of stochastic models with a reasonably simple (markovian) structure for describing the evolution through time of food contaminant exposure is introduced.In the important case when the (random) elimination rate is linear (such a feature being strongly motivated by previous works on kinetics modeling), theoretical properties of the exposure process are thoroughly investigated in section 3. Turning to the problem of estimating the steady-state or time-dependent features of the model that are relevant from the toxicology viewpoint and taking account of the fact that the exposure process is not observable in practice, statistical procedures relying on simulation methods are presented and studied in section 4. Finally, empirical studies related to methyl mercury food contamination are carried out in section 5, with the aim to illustrate the relevance of the modeling and the statistical methods studied in this paper.Technical proofs are postponed to the Appendix.
2 Modeling the exposure to a food contaminant Suppose that an exhaustive list of P types of food, indexed by p = 1, . . ., P, involved in the alimentation of a given population and possibly contaminated by a certain chemical, is drawn up.Regarding to the chemical of interest, each type of food p ∈ {1, . . ., P} is contaminated in random ratio K (p) , with probability distribution F K (p) on R + , the set of positive real numbers (we shall denote by R * + the set of strictly positive real numbers).Concerning this specific contaminant exposure, a meal may be viewed as a realization of a r.v.Q = (Q (1) , . . ., Q (P) ) indicating the quantity of food of each type consumed, normalized by the body weight.For a meal Q drawn from a distribution F Q on R P + , cooked from foods of which toxicity is described by a contamination ratio vector K = (K (1) , . . ., K (P) ) drawn from the tensor product of distributions denoting by ., .the standard inner product on R P .Its probability distribution F U is the image of F K ⊗ F Q by the inner product ., ., assuming that the quantities of food consumed are independent from the contamination levels.Here and throughout, we suppose that the contaminant intake distribution F U has a density f U with respect to λ, the Lebesgue measure on R + .By convention, T 0 = 0 is chosen as time origin.The food contamination phenomenon through time for an individual of the population of interest may be classically modeled by a marked point process {(T n , Q n , K n )} n≥1 on R + × R P + × R P + , the T n 's being the successive times at which the individual consumes foods among the list {1, . . ., P} and the marks (Q n , K n ) being respectively the vector of food quantities and the vector of contamination ratios related to the meal had at time T n , n ≥ 1 (refer to [21] for a recent account of the theory of point processes and of its numerous applications).The process {(T n , Q n )} n≥1 describing dietary behavior is assumed independent from the sequence (K n ) n≥1 of chemical contamination ratios.Although the modeling of dietary behaviors could certainly give rise to a huge variety of models, depending on the dependence structure between (T n , Q n ) and past values {(T m , Q m )} m<n that one stipulates, we make here the simplifying assumption that the marks Q n , n ≥ 1, form a sequence of independent and identically distributed random variables (in the sequel we shall use the abbreviated form "i.i.d.r.v.'s"), with common distribution F Q independent from the location times (T n ) n≥1 .This assumption is acceptable for chemicals present in a few types of food, such as methyl mercury, our running example, but certainly not for all contaminants.For chemicals present in many foods of everyday consumption such as Ochratoxin A (present in cereals, coffee, etc.), it would be necessary to introduce additional autoregressive structure in the model for capturing important features of any realistic diet (the consumption of certain food being typically alternated for reasons related to taste or nutritional aspects).Such a modeling task is beyond the scope of the present paper and is left for further investigation.Finally, we suppose that the inter-intake times ∆T n+1 = T n+1 − T n , n ≥ 1, are i.i.d. with common probability distribution G(dt) = g(t)dt and finite expectation m G = ∞ t=0 tG(dt) < ∞, the sequence (T n ) n≥1 of intake times being thus a pure renewal process.
Contamination sources other than dietary are neglected in the present study and we denote by X(t) the total body burden in contaminant at time t ≥ 0. In between intakes, we assume that the contamination exposure process X(t) is governed by the differential equation ẋ denoting x's temporal derivative by ẋ(t) and θ being a random parameter, taking its values in a set Θ ⊂ R d with d ≥ 1 say, and accounting in the modeling for fluctuations of the elimination rate due to metabolic factors at the intake times (the successive values θ n , n ∈ N, of θ are thus fixed at times T 0 , T 1 , . .., see Remark 2 below).The function r(x, θ) is assumed to be strictly positive and continuous on R * + × Θ, such that for all θ ∈ Θ, r(0, θ) = 0 (so that when X(t) eventually reaches the level 0, the process stays at this level until the next intake) and for all (ǫ, θ) ∈ (0, 1) × Θ: Under these conditions, for any initial value x(0) ≥ 0 and metabolic parameter value θ ∈ Θ, Eq. ( 2) has clearly a unique solution.
Other approaches may be naturally adopted for describing the elimination phenomenon occurring in between intakes.For instance, toxicokinetic models based on stochastic dif-ferential equations or decreasing jump processes (as in inventory modeling) could be pertinently considered for this purpose.
Remark 1 (On pharmacokinetics modeling) In toxicology, Eq. ( 2) is widely used with r(x, θ) = θx for modeling the kinetics in man of certain contaminants following intakes.As shown by many pharmacokinetics studies, there is considerable empirical evidence that it properly describes the way the elimination rate depends on the total body burden of the chemical in numerous cases (see [12], [54] or [29] for further details on linear pharmacokinetics models, also referred to as first-order kinetics models).In this context, the release parameter log 2/θ is known as the biological half-life of the contaminant (the time required for X to decrease by half in the organism in absence of new contaminant intake).For methyl mercury (MeHg), our running example in this paper, the half-life is known to fluctuate around six weeks (see [56] and the references therein).For such dietary contaminants, of which biological half-life is measured in weeks rather than days, it is naturally essential to take account of the kinetics in man for successful modeling of the exposure phenomenon.
Remark 2 (On modeling fluctuations in the metabolic rate) In a mathematical model for the evolution of food contaminant exposure through time, incorporating a certain amount of randomness in the elimination process due to possible metabolism changes may certainly contribute to make the modeling more plausible.In this first attempt, mainly motivated by the dietary methyl mercury case, we chose here to vary the metabolic parameter at each intake time, on grounds of parsimony: indeed, data collected generally consist of observed half-lives in a sample of individuals only (see [35,28] for instance) and, to our knowledge, no quantitative study of the frequency of changes in metabolism (regarding chemical elimination) has been carried out yet.However, our marked point process based model could be easily extended by considering competing risks between intake times and times when the metabolism changes.As pointed out by the referee, another possible approach for modeling the fluctuations of θ could consist in using a stochastic differential equation (based on a geometric Brownian motion for instance).
We assume that (θ n ) n∈N is an i.i.d.sequence with common distribution H(dθ).For a given value of the metabolic parameter θ ∈ Θ, the time necessary for the body burden (without further intake) to decrease from x 0 > 0 to x ∈ (0, x 0 ) is given by Under these assumptions, we clearly have that H({τ θ (x 0 , x) < ∞}) = 1 for all 0 < x ≤ x 0 .The contaminant may be thus entirely eliminated from the body (the amount x reaching then the level 0) with probability one in the sole case when the next condition holds.
In such a case we would also have H({τ θ (x, 0) < ∞}) = 1 for all x ≥ 0. In this respect, it is noteworthy that, in the linear case mentioned in Remark 1, we have τ θ (x, 0) = ∞ for all θ > 0 and x > 0, meaning that the process cannot reach the level 0 in finite time (in contrast with pharmacokinetics models based on affine rates r(x) = a + b • x with a > 0 for instance).However, determining whether the chemical may be entirely removed from the body or not is purely a mathematical concern, due to existing limits of detection (LOD) inherent to analytical measurement techniques (see [33]).Hence, in between intake times and given the current value of the metabolic parameter θ, the exposure process moves in a deterministic fashion according to (2), and has the same (upward) jumps as the process of cumulative intakes and N(t) = n∈N I {Tn ≤t} as the number of intakes until time t, denoting by I E the indicator function of any event E. The exposure process X is piecewise-deterministic with càd-làg1 trajectories (see a typical sample path in Fig. 2) and satisfies the equation X(0) denoting the total body burden in contaminant at initial time T 0 = 0 and with a ∧ b = min(a, b) for all (a, b) ∈ R 2 .For an account of such piecewise deterministic processes, one may refer to [23] (see also [22] and ergodic results may be found in [17]).
For the continuous-time process thus defined to be markovian, one has to record the current value θ(t) = n∈N θ n I {t∈[Tn ,T n+1 [} of the metabolic parameter as well as the backward recurrence time A(t) = t − T N(t) (the time since the last intake).By construction, the process (X(t), θ(t), A(t)) t≥0 is strongly markovian with infinitesimal generator (5) denoting by ζ(t) = g(t)/ ∞ s=t g(s)ds the hazard rate of the inter-intake times and provided that φ(., θ, .): (x, t) → φ(x, θ, t) is a bounded function with bounded continuous first derivatives in x and t for all θ ∈ Θ (one may refer to [2] for an account of key notions of the theory of stochastic processes, oriented to biology applications).
In the above setting, the time origin T 0 = 0 does not necessarily correspond to an intake time.Given the time A(0) = a since the last intake at time t = 0, we let ∆T 1 have t T0 T1 T2 T3 X(t) Figure 1: Sample path of the exposure process X, modeling the evolution of the total body burden of a given dietary contaminant through time.
the density g a (t) = g(a + t)/ ∞ s=a g(s)ds, making the renewal process (∆T n ) n∈N possibly delayed, except naturally in the case when the inter-intake distribution G is exponential.However, the choice of such a memoryless distribution in the dietary context is clearly not pertinent, distributions with increasing hazard rate being much more adequate (see section 5).Here and throughout we denote by P x,a the probability measure on the underlying space such that (X(0), A(0)) = (x, a) and θ(0) ∼ H, and by E x,a [.] the P x,a -expectation for all x ≥ 0 and a in supp(G), the support of the distribution G.
In the case when one neglects variability in the elimination process (i.e. when H is a Dirac measure) and under the additional assumption that the renewal times are exponentially distributed (making the process X itself markovian, which facilitates much the study but is not relevant to our application as emphasized above), this modeling boils down to a standard storage model with a general release rate (see [14] and [13] for instance).We refer to Chapter XIV in [4] for an account of such processes, widely used in operations research for modeling queuing/storage systems.Basic communication properties of the stochastic process X = (X(t)) t≥0 may be established in a fashion very similar to the ones of the latter processes.They are summarized in the next result (of which proof is omitted since it is a slight modification of the proof of Proposition 1.2 in chap.XIV of [4]).
Theorem 1 Suppose that G(dx) = g(x)dx has infinite tail (so that arbitrarily long interintake times may happen with positive probability).Assume further that either g(x) > 0 on ]0, ǫ] for some ǫ > 0 or else that F U has infinite tail (i.e. with positive probability, either intake times may be arbitrarily close together or else intakes may be arbitrarily large).Then X reaches any state x > 0 in finite time with positive probability whatever the starting point, i.e. for all x 0 ≥ 0, a ∈ supp(G), we have with τ x = inf{t ≥ 0 : X t = x} as the (random) time needed for X to reach the level x.Furthermore, if condition (C 1 ) is fulfilled, then (6) still holds for x = 0. Besides, either X "goes to infinity" with probability one, i.e. is such that P x 0 ,a ({X(t) → ∞ , as t → ∞}) = 1 for all x 0 ≥ 0, or else X reaches any state x > 0 in finite time with probability one whatever the starting point, i.e. for all x 0 ≥ 0, a ∈ supp(G), If (C 1 ) is satisfied, then (7) also holds for x = 0.
An important task is to find conditions ensuring that the limiting behavior of the exposure process X is represented by a stationary probability measure µ describing the equilibrium state to which the process settles as time goes to infinity.In particular, time averages over long periods, such as the mean time spent by the exposure process X over a possibly critical threshold u > 0, T −1 T 0 I {X t ≥u} dt, for instance, are then asymptotically described by the distribution µ.Computing/estimating steady-state quantities would be then relevant for summarizing the exposure phenomenon in the long run and assessing the long-term toxicological risk.Beyond stochastic stability properties, determining the tail behavior of the steady-state distribution and evaluating the rate at which the exposure process converges to the stationary state is also of critical importance in practice.These questions are thoroughly investigated for linear pharmacokinetics models in the next section.

Probabilistic study in the 'linear kinetics' case
We now focus on the ergodicity properties of the exposure process X(t) in the specific case when for a given metabolic state described by a real parameter θ, the elimination rate is proportional to the total body burden in contaminant, i.e. r(x, θ) = θx.Here we suppose that Θ is a subset of R * + , ensuring that (3) is satisfied.As mentioned before, the linear case is of crucial importance in toxicology, insofar as it suitably models the pharmacokinetics behavior in man of numerous chemicals (see [29]).We shall show that studying the long-term behavior of X boils down to investigating the properties of the embedded Markov chain X = (X n ) n≥1 of which values correspond to the ones taken by the exposure process just after intake times : X n = X(T n ) for all n ≥ 1.By construction, the chain X satisfies the following autoregressive equation with random coefficients and has transition probability Π(x, dy) = π(x, y)dy with transition density for all (x, y) ∈ R * 2 + , where a∨b = max(a, b).Ergodicity of such real-valued Markov chains Y, defined through stochastic recurrence equations of the form Y n+1 = α n Y n + β n , where {(α n , β n )} n∈N is a sequence of i.i.d.pairs of positive r.v.'s, has been extensively studied in the literature, such models being widely used in financial or insurance mathematics (see section 8.4 in [25] for instance).Specialized to our setting, well known results related to such processes enable to show that the embedded chains X is positive recurrent 2 under the assumption that log(1 ∨ U 1 ) has finite expectation (which is a very plausible hypothesis in the dietary context), as stated in the next theorem, and then to specify the tail behavior of the limiting probability distribution.Furthermore, the simple autoregressive form of Eq. ( 8) makes Foster-Lyapunov conditions easily verifiable for such Markov chains, in order to refine their stochastic stability analysis (we refer to [44] for an account of such key notions of the Markov chain theory).
Theorem 2 Under the assumptions of Theorem 1, the chain X is λ-irreducible 3 .Moreover, suppose that the following condition holds.
Then X is positive recurrent with stationary probability distribution μ.If one assume further that f U is continuous and strictly positive on R + and: then X is geometrically ergodic, μ has finite moment of order γ and there exist constants R < ∞ and r > 1 such that, for all n ≥ 1, x > 0, denoting by Π n the n-th iterate of Π and with μ(ψ) = ∞ y=0 ψ(y)μ(dy) for any μ-integrable function ψ.Suppose finally that the condition (H1) and the next one simultaneously hold, (H3) The r.v.U 1 is regularly varying with index κ > 0 (i.e. for all t > 0, Then the stationary law μ has regularly varying tail with index κ. 2 Recall that a Markov chain Y = (Yn) n∈N with state space (E, E) is positive recurrent if there exists a unique probability distribution µ on E that is invariant for its transition kernel Π (i.e µ(dy) = x∈E µ(dx)Π(x, dy)), making then Y stationary (µ is then referred to as Y's stationary distribution).
3 A Markov chain Y = (Yn) n∈N with state space (E, E) and transition Π(x, dy) is said ψ-irreducible, ψ being a σ-finite measure on E, if, for all A ∈ E weighted by ψ, Y visits the subset A in finite time with positive probability whatever its starting point, i.e.
Remark 3 (On the tail behavior assumption for the intake distribution) The relevance of the regular variation assumption for modeling the tail behavior of dietary contaminant intakes related to certain chemicals is strongly supported in [57] and [8].In these works, various estimation strategies for tail distribution features such as the Pareto index κ involved in (H3) are also proposed and implemented on several food contamination and consumption data sets.We refer to [25] for an excellent account of such notions arising in extreme values theory and techniques for modeling extremal events.
As pointed out in [43], stochastic stability analysis based on drift criteria in the continuous-time setting is not as straightforward as in the discrete-time case, generally due to the complex form of the generator and of candidate test functions.Fortunately, given the explicit relationship between X and the embedded discrete-time chain X in our specific case, ergodicity of the continuous-time model and tail properties of its limiting distribution may be investigated based on the results established above for X under mild conditions.However, under more restrictive moment conditions for the inter-intake distribution, as the one stated below, a simple test function for which the generator ( 5) is shown to satisfy a geometric drift condition, may be nevertheless exhibited, so as to establish geometric ergodicity for the Markov process {(X(t), θ(t), A(t))} t≥0 .
Theorem 3 Under the assumptions of Theorem 1 and supposing that (H1) is fulfilled, X(t) has an absolutely continuous limiting probability distribution µ given by in the sense that T −1 T 0 I {X t ≤u} dt → µ([0, u]), P x 0 ,a -a.s., as t → ∞ for all x 0 ≥ 0 and a ∈ supp(G).Furthermore, • if (H3) holds and the set Θ is bounded, then µ is regularly varying with the same index as F U , • and if (H2) and (H4) hold, then {(X(t), θ(t), A(t))} t≥0 is geometrically recurrent.In particular, µ has finite moment of order γ and for all Remark 4 (On the tail behavior of the stationary distribution) When the U n 's are heavy-tailed, and under the assumption that the ∆T n 's are exponentially distributed (making B(t) a time-homogeneous Lévy process), the fact that the stationary distribution µ inherits its tail behavior from F U has been established in [3] for deterministic release rates.Besides, when assuming G exponential and θ fixed, one may exactly identify the limit distribution µ in some specific cases (see section 8 in [13] or section 2 in Chap.XIV of [4]) using basic level crossing arguments (X being itself markovian in this case).If F U is also exponential for instance, µ is a Gamma distribution.
Remark 5 (On practical relevance of steady-state features) Now that it is established that the exposure process settles to an equilibrium regime after a 'certain time', the question of specifying precisely what is meant by 'certain time' naturally arises.
It would be pertinent to describe the long run behavior of the exposure process and assess the long-term toxicological risk by computing steady-state characteristics solely if the time necessary to reach the equilibrium state approximately may be considered as a reasonable horizon at the human scale.As an illustration, the amount of time needed to be roughly in steady-state from a collection of datasets related to dietary MeHg contamination is evaluated through simulation in Section 5.
In order to exhibit connections between the exposure process X = (X(t)) t≥0 and possible negative effects of the chemical on human health, it is appropriate to consider simple characteristics of the process X, easily interpretable from an epidemiology viewpoint.In this respect, the mean exposure over a long time period T −1 T t=0 X(t)dt is one of the most relevant features.Its asymptotic behavior is refined in the next result.
Proposition 4 Under the assumptions of Theorem 1 and supposing that (H2) is fulfilled for γ = 1, we have for all (x 0 , a) ∈ R + × supp(G) as H2) is fulfilled with γ ≥ 2, then there exists a constant 0 < σ 2 < ∞ s.t. for all (x 0 , a) ∈ R + × supp(G) we have the following convergence in P x 0 ,a -distribution Remark 6 (On the limiting variance of the sample mean exposure value) As will be shown in the proof below, the asymptotic variance σ 2 in ( 14) may be related to the limiting behavior of a certain additive functional of the Markov chain ((X n , θ n , ∆T n+1 )) n≥1 .
In [5] (see also [6]), an estimator of the asymptotic variance of such functionals based on pseudo-renewal properties of the underlying chain (namely, on renewal properties of a Nummelin extension of the chain) has been proposed and a detailed study of its asymptotic properties has been carried out.
Beyond the asymptotic exposure mean or the asymptotic mean time spent by X above a certain threshold, other summary characteristics of the exposure process could be pertinently considered from an epidemiology viewpoint, among which the asymptotic tail conditional expectation E µ [X | X > u], denoting by E µ [.] the expectation with respect to µ, after the fashion of risk evaluation in mathematical finance or insurance (see also [62]).

Simulation-based statistical inference
We now consider the statistical issues one faces when attempting to estimate certain features of linear rate exposure models.The main difficulty lies in the fact that the exposure process X is generally unobservable.Food consumption data (quantities of consumed food and consumption times) related to a single individual over long time periods are scarcely available in practice.Performing measurements at all consumption times so as to record the food contamination levels appears as not easily realizable.Instead, practitioners have at their disposal some massive databases, in which information related to the dietary habits of large population samples over short periods of time is gathered.Besides, some contamination data concerning certain chemicals and types of food are stored in data warehouses and available for statistical purposes.Finally, experiments for assessing models accounting for the pharmacokinetics behavior in man of various chemicals have been carried out.Data permitting to fit values or probability distributions on the parameters of these models are consequently available.
Estimation of steady-state or time-dependent features of the law L X of the process X given the starting point (X(0), A(0)) = (x 0 , a) ∈ R + × supp(G) could thus be based on preliminary computation of consistent estimates Ĝ, FU and Ĥ of the unknown distribution functions G, F U and H. Hence, when no closed form analytic expression for the quantity of interest is available from (G, F U , H), ruling out the possibility of computing plug-in estimates, a feasible method could consist in simulating sample paths starting from (x 0 , a) of the approximate process X with law L X corresponding to the estimated distribution functions ( Ĝ, FU , Ĥ) and construct estimators based on the trajectories thus obtained.
Beyond stochastic modeling of the exposure phenomenon, the main goal of this paper is to provide theoretical grounds for the application of such statistical methods in toxicological risk evaluation.This leads up to investigate the stability of the stochastic model described in section 2 with respect to G, F U and H (stability analysis may be viewed as the counterpart of sensitivity analysis in a probabilistic framework, refer to [46] for an account of this topic), and consider the continuity problem consisting in evaluating a measure of closeness between L X and L X making the mapping L X → Q(X) continuous, Q being the functional of the trajectory of interest, a certain mapping defined on the Skorohod's space, i.e. the set of càd-làg functions x : R + → R. Hence, convergence preservation results may be obtained via the continuous-mapping approach as described in [63], where it is applied to establish stochastic-process limits for queuing systems.For simplicity's sake, we take a = 0 in the following study and do not consider the stability issue related to the approximation of the starting point (X(0), A(0)), straightforward modifications of the argument below permitting to deal with the latter problem.For notational convenience, we omit to index by (x 0 , 0) the probabilities and expectations considered in the sequel.
Let 0 < T < ∞.Considering the geometry of the (càd-làg) sample paths of the exposure process X (see Fig. 2), we use the M 2 topology on the Skorohod's space D T = D([0, T ], R) induced by the Hausdorff distance on the space of completed graphs (the completed graph of x ∈ D T being obtained by connecting (t, x(t)) to (t, x(t−))) with a line segment for all discontinuity points), allowing trajectories to be eventually close even if their jumps do not exactly match (the J 2 topology would be actually sufficient for our purpose, refer to [36] or [63] for an account on topological concepts for sets of stochastic processes).In order to evaluate how close the approximating and true laws are, we shall establish an upper bound for the L 1 -Wasserstein Kantorovich distance between the distributions L X (T ) and L X(T ) of X (T) = (X(t)) t∈[0,T] and X(T) = ( X(t)) t∈[0,T] , which metric on the space of probability laws on D T is defined as follows (refer to [47], [10]): where the infimum is taken over all pairs (Z ′ , Z) with marginals L ′ and L and m It is well-known that this metric implies weak convergence (see [10]).
As claimed in the next theorem, the law L X(T ) gets closer and closer to L X (T ) as the distribution functions Ĝ, FU and Ĥ respectively tend to G, F U and H in the Mallows sense.
distance between two distribution functions F 1 and F 2 on the real line.
Theorem 5 Let (G, F U , H) (respectively, ( Ĝ(n) , F(n) U , Ĥ(n) ) for n ∈ N) be a triplet of distribution functions on R + defining a linear exposure process X (respectively, X(n) ) starting from x 0 ≥ 0 and fulfilling Theorem 1's assumptions and (H2) with γ = 1.Suppose that ( Ĝ(n) , F(n) U , Ĥ(n) ) tends to (G, F U , H) in the L 1 -Mallows distance as n → ∞.Assume further that G (respectively, Ĝ(n) ) has finite variance σ 2 G (resp., σ 2 Ĝ(n) ) and H (resp., Ĥ(n) ) has finite mean, m H .If σ 2 Ĝ(n) remains bounded, then: And for all T > 0 we have the weak convergence: Remark 7 Before showing how this theoretical result applies to the problem of approximating/estimating general functionals of the exposure process, a few remarks are in order.
• We point out that similar results hold for the L p -Wasserstein Kantorovich distance with p ∈ [1, ∞) under suitable moment conditions.
• It may also be convenient to consider the function space D ∞ = D([0, ∞), R) in which X has its sample paths and on which the metric ∞ may be considered.It is noteworthy that ( 15) also immediately provides a control of the L 1 -Wasserstein distance W (∞) 1 corresponding to that metric between L X and L X.
• In statistical applications, one is led to consider random estimates Ĝ(n) , e. 'L 1 -consistency' of the distribution estimates) and the boundedness of σ 2 Ĝ(n) hold almost surely, then the results of the preceding theorem (and those stated in the next corollary) also hold almost surely.
By showing that good approximations/estimations of the distributions H, G and F U also induces good approximation/estimation of general functionals of the exposure process, the next result establishes the asymptotic validity of simulation estimators under general conditions.Roughly speaking, provided that the instrumental distribution estimates at our disposal are accurate enough, we may treat simulated sample paths as if they were really exposure trajectories of individuals in the population of interest.
for n ∈ N) be a triplet of distribution functions on R + defining a linear exposure process X (respectively X(n) ) starting from x 0 ≥ 0 and fulfilling the assumptions of Theorem 5. Let 0 < T ≤ ∞.
(i) Let Q be a measurable function mapping D T into some metric space (S, D) with Disc(Q) as set of discontinuity points and such that P(X (T) ∈ Disc(Q)) = 0. Then we have the convergence in distribution (ii) For any Lipschitz function φ : (D T , m proof.The first assertion derives from Theorem 5 and the convergence (in distribution) preservation result stated in Theorem 3.4.3 of [63], while the second one is an immediate consequence of the first assertion of Theorem 5 (see also [10]).
We conclude this section by giving several examples, illustrating how the results above apply to certain functionals of the exposure process in practice.Among the time-dependent features of the exposure process, the following quantities are of considerable importance to practitioners in the field of risk assessment of chemicals in food and diet (see [50] and the references therein).
Maximum exposure value.The mapping that assigns to any finite length trajectory {x(t)} 0≤t≤T ∈ D T its maximum value sup 0≤t≤T x(t) is Lipschitz with respect to the Hausdorff distance m (T) M 2 (see Theorem 13.4.1 in [63]).Under the assumptions of Theorem 5, the expected supremum is finite and, given consistent estimates Ĝ(n) , F(n) U and Ĥ(n) of G, F U and H, one may thus construct a consistent estimate of E[sup 0≤t≤T X(t)] by implementing a standard Monte-Carlo procedure for approximating the expectation By straightforward uniform integrability arguments, it may be seen that convergence in mean also holds, so that the mean exposure value E[T −1 T t=0 X(t)dt] may be consistently estimated by Monte-Carlo simulations.
Average time spent over a critical threshold.Let u > 0 be some critical level.In a very similar fashion, it follows from the continuity of {x(t)} t∈[0,T] ∈ D T → T −1 T t=0 I {x(t)≥u} dt, that a Monte-Carlo procedure also allows to estimate the expectation of the average time spent by the exposure process above the threshold value u, namely First passage times.Given the starting point x 0 of the exposure process X, the distribution of the first passage time beyond a certain (possibly critical) level x ≥ 0, i.e. the hitting time τ + x = inf{t ≥ 0, X(t) ≥ x}, is also a characteristic of crucial interest for toxicologists.The mapping X ∈ D((0, ∞), R) → τ + x being continuous w.r.t. the M 2 -topology (refer to Theorem 13.6.4 in [63]), we have τ+ x = inf{t ≥ 0, X(t) ≥ x} ⇒ τ + x as soon as X ⇒ X.
In practice, one is also concerned with steady-state characteristics, describing the long term behavior of the exposure process.For instance, the steady-state mean exposure m µ or the limiting time average spent above a given critical value u > 0, µ([u, ∞[) = lim T→∞ T −1 T t=0 I X(t)≥u dt, can be pertinently used as quantitative indicators for chronic risk characterization (see also [62]).As shall be seen below, such quantities may be consistently estimated in a specific asymptotic framework stipulating that both T and n tend to infinity.As a matter of fact, one may naturally write And the term between brackets on the left hand side of ( 17) tends to 0 as T → ∞ by virtue of Theorem 3, while it follows from the coupling argument of Theorem 5 (see A4 in the Appendix) that the term on the right hand side is less than )) up to a multiplicative constant.Hence, if T and n simultaneously tend to infinity in a way that the latter quantity converges to 0, consistency of E T −1 T t=0 X(n) (t)dt as an estimator of m µ is clearly established.
Besides, with regard to statistical applications, Theorem 5 paves the way for studying the asymptotic validity of bootstrap procedures in order to construct accurate confidence intervals (based on sample paths simulated from bootstrapped versions of the estimates Ĝ(n) , F(n) U and Ĥ(n) ).This is beyond the scope of the present paper but will be the subject of further investigation.

Application to methyl mercury data
As an illustration of the mathematical toxicological model analyzed above, some numerical results related to dietary methyl mercury (MeHg) contamination are now exhibited.As previously mentioned, this chemical is present in seafoods quasi-solely and a clear indication of its adverse effects on human heath has been given by observational epidemiological studies (see [60], [15] [20] and [31] and references therein), leading recently regulatory authorities to develop seafood standards for protecting the safety of the consumer.At present, dietary risk assessment is conducted from a static viewpoint, comparing the weekly intakes to a reference dose called Provisional Tolerable Weekly Intake (PTWI), which is considered to represent the contaminant dose an individual can ingest each week over all his life without appreciable risk.For methyl mercury, the PTWI has been set to 1.6 micrograms per kilogram of body weight per week (µg/kgbw/w in abbreviated form) by the international expert committee of FAO/WHO (see [28]).Another reference dose of 0.7 µg/kgbw/w, established from a previous evaluation by the (U.S.) National Research Council [61], is sometimes used from a more conservative perspective.Hence, in a dynamic approach, a deterministic exposure process of reference for risk assessment could be built by considering weekly intakes exactly equal to one of these static reference doses (d = 1.6 or 0.7) and a fixed mean half-life HL expressed in weeks.In this case, the body burden at the n th intake is given by the (affine) recurrence relation The dynamic reference dose is obtained by taking the limit as n tends to infinity: Numerically, this yields X ref,0.7 = 6.42 µg/kgbw and X ref,1.6 = 14.67 µg/kgbw when MeHg biological half-life is fixed to 6 weeks as estimated in [56].
Datasets and empirical estimates of distributions F U , G and H. Food contamination data related to fish and other seafoods available on the French market have been collected by accredited laboratories from official national surveys performed between 1994 and 2003 by the French Ministry of Agriculture and Fisheries [40] and the French Research Institute for Exploitation of the Sea [34].Our dataset comprises of 2832 analytical data.
The national individual consumption survey INCA (see [19]) provides the quantity consumed of an extensive list of foods over a week, among which fish and seafoods, as well as the time when consumption occurred with at least the information about the type of meal (breakfast, lunch, dinner or snack).The survey is composed of two samples: 1985 adults aged 15 years or over and 1018 children aged between 3 to 14 years.However, as shown by the hazard characterization step (see [28]), the group that is the most critically exposed to neuro-developmental adverse effects of MeHg are foetus: the MeHg in the mother's fish diet can pass into the developping foetus and cause irreversible brain damage.Here we thus focus on women of childbearing age (between 15 and 45).
For simplicity, MeHg intakes are computed at each observed meal through a deterministic procedure currently used in national and international risk assessments.From the INCA food list, 92 different fish or seafood species are determined and a mean level of contamination is computed from the contamination data, as in [20,59].Intakes are then obtained by applying relation (1).For comparability sake, all consumptions are divided by the associated individual body weight, also provided in the INCA survey.
After the work of [57], the intake distribution F U is modeled by a heavy-tailed distribution, namely the Burr distribution (of which cumulative distribution function is of the form (1 − (1 + x c ) −k ), with c > 0 and k > 0).It is noteworthy that it fulfills assumption H3 (see Remark 3).It is fitted here by means of standard maximum likelihood techniques (see Fig. 2(a) below for a probabilty plot illustrating the goodness of the fit).
The times of consumption available from the INCA survey allow us to compute the inter-intake times or at least produce right censored values (providing then the information that some durations between successive intakes are larger than a certain time).A Gamma distribution (which has increasing hazard rate) is retained for modeling the interintake distribution G: its parameters are fitted using a right censored maximum likelihood procedure.As shown by the probability plot displayed in Fig. 2(b), the chosen distribution (Gamma) provides a good fit for the left (uncensored) part of the distribution.
The pharmacokinetics in man of MeHg has been thoroughly investigated in several studies (see [49], [55], [56] and [35] for instance), almost all coming to the conclusion that the half-life of methyl mercury in man (see Remark 1) fluctuates around six weeks.As we could not dispose of any raw data related to MeHg half-life in the human body, based on the most documented study (in which collected half-life data (in days) are indicated to range from 36 to 65 and correspond to a sample mean of 44, refer to [56] and [35]), a Gamma distribution with mean 44 days and 5th percentile 36 days is chosen here for modeling the variability of the biological half-life log 2/θ.
Table 1 sums up the characteristics of the three input distributions, with the convention that a Gamma distribution with parameter α and β has density Γ (α) −1 β −α x α−1 exp(−x/β) and mean αβ.Recall that the tail index of the Burr distribution is given by κ = (ck) −1 and its moment of order r is finite if ck > r, it is then given by Γ (k − r/c) × Γ (1 + r/c)/Γ (k).
Time to steady-state.As underlined in Remark 5, the question of determining how much time is needed for the exposure process to be roughly at equilibrium is crucial for assessing the practical relevance of steady-state characteristics.In order to evaluate the time to steady-state, Monte-Carlo simulations have been carried out: using the instrumen- tal distributions described above, M = 1000 trajectories have been started from different initial values x 0 .For each path, the (temporal) mean exposure over the time interval [0, T ] is computed and individual results have been averaged.In Figure 3(a), the resulting estimates for the Adult Female (15-45) subpopulation are displayed, as time T grows: as expected, all empirical mean exposures converge to the same quantity (namely m µ ) and the relative error is lower than 10% after 29 half-lives (3.5 years approximately), whatever the starting value x 0 .The same procedure is used for the mean time spent beyond the reference threshold u = X ref,0.7 : as shown by Figure 3(b), the limit is approximately reached after 70 half-lives (about 8.5 years), whatever the initial state x 0 .The convergence is slower in this case, since the quantity of interest is related to an event of relatively small probability (see also Remark 10).
Naturally, the time to steady-state strongly depends on the initial value x 0 and of the functional of interest.However, on the basis of these simulation results, it may be stated that, for realistic initial values, the time to steady-state for basic quantities as the ones considered here fluctuates between 3 and 10 years, which is, anyhow, a reasonable horizon at the human scale.
Remark 8 (Convergence rate bounds) The problem of determining computable bounds on the convergence rate of ergodic Markov processes has recently received much  attention in the applied probability literature (see [42], [53], [39], [51] or [52]).Using suitably calibrated parameters of the drift and minorization conditions (Equations ( 18), ( 19), (23), and ( 22)) established along the proofs of theorems 2 and 3 in the Appendix, rough numerical estimates of the constants involved in rate bounds (10) and ( 12) can be computed from Theorem 2.2 in [52] (in the discrete-time case) and Theorem 4.1 in [52] (in the continuous-time setup).Explicit computations based on these theoretical results shall be carried out in a forthcoming paper (see [7]).However, as such computable bounds are quite loose in general and are, furthermore, related to the total variation norm (whereas one rather focuses on specific functional of interest in practice), the problem is handled here by exploiting the simulation approach.
Estimation by computer simulation.We now focus on estimating certain relevant time-dependent and steady-state quantities among those enumerated in the previous section via the simulation approach studied in Section 4 from our MeHg datasets.Estimates of the quantities of interest are computed by averaging over M = 1000 replicated trajectories on [0, T ], taking T equal to 5 years.In order to ensure that the retained trajectories are approximately stationary, a burn-in period of 5 years is used (see the preceding paragraph).Numerical results related to the estimation of the steady-state mean exposure (m µ ), the probability to exceed the dynamic reference doses in steady-state (µ([X ref,0.7 , ∞[) and µ([X ref,1.6 , ∞[), the mean time to run over the lowest reference dose (E µ [τ X ref,0.7 ]) and the expected maximum exposure over 5 and 10 years (E µ [max t≤5/10 years X(t)]) are displayed in Table 2.
We observe that the average time spent over the EU-based threshold (X ref,1.6 ) or the US-based one (X ref,0.7 ) are close to zero in the Adult Female  subpopulation, resp.0.003% and 0.575%.Regarding the time required to reach such threshold levels, further  simulations have been conducted using the estimated stationary mean as the initial point (namely, x 0 = 2.92).Only the distribution of the time to reach the US-based threshold (X ref,0.7 ) has been estimated using a standard Monte-Carlo procedure As explained in Remark 10 below, estimating the distribution of the time required to run over level X ref,1.6 involves computing rare event probabilities and thus requires the use of more sophisticated simulation methods.Over M = 1000 trajectories, the mean (respectively, the median) of τ + X ref,0.7 is 7.23 years (resp.5.05 years).Figure 4 displays the (highly skewed) Monte-Carlo distribution estimate (obtained by a kernel estimation built over the M = 1000 simulated values using a standard procedure) of the time to run beyond X ref,0.7 for the studied subpopulation.
2: Estimated features of the exposure processes -Adult female (15-45) Remark 9 (Sensitivity to the choice of the instrumental distributions) The distribution models used here for the governing probability measures F U and G have been chosen because they provide a good overall fit to the data (and may be easily seen to satisfy all the assumptions required in the ergodicity and stability analyses).According to our own practical experience, the numerical results displayed here would not have been significantly different, if one had chosen to model G by a Weibull distribution for instance.However, in a future study (see [7]) special attention shall be given to the statistical issues of validating the mathematical toxicological model, by investigating how sensitive the latter is to changes with respect to the estimation method chosen (considering also the use of nonparametric approaches).
Remark 10 (Naive Monte-Carlo simulation and high threshold) From the perspective of public health guidance practice, it is of prime importance to evaluate the probability of occurrence of rare (extremely risky) events, the probability to exceed a large threshold u such as u = X ref,1.6 for instance.In this respect, we point out that the naive Monte-Carlo simulation proposed here leads to estimate this probability by zero (see Table 2), so seldom this threshold is reached on a time interval [0, T ] of reasonable length.
Treading in the steps of [16], it is shown in [7] that one may remedy to this problem by implementing a suitable particle filtering algorithm.

A Appendix -Technical proofs
A.1 Proof of Theorem 2 From conditions required by Theorem 1, aperiodicity and irreducibility properties are immediately established for the discrete-time chain X. Besides, under mild irreducibility conditions, the stability of the random coefficients autoregressive model on R d where (α n , β n ), n = 1, . . .are i.i.d.r.v.'s on R * + × R d , has been investigated in detail since the seminal contribution of [37] (see [48] and the references therein).Under the assumption that E[log(1 ∨ β 1 )] < ∞ and E[log(1 ∨ α 1 )] < ∞, it is well known that a sufficient and necessary condition for the chain X to have a (unique) probability measure is that E[log(α 1 )] < 0 (see Corollary 2.7 in [11] for instance).Based on this result, it is then straightforward that, under the assumptions of Theorem 1 and (H1), the chain X is positive recurrent with absolutely continuous stationary probability distribution μ(dx) = f(x)dx.In the discrete-time context, analysis of the stability of Markov models (Y n ) n∈N may be carried out by establishing suitable conditions for the 'drift' ∆V(y) = E[V(Y 1 ) | Y 0 = y]−V(y) for appropriate non-negative test functions V(y).Such 'Foster-Lyapunov' criteria stipulate the existence of a 'small set' S (i.e. an accessible set S to which the chain returns in a given number of steps with positive probability, uniformly bounded by below, see section 5.2 in [44]) toward which the chain drifts in the sense that: for some 'norm-like' function f(x) ≥ 1 and b < ∞.Now for the chain X, any compact interval [0, s] with s > 0 is small.Indeed, it follows from (9) that for all x ∈ [0, s], the minorization condition below holds: with δ s = s × inf y∈[0,s] f U (y) > 0 and denoting by U s (.) the uniform probability distribution on [0, s].When γ = 1 for instance, take V(x) = 1 + x.The affine drift related to X is given by ∆V Applying Theorem 15.0.1 in [44], we thus get that X is geometrically ergodic with invariant probability measure μ such that μ In particular, μ has finite expectation and there exist constants r > 1, R < ∞ such that for all x > 0: with ν V = sup ψ:|ψ|≤V ψ(x)ν(dx) for all bounded measure ν on the real line.When V ≡ 1, .V coincides with the total variation norm .TV .For γ > 1, the results is proved in a similar fashion by taking V(x) = 1 + x γ .Finally, the last assertion of Theorem 2 immediately derives from Theorem 1 in [32].

A.4 Proof of Theorem 5
Observe first that ( 16) immediately follows from ( 15) by virtue of standard properties of Wasserstein metrics.In order to prove (15), we construct a specific coupling of the laws L X(T ) and L X (T ) .Let (V k ) n∈N , (V ′ k ) k∈N and (V ′′ k ) k∈N be three independent sequences of i.i.d.r.v.'s, uniformly distributed on [0, 1].For all (n, k) ∈ N 2 , set and define recursively for k ∈ N, X k+1 = X k e −θ k ∆T k+1 +U k+1 and X(n where ) ∨ E( N(T ) 2 ) ≤ C ′ T 2 (refer to Propositions 6.1 and 6.3 of chap.V in [4] for instance).Observe that the constants C and C ′ may be chosen independent from the integer n indexing the sequence Ĝ, since by assumption the sequences m Ĝ and σ 2 Ĝ are bounded.This establishes the desired result (15).
denoting by Γ Z ′ and Γ Z the completed graphs of Z ′ and Z and by m (T) H the Hausdorff metric on the set of all compact subsets of [0, T ] × R related to the distance m (a) Intake distribution (b) Inter-intake time distribution

Table 1 :
Parameters of the intake and inter-intake distributions for the Adult Femalesubpopulation, fitted by maximum likelihood.