PARAMETER ESTIMATION IN A STRUCTURED ERYTHROPOIESIS MODEL

. We develop a numerical method for estimating parameters in a structured erythropoiesis model consisting of a nonlinear system of partial dif-ferential equations. Convergence theory for the computed parameters is pro- vided. Numerical results for estimating the growth rate of precursor cells as a function of the erythropoietin concentration and the decay rate of erythropoi- etin as a function of the total number of precursor cells from computationally generated data are provided. Standard errors for such parameters are also given.


1.
Introduction. Erythropoiesis is the process in which the body produces red blood cells. This production is regulated by the hormone erythropoietin (EPO). This hormone, which is released from the kidneys when the body is unable to deliver oxygen adequately [24], recruits stems cells from the bone marrow into the erythrocyte progenitor population. Erythropoietin may also have the effect of accelerating the maturation process of precursor cells [14]. Once stem cells are recruited into the precursor population, they begin a maturation process. The primary measure of cell maturity is relative to its hemoglobin content, which enables mature erythrocytes to effectively carry oxygen throughout the blood stream.
To our knowledge there are no studies that provide information on the functional forms of the maturation rate of precursor cells as a function of the EPO or the decay rate of EPO as a function of the total precursor cells. Thus, one must rely on an inverse problem approach to identify possible function forms for these rates from observed data, which is the main goal of this paper.
There have been many studies (see, [1]- [3], [7]- [12], [16], [18]- [22], [26]- [27]) that focused on the development of inverse problem techniques for identifying parameters in continuous age/size structured population models. The basic idea in several of these papers is the set up of a least-squares cost functional that compares model output to data and the development of a numerical approximation that solves this least-squares problem and provides an approximation to the optimal parameter choice. Convergence theory for the computed minimizers of the approximate leastsquares cost functional to a minimizer of the true least-squares functional has been developed in these papers. To our knowledge, there has been only one study [25] involving parameter estimation for a structured model of erythropoiesis and no convergence results were provided.
The focus in this paper is two-fold: First, we develop an explicit finite difference scheme, which is a modification of the implicit one developed in [4] and is in the spirit of the one developed in [28]. The reason for adopting an explicit scheme is that for an inverse problem as the one considered here, the model must be solved numerous times (iterating over the parameter space) until the optimal parameter is found. Thus, a more practical and computationally faster method becomes crucial to solving the problem in a reasonable amount of time. And second, utilizing the previously mentioned finite difference scheme, we develop a least-squares approach for estimating parameters in the structured erythropoiesis model studied in [4] and we provide a convergence theory of the computed parameter estimates.
As the use of numerical methods for estimating parameters becomes more accepted in the biological sciences, it is becoming increasingly important for researchers to support the efficacy of proposed numerical algorithms with not only numerical simulation results but also with estimates of uncertainty in forms of standard errors and confidence intervals for the resulting parameter estimates. In this paper we will present a statistical technique for computing standard errors for such parameter estimates (e.g., [3,17]). While in our efforts we emphasize ordinary least square estimators, the ideas and methods presented in this paper can readily be used with maximum-likelihood estimators as well as other standard estimators found in the statistical literature.
This paper is organized as follows: in Section 2 we present the structured model, in Section 3 we present the explicit finite difference method and discuss its convergence to the unique weak solution of the model, and in Section 4 we present the least-squares cost functional and provide convergence theory for the computed minimizers to a minimizer of the least-squares cost functional. Following the development of the convergence theory, in Section 5 we demonstrate the feasibility of this approach by utilizing computationally generated data to obtain estimates of the maturation velocity of the precursor population as a function of the EPO concentration and the decay rate of EPO as a function of the precursor population.
2. The structured erythropoiesis model. Our model consists of three main components. The first is the density of the precursor population, denoted by p(t, µ). Here µ is the maturity level of the precursor cells. We assume that precursor cells must reach the maturity level µ F before joining the mature population, m(t, ν), which is the second main component of our model. Mature cells are structured by their age ν. The third component is the EPO concentration, which we denote by E(t). The maturation rate of the precursor cells is assumed to be a function of the EPO concentration and is denoted by g(E) in our model. The function σ(t, µ, E) represents the net change in the birth and death rate of precursor cells. Note that this function depends on E(t) since it is known that EPO prevents apoptosis in erythroid progenitors [15]. We use γ(t, ν, M (t)) to represent the death rate of  mature red blood cells and f (t, M ) is the negative feedback that represents the kidney's response to low levels of mature erythrocytes. Further, a E (P ) is the decay rate of EPO as a function of the total precursor population. It is assumed that the rate at which stem cells are recruited into the precursor population is proportional to the EPO concentration and we use φ(t) to denote the function of proportionality. With these parameters defined as such the model can be represented graphically as in Figure 1. Mathematically, the model takes the form ∂p(t,µ) ∂t + g(E(t)) ∂p(t,µ) ∂µ = σ(t, µ, E(t))p(t, µ), In the following section, we make regularity assumptions on the parameters mentioned above, introduce an explicit finite difference scheme, and prove the convergence of this scheme to the unique weak solution of the model. 3. Convergence of finite difference approximation. Throughout the discussion BV (Ω) (Ω ⊂ R n an open set) denotes the space of functions in L 1 (Ω) with bounded variation (e.g. [23]), and we assume that the parameters in (1) satisfy the following properties: There exist positive constants c and L such that 604 AZMY S. ACKLEH AND JEREMY J. THIBODEAUX (A1) The function g(E) is continuous, 0 < g(E) ≤ c, and |g( The function σ(t, µ, E) is continuous with respect to t, continuously differentiable with respect to E, and for any (t, µ, E) The function γ(t, ν, M ) is continuous with respect to t, continuously differentiable with respect to M , and for any (t, ν, M ) As in [4], we define a weak solution of (1) to be a triple (p, m, , and [0, T ] into n 1 , n 2 , and K subintervals, respectively, and use the following notation throughout the paper: ∆µ = µ F /n 1 , ∆ν = ν F /n 2 , and ∆t = T /K denote the spatial and time mesh sizes, respectively. The mesh points are given by: µ i = i∆µ, i = 0, 1, · · · , n 1 , ν j = j∆ν, j = 0, 1, · · · , n 2 , and t k = k∆t, k = 0, 1, · · · , K. We denote by p k i , m k j , M k , and E k the finite difference approximations of p(t k , µ i ), m(t k , ν j ), M (t k ), and E(t k ), respectively. We let . Also we define the ℓ 1 and ℓ ∞ norms of p k and m k to be respectively. We then discretize the differential equation system (1) using the following explicit finite difference approximation with the initial conditions It is sometimes more convenient to express the first three equations in (3) as The following results were established in [5].
With the help of Lemmas 3.2 and 3.3 we will establish the following result.
where λ 1 = ∆t ∆µ . Summing over i = 2, . . . , n 1 , and then adding |p k+1 After a few calculations, we obtain Using estimate (7) in (6) we arrive at for some constant c 1 . By adding and subtracting terms and using (A1) and (A6), we get the following: for some c 2 . Thus, we have for some constant c 3 . Similar calculations give By using the boundary condition for m in (3) we can easily see that Hence, (11) becomes for a constant c 6 . Multiplying (10) by g k+1 and adding the result to (13) yields Adding and subtracting terms, using (A1), and observing that |E k+1 − E k | ≤ ∆tc 8 we get Clearly, (16) implies the result.
Using the above lemmas the following results easily follows (see [5]).
Following [29] we define a family of functions {P ∆t,∆µ }, {M ∆t,∆ν }, and {E ∆t } by Recalling that the uniqueness of the weak solution was established in [4], and applying techniques as in [4,29] we have the following result. 4. The inverse problem and convergence results. The main focus of this paper is estimating parameters contained in (1). Our goal, given observations X r,s1 , Y r,s2 , Z r which correspond to the total number of precursor cells with maturation level in the interval [α s1 , α s1+1 ), s 1 = 1, . . . , S 1 , the total number of mature cells in the age interval [β s2 , β s2+1 ) s 2 = 1, . . . , S 2 and the total concentration of erythropoietin at time t r , respectively, is to find a parameter q = (g, σ, γ, a E , f, φ) ∈ Q such that the functional is the space of continuous uniformly bounded functions on Ω. Throughtout, this section we assume that our admissible parameter space, Q, is a compact subset of D such that each q = (g, σ, γ, a E , f, φ) ∈ Q satisfies (A1)-(A6). Since our parameter set is infinite dimensional, a finite-dimensional approximation of the parameter space is also necessary for computing minimizers. To this end, we consider the following finite-dimensional approximations of (17) J ∆t,∆µ,∆ν (q) = R r=1 S1 s1=1 αs 1 +1 αs 1 P ∆t,∆µ (t r , µ, q)dµ − X r,s1 each of which is minimized over Q N , a compact finite-dimensional approximation of the parameter space Q. To establish the convergence results for the parameterestimation technique, we use an approach in the spirit of that used in [1,2,3], which is based on the abstract theory in [13].
Theorem 4.1. Suppose that q l → q in Q and that ∆t l , ∆µ l and ∆ν l → 0 as l → ∞. Let P ∆t l ,∆µ l (q l ), M ∆t l ,∆ν l (q l ) and E ∆t l (q l ) denote the family of functions obtained from the solution of the finite difference equations (3) with parameter q l and let p(q), m(q) and E(q) be the unique weak solution of (1) with initial conditions p 0 , m 0 and E 0 and parameter q. Then, the sequence of functions ({P ∆t l ,∆µ l (q l )}, {M ∆t l ,∆ν l (q l )}, {E ∆t l (q l )}) converges to (p(q), m(q), E(q)) as l → ∞, in the sense defined in Theorem 3.6.
Proof. We begin by defining p k,l i = p k i (q l ), m k,l j = m k j (q l ), and E k,l = E k (q l ), where (p k i (q l ), m k j (q l ), E k (q l )) is the solution of (3) with parameter q l . Since q ∈ Q, we have from the previous section that there exists functionsp,m, andÊ such that P ∆t l ,∆µ l (q l ) →p, M ∆t l ,∆ν l (q l ) →m, and E ∆t l (q l ) →Ê in the sense described in Theorem 3.6 as l → ∞. So by the uniqueness of the weak solution established in [4], we only need to show that (p,m,Ê) is a weak solution of (1) (with parameter q).
The next result follows immediately from Theorem 4.1. Corollary 1. Let P ∆t l ,∆µ l , M ∆t l ,∆ν l and E ∆t l denote the numerical solution from (3), with parameter q l → q in Q and let ∆t l , ∆µ l , ∆ν l → 0 as l → ∞. Then J ∆t l ,∆µ l ,∆ν l (q l ) → J(q) as l → ∞.
In the next theorem, we establish the continuity of the finite difference approximation with respect to the parameter. From this it immediately follows that the approximate cost functional is continuous with respect to the parameter. Hence, the computational problem of finding the approximate minimizer is well posed.
Theorem 4.2. Let ∆t, ∆µ, ∆ν be fixed. For each q ∈ Q let P ∆t,∆µ (q), M ∆t,∆ν (q) and E ∆t (q) denote the numerical solution of (3) and let q h → q in Q, as h → ∞. Then as h → ∞, in the sense defined in Theorem 3.6.
Proof. Let p h,k i , m h,k j , E h,k and p k i , m k j , E k denote the solutions of (3) with parameter q h and q, respectively. Further, let u h,k Then from the first equation in (4), we have For constants c 12 and c 13 . Using the first boundary condition in (3) yields Hence, there exist constants c 14 and c 15 such that Using similar techniques, we arrive at v h,k+1 and for some constants c i , i = 16, · · · , 20. Now if we let S h,k = u h,k for positive constants c 21 , c 22 and c 23 . Since q h → q in Q, the result is established.
Using the above results, the following convergence result follows from a direct application of the abstract theory in [13]. Theorem 4.3. Suppose that Q N is a sequence of compact subsets of Q. Moreover, assume that for each q ∈ Q, there exists a sequence of q N ∈ Q N such that q N → q as N → ∞. Then the functional J ∆t,∆µ,∆ν has a minimizer over Q N . Furthermore, if q i N denotes a minimizer of J ∆ti,∆µi,∆νi over Q N and ∆t i , ∆µ i , ∆ν i → 0, then any subsequence of q i N has a further subsequence which converges to a minimizer of J.

5.
Numerical results. In this section, we test the parameter estimation numerical method from the previous section. Since there are no current estimates of the decay rate of erythropoietin as a function of the total precursor population and the maturation velocity of the precursor cells as a function of the Epo concentration, we focus on implementing the parameter estimation scheme to provide estimates for these two functions.
To generate a data set we fix the model parameters to these given in Table 1 and let µ F = 5.9, ν F = 50 and T = 5. These parameters are similar to the ones used in [4,14,25]. We then solve (3) with these parameters and ∆t = 0.01, ∆µ = ∆ν = 0.02. From the resulting numerical solution we obtain the computational data for: 1) the total population of the precursor cells (S 1 = 1) denoted by X r , 2) the total population for the mature cells (S 2 = 1) denoted by Y r , and 3) the erythropoietin level denoted by Z r where r = 1, . . . , R. These values were collected at t r = 0.5r, r = 1, . . . , 10. Thus, R = 10. To test how noisy data would affect the minimization results we generated a second data set with noise as follows:X r = X r (1 + 0.05ǫ 1,r ),Ỹ r = Y r (1 + 0.05ǫ 2,r ), andZ r = Z r (1 + 0.05ǫ 3,r ), where each ǫ i is a normally distributed random number with mean zero and standard deviation equal to one.
In our first parameter estimation experiment we assume that where the vector α = (α 0 , . . . , α 5 ) is unknown. We minimize (18) over the vector α = (α 0 , . . . , α 5 ) ∈ Q which is assumed to be a compact subset of R 6 + . In this experiment we set ∆t = 0.02, ∆µ = ∆ν = 0.04 which are different from those chosen above to generate the data. Because of this difference in the mesh sizes, the data (even without adding noise) is not exactly attained by the model output as there is a numerical error. The parameter estimates resulting from this experiment, which are quite satisfactory, are presented in Figure 2.
So if then we take the values √ V 11 , √ V 22 , . . . , √ V 66 to be the standard errors for the parameters α 0 , α 1 , . . . , α 5 , respectively. The results are give in Table 2. One can see from these results that the model lacks sensitivity to the parameters α 1 , α 2 for this particular choice of g and range of E.
For our second numerical experiment, we test an infinite dimensional estimation problem. In particular, we assume that a E (P ) is the only unknown parameter and instead of choosing the form of a E given in the table, we let φ N i (P ), i = 0, . . . , N denote the ith linear spline and approximate a E (P ) by the function a N E (P ) = N i=0 α i φ N i (P ) for P ∈ [0, 2] (which contains the normal range for humans) and extend it by a constant function for P > 2, i.e., a N E (P ) = α N for P ∈ (2, ∞). Thus, our estimation problem reduces to minimizing (18) over the vector α = (α 0 , . . . , α N ) ∈ Q N a compact subset of R N + . We choose N = 5 and present the parameter estimates in Figure 3.  Using the procedure described previously, we calculate the standard errors for this case. The results are given in Table 3.