Parameter Estimation of Population Pharmacokinetic Models with Stochastic Differential Equations : Implementation of an Estimation Algorithm

Population pharmacokinetic (PPK) models play a pivotal role in quantitative pharmacology study, which are classically analyzed by nonlinear mixed-effects models based on ordinary differential equations. This paper describes the implementation of SDEs in population pharmacokinetic models, where parameters are estimated by a novel approximation of likelihood function. This approximation is constructed by combining the MCMC method used in nonlinear mixed-effects modeling with the extended Kalman filter used in SDEmodels.The analysis and simulation results show that the performance of the approximation of likelihood function for mixed-effects SDEs model and analysis of population pharmacokinetic data is reliable. The results suggest that the proposed method is feasible for the analysis of population pharmacokinetic data.


Introduction
Population pharmacokinetic (PPK) models play a pivotal role in quantitative pharmacology study.They combine the classic pharmacokinetic (PK) analysis with population statistical models, which makes it possible to investigate the determined factors of drug concentration in patients group quantitatively, to guide the adjustment of administration, and to provide comprehensive quantitative information for a more rational and effective clinical administration regimen.In recent years, with the trend that model based drug development (MBDD) is promoted and strongly advocated by the FDA, the population pharmacokinetic theories and practice researches have been greatly developed.
Nonlinear mixed-effects modeling has been proven to be a kind of widely used and effective tool for describing the pharmacokinetic (PK) and pharmacodynamic (PD) properties of drug [1,2].However, traditionally the most commonly used nonlinear mixed-effects PPK models are based on ordinary differential equations (ODEs), supplemented by a model for the interindividual variations in the structural model parameters and a model for the variation of the residuals and those two models are assumed to be independent, so that the residuals are determinate.However, the pharmacokinetic process will be affected by much stochastic volatility, which cannot be described by the determinate errors, so the preceding assumption is not appropriate [3].
Stochastic differential equations (SDEs) are a kind of modeling method developed recently, which could be used to deal with such kind of questions.In nonlinear mixed-effects modeling with SDEs, the differences between individual values and observations are explained by three fundamentally different types of noise: (1) the dynamic noise, which enters through the dynamics of the system and may arise from structural model misspecification or unpredictable random behavior of underlying process; (2) random effect, which describes the interindividual variations; (3) the measurement noise, which represents the uncorrelated part of the residual 2 Journal of Probability and Statistics variability, that may be due to assay error [4].The simulation results show that the introduction of the SDE may contribute to the better estimates of the interindividual variations and structural parameters [5].
The first paper encouraging the introduction of random fluctuations in PK/PD was published by D' Argenio et al. [6][7][8].The authors underlined that both deterministic and stochastic component contribute to PK/PD models.However, estimating parameters in nonlinear mixed-effects model with SDE is a difficult problem and not straightforward, except for simple cases.Naturally likelihood inference would be a feasible approach, but the transition densities are rarely known; thus, explicit likelihood function is usually hard to get.Actually, there are hardly any theories for SDE models at present.Kristensen NR proposed maximum likelihood method and maximized a posterior to estimate the parameters in PK models with SDE, but this method only focused on single subject modeling where no interindividual variance components were estimated [9].Overgaard RV suggested applying the Kalman filter to approximate the likelihood function for a SDE model, with a nonlinear drift term and a constant diffusion term [10].Tornoe CW used this algorithm to estimate SDEs in NONMEM [11], but the NONMEM implementation cannot be used to form Kalman smoothing estimates, which is an important feature of the SDE approach, where all data is used to give optimal estimates at each sampling point.Sophie Donnet proposed a Bayesian inference to analyze growth curves using mixed-effects models based on stochastic differential equations and obtained good results [12].Struthers and McLeish applied Bayesian method to the multicenter AIDS cohort study [13].
Inspired by this, the present work describes an approximation of likelihood function of nonlinear mixed-effects pharmacokinetic model, which is constructed by combining the extended Kalman filter with the MCMC method.The parameters in the nonlinear mixed-effects models are estimated based on stochastic differential equation and the whole process is implemented in Matlab.Section 2 presents the classical nonlinear mixed model and the mixed model defined by SDEs.In Section 3, the algorithm of nonlinear mixed-effects model with SDEs is proposed.In Section 4, example with simulated data and a case study are described.Conclusion and discussion are listed in Section 5.

Models and Notation
Nonlinear mixed-effects models can be regarded as hierarchical models where the variability in concentration/effect is split into intraindividual variability described by the firststage distribution and interindividual variability described by the second-stage distribution.This section describes the notation for nonlinear mixed-effects models used in the present paper.

Nonlinear Mixed-Effects Model
where  is a possibly nonlinear function and Φ = (  ) 1≤≤ is the individual parameter vectors.  is assumed to be independently and identically normally distributed with expectation  and variance Ω.   are the measurement errors and are also assumed to be independently and identically normally distributed with null mean and variance  2  .

Nonlinear Mixed-Effects Model Based on Stochastic Differential Equation.
While the introduction of SDEs does not change the fundamental hierarchical structure, they do change the entities in the first-stage density and the construction.Under the framework of ordinary differential equations, noise is only introduced through the measurement equation; see (1).This allows the measurement noise term to absorb the whole error due to model miss-specification or true random fluctuations of the states and hence may ignore the correlated residuals.In order to consider the correlated residuals, a stochastic process is added to the state space model, such that the nonlinear mixed-effects model based on SDEs can be written as where Γ(  , ,  2 )   is called the diffusion term and describes the stochastic part of the system.  is a standard Wiener process defined by   2 −   1 ∼ (0, | 2 −  1 |).(  , , )  is called the drift term and describes the deterministic part.The stochastic dynamics of the system is defined by the drift and diffusion terms together.
In nonlinear mixed-effects model based on SDEs (see (2)), the total variance is divided into three fundamentally different noises: the interindividual variability Ω describing the individual difference, the system noise  2 reflecting the random fluctuations around the corresponding dynamic model, and the measurement noise  2  representing the uncorrelated residuals originating from measurement assay or sampling errors.

Algorithm of the Nonlinear Mixed-Effects Model with SDEs
To solve the nonlinear mixed-effects model with SDEs, we calculate the approximation of likelihood function constructed by combining the extended Kalman filter with the Bayesian inferences.The details are described in what follows.
3.1.Extended Kalman Filter.We use the extended Kalman filter to calculate the one-step predictions and the onestep predicted variances for a stochastic differential equation with additive diffusion and measurement noise.The Kalman filter is a recursive estimator, which means that only the estimated state from the previous time step and the current measurement are needed to compute the estimate for the current state.The algebra presented in the following is all performed on the individual level and the  index referring to the individual had been dropped for simplicity.The general intraindividual model can be written as where  is the vector of state variables,   is the vector of measurements at time   ,   are the associated normally distributed measurement errors with covariance matrix ∑, and    is the system noise.The state of the filter is represented by two variables as follows.
(1) x| is a posteriori state estimate at time  given observations up to and including time ; (2)  | is a posterior error covariance matrix.
And we need to initiate the extended Kalman filter (EKF) with a prediction of the initial state x1|0 and a prediction of the covariance of the initial state  1|0 .
From this point, the EKF is most often conceptualized as two distinct phases: "predict" and "update." The predict phase uses the state estimate from the previous time step to produce an estimate of the state at the current time step, which is achieved by The update phase uses the actual measurement to update our state prediction and variance.This is performed by the update equations; that is, where   is the Kalman gain.
The final step in the recursive algorithm is to predict the state and the state variance at the time of the following measurement.This is performed by solving the prediction equations; that is, where So the algorithm of extended Kalman filtering proceeds as follows.
(1) Given parameters and initial prediction , x1|0 and  1|0 (2) For  = 1 to   do (3) Use (4) to compute, ŷ|−1 and R|−1 After the prediction of the state value at the following measurement, we start again with predictions of the actual measurements until all the one-step prediction  |−1 and all the one-step prediction variance  |−1 have been calculated.

The Likelihood Function for the Nonlinear Mixed-Effects
Model with SDEs.In the present work, we will use a quasilikelihood method, that is, a method that uses the Gaussian approximation, so we assume that the conditional densities are well approximated by Gaussian densities.The calculation of the conditional densities for the intraindividual model is facilitated by the extended Kalman filter (EKF).
The EKF approximates the conditional densities with Gaussian distributions.The conditional densities describe the distribution of the following measurement on the condition of all the previous measurements, so that the mean of the distribution is identical to the prediction of the following measurement, that is, the one-step prediction ŷ(|−1) .Likewise, the covariance of the conditional density will be the onestep prediction covariance  (|−1) .We have thus completely described the approximate Gaussian conditional densities by the conditional mean and covariance, which are The one-step prediction error   is given by Using the notation above, the Gaussian approximation of the first-stage distribution density function can be written as The second-stage density can be written as  2 (  | Ω), which is included in the same way as for ordinary differential equations.This gives us the full nonlinear mixed-effects likelihood function where is the approximate of a posterior log-likelihood function for the random effects of the th individual.It is observed that the likelihood function is based on the one-step prediction error.

The Parameter Estimation for the Nonlinear Mixed-Effects
Model with SDEs.As usual for nonlinear mixed-effects models, the likelihood function cannot be solved analytically.
Approximations, therefore, have to be made in order to estimate the parameters and Bayesian inferential method is considered.The Bayesian approach allows prior distributions to be incorporated with the likelihood function to evaluate the posterior distribution of the population parameters in the SDE model.Thus the first step is the choice of the prior distributions.Usual diffuse prior distributions can be chosen but the resulting posterior distributions may not be proper.Therefore, we propose to use standard prior distributions suggested by de la Cruz-Mesía and Marshall [14]: where  and Γ are the Wishart distributions and Gamma distributions, respectively.We select a uniform prior on  2 .In practice, the specification of hyper parameters  prior  , V prior  , ,  prior  , and  prior  may be very difficult; therefore, we choose the noninformative priors for the hyperparameters.
For the ODE model (see ( 1)) Gibbs sampling algorithms have been proposed in the literature [15].These algorithms will not be detailed here.For the SDE model, we propose to use a Gibbs algorithm.In the case, when the posterior distributions of the parameters have no explicit form, we propose to approximate it using the Metropolis-Hastings random walks.The convergence of the Metropolis-Hastings algorithm is ensured by the theorem proposed by Mengersen and Tweedie [16].
Given the starting value  (0) at iteration  = 0, the algorithm proceeds as follows.

Numerical Studies
In this section, we introduce the algorithm to compute the estimator of the parameters in the population pharmacokinetic model.The model structure is a linear twocompartment model with elimination from the central compartment.We use the simulation and an application to the real data to illustrate the reliability of the algorithm.

Simulation Study.
In the simulation, we simulate 100 datasets consisting of  = 10 individuals and   = 10 measurements from the following linear model which is proposed with independent Brownian motions on each equation and diffusion coefficients [17]: where  1 is the concentration of drug in the central compartment,  2 is the concentration of drug in the peripheral compartment,  10 ,  12 , and  21 are the rate constant,  is the measurement,  is the measurement error,   is the coefficient of variation for the measurement error,  is a standard Wiener process, and  1 and  2 are the magnitude of the system noise.For all simulations the sampling time period of the simulation interval was 5.The initial values for population parameters were   = [0.2,0.5, 0.25], diag(Ω) = (0.01 2 , 0.1 2 , 0.02 2 )  2  = 0.04,  1 = 0.1,  2 = 0.1.On the simulated datasets, parameters were estimated using algorithm described previously.We chose 10000 iterations to make sure of the convergence of the MCMC algorithm.Estimates were obtained as the expectation of the parameter posterior distribution of the last 5000 simulated trajectories of the SDE generated during the MCMC algorithm.The true values and estimators are reported in Table 1, where the 95% credibility intervals are given in brackets.The results are presented in Figure 1. Figure 2 shows the basic goodness-offit graph for simulated data.From the results, we can conclude that the parameters are well estimated.
The results of SDE and ODE, as presented in Table 1, demonstrate that the SDE estimation works much better than the ODE estimation.The estimation of SDE is much closer to the true value and the 95% confidence interval is smaller than the ODE method.Only the  12 has a relative rational result, and the other two parameters are not well estimated in ODE simulation.The result gives an obvious comparison of the two methods on parameters estimation, and it is believed that the SDE does better.
When the time points turned out to be less than 10, maybe 6 or even less, the SDE method does not perhaps perform well enough.Sparse data cannot satisfy the modeling need so the parameter estimation may be unreliable.Considering the sparse data, data augmentation, for instance, local polynomial, is a good tool dealing with sparse data in pharmacokinetics.The local polynomial data augmentation, based on limited data, inserts data into the adjacent data to get more data points, so as to satisfy the statistical need [18].Then the SDE models are applied to the augmented data, and expected estimation results can be reached.

Application to C-Peptide Data.
The method was used to model the C-peptide data [19].C-peptide, also known as linker peptide, is the secretion of islet  cells and released to capillary along with insulin.It reflects the capacity of   islet  cells to synthesize and release insulin, so it is useful in the diagnosis of islet cell tumor.In this chapter, we applied the proposed method to in vivo metabolism data of C-peptide.In 14 normal humans, a bolus (average mass of about 50000 pmol) of biosynthetic CP was intravenously administered.In order to avoid the confounding effect of endogenously secreted CP, CP pancreatic was suppressed through a somatostatin infusion (started two hours before the bolus administration and thereafter continued throughout the experiment).Blood samples were collected at min 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 14, 17, 20, 25, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100, 110, 120, 140, 160, and 180 and C-P plasma concentration was measured.For simplicity, we choose the plasma concentration data at min 2-11 of seven subjects with same initial dose and time interval and establish a mixed-effect model based on ODE to analyze the data.Twocompartment models are recommended in some literature to describe this process.
So the CP plasma concentration data is modeled with two-compartment model for intravenous administration and parameters are estimated by Kalman filter and Bayesian inference method.The results are listed in Table 2.
Figure 3 describes the CP concentrations data (given by circle) and prediction data (given by asterisk).
Figure 4(a) describes the relationship between observed concentration and population predictions.In Figure 4, "pred" represents population predictions, "ipred" represents individual predictions, and "DV" represents latent variables (observations).Figures 3 and 4 show a good linear relationship between observations and predictions of CP concentrations which illustrates that the actual data are well fitted and parameters are accurately estimated.

Conclusion and Discussion
The Kalman filter, also known as linear quadratic estimation, is an algorithm that uses a series of measurements observed over time, containing random variations and other inaccuracies, and produces estimates of unknown variables that tend to be more precise than those based on a single measurement alone.The Kalman filter is a recursive estimator, which means that only the estimated state from the previous time step and the current measurement are needed to compute the estimate for the current state.The Kalman filter has been widely used in many fields.Extended Kalman filter is an extension to Kalman filter, which gradually becomes the standard method to deal with the parameter estimation in nonlinear system.
In the present work, we apply extended Kalman filter to population pharmacokinetics model based on SDE.We bring the model to a more generic situation by changing single status equation to status equation set.To overcome the difficulty of getting the likelihood function for complex model, we use extended Kalman filter method to get the approximation likelihood function and then estimate parameter values by Bayesian inference, but there are still many more problems in statistical inference of stochastic differential equations worth our deeper study.An interesting area for future research is the exploration of the model with covariate.Moreover, the extension of this work to multidimensional SDEs would also be an interesting direction.

Figure 1 :
Figure 1: Simulation data are given by the circle and prediction data are given by the asterisk linked in a line.The 95% credibility intervals are given by the dotted line.

Figure 2 :
Figure 2: Basic goodness-of-fit graph for simulated data.(a) Plot of observed versus population predicted value and (b) observed versus individual predicted value.The solid lines are the lines of identity.
Figure 4(b)   describes the relationship between observed concentration and individual predictions.

Figure 3 :
Figure 3: C-peptide concentrations data are given by the circle and prediction data are given by the asterisk.

Figure 4 :
Figure 4: (a) Relationship between observed concentration and population predictions.(b) Relationship between observed concentration and individual predictions.
Based on Ordinary Differential Equation.Let  = (  ) 1≤≤ = (  ) 1≤≤,1≤≤  ,  = 1, . . ., ,  = 0, . . .,   , be the true observations, where   is measurement for individual  at time   ,  is the number of individuals, and   is the number of measurements for individual .Traditional nonlinear mixed-effect models usually model this process by nonlinear mixed-effects model based on ordinary differential equations.Formally, the classical nonlinear mixed-effects model is written as

Table 1 :
Mean estimates and 95% credibility interval in brackets obtained from the SDE mixed-effects models and ODE models on 500 simulated datasets.

Table 2 :
The parameters estimators of C-peptide concentrations.