Hepatitis C Viral Dynamics Using a Combination Therapy of Interferon , Ribavirin , and Telaprevir : Mathematical Modeling and Model Validation

Groundbreaking new drugs called direct acting antivirals have been introduced recently for the treatment of chronic Hepatitis C virus infection. We introduce a mathematical model for Hepatitis C dynamics treated with the direct acting antiviral drug, telaprevir, alongside traditional interferon and ribavirin treatments to understand how this combination therapy affects the viral load of patients exhibiting different types of response. We use sensitivity and identifiability techniques to determine which model parameters can be best estimated from viral load data. Parameter estimation with these best estimable parameters is then performed to give patient-specific fits of the model to partial virologic response, sustained virologic response and breakthrough patients.


Introduction
Over 200-300 million people worldwide are infected with a virus called Hepatitis C (HCV) that affects the liver, which was discovered in 1989 [1].It is usually spread by blood-to-blood contact via intravenous drug use, poorly sterilized medical equipment and transfusions.
Scarring of the liver and ultimately cirrhosis are just a few of the more severe complications associated with HCV [2].
Six different genotypes of HCV exist due to the highly error prone RNA polymerase with the most common being genotype 1 that has the lowest levels of response to standard treatment [3,4].Genotype 1 patients have about a 50% chance for sustained virologic response (SVR), while nongenotype 1 patients have about an 80% chance for SVR [5].The clinical data used for this study were provided by the University of Sao Paulo, School of Medicine in Sao Paulo, Brazil and consist of genotype 1 patients.
One of the first treatments for HCV was 6-12 months monotherapy with interferon glycoproteins.Interferon is naturally secreted from our bodies to fight off infection and monotherapy treatment with them is associated with around 10% SVR [6].The addition of ribavirin (RBV), a drug believed to render some of the virus non-infectious, increased SVR to around 30% [6].RBV monotherapy is not recommended because it does not give a significant benefit to SVR [7].Until recently, the most common therapy was a combination of pegylated Interferon (IFN) and RBV for 24-48 weeks that yielded about a 45% SVR [5,6].One of the major differences between IFN and standard inteferon glycoproteins is that the pegylation allows the drugs to stay in the body longer [8].There have also been clinical trials with RBV monotherapy before and after IFN + RBV therapy as described in [9].Recently, new drugs called direct-acting antiviral agents (DAAs) have raised the chance for SVR for HCV patients.
DAAs give an increase to about an 80% chance for SVR for genotype 1 [10].According to the FDA, DAAs are drugs that interfere with specific steps in the HCV replication cycle by taking advantage of the biological makeup of HCV [11].HCV is a single-stranded RNA molecule that is several nucleotides in length.During HCV's life cycle, it is translated into a polyprotein that is composed into structural and nonstructural proteins that aid in replication.During posttranslational processing, DAAs called protease inhibitors block a key protease from the replication process and hinders further infection [10,12].Among the protease inhibitors available are boceprevir, telaprevir and simeprevir.Simeprevir is recommended over telaprevir and boceprevir because of both improved efficacy and less side effects, but telaprevir continues to be used because of its cost efficiency [13,14].
Integration of mathematical modeling of viral dynamics with clinical data has led to further understanding of how different treatment strategies dictate viral load dynamics.One of the first mathematical models was given by Neumann et al. which attempted to describe HCV dynamics with interferon monotherapy [4].Improvements were made to the Neumann's model to better describe different mechanisms in the liver during treatment including the regeneration of liver cells.Adjustments were also made to include the standard of care, IFN, and RBV.Some of these modifications can be found in [5,15].In particular, Snoeck et al. [5] had data after the end of the treatment phase so that the model can give a more accurate representation of its prediction of SVR.The introduction of DAAs has ushered in more mathematical models that include this type of therapy [16].For example, mathematical models have been proposed using telaprevir monotherapy [17][18][19][20] and in combination with IFN and RBV [18] that uses Bayesian feedback to estimate the parameters in the model.The challenges that come with modeling DAAs is that since they are relatively new, there is not as much data available [17].It can be difficult to predict SVR because of a lack of data after the treatment phase ends due to how recent the drugs have been approved.
This chapter introduces a novel approach for the development of a mathematical model describing HCV dynamics given the triple-drug combination treatment of IFN, RBV, and the DAA telaprevir.In Section 2, we describe how we adapted a previously known HCV model to include telaprevir and the available clinical data.Section 3 describes the a priori analysis of sensitivity and identifiability and its incorporation into the parameter estimation problem.Section 4 gives the parameter estimation results using several patient specific clinical data including partial virologic response, sustained virologic response and breakthrough.Finally, concluding remarks are provided in Section 5.

Mathematical models of HCV dynamics
The original model for HCV dynamics in Neumann et al. [4] was frequently used to assess viral-load profiles after short-term treatment and is given by where T and I denote the concentrations of healthy and infected hepatocytes, and V represents viral concentration in the liver fluid.One of the key contributions of the model was the understanding of the mechanism of IFN.It was unknown whether it acted through η > 0 (i.e., inhibiting the infection of healthy liver cells) or ε > 0 (i.e., reducing virion production in infected cells).In [4], it is determined that it is through ε which inhibits production of the virus.The drawback to (1) is that it cannot describe patients exhibiting breakthrough, relapse, and most importantly SVR.These responses are reasons that early viral response does not uniformly predict responses in the long term.Another important aspect is the handling of viral load measurements below the lower limit of quantification (LLOQ).Previous analysis omitted the data below LLOQ, but it can contain critical information regarding long-term treatment outcome.Snoeck et al. [5] present a mathematical model for the dynamics of HCV with the drug treatment combination of IFN and RBV that attempts to address both the long-term responses and the use of the LLOQ.The model described in [5] is given by the following system of nonlinear differential equations where T (uninfected hepatocytes), I (infected hepatocytes), V I (infectious virions) and V NI (noninfectious virions) are natural states (international units IU/mL).This model was adapted from a standard model of viral infection [4].The number of uninfected hepatocytes increases each day with reproduction rate s and regeneration rate r.That number decreases each day as those hepatocytes die naturally at a rate d or infected at a rate β.The maximum number of hepatocytes per mL is T max .The number of infected hepatocytes increases when the healthy liver cells are infected and when the infected cells regenerate themselves.That number decreases when they die off naturally at a rate δ.Infected hepatocytes produce both infectious and noninfectious virions at a rate p. Virions are naturally cleared at a rate c.IFN inhibits virus production while RBV renders some of the virus noninfectious.The drug efficacies of IFN and RBV are represented by ε and r, respectively.The bounds for IFN and RBV are 0 < ε ≤ 1 and 0 < r ≤ 1 where the more effective the drug is, the closer the efficacy of the drug will be to 1. Snoeck uses data that extend beyond treatment for patients so the terms ε and r in (2) account for the exponential decays of the efficacies of the drugs after treatment has ceased.The exponential decay of the drug efficacies is given by and where k is the efficacy decay rate, t end marks the end of treatment, and

&
The drug efficacies ε and r are related to the drug dosage levels by the following expressions and where Dose PEG is the weekly subcutaneous dose of IFN and ED 50 PEG is the estimated weekly dose that causes 50% inhibition of virion production.Dose RBV represents the daily dose of RBV/kg body weight, and ED 50RBV represents the estimated daily dose in mg/kg that makes 50% of the virions noninfectious.Biologically, all state variables and parameters are nonnegative.Typical values for model parameters used by Snoek et al. [5] are given in Table 1.

HCV model with DAA
Snoeck's model is adapted to incorporate the DAA, telaprevir.Recall that a DAA targets specific parts of the genome of the virus to inhibit both replication and infection.The hindrance of replication of the virus in the infected hepatocytes results in the virus not being produced by those cells.This means that the DAA should be implemented as part of the infection term, βTV I , for inhibiting infection and viral production terms, pV I and pV NI , for inhibiting replication of the virus in (2).However, after simulations and analysis, it is concluded in this study that the obstruction of the infection and replication of the virus by telaprevir can be described solely as an amplifier for mitigating the production of virions alongside IFN.With this assumption, the model in [5] is modified to include the triple drug combination of IFN, RBV and telaprevir as follows: where γ represents the exponential decay of the telaprevir efficacy and is defined similarly as for ε and r (see ( 3) and ( 4)).In [21], existence and uniqueness of solutions to this updated HCV dynamical model were established, and a steady-state stability analysis was also performed.

Treatment schedule
The data in this research uses the treatment schedule timeline as follows (also summarized in Figure 1).
1.The patient is treated with the triple drug combination of IFN + RBV + telaprevir for the first 12 weeks.

5.
End of treatment at 48 weeks.

Subset selection
The forward problem refers to using a model to predict the future behavior of a system given a set of parameters.The inverse problem, on the other hand, is the parameterization of a model from empirical data [22][23][24].There have been extensive studies about parameter selection while solving the inverse problem for biological models and other applications that can be found in [3,22,[25][26][27] and references therein.In this study, we use a simple algorithm to choose a subset of parameters to be estimated from clinical data based on both sensitivity and identifiability as follows: 1. Start with the full parameter set Q.

2.
Remove parameters that are not locally sensitive to attain Q S ⊂ Q.

3.
Remove parameters that are not locally identifiable from Q S to obtain sensitive and identifiable parameter set Q SI Since these are local analyses, this procedure is repeated over a large number of parameter sets and the parameters that appear most often in Q SI are the parameters that are estimated.All other parameter values are fixed to values from the literature.A biological and structural explanation for some of the fixed parameters is given in the next section.

Fixed parameters
The assumptions for fixed parameters are the same as in [5].Since the maximum number of hepatocytes in the liver is 2:50 Â 10 11 and HCV RNA is distributed in plasma and extracellular fluids with a volume of $ 1:35 Â 10 4 ml, then T max ¼ 2:50Â10 11 1:35Â10 4 ¼ 1:85 Â 10 7 .d is obtained from hepatocyte turnover being every 300 days and s ¼ T max Á d can be deduced in the absence of liver disease.p is always fixed because p 1 À ε ð Þ appears in _ V and _ V NI making p and ε, impossible to estimate uniquely.The rest of the parameters will be considered in the sensitivity analysis.

Sensitivity analysis
A sensitivity analysis is the process of understanding how the model output is affected by changes in the parameters.Sensitivity analyses are used in many branches of mathematics such as statistics, partial differential equations (PDEs), and control design [28,29].The parameters that give the most change in the output are said to be sensitive parameters.This is important in the forward problem because it allows an understanding of which parameters will give useful information.Once the parameters have been identified, a sensitivity analysis for the inverse problem is usually performed to determine the sensitive parameters.Parameters with minimal impact are fixed from the literature.There are two different types of sensitivity analysis: global and local.A global sensitivity analysis heavily depends on the structure of the model and quantifies how uncertainties in outputs can be apportioned to uncertainties in inputs.We refer the reader to [30] for a more comprehensive discussion.Our study uses a local sensitivity analysis that depends on the prescribed values of the parameters.

Sensitivity equations
The sensitivity analysis presented in this section uses a derivative-based approach.Consider the general form of an ODE model and a function z of its output whereby the vectors y and q contain the variables and parameters of the model, respectively.Since we are concerned with how our model output, z, is influenced by changes to our parameters, q, then we consider the partial derivative of z, ∂z ∂q , with respect to q.One approach to computing this partial derivative is by solving the associated sensitivity equations.Differentiating both sides of the output Eq. ( 8) with respect to the parameter q yields since ∂t ∂q ¼ 0 and ∂q ∂q ¼ 1.The two components ∂g ∂y and ∂g ∂q can be directly calculated from g, but can be cumbersome to do by hand depending on the complexity of the function g.Thus, one can employ automatic differentiation to evaluate these derivatives.Since any mathematical function can be decomposed into elementary functions, automatic differentiation numerically implements the chain rule and basic arithmetic equations repeatedly to compute the total derivative of a function with accuracy to working machine precision [31].This is achieved with table lookups and tabulating all the functional compositions [32,33].An automatic differentiation (AD) code developed by Martin Fink in MATLAB was employed [34].Finally, to calculate ∂y ∂q , it is noted that y is continuous in t and q.Since ∂y ∂q exists, by taking the partial derivative with respect to q of the state equations and reversing the order of differentiation [35], we obtain Similar to ∂g ∂y and ∂g ∂q , ∂f ∂y and ∂f ∂q are calculated using automatic differentiation.From (10), the sensitivity equations are given by the following coupled system of differential equations Solving the sensitivity equations yields ∂y ∂q , which, in turn, gives ∂z ∂q from (9).

Model considerations and sensitivity results
The sensitivities of each parameter are ranked to obtain which parameters are most sensitive.Since there is a large range of parameter and viral load values, each parameter, q j , is log scaled in association with the state variable, y, that is, . This allows a comparison of the sensitivities of each parameter using similar magnitudes.The l 2 À norm is used to nondimensionalize the sensitivities over time so the following sensitivity coefficient is considered for each parameter Eq. ( 12) is defined to be the relative ranking sensitivity of each variable y i in y with respect to each individual parameter q j .
Since the local sensitivity analysis depends on values in q, independent sets of parameters that have a log-normal distribution are created from the population-based model fit in Snoeck et al. [5].That is, a sequence of independent parameter sets q k È É is generated from this distribution using the typical values from [5] as the mean.To determine pseudo-global sensitivities, a sensitivity coefficient, S k ij , is computed for each parameter in the k th parameter set.Then, if B parameter sets are to be analyzed, then an average for all the parameter sets is computed by A cutoff is determined based on the ranking of the averages attained in (13).Those parameters above the cutoff are further examined in the identifiability analysis.This method is a version of what is referred to as Morris Screening in [30].Similar to the work done here, the Morris algorithm [36] averages local derivative approximations to provide more global sensitivity measures.The difference being that the variance in the parameter sets is also considered.Here that variance would be given by As explained in [30], while the mean (13) quantifies the individual effect of the input on the output, the variance ( 14) estimates the combined effects of the input due to nonlinearities or interactions with other inputs.The reader is referred to [30,36] and references therein for a more detailed analysis of Morris Screening.It is noted that only the marginal distributions are given in [5], so computations are ignorant of any covariances between parameters.The data that were used contain only the viral load observations.So the sensitivities of 8) is considered where with output Two different sets of time points are used during this analysis.The first and second set of time points come from the partial virologic response (PVR) case and Breakthrough case, respectively.This will provide a better illustration of sensitivities given that treatment decays in the Breakthrough case, but does not in PVR.The sensitivity rankings are given in Figure 2 for over , respectively.These parameters are considered in the identifiability analysis.Note that γ is always considered in the identifiability analysis due to there not being a value from the literature to fix it to for this model.It is used to determine whether it affects the identifiability of other parameters.

Identifiability analysis
After deciding which parameters are sensitive, consideration is given to understanding which sensitive parameters can uniquely be identified from the data.In this study, we employed a sensitive-based approach for local identifiability analysis.To this end, we consider the parameters contained in q which minimize the cost function with V t i ; q ð Þ denoting the model output and V i d denoting the corresponding data value at time point t i for i ¼ 1, …N, where N is the number of data values.Similar to [37], let us assume that q * is the minimum of this cost function.Then by using a Taylor series expansion around q * , we obtain If we only consider the first two elements of V t i ; q ð Þ under the assumption that q ≈ q * and substitute this expression into the cost function we find that

J q
where we used the fact that q * is the minimum of the cost function so that be an N Â l ð Þsensitivity matrix relating to the sensitivities dV dq j t i ð Þ of the output with i ¼ 1, …, N and j ¼ 1, …, l, where l denotes the number of parameters.The cost function of ( 15) is rewritten in terms of this sensitivity matrix where Δq ¼ q À q * .Rearranging Δq ¼ q À q * , we formulate the cost function in terms of q * þ Δq: If we suppose that Δq is an eigenvector of S T S with S T SΔq ¼ λΔq, then we have We note that if Δq is an eigenvector with eigenvalue λ ¼ 0, then the cost function to secondorder approximation is J q * þ hΔq ð Þ¼0: The least squares cost function does not change values when moving from q * to q * þ hΔq, with h arbitrary.Thus, the parameters are locally unidentifiable at q * .If S T S has very small eigenvalues, this can also be a problem for parameter identification.There have been studies about how the Fisher Information Matrix (S T S) can be used for parameter identification [38,39].For example, in [38], they search all possible parameter combinations and choose them based on the rank of the sensitivity matrix, S, and asymptotic standard error uncertainty.We use the following algorithm as described in [39] to determine which of the parameters in our model will be unidentifiable.
1. Create the matrix S T S, compute its eigenvalues, and order them such that is less than some threshold ε (typically taken to be 10 À4 ), we say that there is a parameter that is unidentifiable.

3.
The largest magnitude component of the eigenvector Δq 1 associated with the eigenvalue λ 1 corresponds to the least identifiable parameter.Remove the corresponding column from S and repeat step 1.
After performing this procedure, we now have a set of sensitive and locally identifiable parameters to estimate.The rest of the parameters are set to "typical values" found in the literature.The identifiability algorithm is applied to all the parameter sets of sensitive parameters, Q PVR and Q Brk , obtained in the previous section.It is observed from Figure 3 that the parameters in Q PVR ¼ δ; c; β; γ È É are identifiable at least 50% of the time and the parameters in Q Brk ¼ δ; c; β; γ; ε È É are identifiable at least 50% of the time.The parameters contained in Q PVR and Q Brk are those that will be estimated from the clinical data.

Parameter estimation
The parameters in Q PVR and Q Brk are estimated using the weighted sum of squares of errors (WSSE) given by Hepatitis C -From Infection to Cure J where w i is the weight for the error term log d is data measurement of viral load at the i th time point and V t i ; q ð Þ is the model output with parameters q.We used both sampling and gradient based methods to minimize this function implemented in MATLAB.The model was fit to three data sets; namely, PVR, ETR (end-oftreatment response) and Breakthrough.PVR represents when the patient has an initial positive reaction to the therapy, but then the viral load rebounds during treatment and never goes below detection.ETR represents when the viral load drops below detection and does not rebound.Breakthrough represents when the patient's viral load drops below detection, but rebounds.In our data, the LLOQ is 15 IU/ml.When the data drop below the LLOQ, least squared estimation does not suffice as a statistically rigorous methodology.Instead, we employ the expectation maximization (EM) [40] to compute maximum likelihood estimates of our patient specific parameters.For a detailed description of the EM algorithm, we refer the reader to [41].The RBV dosage depends on the patient's body weight and was sometimes modified during treatment due to different symptoms of the patients such as blood thinning.The patients experiencing PVR and Breakthrough had constant RBV dosage for the entire treatment while the patient exhibiting ETR had modified dosage.The RBV efficacy is fixed to r ¼ :1222 from [22] for the PVR and Breakthrough patients.The efficacies for the ETR patient were modified based on time, t, in days since initial treatment and are presented in Table 2.
The parameters not in Q PVR or Q Brk are fixed to the values in Table 3 from [5,22].As in [5], the infected steady state is used for the initial conditions for (7) because the patients considered had chronic infection.The values in Table 4 are obtained after estimating the parameters in Q PVR and Q Brk .These estimates produce the model fits (graphs on the left) and residuals (graphs on the right) in Figures 4-6.It is noted that in Figure 6, the ETR patient's viral load goes to zero, and the residuals for censored data are set to zero.
In practice, the mathematical model is never exact (model misspecification), and the data contain noise (human errors, instrument errors).Hence, confidence and prediction intervals are used to understand the extent of uncertainty involved in estimating our parameters.In   calculating these intervals, standard errors are computed from the model predictions using the parameters that have been estimated.Moreover, 95% parameter and predictive confidence intervals and prediction intervals for the PVR parameters (attached as half-widths in Table 4) and predictions are calculated using the asymptotic theory outlined in [22,27,30,41,42].The predictive confidence intervals and prediction intervals are shown in Figure 7.

Discussion
The higher values in c and δ in the ETR patient lead us to believe that the immune response along with the drugs has a stronger impact on the mutation and clearance of the virus.It is known that the immune response is strongly correlated with the clearance  of the virus.Since the initial conditions of (7) are at the infected steady state, introduction of the drugs could be a mechanism to jump start the immune response.We note that even when the virus is not cleared, telaprevir still has a strong impact on viral load decay.This behavior corresponds with how powerful DAAs can be in reducing viral load even when it rebounds.The rebound could be because of mutations which are neglected in this model as stated earlier.There is a dip at around the 150th day in the Breakthrough response that is unquantifiable due to lack of information regarding the other three states or a dynamic immune response.However, this type of dip is observed in [5,27] where data are available around this time.We conjecture that this dip is due to the immune response being stimulated by the spike in viral load and infection.The residuals in the PVR fit in Figure 4 seem to be i.i.d. because the errors seem to be randomly distributed and are on both sides of the zero axis.This is unlike the Breakthrough fit in Figure 5 which have most of the residuals above the zero axis.The predictive confidence intervals and prediction intervals look almost the same because the variance is very small, and the model fits the data very well.The reader is referred to [30] for further details on differences between the predictive confidence intervals and prediction intervals.

Conclusion
The missing data between weeks 12-24, 24-36 and 36-48 for the ETR and Breakthrough patients make parameter estimation challenging.The predictions would also be more robust if information concerning states T, I, and V NI were available.These issues should be considered when making remarks about the estimations and confidence measures.DAAs were introduced in 2011, so there is not as much data available, but in the future, we hope for a larger quantity of data to make more precise estimations.
This chapter describes a model for patients with HCV who are treated with IFN, RBV, and telaprevir combination therapy.The development of this model was motivated by the desire for a model that can be validated and calibrated using sensitivity and identifiability techniques while simultaneously incorporating the new DAA, telaprevir.The model can be used to accurately describe patients exhibiting PVR, ETR, and Breakthrough.

Figure 1 .
Figure 1.Treatment schedule for patients used for data received from patients treated at University of Sao Paulo, School of Medicine in Sao Paulo, Brazil.
2000 (a) and 400 (b) parameter sets, respectively.Error bars that are two standard deviations from the mean are included.The sensitive parameters for the PVR and Breakthrough time points are Q PVR ¼ δ; c; β; r; γ È É and Q Brk ¼ δ; c; β; r; r; γ; ε È É

Figure 2 .
Figure 2. Sensitivity rankings using PVR (a) and Breakthrough time points (b).

Figure 3 .
Figure 3. Final subset percentages using PVR (a) and Breakthrough time points (b).

Figure 4 .
Figure 4. Viral load model fit (a) and residual plot (b) for PVR patient data.

Figure 5 .
Figure 5. Viral load model fit (a) and residual plot (b) for Breakthrough patient data.

Figure 6 .
Figure 6.Viral load model fit (a) and residual plot (b) for ETR patient data.