BLASTING NEUROBLASTOMA USING OPTIMAL CONTROL OF CHEMOTHERAPY

Abstract. A mathematical model is used to investigate the effectiveness of the chemotherapy drug Topotecan against neuroblastoma. Optimal control theory is applied to minimize the tumor volume and the amount of drug utilized. The model incorporates a state constraint that requires the level of circulating neutrophils (white blood cells that form an integral part of the immune system) to remain above an acceptable value. The treatment schedule is designed to simultaneously satisfy this constraint and achieve the best results in fighting the tumor. Existence and uniqueness of the optimality system, which is the state system coupled with the adjoint system, is established. Numerical simulations are given to demonstrate the behavior of the tumor and the immune system components represented in the model.

1. Introduction.Neuroblastoma is a type of cancer that consists of crest cells found in tumors of nerve tissues [4,6].Crest cells, which resemble the early stages of cells developing in the embryo or fetus, take part in the development of the nervous system and other tissues [2].Two-thirds of this cancer originate in the abdomen.Another one third begins in the adrenal glands or the sympathetic ganglia found in the chest, neck, or pelvis.Children under age ten and infants are the most likely patients who acquire neuroblastoma [3].According to the Center for Disease Control statistics for the year 2003 [8], one quarter of all youth cancer deaths are attributable to the neurological system.
One particular drug that has proven successful in the reduction of cancer cells is the drug Topotecan (TPT), which is a novel semisynthetic antiderivative of the anticancer agent Camptothecin [7].This drug interacts with Topoisomerase I, which is an intranuclear enzyme in the body, to inhibit the replication of DNA and thus result in cell death [5, 23,36].Several studies have been conducted using several dosage strategies of TPT in order to kill tumors and reduce side effects [36].One technique often used to determine appropriate dosage levels is called pharmacokinetic-based (PK-based) dosing [5].This approach utilizes careful monitoring of drug levels in order to tailor treatments to the particular physiology of each patient [4,5].
Optimal control theory is a useful mathematical approach for maximizing the results of various treatment strategies.This theory has been applied to biological and mathematical models such as the interaction between tumor cells and chemotherapy [13,24,25].Studies by G. W. Swan helped open the door to using optimal control with biology.In 1980, Swan [34] published a study on optimal control in some chemotherapy problems; two years later, he applied optimal control to diabetes mellitus [35].Ten years after Swan's paper, J. M. Murray published a more in depth paper on the same subject by adding aspects such as general growth and loss functions [28].Murray is just one of many authors who have furthered the study of using optimal control with biological models.Control theory has successfully been applied to models that maximize the effect of the chemotherapy while minimizing the damage due to toxicity by Kim et.al [21] , Swan and Vincent [33], and Murray [29].Optimal control has also been applied to studies for other treatments, such as immunotherapy for cancer (dePillis et.al [11], Ledzewicz et.al [24,25]) and HIV by Kirschner et.al [22].Costa et.al [9,10] also published work involving optimal chemotherapeutic protocols which include inequality constraints of the control and state variables.
We will discuss a system of differential equations which model TPT plasma PK, tumor growth, and absolute neutrophil count (ANC) (Section 2).We then analyze the existence, characterization, and uniqueness of the optimal control in Sections 3 and 4 respectively.In Section 5, numerical analysis and results are then given with explanations relating the numerical results to clinical findings.

2.
Model.The model used in this study was developed by a research team at St. Jude Children's Research Hospital [31], using clinical data generated at St. Jude's.The model consists of three components: the TPT plasma PK, the tumor growth model, and the ANC (Absolute Neutrophil Count) model.The state variables are defined as follows: y c -amount of drug in the plasma y p -peripheral compartment for the TPT concentration P -total number of proliferating tumor cells Q -total number of quiescent tumor cells N p -concentration of proliferating immune cells, including stem cells, colony forming units, myeloblasts, promyeloblasts, and myelocytes concentration of nonproliferating blasts, including metamyelocytes, bands, and segmented blasts.
N circ -concentration of circulating neutrophils u(t) -amount of TPT per unit time (control variable) The system of equations describing the collective behavior of these states is given by where the value of each parameter is found in Table 1.Note that the values used represent averages taken from clinical data, [31].In addition, values of certain parameters are altered in the presence of G-CSF (granulocyte colony-stimulating factor).G-CSF is produced naturally and increases both the growth rate of stem cells and the rate at which new neutrophils are formed.Typically, additional G-CSF is administered at the end of a TPT treatment cycle in an attempt to rejuvenate levels of circulating neutrophils.Please note that the effects of the body's natural production of endogenous G-CSF is already incorporated into the determination of the parameters k in , K m , and k bp ; the parameters k ingcsf , K mgcsf , and k bpgcsf all have adjusted values in the presence of exogenous G-CSF.The implementation of these adjusted parameters is discussed in the numerical simulation section.
For clarity, we give a brief explanation of the system and refer the reader to the aforementioned article for more details.A two compartment model is used to describe the first two equations in which the equations explore transfer from a plasma TPT compartment to a peripheral TPT compartment with rates k cp and k pc .There is natural decay associated with the plasma TPT and the control is implemented as the amount of drug in the plasma compartment.Within the second compartment, equations ( 3) and (4) model the tumor growth.Within this model, the α and β parameters describe the transition from proliferating and quiescent compartments.In the third compartment, the neutrophil production in the bone marrow is modeled from stem cell production and differentiation to circulation.Michaelis-Menton terms for circulating neutrophils as well as the TPT concentration are included in equation ( 5) to take into account that lower neutrophil counts stimulate G-CSF.This growth factor in turn activates the production of stem cells.Equations ( 6)-( 8) incorporate the transitions from movement of the blasts from various phases of differentiation.
The inclusion of this constraint represents the need for the immune system to be at a sufficient strength to handle the harmful side effects of chemotherapy, as measured through the ANC.We incorporate this constraint as a component of the state system, as in Kirk [20].To begin, we rewrite this constraint in the form h(x(t), t) ≥ 0, which is, in this situation, h(x(t), t) = (N circ (t) − 500) ≥ 0. We then define an additional state variable Z through the following: where H : R → R is defined by H(t) = 0 if t < 0 and H(t) = 1 elsewhere.Furthermore, Z(t) must satisfy the boundary conditions Z(0) = Z(T ) = 0. We now treat Z(t) as an additional state variable and include equation ( 11) with the state system (1) -( 9).
For convenience, we define the system (1)-( 11) in matrix form: let System (1)-( 11) can now be expressed as x ′ = Ax+B +u.Note that Ax contains only linear terms, while B contains the nonlinear terms.
3. Quadratic Control.We apply optimal control theory to the model given by equations ( 1)-( 11) in order to find the treatment strategy that produces the best results.In particular, we examine a specific aspect of the research done by Panetta, et.al. [31]; namely, the effect of varying topotecan systemic exposure given a fixed administration schedule.The object of applying optimal control theory to this problem is to determine the ideal topotean systemic exposure; that is, the dosage of topotecan that results in a minimization of both tumor volume and myelosuppression.Note that, in [31], the authors also investigate the effects of altering the TPT administration schedule; the research discussed in this work only addresses variations in the amount of topotecan administered according to a fixed schedule.
3.1.Objective Functional.The cost functional balances the competing objectives of minimizing the tumor size and the detrimental cost associated with the drugs administered.This minimization of the quadratic cost can be considered as a minimization of the cost to the system, whether it be a financial cost or a cost in relation to the health of the patient.In other words, the objective functional is to be minimized over the set of admissible controls where ε is a weight parameter used to emphasize the importance of minimizing the control relative to the salvage terms (the choice of value for ε will be discussed in the numerical simulations section).We choose a quadratic control term in the objective functional for convenience in finding an analytic representation.
3.2.Existence.First, we obtain boundedness of the state system given an optimal control in the admissible control set U .We then establish the existence of an optimal control using a standard theorem from Fleming and Rishel [15].
Proof.We recognize that the system (1)-( 11) componentwise has a lower bound independent of u.We consider upper bounds of the state system using the following notation: We then require a uniform upper bound for the state system; therefore, we define x as the supersolution of the system.Then T , and u = 1 0 0 0 0 0 0 0 0 0 T .
We denote x as the supersolution of the system.Since both N p and N circ are finite populations of cells, we let N max p be the upper bound for N p (t) and N max circ the upper bound for N circ (t) 2 .Since this is a linear system in finite time with bounded coefficients then the supersolution x is uniformly bounded.We can use a comparison result to obtain that the original state system is bounded.See [12] for similar analyses.
With the boundedness of the state system established, we now prove the existence of the optimal control using a result from ([ [15], Thm 4.1, pg.68-69]).Theorem 3.2 (Existence of a Quadratic Optimal Control).Given the objective functional (12), subject to the system given by equations (1)- (11), with y c (0) = y c0 , y p (0) = y p 0 , P (0 and the admissible control set (13) then there exists an optimal control u * (t) such that if the following conditions are met: 1.The class of all initial conditions with a control u(t) such that u(t) is a Lebesgue integrable function on [0, T ] with values in the admissible control set and such that the state system is satisfied is not empty.2. The admissible control set U is closed and convex.3. The right hand side of the state system is continuous, is bounded above by a sum of the bounded control and the state, and can be written as a linear function of u(t) with coefficients depending on time and the state variables.4. The integrand of the objective functional (12) is convex on U and is bounded below by −c + c 2 |u| β where c 2 > 0 and β > 1.
Proof.The system (1)-( 11) is bounded from Lemma 3.2.As the coefficients of this system are also bounded, a solution to the system exists using a theorem from Lukes [27], which fulfills condition 1.The second condition is fulfilled from the definition of the admissible control set U .For the third condition, note that since the parameters V, K m , and IC 50 are all positive and N circ (t) and y c (t) are nonnegative for all t in the finite time interval, then the right hand side of system (1)-( 11) is continuous.Using the boundedness of the solutions, we also have that where Ω is the maximum of |A| and |B| .For the final condition, it is clear that since the objective functional is squared then it is convex and satisfies the needed bound with β = 2 and c 2 dependent on ǫ.

Characterization.
Having proven the existence of an optimal control, we now determine the analytic representation for that control.The form of the adjoint equations and the transversality conditions are standard results from Pontryagin's Minimum Principle [32], with the exception of the additional conditions imposed by constraint (10).The constraint is incorporated into the state system through equation ( 11), as before.The Hamiltonian, used to find the characterization of the optimal control, is given by Theorem 3.3 (Characterization of the Optimal Control).Given an optimal control, u * (t), and solutions of the corresponding state system, there exist adjoint variables λ i for i = 1, 2, ..., 10 that satisfy the system λ ′ = −A T λ + C, where λ is the vector of adjoint variables, A T is the transpose of the coefficient matrix defined in Section 2, and C is the vector of nonlinear terms given by In addition, the adjoint variables satisfy the transversality conditions λ 3 (T ) = 1, λ 4 (T ) = 1, and λ i (T ) = 0 for i = 1, 2, 5, 6, 7, 8, 9, 10.
Furthermore, u * (t) can be represented by Proof.Suppose u * (t) ∈ U is an optimal control and X = (y c , y p . . ., Z) is a corresponding solution to the system (1)-( 11), (11).The existence of adjoint variables λ 1 , . . ., λ 10 satisfying the transversality conditions stated in Theorem 3.3 is guaranteed by Pontryagin's Minimum Principle [32].By analyzing the bounds on the controls and the representation in the interior of the set, we find that 3.4.Uniqueness.Using the bounds for the state equations, the adjoint system has bounded coefficients and is linear in each adjoint variable.Hence, the solutions of the adjoint system are bounded for the final time sufficiently small.Theorem 3.4.For T sufficiently small, the solution to the optimality system is unique.
Proof.We suppose that we have two solutions to our optimality system.One solution has the previously mentioned variable names and the second has the use of bars associated with each variable name.We will choose m > 0 such that for the state equations, for example, y c = e mt h 1 , and for the adjoint variables associated with the equations moving backward in time, we have i.e. λ 1 = e −mt w 1 .
Next, we subtract the equations for h 1 and h1 , h 2 and h2 , etc.The resulting equations are then multiplied by an appropriate function and integrated from zero to T.
To complete the proof for the uniqueness of the optimal control, the integral representations of (h 1 − h1 ), (h 2 − h2 ), (h 3 − h3 ), etc. and (w 1 − w1 ), (w 2 − w2 ), and so forth are combined, and estimates are utilized to obtain where D 1 , C depend on all coefficients and bounds on all solution variables.We choose m such that m−D 1 − Ce 2mT > 0. Since the logarithm is an increasing function, then if m > C + D 1 .Thus, this gives that T < 1 2m ln m−D1

C
. With this choice of m and T , we obtain that each of the squared term in the integral terms must be zero.Based on the uniqueness of the optimality system, the optimal control is thus unique.
4. Numerical Simulations.Numerical investigations were implemented using the Forward-Backward Sweep method found in [26].This technique incorporates a Runge-Kutta method to solve the state system forward in time and the adjoint system backward in time.Note that the problem of solving the adjoint system backward in time is actually an initial value problem, using the transversality conditions given in Theorem 3.3 as initial values.The control is then updated using its analytic representation, which is given in terms of the state and adjoint variables.In theory, this process is repeated until the relative error between the current control and the control found during the previous iteration is sufficiently small; i.e., u− oldu u ≤ η, where oldu represents the control vector from the previous iteration, • is the ℓ 1 norm, and η > 0 is negligibly small.In practice, however, we must allow for the control to be zero; thus, the convergence requirement we use is η u − u− oldu ≥ 0.More information about the convergence and stability of this numerical method may be found in [17].
The optimal control representation provided the impetus for the numerics given using the requirements that the state constraint had to be satisfied.Numerical simulations were performed according to the fixed schedule investigated in [31]: five concurrent days of TPT treatments, two days of rest, another five days of TPT, followed by the administration of G-CSF for 8 days and two days to rest before beginning the cycle again.This 22-day cycle was repeated three times.Note that the control is only implemented during those phases of the treatment where TPT is being administered; this represents the goal of optimizing TPT systemic exposure.The parameters k in , K m , and k bp are replaced with the parameters k ingcsf , K mgcsf , and k bpgcsf during each 8-day administration of exogenous G-CSF in the simulations.In addition, the value of the weight parameter ε in the objective functional (12) is set equal to 1 for all simulations; varying this parameter had no noticeable effect for several orders of magnitude.
Simulations were performed with various combinations of tumor size and immune system strength.The following two cases correspond to simulations run with average values for both tumor size and ANC levels.The first case represents a patient whose neutrophils regenerate at an average rate throughout treatment, while the second case simulates a patient with slower rate of regeneration.The first case also incorporates a higher rate of effectiveness for the TPT.The parameter values used for each scenario are given in Table 2.All parameters used during the numerical simulations are within observed ranges determined during clinical trials [31].
Figure 1 shows the behavior of the optimal control u * (t).The optimal control on each period of TPT administration is found to be maximal.Note that the control The reactions of the tumor populations -both proliferating and quiescent -are plotted in Figure 2. In this case, the TPT is effective in fighting the tumor.The proliferating population is suppressed whenever TPT is administered; however, it resurges during the G-CSF administration and the rest periods.This resurgence does appear to be waning as the treatment continues, as the final population of the proliferating cells is smaller than the initial population.The TPT appears to have a more significant effect on the quiescent tumor population; in this simulation, the number of quiescent tumor cells decreases steadily throughout the treatment.Figure 3 displays the levels of the circulating neutrophils over the course of treatment.The N circ populations must be kept above 500 cells per microliter of blood, as anything below this level constitutes a condition called neutropenia, in which the body is severely limited in its ability to fight off infection.During the simulations, we notice a consistent pattern; the levels of circulating neutrophils behave cyclically in response to the various phases of the treatment schedule.To relate these three graphs, we note that when the control turns on there is a general decrease in the amount of tumor cells and circulating neutrophils.When the drug is not administered, both the tumor cells and the circulating neutrophils begin to regenerate, with the tumor cells reproducing at a faster rate.In the presence of G-CSF, the circulating neutrophils rebound to slightly above the initial level, while the tumor cells also regenerate substantially.
The second case is not as successful.The implementation of the control is once again maximal on each period of TPT administration; however, here the treatment is not as effective.This case represents a patient whose tumor cells do not respond as well to the TPT and whose immune system is sluggish during treatment.Figure 4 illustrates the reduced efficacy of the TPT in this second case.Here we see a similar behavior to that in case 1; however, here the final tumor volume at the end of each 22-day treatment cycle is greater than the initial tumor volume.This behavior occurs consistently in both the proliferating and quiescent populations.
The levels of circulating neutrophils during the second case are depicted in Figure 5.Here the N circ levels increase sharply in response to the initial TPT dose, then decrease equally sharply before settling into a cyclic behavior similar to that in case 1.Note that while the TPT is not having quite the effect desired on the tumor populations, here the levels of circulating neutrophils remain relatively stable.

5.
Discussion.This research indicates the existence of dosing schedules that significantly reduce tumor size while maintaining the ANC above an acceptable level.However, the simulations suggest that the effectiveness of TPT as a treatment option for neuroblastoma patients will vary significantly from individual to individual.The concept of giving bolus injections at the beginning of a treatment cycle has been demonstrated as optimal here; other scientists [13,16,24,25] have shown that this strategy is effective in reducing tumors.In the clinical setting, it has been understood that treatments have been administered as bolus injections because of the restrictions due to patient health and clinical staff availability [30].It is interesting to note the similar results found in this work and [30].Yet, we recognize that we are giving the maximum treatment during each time interval since the problem has been constrained by the predetermined time intervals.Here, the optimal strategy for the patients with average immune systems have higher success rates for reduction of the proliferating tumor cells.In a patient with a weakened immune system, the optimal mathematical strategy produces unsatisfactory results for the patient in that the proliferating cells are increasing.Although treatment in [30] is given over six 21-day cycles, the behavior of the ANC is quite similar and reflects the oscillatory behavior (induced by the chemotherapy treatments) encountered here.As noted in the weakened immune system patient, the mathematical strategy gives an answer but it is not beneficial to the patient.Therefore, future work will address the utilization of other mechanisms as the control of the G-CSF to see if this will allow for greater success for sicker patients.Another logical direction for future research is to optimize the dosage schedules outlined in [31].The results of that research indicate that alterations to the schedule may be more effective than increasing dosage levels.A third option is to implement linear control in the objective functional and investigate the possible existence of bang-bang and/or singular control.This could produce a clearer idea of the toxicity levels that the TPT produces in the mathematical setting.
Acknowledgements.The authors would like to thank Dr. J. Carl Panetta and the team from St. Jude Children's Hospital for their permission to further study the model presented in [31].This study owes its existence to the support of the Committee on Institutional Studies and Research (CISR) at Murray State University, as well as the National Science Foundation grant NSF-DMS-053-1865.We would also emphasize that all work contained herein is the sole opinion of the authors and does not necessarily reflect that of the CISR or NSF.
Appendix.The equations that define the adjoint system discussed in Theorem 3.3 are given below.

Table 2 .
Parameter values used in simulations.