Dose Finding for Drug Combination in Early Cancer Phase I Trials Using Conditional Continual Reassessment Method

We describe a dose escalation algorithm for drug combinations in cancer phase I clinical trials. Parametric models for describing the association between the doses and the probability of dose limiting toxicity are used assuming univariate monotonicity of the dose-toxicity relationship. Trial design proceeds using the continual reassessment method, where at each stage of the trial, we seek the dose of one agent with estimated probability of toxicity closest to a target probability of toxicity given the current dose of the other agent. A Bayes estimate of the maximum tolerated dose (MTD) curve is proposed at the conclusion of the trial for continuous doses or a set of MTDs is determined in the case of discrete dose levels. We evaluate design operating characteristics in terms of safety of the trial and percent of dose recommendation at dose combination neighborhoods around the true MTD under various model generated scenarios and misspecification. The method is further assessed for varying algorithms enrolling cohorts of two and three patients receiving different doses and compared to previous approaches such as escalation with overdose control and two-dimensional design. Citation: Diniz MA, Quanlin-Li, Tighiouart M (2017) Dose Finding for Drug Combination in Early Cancer Phase I Trials Using Conditional Continual Reassessment Method. J Biom Biostat 8: 381. doi: 10.4172/2155-6180.1000381


Introduction
Cancer management and treatment has seen major advances over the last two decades with the widespread use of targeted therapy and more recently immunotherapy. These treatments often combine cytotoxic agents with biologic and immunotherapy drugs, and possibly radiation because it is known that combining several drugs can help reduce tumor resistance by targeting different signaling pathways simultaneously. Using the optimal combination of these drugs in large phase II and III trials is a challenging problem and require safe, efficient, and robust designs in early phase cancer trials. Unlike single agent phase I trials where the ordering of the doses with respect to the probability of dose limiting toxicity (DLT) is completely specified [1][2][3], dose combination trials where the doses of two or more agents are allowed to vary during the trial imply a partial ordering of the doses. Hence, many strategies for exploring the space of dose combinations safely are possible and not one algorithm described in the literature seems to perform uniformly better than the others in estimating the maximum tolerated dose. In general, parametric model based designs that link the dose combination-toxicity relationship described [4][5][6][7][8][9][10][11][12][13], the partial ordering approach [14,15], and nonparameteric method [16] all proceed by treating successive cohorts of patients with dose escalation starting from the lowest dose combination and the model parameters and estimated probabilities of toxicities sequentially updated. Dose allocation to the next cohort proceeds by using variations of the continual reassessment method (CRM) [17][18][19][20][21] or escalation with overdose control (EWOC) [22][23][24][25][26][27][28][29][30].
In this manuscript, we examine the performance of the EWOC based designs described in Tighiouart et al. [11,13] when the dose combinations for the next cohort of two patients are determined according to the CRM scheme. In addition, we propose a new dose escalation algorithm that treats cohorts of three patients receiving different dose combinations. Briefly, cohorts of two patients that receive doses along the dose levels of each drug as described in Tighiouart et al. [13] are supplemented by a third patient who will be treated with a dose combination along the diagonal defined by the two drugs. The severity of such an escalation scheme depends on the nature of the agents under consideration. For instance, this approach may not be appropriate for two cytotoxic drugs but can be an option when studying two biologic agents since trial duration may be shortened considerably. The performance of these extensions will be evaluated by extensive simulations and compared to the approach of Tighiouart et al. [13] and Wang and Ivanova [5] with respect to safety of the trial and efficiency of the estimated MTD under a large number of practical scenarios.
This paper is organized as follows. In Section 4, we review the class of dose-toxicity models from the previous work and introduce the dose escalation/de-escalation algorithm in the case of continuous dose levels. In Section 5, we derive the operating characteristics of the designs under various scenarios for the location of the true MTD curve and under model misspecification. The algorithm is adapted to the case of discrete dose combinations in Section 4 and compared to previous approaches. Section 6 contains some discussion and plans for future work.

Dose-toxicity model
The class of linear logit models described [9][10][11][12][13] for synergistic drugs will be adopted to describe the dose-toxicity relationship: Here, the interaction coefficient η is non-negative, the link function F(⋅) is known, Z is the binary indicator of DLT, and the continuous dose levels (x,y) of two drugs A and B are bounded in the space [X min , In the remaining of this paper, dose levels of these two drugs are standardized to be in the unit interval [0,1]. Following the usual assumption of monotonicity of the probability of DLT as a function of dose for cytotoxic or biologic agents, we presume that for all y∈[0,1], Prob(Z=1|x,y) is increasing in x and for all x∈[0,1], Prob(Z=1|x,y) is increasing in y. These conditions are satisfied if and only if β>0 and γ>0. By definition, the MTD is the set C of dose combinations (x * ,y * ) that satisfy: The parameter θ is known as the target probability of DLT and must be pre-specified by the clinician. Typically, θ is set relatively high when the nature of DLTs anticipated in the trial are reversible and low when these are hazardous or life-threatening with common values selected in [0.1,0.4]. It follows from (1) and the definition of the MTD in (2) that In order to facilitate prior elicitation on model parameters that have a practical interpretability, model (1) is reparametrized in terms of ρ kl , the probability of DLT when the levels of drugs A and B are k and l, respectively, for (k,l) ∈{(0,0),(0,1),(1,0)}, and the interaction parameter η. This is a one-to-one transformation with inverse function given by The MTD set can be re-expressed in terms of these new parameters:

Trial design
The dose escalation/de-escalation algorithm is similar to the one described [13]. The trial enrolls consecutive cohorts of two patients receiving different dose combinations determined using univariate CRM. Specifically, 1. The two patients in the first cohort receive the minimum dose combination available in the trial (x i ,y i )=(0,0).
2. In the i-th cohort of two patients, (a) If i is even, patient (2i-1) receives dose (x 2i-1 ,y 2i-3 ) and patient 2i receives dose (x 2i-2 ,y 2i ), where In (a) and (b) above, the Bayes estimates 00 01 10ˆˆ( , , , ) ρ ρ ρ η are the medians of the posterior distribution π((ρ 00 , ρ 01 ,ρ 10 ,η|D 2i-2 ) and the recommended doses are obtained by minimizing the distance between the plug in Bayes estimate of the probability of DLT and the target probability of DLT θ. This algorithm implies that a patient in the current cohort can be treated at a dose (x,y) if and only if a patient in the previous cohort was treated at a dose on the same horizontal (along drub A) or vertical (along drug B) line within the dose range.
When treating consecutive cohorts of three patients, the first two patients in any given cohort receive dose combinations as in (a) or (b) above and the third patient is given a dose combination (x 3i , y 3i ) along the diagonal defined by the doses of the two agents according to the CRM principle: = .
i i y x 3. Repeat step 2 and terminate the trial when a total of a prespecified number of patients n are enrolled or the following stopping rule holds.
Stopping rule: Due to the monotonicity assumption of the dosetoxicity relationship with respect to each drug, a stopping rule for safety can be tested at the minimum dose combination (0,0). Specifically, enrollment to the trial is stopped if there is statistical evidence that the minimum dose combination available in the trial is too toxic, i.e, P(P(Z=1|(x,y)=(0,0) > θ +δ 1 |data)> δ 2 . The design parameters δ 1 and δ 2 are chosen to achieve good operating characteristics for a given set of scenarios.
At the conclusion of the trial, we propose the following estimate of the MTD curve where 00 01 10ˆˆ, , , ρ ρ ρ η are the posterior medians given the data Dn.

Simulation Studies Simulation set-up and scenarios
To facilitate comparison with the work of Tighiouart et al. [13], we derive the operating characteristics of this design under the same scenarios [13] using the logistic link function for the working model. The corresponding true parameters (ρ 00 , ρ 01 ,ρ 10 ,η) are shown in Table 1  . The target probability of DLT was set to θ=0.33 and the number of trial replicates is m=1000. When treating cohorts of size 2, the trial sample size is n=40 and we choose n=42 when using the algorithm that treats consecutive cohorts of three patients. Vague priors for ρ 00 , ρ 01 ,ρ 10 were selected by setting a i =b i =1, i=1,…,3 and a diffuse prior forη was selected by taking E(η)=21 and Var(η)=542, see Tighiouart et al. [13] for the rationale behind this choice.

Design operating characteristics
We evaluate the performance of the CRM based design treating cohorts of two patients (CRM2p) in terms of safety, efficiency of the estimate of the MTD, and under model misspecification. This design is compared to the EWOC based dose escalation algorithm described in Tighiouart et al. [13]. Finally, CRM2p is compared to the CRM based algorithm that treats consecutive cohorts of three patients (CRM3p).

Safety and efficiency:
To evaluate trial safety, we report the oberved percent of DLTs averaged across all m=1000 trials and the percent of trials with an excessive rate of DLT, e.g., a DLT rate exceeding θ +, for δ=0. 1. This provides an estimate of the probability that a prospective trial will result in an excessive rate of DLT. For efficiency of the estimate of the MTD curve, we use the same summary statistics proposed [13].  The first is an estimate of the MTD curve given by where 00 01 10 , , , ρ ρ ρ η are the average posterior medians of the parameters ρ 00 , ρ 01 ,ρ 10 ,η from all m=1000 trials. The next measure is the pointwise average bias where ( ) In the above equation,C i is the estimated MTD curve from the i-th trial, C true is the true MTD curve, and y′ is such that (x,y′)∈C i for all (x,y)∈C true . The last measure of efficiency is the pointwise percent selection for a selected tolerance probability P where ∆(x,y) is the Euclidean distance between the minimum dose combination (0,0) and the point (x,y) on the true MTD curve. Detailed description of these summary statistics and their properties can be found [11,13]. Table 1 shows that the average DLT rates when the true and working models are logistic are close to the target probability of DLT θ=0.33 except for scenario (a) where the true MTD curve is close to the maximum dose combination available in the trial. In this case, the DLT rate is around 14%, consistent with the results obtained [11,13]. These DLT rates remain unchanged under model misspecification and the percent of trials with an excessive DLT rate is very low under all scenarios. We conclude that the CRM2p design is safe for this class of practical scenarios. The estimated MTD curve (dashed line) in Figure 1 is very close to the true MTD curve (solid line) under all scenarios. The gray cloud of points consists of the last two doses of the two patients in the last cohort. The 95% confidence region is fairly tight around the true MTD curve and illustrates the uncertainty of the estimated MTD should a clinician choose to select the last dose given to the last patient in the trial for future phase II trials. Figure 2 shows the pointwise average bias using the true model (logistic) and model missspecification under the four scenarios. Given that the dose range for each agent is the unit interval [0,1], the average bias is negligible under scenarios (a-c). For scenario (d), the average bias tend to be higher near the top of the true MTD curve, consistent with previous results [13]. These results are also consistent with the pointwise percent selection shown in Figure 3. The pointwise percent selection with a tolerance p=0.1 are very high for phase I trials ranging from 55% to 100% across all scenarios. The extent of variability of these pointwise percent selections under model misspecification is small except when using the probit model. We conclude that the CRM2p design is safe and achieves good pointwise percent MTD recommendation even under model misspecification.

Comparison of CRM2p and EWOC2p designs
The average DLT rates for the CRM2p design shown in Table 1 are very similar to the EWOC2p design with the largest difference of 3% obtained under scenario (a) using the probit link as the true model. Similarly, the percent of trials with an excessive rate of DLT are similar for the two algorithms. Figure 4 shows the plots of the estimated MTD curves for the CRM2p and EWOC2p designs. These curves are very close to one another across all scenarios. Although the pointwise average bias is still negligible for these two designs as shown in Figure  5, the bias for CRM2p is consistently lower (in absolute value) than the EWOC2p design except for scenarios (a) where the true MTD curve is near the maximum dose combination. This is also reflected in the pointwise percent selection shown in Figure 6 where the percent selection for CRM2p is higher relative to EWOC2p under scenarios (b-d) with the largest difference of 17% obtained near the middle of the MTD curve under scenario (d). Based on the results of these scenarios, CRM2p design outperforms EWOC2p in terms of the efficiency of the estimate of the MTD in general.

Comparison of CRM2p and CRM3p designs
Treating a third patient in a cohort along the diagonal does not seem to affect the safety of the trial. If anything, Table 1 shows that the average DLT rate for CRM3p design is slightly lower than that of CRM2p. Therefore, safety of the trial is not compromised by escalating both drugs along the diagonal according to these four scenarios. As in the previous section, the plots of the estimated MTD curves for the CRM2p and CRM3p designs in Figure 4 show no difference between these estimates. The pointwise average bias for CRM3p is negligible and is consistently higher than CRM2p design except under scenario (a). With respect to the pointwise percent selection shown in Figure 6, there is no clear distinction about the superiority of either design since this depends on both the scenario under study and the value of the dose combination. For example, under scenario (a), CRM3p outperforms CRM2p across the dose range but in scenarios (b) and (c), CRM2p does better across half the dose range. Given the results for these scenarios, the main advantage of using CRM3p design over CRM2p is to shorten trial duration.

Discrete Dose Combinations
In this section, we review how the methodology can be adapted to a set of discrete pre-specified dose combinations and further compare the CRM2p design to the two-dimensional design [5] in addition to EWOC2p and CRM3p designs.

Approach
Denote by (x 1 ,…,x r ) and (y 1 ,…,y s ) the doses of agents A and B, respectively, standardized in the unit interval [0,1]. The CRM algorithm in Section 2.3 is applied by rounding the recommended continuous dose to the nearest discrete dose level sequentially for each patient in the trial. The estimated set of MTDs is obtained as described in Tighiouart et al. [13] and is briefly reviewed here. Let ( ( , ), ) j k d x y C be the Euclidean distance between dose combination (x j ,y k ) and the estimated MTD curve Ĉ .  The final set Γ consists of the recommended MTDs at the end of the trial with design parameters δ 1 ,δ 2 selected to achieve good operating characteristics for a pre-specified set of scenarios.

Operating characteristics
The performance of the method is evaluated by calculating the percent of MTDs selection introduced in Tighiouart et al. [13] estimating the probability that for a given scenario, a prospective trial will recommend a set of dose combinations that are all MTDS, where is the set of true MTDs for a pre-specified threshold parameter δ set by a clinician independent of the scenarios under study and Γ i is the estimated MTD set from the i-th trial obtained in (ii) above.
Another measure of percent selection is the probability of obtaining at least K MTDs, In addition, we also define the average proportion of the recommended set of dose combinations that are MTDs, However, such a measure is not well defined if the recommended set Γ i is empty, thus we consider a weighted average proportion: The statistic S δ Γ gives an estimate of the probability that any given dose in the set of recommend doses at the end of the trial is an MTD.

Simulation setup and scenarios
We consider five scenarios from Wang and Ivanova [5] presented in Table 2 with r=6 and s=3 for scenarios 1-4 and r=s=4 for scenario 5. The target probability of DLT is θ=0.2 and the threshold parameter is fixed at δ=0.1. For each scenario, we simulated m=1000 trials using the same vague priors for ρ 00 , ρ 01 ,ρ 10 and η from Section 3. Trial sample size is n=54 for scenarios 1 -4 and n=60 for scenario 5. The design parameters are δ 1 =0.1, δ 2 =0.1 as suggested [13].
Summary statistics for trial safety and efficiency of the estimated set of MTDs are shown for CRM2p, CRM3p, EWOC2p, and the twodimensional design of Wang and Ivanova [5] (WI2p) in Table 3. The average percent of DLTs for all designs are all less than the target probability of DLT θ=0.2 with WI2p achieving a DLT rate of 20.1% under scenario 3. The percent of trials with an excessive rate of DLT is also small with the highest rate of 7% achieved by CRM2p under scenario 2.
The percent selection when comparing CRM2p with EWOC2p designs are very similar for all five measures of efficiency except for PS under scenario 4 where CRM2p exceeds EWOC2p by 9%. On the other hand, CRM3p seems to perform better than CRM2p on the average with the highest difference of 14% achieved under scenario 2 for the statistic PS. Similar to the results obtained in Tighiouart et al. [13], WI2p design performs uniformly better than EWOC2p, CRM2p, and CRM3p across all five statistics under scenario 2. For scenarios 1, 3-5, there is no clear advantage of the two-dimensional design WI2p and the results do depend on the statistic of interest. When using informative priors to match the priors [5] as was done [13], the three designs outperform the two-dimensional design for scenarios 1, 3-5 and for most of the efficiency statistics measures (data not shown). We note that here, the percent selections are exceptionally high due to the fact that the number of true MTDs is very high for all scenarios since the threshold parameter δ=0. 1. This means that any dose combination with probability of DLT within 0.1 of the target probability θ=0.2 is considered an MTD. Such an assumption is not uncommon in practice due to the heterogeneity of patients in cancer phase I trials and higher DLT rates are deemed acceptable in phase II trials relative to the target θ that is set in phase I.

Concluding Remarks
The main objectives of this manuscript are to evaluate the performance of the dose escalation algorithm described in Tighiouart et al. [13] for early phase drug combination cancer trials when the CRM scheme is used to estimate the recommended doses for consecutive cohorts of patients and the effect of treating an extra patient along the diagonal defined by the dose levels of the two drugs. We assessed the performance of these new designs under similar parametric models used in the previous approaches under vague prior distributions for the model parameters and settings of continuous and discrete dose levels.
We found that the trials are very safe under EWOC and CRM based designs, even when we treat cohorts of three patients with a third patient along the diagonal, i.e., when increasing the dose levels of both agents. With respect to trial efficiency for continuous dose levels, the CRM2p design outperforms EWOC2p in all scenarios except when the true MTD curve is near the maximum dose combination available in the trial. Given the extend of difference of the percent selection overall, we recommend the use of CRM2p design. Since none of CRM2p and CRM3p outperforms the other, CRM3p design should be used to shorten trial duration without compromising the safety of the trial. Similar conclusions can be drawn in the case of discrete dose combinations because CRM2p had better percent selection PS under scenario 4 relative to EWOC2p and CRM3p outperforms CRM2p on the average. We note that these recommendations hold for the class of scenarios used [5] and may not be generalized to settings with different number of dose levels, sample size, target probability of DLT, and prior information about the drugs when used as single agents. As in any early phase cancer trial, a set of all possible practical scenarios must be defined with the help of the clinician and operating characteristics using one or more algorithms derived to select the most appropriate design for the prospective trial.
We are currently working on extending this work to phase I/II trials where a binary efficacy endpoint is assessed relatively quickly and situations where baseline characteristics thought to be related to toxicity and efficacy outcomes are available.