Challenges of phase III trial design for novel treatments in diseases with no standard treatment: The AZA-001 myelodysplasia study model

For cancers lacking standard treatments, comparing new agents with existing treatments is problematic. Here we discuss the study design from the AZA-001 trial, which compared azacitidine with 3 frequently used conventional care regimens (CCR) for higher-risk myelodysplastic syndromes. Before randomiza- tion, physicians preselected the most appropriate of 3 CCR for each patient, after thorough examination. Patients were then randomized to azacitidine or CCR. Patients randomized to CCR received their prese- lected treatment, thus including patients otherwise excluded as poor candidates for a single comparator. This design may serve as a template in other cancers lacking standard therapy. The Authors. Published Ltd.


Introduction
Many cancers have accepted standard treatments that improve remission rates and/or survival while others lack standard treatment [1]. In such cancers, particularly those with low incidence, designing randomized trials to compare new drug treatments with existing therapies in representative patient populations is difficult, especially when limited to non-standard, suboptimal comparators.
Designing phase III randomized studies to compare effects of azacitidine with existing treatments on OS prolongation was challenging. Comparing azacitidine with only one commonly used CCR would have limited patient enrollment to only good candidates for that CCR, reducing trial result validity. We planned an international phase III trial to compare azacitidine with the 3 most commonly used CCR for higher-risk MDS at the time [17], thus increasing the likelihood that enrolled patients would receive appropriate treatment, and maximizing the applicability of the results in a diverse multinational patient population.
We describe the AZA-001 preselection trial design, its applicability in higher-risk MDS, and other hematological disorders lacking standard therapy.

Preselection study design
Preselection trial design ( Fig. 1) requires site investigators to assign patients at screening (before randomization) to one of two or more unblinded protocol-specified CCR, based on patient medical history, comorbidities, vital signs, laboratory results, disease status, prognosis and demographics, together with national, regional, and local treatment guidelines and patient preference. After preselection is recorded, patients are randomized (1-to-1) to the investigational drug arm or to the collective CCR arm, in which they receive their preselected treatment (Fig. 1).
An important goal of investigator-preselection design is to compare treatments between cohorts of patients with similar pretreatment disease and demographic characteristics. Investigator preselection of the most appropriate protocol-specified treatment before randomization increases the likelihood that patient preselection cohorts have similar baseline characteristics for comparison of the investigational drug cohort with the collective comparator group and for the comparisons of the investigational drug with each individual preselection subgroup (Fig. 1). Randomizing patients after stratification based on disease-risk category or other relevant characteristics further insures balanced patient assignment. Preselection design also gives site investigators treatment options, minimizing patient assignment to inappropriate treatment and enabling inclusion of patients who might otherwise opt out because of the 50% chance of randomization to inappropriate treatment. Additionally, preselection can facilitate adherence to treatment guidelines based on patient demographics, performance status, and disease characteristics.

Experience with the preselection study design in AZA-001
The international, multicenter, randomized, open-label AZA-001 trial compared the treatment effect of azacitidine vs CCR (the collective CCR cohort) on OS prolongation in higher-risk MDS.

Statistical analyses
Approximately 354 patients with higher-risk MDS were to be randomized to receive azacitidine or CCR in a 1:1 ratio. Data from several sources, including published CALGB 9221 data [5] for the BSC-only group and similarly designed trials that used any of the 3 comparator options, were used for sample size calculation. Median survival was assumed to be 11 months for CCR and 18.3 months for azacitidine. The number of expected deaths was 167, assuming 18 months accrual, at least 12 months of follow-up, and an overall attrition rate of 30%. In addition, a misdiagnosis rate of 15% was assumed for both groups. Using a log-rank analysis with a 2-sided alpha-level of 0.05 provided 90% power to detect a 67% improvement in OS in the azacitidine group (hazard ratio [HR] = 0.60).
Because of slow enrollment, actual enrollment and follow-up required a 42-month study period that comprised 195 deaths, giving the trial 95% power to detect the HR specified in the trial design. OS was defined as time from randomization to death from any cause. The time-to-event analysis used Kaplan-Meier methodology and stratified log-rank tests.
Comparisons within the 3 treatment preselection subgroups, azacitidine vs BSC in patients preselected to BSC, azacitidine vs LDAC in patients preselected to LDAC, and azacitidine vs IC in patients preselected to IC (Fig. 2), were prospectively planned to assess consistency of treatment effect across preselection subgroups and to determine whether azacitidine survival outcomes were influenced by preselection.

Results
Patient demographic characteristics were well balanced between the azacitidine and collective CCR arms and within the BSC and LDAC preselection subgroups. Differences were observed in the IC preselection subgroup, with patients being generally younger, with healthier ECOG performance status, and with higher-risk disease [17]. Azacitidine significantly prolonged OS in patients with higher-risk MDS vs collective CCR [17]. Comparison of OS prolongation within the BSC and LDAC preselection subgroups (Fig. 2) was significant for azacitidine and consistent with the primary analysis. In the IC preselection group, however, median OS of azacitidinetreated patients was not statistically different from the median OS of IC-treated patients [17].

Lessons learned from AZA-001 and their applicability to future studies in hematology
The AZA-001 preselection study design allowed comparison of azacitidine effect on OS with that of the 3 CCR overall and within the preselection subgroups. A study designed with 3 comparators but without treatment preselection before randomization would likely have not demonstrated the robust results of AZA-001, since baseline characteristics of patients randomized to azacitidine or CCR would probably have been less balanced. While AZA-001 preselection comparisons demonstrated the superiority of azacitidine for prolonging OS over BSC (9.6 months) and LDAC (9.2 months), no significant difference between azacitidine and IC was observed, despite a similar 9.3 month OS advantage [17], likely due to diminished statistical power from the small number of IC-preselected patients (N = 42). Future studies might use required minimum sample sizes for each preselection subgroup based on statistical power considerations to control for treatments less frequently preselected. Alternatively, patient enrollment could continue in a specific arm until sufficient power was achieved, and then discontinued. Enrollment would continue for the other preselection treatments until sufficient power was achieved. This design might also prolong trial duration.
As with all open-label studies, the AZA-001 trial was subject to potential biases associated with knowledge of treatment. For example, 4 patients (2.2%) randomized to azacitidine discontinued before treatment initiation compared with 14 patients (7.8%) randomized to CCR. However, of the 14 patients randomized to CCR, 3 had been preselected to BSC (2.9%), 5 to LDAC (10.2%), and 6 to IC (24.0%) [17]. Bias is difficult to confirm because of small patient numbers, but investigator treatment preselection may have influenced attrition. Thus, a trial designed to evaluate IC, for example, would have a selection bias for patients able and willing to tolerate IC. Equally, a trial designed to evaluate active treatment versus only BSC could have a selection bias for patients and physicians unwilling to accept the possibility of not receiving active therapy. Even when multiple treatment options are available, as in AZA-001, an investigator may have declined to enroll a patient for whom he or she believed a specific treatment was appropriate, because the patient could have been randomized to study drug instead. Nevertheless, for those patients who were enrolled, the preselection bias in AZA-001 reflects investigator judgment of a patient's best therapeutic option, thus giving the trial a "real world" aspect. Another potential treatment bias could occur in international studies because treatment preferences can differ among countries. An investigation into country-specific treatment preselections in the AZA-001 study across eight European countries showed that in pooled results from France and the UK, the most frequent preselection option was LDAC (74%), whereas in pooled results from Germany, Italy, Spain, Sweden, Greece, and the Netherlands the most frequent preselection option was BSC (79%) [23]. Survival analysis within each geographical subgroup showed an advantage for the azacitidine group vs the CCR group similar to that observed in the overall AZA-001 analysis. In future studies, stratifying within countries may minimize treatment preference biases.
At least one other study has utilized a similar design: the phase III study of temsirolimus in relapsed mantle-cell lymphoma (r-MCL) [24]. While numerous agents have some activity against r-MCL, no clear consensus of standard therapy exists [25]. That study required site investigators to nominate an intended single-agent treatment, with patients subsequently randomized to the "physician-choice" treatment or to one of two temsirolimus dosing schedules. The original protocol prespecified 6 allowed physician-choice therapies but additional agents were allowed for small numbers of patients. The primary study endpoint was progression-free survival (PFS), with statistical powering for pair-wise comparisons of physicianchoice with each temsirolimus dose. The study was positive for the higher-dose temsirolimus arm and not for the lower-dose arm. Trial results led to FDA approval of temsirolimus for r-MCL. However, unlike the AZA-001 trial, the large number of physician-choice treatments prevented direct comparison between temsirolimus and each physician-choice therapy.
The preselection study design could also assess new drugs in other malignancies for which no standard treatment exists, e.g., newly diagnosed elderly AML. Currently, no consensus exists regarding elderly patient "fitness" for IC. Survival remains poor, even with IC. Hypomethylating agents have demonstrated activity in elderly patients with AML [26] and should now be compared with CCR treatments: IC, LDAC, and BSC. Two clinical trials could be designed to study hypomethylating therapy, one enrolling patients considered "fit" for IC, another enrolling patients considered "unfit" for IC. Alternately, a single trial employing preselection trial design could compare IC with a hypomethylating agent in patients deemed "fit" per investigator preselection to IC, and compare the hypomethylating agent with LDAC in patients preselected as "unfit" for IC. In the absence of consensus criteria for "fitness," however, the validity of this approach might be questioned.
The preselection study design provides an effective model for other randomized trials in diseases lacking standard therapy. Investigator preselection of comparator treatment enables inclusion of a range of patients with varying disease states, allows comparison with current CCR, and provides direct comparison of treatment effects across comparable patient subgroups. As more is learned about molecular and cytogenetic heterogeneity of diseases such as MDS, it is clear that patients cannot be treated as a homogeneous population for evaluation of emerging therapies. This might result in smaller future trials, or larger trials using preselection methodologies, allowing physicians to consider patient heterogeneity in selecting among treatment options to assess experimental drugs.