Postremission sequential monitoring of minimal residual disease by WT1 Q‐PCR and multiparametric flow cytometry assessment predicts relapse and may help to address risk‐adapted therapy in acute myeloid leukemia patients

Abstract Risk stratification in acute myeloid leukemia (AML) patients using prognostic parameters at diagnosis is effective, but may be significantly improved by the use of on treatment parameters which better define the actual sensitivity to therapy in the single patient. Minimal residual disease (MRD) monitoring has been demonstrated crucial for the identification of AML patients at high risk of relapse, but the best method and timing of MRD detection are still discussed. Thus, we retrospectively analyzed 104 newly diagnosed AML patients, consecutively treated and monitored by quantitative polymerase chain reactions (Q‐PCR) on WT1 and by multiparametric flow cytometry (MFC) on leukemia‐associated immunophenotypes (LAIPs) at baseline, after induction, after 1st consolidation and after 1st intensification. By multivariate analysis, the factors independently associated with adverse relapse‐free survival (RFS) were: bone marrow (BM)‐WT1 ≥ 121/104 ABL copies (P = 0.02) and LAIP ≥ 0.2% (P = 0.0001) (after 1st consolidation) (RFS at the median follow up of 12.5 months: 51% vs. 82% [P < 0.0001] and 57% vs. 81%, respectively [P = 0.0003]) and PB‐WT1 ≥ 16/104 ABL copies (P = 0.0001) (after 1st intensification) (RFS 43% vs. 95% [P < 0.0001]) Our data confirm the benefits of sequential MRD monitoring with both Q‐PCR and MFC. If confirmed by further prospective trials, they may significantly improve the possibility of a risk‐adapted, postinduction therapy of AML.


Introduction
During the last decades, the treatment of acute myeloid leukemia (AML) has not significantly changed and the outcome has remained largely unsatisfactory [1,2]. Complete remission (CR) can be achieved in 70-80% of newly diagnosed AML patients treated with conventional induction/consolidation regimens, but relapses still occur in 40-50% of cases and, in the end, no more than 30-40% of adult AMLs can be cured [2,3]. Therefore, the optimization of postremission therapy to maintain CR represents the greatest challenge in AML treatment.
Thanks to a wider use of allogeneic stem cell transplantation (allo-SCT), which is the most powerful

ORIGINAL RESEARCH
Postremission sequential monitoring of minimal residual disease by WT1 Q-PCR and multiparametric flow cytometry assessment predicts relapse and may help to address risk-adapted therapy in acute myeloid leukemia patients postremission treatment [2], some advances have been registered in younger adults with high-risk disease. Nowadays, there is general agreement in offering allo-SCT in first remission to AML patients falling into the category of unfavorable cytogenetics [4]. They account for 15-20% of newly diagnosed patients and <5-10% of them may become long survivors without allo-SCT [4]. Excluding another 15-20% of patients with favorable cytogenetics, who can be cured in up to 60-70% of cases without allo-SCT [2], in the remaining 40-50% of patients with intermediate cytogenetic risk, the therapeutic decision is problematic due to their heterogeneity and to the difficulty in precisely defining their prognosis. Different clinical (e.g., age, secondary AML, extramedullary involvement) and laboratory (e.g., white blood cell count, LDH serum levels) factors have been identified and correlated with prognosis [2], but none of them, neither alone nor in combination, has been universally recognized and systematically applied to guide risk-adapted therapeutic strategy.
More recently, advances in defining the prognostic relevance of genomic alterations created an enormous expectation in understanding the biological heterogeneity of AML and in guiding the therapy [5]. Unfortunately, due to the variety and infrequency of molecular abnormalities, the translation of this genomic information into clinic is difficult and, as a consequence, AML therapeutic strategy based on the molecular data at diagnosis still remains controversial for the great majority of patients.
Data reported by the Medical Research Council (MRC) and Gruppo Italiano Malattie Ematologiche dell'Adulto (GIMEMA) strongly suggest that the use of posttreatment factors indicating the speed and quality of response can improve the outcome prediction in AML patients [6][7][8][9][10][11][12]. The MRC study showed that patients who entered CR after the 2nd induction course had a worse outcome [6], whereas GIMEMA studies showed that assessment of the minimal residual disease (MRD) might be a powerful and accurate tool to improve risk evaluation, as initially established on the basis of cytogenetic and molecular markers [8]. In this view, multiparametric flow cytometry (MFC) and quantitative polymerase chain reactions (Q-PCR) on target genes, such as the WT1 pan-leukemic marker, are the techniques which are currently used to evaluate the quality of response after chemotherapy-induced morphological CR [7,9]. At present, the predictive power of each technique and the identification of the most accurate time-point for MRD assessment are not well defined and still remain open questions. Furthermore, it is a matter of debate how and when MRD data should be used in the context of the AML treatment program.
The aim of this study was to comparatively analyze the predictive impact of sequential MRD monitoring with leukemia-associated immunophenotypes (LAIP)-MFC and WT1 Q-PCR in a cohort of 104 consecutively treated AML patients.

Patients
One hundred and four consecutive AML patients admitted to the Haematology Department of Spedali Civili in Brescia from 2010 to 2013 and consecutively treated with a conventional induction/consolidation treatment program [13] were retrospectively analyzed. The clinical and biological features of the patients are reported in Table 1 MRD Monitoring in AML M. Malagola et al. were positive for Flt3 internal tandem duplication, 42% for the NPM1 mutation, and 14% for both. The median follow up of the study population is 12.5 months (range: 1-47).

Plan of treatment
All the patients, stratified according to the ELN risk [1], received a treatment program according to the NILG (Northern Italy Leukemia Group) AML-Protocol [13].
Briefly, patients less than 70 years received an induction chemotherapy consisting of ICE regimen (idarubicine 12 mg/m 2 per day on days 1, 2, and 3; etoposide 100 mg/m 2 per day on days 1-5; cytarabine 100 mg/sqm bid on days 1-7), followed by SPLIT regimen (idarubicine 17.5 mg/m 2 per day on days 1 and 8, cytarabine 3 g/sqm bid on days 2, 3, 9, 10) in case of no response (NR) to ICE; patients older than 70 years were treated according to the MICE regimen (mitoxantrone 7 mg/m 2 per day on days 1, 3, 5, cytarabine 100 mg/m 2 per day continuous infusion on days 1-7, etoposide 100 mg/m 2 per day on days 1, 2, 3). Consolidation treatment included one cycle of idarubicine 10 mg/m 2 per day on days 1, 2, and 3 and cytarabine 200 mg/m 2 per day on days 1-7, followed by one cycle of intermediate-dose cytarabine (2 g/sqm per day for 4 days) for patients younger than 70 years and one cycle of mini-ICE regimen followed by one cycle of intermediate-dose cytarabine (2 g/sqm per day for 3 days) for patients older than 70 years. Both consolidation programs were followed by peripheral blood stem cell (PBSC) collection. Patients with low and intermediate risk leukemia were addressed to intensification phase, which consisted of a maximum of three repetitive high-dose cytarabine cycles (at a dose ranging between 2 and 4 g/sqm per day days 1-5 plus idarubicine 8 mg/ m 2 per day on day 1 and 2 or 10 mg/sqm day 1) followed or not by autologous PBSC rescue (maximum of three cycles), whereas high-risk patients were addressed to HLAmatched allo-SCT.

Sequential MRD monitoring
MRD monitoring by LAIP-MFC from BM and by WT1 Q-PCR from both BM and PB were systematically performed at the following time-points: after induction course, after the first consolidation course and after the first intensification course. At each time-point, MRD evaluation was performed at the time of recovery from PB cytopenia (usually between day +30 and +45 from each chemotherapy cycle).
Bone marrow and PB quantitative assessment of WT1 molecular levels was performed by Q-PCR according to the ELN method as previously published [15,16].

Statistical analysis
Survival distributions (relapse-free survival -RFS) were estimated using the Kaplan-Meier method [17]. RFS was calculated from the date of 1st remission until the date of relapse or death, whichever occurred first. Transplanted patients were censored at the time of SCT. Differences in RFS were evaluated by log-rank test. Cox proportional hazard regression model was used for univariate and multivariate analysis of factors associated with RFS. The following variables were analyzed for all patients: age, sex, FAB subtype, WBC, ELN risk category, unfavorable cytogenetics, morphological/cytogenetic remission after induction and biological parameters (LAIP, FLT3-ITD, FLT3-TKD, NPM1 mutations, and WT1 expression) at the time-points previously indicated. Continuous variables were categorized as follows: each variable was first divided into four categories at approximately the 25th, 50th, and 75th percentile. If the hazard ratios (HRs) in 2 or more adjacent categories were not substantially different, these categories were grouped together. If no clear pattern was observed, the median was taken as the cut point. All P M. Malagola et al.

MRD Monitoring in AML
values were two-sided and P < 0.05 was considered statistically significant.

Risk stratification at diagnosis
The patients were stratified at diagnosis according to the ELN risk criteria [1]. Neither by univariate nor by multivariate analysis did ELN risk impact on RFS. Other baseline clinical (age and sex), and biological variables (WBC count, cytogenetic alone, PB and BM-WT1 level, NPM1 mutation and Flt3-ITD/TKD mutation) were included in univariate and multivariate analysis, but only a WBC count greater than 58.500/mmc was independently associated with adverse RFS (HR 4.0; 95% CI 1.4-11.7; P = 0.01).
The median level of PB and BM WT1 at diagnosis was 1747 and 3621 × 10 4 ABL copies, respectively. By univariate analysis on RFS, at least in our cohort of patients, none of the two values was able to dissect patients at different risk of relapse (HR 1.4 [95% CI 0.4-5.2] -P = 0.62 and HR 0.9 [95% CI 0.3-2.3] -P = 0.79, for PB and BM, respectively).
No significant association between a peculiar LAIP or BM/PB WT1 overexpression and any baseline clinicalpathological characteristic of the patients was found.

Evaluation of predictive impact of postremission sequential MRD monitoring
Postremission predictive impact of MRD monitoring by LAIP-MFC and by WT1 Q-PCR on RFS was evaluated on the basis of MRD results assessed after induction, consolidation, and intensification, as previously stated.

Incidence of morphological relapse in MRDpositive patients
According to the baseline characteristics, first-line allo-SCT was planned as an intensification treatment program in ELN high-risk patients. Looking at these patients, they represented 20/104 (19%); of these 4 (20%) relapsed and 18 (90%) were allotransplanted in the first line. Since allo-SCT was mandatory for these categories of patients, we excluded them from subsequent analysis.

Discussion
In the absence of new and more effective antileukemic drugs, the improvement of outcome observed in adult AML patients during the last decades has been mainly due to efforts in optimizing postremission therapy [2,3]. Two major factors played a role concerning this point: the progressively better identification of patients at high risk of relapse [1,18] and, in particular, the more extensive use of allo-SCT in 1st CR in patients selected on the basis of cytogenetic and molecular high-risk markers [4]. This last category of patients accounts for 40-50% of cases. Therefore, approximately half of AML patients, who are included in the ELN low/intermediate risk categories, are generally excluded from first-line allo-SCT intensification procedures in order to avoid the risk of high transplant-related mortality.
However, both low and intermediate risk patients at diagnosis may eventually relapse in 20-40% of cases and they should be retreated before being transplanted in 2nd CR, with more resistant disease and suboptimal clinical conditions. In our cohort of low and intermediate-risk AML patients relapses occurred in 27% and 34% of cases, respectively, and only a minority of these patients were actually allotransplanted in 2nd CR (Table 3). In view of postremission therapy optimization, the evaluation of MRD during and after treatment may be a powerful and accurate tool to improve risk assignment, as initially established in many hematological diseases [19,20] With respect to other diseases, AMLs almost represent an exception, even if studies of other groups clearly indicate that the combination of prognostic parameters detected before treatment (cytogenetic/molecular) and during treatment (CR after one or two cycles) may improve outcome prediction [6] and that MRD monitoring, as an expression of disease debulking and drug sensitivity, may be extremely important to guide postremission risk-adapted treatment [7,[9][10][11][12][21][22][23][24][25][26].
In this study, we firstly performed a risk stratification of our patients according to the baseline clinical and biological characteristics. None of the studied variables but a WBC count greater than 58.500/mmc was independently associated with adverse RFS (HR 4.0 by multivariate analysis; 95% CI 1.4-11.7; P = 0.01). The result of this analysis may be influenced by the intensive therapeutic program adopted in our series, which may have hampered the prognostic significance of some commonly accepted outcome predictors such as cytogenetics. We then  1 1 patient with MRD positivity after consolidation, 1 patient with MRD positivity after intensification and 1 patient with MRD positivity both after consolidation and intensification who relapsed were over 70 years of age and thus not eligible for allo-SCT.
MRD Monitoring in AML M. Malagola et al. focused on the longitudinal monitoring of MRD with LAIP-MFC and WT1 Q-PCR on PB and BM simultaneously, with the aim of evaluating their usefulness in the prediction of relapse, comparing the efficiency of the two methods and evaluating the most accurately predictive time-point for relapse. Multivariate analysis clearly identified BM-WT1 ≥ 121 × 10 4 ABL copies (HR 4.1; P = 0.02) and LAIP ≥ 0.2% (HR 3.3; P = 0.0001) after 1st consolidation, and PB-WT1 ≥ 16 × 10 4 ABL copies (HR 10.2; P = 0.0001) after 1st intensification as associated with dismal prognosis (Table 2, Fig. 1).
A potential prospective use of these MRD-positive values to plan postremission risk-adapted therapy has been suggested in Table 3. Based on the predictive value of MRD monitoring on relapse, 5/37 (19%) ELN favorable-risk patients could have been considered for allotransplant in 1st CR, facing a relapse risk of 25% and 80%, respectively, if MRD positivity was detected after consolidation or intensification. Similarly, 22/47 (47%) ELN intermediate-risk patients could have been considered for allotransplant in 1st CR, facing a relapse risk of 55% and 73%, respectively, if MRD positivity was detected after consolidation or intensification. Finally, in both ELN groups the persistence of MRD positivity after consolidation and intensification was predictive of relapse in at least 80% of cases, thus supporting the indication of an allotransplant. We are aware that the number of patients in each group is relatively small and that any conclusion should be drawn carefully, but we think that this approach could be helpful to clinicians and patients, in order to better evaluate the balance between the risk of relapse and that of an allotransplant, the latter being of benefit when the risk of leukemia relapse exceeds 35-40% [27].
MFC on LAIP and WT1 Q-PCR techniques became very popular for detecting leukemic cells at submicroscopic level, mainly because they offered the opportunity of a MRD longitudinal measure [28]. MFC shows the advantage of being available in almost all hematological-oriented laboratories, it is relatively easy but is operator-dependent, is not completely standardized and the immunophenotypic shift of the leukemic clone may hamper its power in predicting relapse. On the other hand, Q-PCR on target genes is easy, highly standardized and relatively cheap, but it is applicable mainly in AML patients with known molecular aberrations (e.g., molecular rearrangements arising from chromosomal translocations such as in CBF leukemias, or gene mutations such as NPM1 and Flt3-ITD), which are observed in a minority of cases [29][30][31]. Nevertheless, the possibility to quantify with Q-PCR the WT1 gene, which is overexpressed in up to 80-90% of AML cases, overcomes this limit and offers the opportunity to monitor a MRD marker in the great majority of patients, although its specificity in detecting the leukemic clone remains controversial. We used both the LAIP-MFC and WT1 Q-PCR in order to evaluate the efficiency of the two methods and we decided to monitor the MRD at a very early time-point (after induction), an intermediate time-point (after consolidation), and a late time-point (at the end of the intensification program). In the end, we cannot say that one is clearly better than the other and we have seen that both methods are useful to stratify patients at high risk of relapse as similarly reported by other Authors [32][33][34]. These results have been confirmed by other groups, who reported the power of WT1 monitoring in detecting patients at high risk of relapse, but no conclusive data are available concerning the cut-off for positive versus negative samples, as well as the optimal time-point for its assessment [23,[36][37][38][39][40][41][42][43]. Concerning this latter point, data coming from the published studies are not conclusive. In particular, a very early time-point for WT1 monitoring (postinduction) [15,35,[36][37][38][39][41][42][43], but also a later time-point (postconsolidation or pre-allo-SCT) [38,[41][42][43] have been reported to significantly predict the outcome. Concerning LAIP-MFC sensitivity and efficiency, our data are concordant with those reported by the GIMEMA group and suggest that the most accurate predictive time-point of MRD assessment is probably after consolidation [14,44]. In particular, we have seen that, within favorable and intermediate-risk groups, MRD positivity after consolidation predicts relapse in about 40% of MRD-positive AML patients, but the accuracy of relapse prediction increases to 75% when the evaluation of MRD positivity is made after intensification. On other hand, other Groups observed that LAIP-MFC positivity detected at an earlier time-point (after induction) is significantly associated with increased relapse risk [25,45,46].
According to our experience and to the data reported in literature, MRD monitoring has to be considered dynamically: the more we advance in the treatment program, the more the level of MRD positivity is reduced, but at the same time, the greater becomes the accuracy of the predictive power of MRD positivity on relapse. Only taking into account the dynamic nature of this phenomenon, we can explain the discordance of the cut-off values for positive versus negative samples observed by different groups. Also in our experience, for example, the predictive value of PB-WT1 ≥ 16 × 10 4 ABL copies measured after 1st intensification is greater than PB-WT1 ≥ 5 × 10 4 ABL copies measured before transplantation after completion of the treatment program, as already published on a selected cohort of the same patients [16].
As both LAIP and WT1 monitoring proved to be effective and highly concordant in relapse prediction, any of them could be chosen in relationship with the expertise and the facilities available in different centers. In the meantime, we think the most important effort is to make a good choice in terms of the method used for MRD monitoring and good standardization and close monitoring of MRD. As suggested by our data, the accuracy of indication to allo-SCT depends on when we detect MDR positivity and on the number of time-points (single or multiple) used for this detection. Moreover, MDR positivity as an indicator to allo-SCT intensification should be combined with other clinical and biological factors detected at disease onset. Only through this integrated system of evaluation we think that the treatment program could be optimized and customized for each patient.