Defining low-risk high hyperdiploidy in patients with paediatric acute lymphoblastic leukaemia: a retrospective analysis of data from the UKALL97/99 and UKALL2003 clinical trials

Summary Background High hyperdiploidy is the most common genetic subtype of childhood acute lymphoblastic leukaemia and is associated with a good outcome. However, some patients relapse and, given its prevalence, patients with high hyperdiploidy account for a large proportion of all relapses. We aimed to evaluate putative risk factors and determine the optimal pattern of trisomies for predicting outcome. Methods We used discovery and validation cohorts from consecutive trials—UKALL97/99 (n=456) and UKALL2003 (n=725)—to develop the prognostic profile. UKALL97/99 recruited patients aged 1–18 years between Jan 1, 1997, and June 15, 2002, and UKALL2003 recruited children and young adults aged 1–24 years between Oct 1, 2003, and June 30, 2001, from the UK and Ireland who were newly diagnosed with acute lymphoblastic leukaemia. Cytogenetic and fluorescence in-situ hybridisation testing was performed on pre-treatment bone marrow samples by regional UK National Health Service genetic laboratories or centrally by the Leukaemia Research Cytogenetics Group, and results were reported using established nomenclature and definitions. We examined the prognostic effect of previously proposed genetic and non-genetic risk factors among patients with high hyperdiploid acute lymphoblastic leukaemia treated on UKALL2003. We used Bayesian information criterion, targeted projection pursuit, and multivariate analysis to identify the optimal number of trisomies, and best subset regression and multivariate analysis to identify the optimal combination. Survival analysis considered three endpoints, as follows: event-free survival, defined as time to relapse, second tumour, or death, censored at last contact; relapse rate, defined as time to relapse for those reaching complete remission, censored at death in remission or last contact; and overall survival, defined as time to death, censored at last contact. Findings The median follow-up time for UKALL97/99 was 10·59 years (IQR 9·25–12·06) and 9·40 years (8·00–11·55) for UKALL2003. UKALL97/99 included 208 female patients and 248 male patients, and UKALL2003 included 345 female patients and 380 male patients. We deduced that the trisomic status of four chromosomes provided the optimal information for predicting outcome. The good risk profile comprised karyotypes with +17 and +18 or +17 or +18 in the absence of +5 and +20. All remaining cases were classified in the poor risk profile. The ratio of patients with good risk and poor risk was 82:18 and 80:20 in the discovery and validation cohorts, respectively. In the validation cohort, patients with the high hyperdiploid good risk profile had an improved response to treatment compared with other patients with high hyperdiploidy at 10 years (relapse rate 5% [95% CI 3–7] vs 16% [10–23]; p<0·0001; event-free survival 92% [90–94] vs 81% [73–86]; p<0·0001; and overall survival 96% [94–97] vs 86% [79–91]; p<0·0001). The outcome for high hyperdiploid poor risk patients was similar to that of patients with an intermediate cytogenetic profile. The prognostic effect of the UKALL high hyperdiploid profile was independent of minimal residual disease and the profile outperformed other high hyperdiploid risk profiles. Interpretation Future clinical trials and treatment protocols using high hyperdiploidy as a risk stratification factor should consider modifying the definition beyond chromosome count to incorporate this novel UKALL high hyperdiploid profile. Funding Blood Cancer UK.


Introduction
High hyperdiploidy is the most common cytogenetic abnormality in childhood acute lymphoblastic leukaemia, and accounts for 30-35% of B-cell precursor acute lymphoblastic leukaemia cases. 1 High hyperdiploidy is defined by the non-random gain of chromosomes, increasing the modal chromosome number of leukaemic blasts from 46 to between 51 and 65 or 67. [2][3][4] A DNA index of more than 1·16 is also frequently used to defined this subgroup. Gain of chromosomes X, 4, 6, 10, 14, 17, 18, and 21 accounts for more than 75% of aneuploidy events. 5 High hyperdiploidy has been associated with a favourable outcome (survival >90%) in multiple studies over several decades. 1,[6][7][8] Despite this association with good outcomes, the high frequency of patients with high hyperdiploid acute lymphoblastic leukaemia means that this subgroup accounts for up to 25% of all relapses. 9 Therefore, identification of robust risk factors within this group is clinically relevant. Numerous studies have examined cytogenetic risk factors within the subgroup of patients with high hyperdiploid acute lymphoblastic leukaemia, ranging from modal chromosome number to specific trisomies and the presence of structural abnormalities. 1,6,[10][11][12][13][14][15][16] Studies have shown improved outcomes for patients with a higher modal chromosome number [10][11][12] and specific trisomies (+6; 13 +4, +10; 14 +10, +17; 6 +18 1,15 ) and triple trisomies (+4, +10, +17), 16 which often are related to each other. 17 However, none of these studies investigated the pattern of all possible combinations of chromosomal gains and there is no consensus regarding the optimal risk factors in patients with high hyperdiploid acute lymphoblastic leukaemia. Additionally, the aforementioned risk factors have not been assessed within the context of end of induction minimal residual disease (MRD) risk stratification, 18 which is now commonplace in acute lymphoblastic leukaemia treatment protocols.
In this study, we evaluated all previously published high hyperdiploid acute lymphoblastic leukaemia data from the UKALL97/99 and UKALL2003 studies and did a comprehensive analysis to identify new risk profiles by investigating all possible combinations of gained chromosomes. Additionally, we defined and validated a novel profile to be relevant in the context of risk stratification using MRD.

Study design and participants
In this retrospective analysis, patients were diagnosed with B-cell precursor acute lymphoblastic leukaemia by standard flow-cytometric criteria and were treated in the UKALL97/99 (between Jan 1, 1997, and June 15, 2002) or UKALL2003 (between Oct 1, 2003, andJune 30, 2001) treatment trials, as previously described (appendix pp [22][23]. All patients aged 1-18 years (in UKALL97/99) and 1-24 years (in UKALL2003) from the UK and Ireland who were newly diagnosed with B-cell precursor acute lymphoblastic leukaemia were eligible for the trials. In UKALL2003, MRD was evaluated by real-time qPCR analysis of immunoglobulin and T-cell receptor gene rearrangements). 18 Cytogenetics and fluorescence in-situ hybridisation were done on

Research in context
Evidence before this study Among children with acute lymphoblastic leukaemia, high hyperdiploidy-defined by a modal chromosome number of 51-65 chromosomes-is the largest subtype. Despite a strong association with good outcomes, the size of this subgroup means that a large number of patients relapse in absolute terms. Many studies have sought to define a smaller subgroup of these patients that is associated with a uniform very low risk of relapse and can be considered for treatment reduction. However, robust validation studies are rare and there is no consensus definition of low-risk high hyperdiploidy. Before this study, we searched PubMed for all publications citing high hyperdiploidy before March 30, 2020, using the search terms "high hyperdiploidy", "acute lymphoblastic leukaemia", "childhood", "prognosis", and "HeH". This search was updated on January 10, 2021. All abstracts were screened by AE and reviewed in detail by AE and AVM.

Added value of this study
To our knowledge, this is the most comprehensive analysis to date of risk factors in childhood high hyperdiploid acute lymphoblastic leukaemia. We identified and validated a clinically useful profile that redefines low-risk high hyperdiploid acute lymphoblastic leukaemia. Our profile identified that most of the high-risk relapses in this subgroup of patients were based on four constituent trisomies: +5, +17, +18, +20. These trisomies have previously been proposed as risk factors in patients with high hyperdiploid acute lymphoblastic leukaemia. This novel profile outperformed previously published risk profiles in terms of prediction accuracy and prognostic impact, and although statistically independent of minimal residual disease, it could be further refined by its integration. The proposed profile identified low-risk patients with high hyperdiploid acute lymphoblastic leukaemia who should be considered for treatment de-intensification and a group of patients with high hyperdiploid acute lymphoblastic leukaemia who should be treated with other intermediate risk patients.

Implications of all the available evidence
The outcomes for patients with high hyperdiploidy are heterogeneous. Simply counting the number of chromosomes does not identify a subset of patients with a uniform outcome. There is evidence from this study and others that the pattern of chromosome gain can be used to define subgroups of patients with high hyperdiploidy with a distinct risk of relapse. The UKALL-high hyperdiploidy profile is simple to compute from a full karyotype or single nucleotide polymorphism array profile and outperforms existing profiles with high hyperdiploidy. We propose that this new definition be used prospectively to define patients with high hyperdiploid acute lymphoblastic leukaemia who have an excellent chance of a potential cure.
See Online for appendix pre-treatment bone marrow samples by the member laboratories of the UK Cancer Cytogenetics Group or centrally by the Leukaemia Research Cytogenetics Group, and results were reported with established nomenclature and definitions. 1 Each trial was approved by the relevant ethics committee and patients or parents or guardians gave written informed consent in accordance with the Declaration of Helsinki.

Procedures
Patients with high hyperdiploidy and concomitant BCR-ABL1, ETV6-RUNX1, KMT2A, or TCF3-PBX1 fusions were excluded from this analysis on the assumption that the fusion gene was the primary genetic abnormality and would be used to direct therapy. All karyotypes were scrutinised for cases of masked hypodiploidy and, if found, were removed from the cohort. 19 To develop the UKALL high hyperdiploidy prognostic profile, we used the two trial datasets as discovery (UKALL97/99) and validation (UKALL2003) cohorts. To ensure the development of a robust profile, only chromosomal abnormalities and trisomies visible by conventional G-banded karyotyping were included in the analyses. To simulate the real-world setting, we validated the profile using cytogenetic data generated as part of the standard-of-care genetic tests done by regional genetic laboratories for UKALL2003 patients. The SALSA MLPA kit P335 (MRC Holland; Amsterdam, Netherlands), which includes probes directed to IKZF1, CDKN2A/B, PAX5, EBF1, ETV6, BTG1, RB1, and PAR1 (P2RY8/CSF2RA/ IL3RA/CRLF2) was used to identify copy number alterations.

Statistical analysis
Initially, the evaluation of previously reported risk factors in high hyperdiploidy was done using the UKALL2003 trial only, because this trial had used MRD to direct therapy. The development of the UKALL high hyperdiploid profile was done using discovery (UKALL97/99) and validation (UKALL2003) cohorts.
We examined modal chromosome number both as a continuous variable and across three predefined categories (51-53, 54-57, and 58-65). 11,12 We also examined the prognostic impact of double trisomies (+4, +10) and triple trisomies (+4, +10, +17) as proposed by the Children's Oncology Group (COG). 6,16 Previously, Heerema and colleagues 17 proposed subgroups of high hyperdiploidy based on the pattern of chromosomal gains and modal chromosome number. We replicated this approach using correlation coefficients to identify clusters of gained chromosomes and their correlation with relapse. We analysed age, white cell count, and MRD as both continuous and categorical variables. Survival analysis considered three endpoints, as follows: event-free survival, defined as time to relapse, second tumour, or death, censored at last contact; relapse rate, defined as time to relapse for those reaching complete remission, censored at death in remission or last contact; and overall survival, defined as time to death, censored at last contact. The median follow-up times for the discovery and validation cohorts were 10·59 years (IQR 9·25-12·06) and 9·40 years (8·00-11·55), respectively. We calculated and compared survival rates at 10 years (unless otherwise specified) using Kaplan-Meier methods, log-rank tests, and Cox regression models (univariate and multivariate analyses). p<0·05 was considered to indicate a significant difference. We compared the outcomes of patients with and without a given chromosomal gain, using a univariate Cox regression model.
To investigate the optimal number of chromosomal gains, we used area under the receiver operating characteristic curve (AUC), generalised linear models, 20 and targeted projection pursuit. 21 We used best subset regression to evaluate the best subset models. 20 We measured Mallows' C p for each combination and the model with the lowest value was chosen as the fittest. To assess the optimal combination of chromosomes for predicting outcome, multivariate analysis, generalised linear models, network analysis, and coefficient of the risk model were used. The model with the smallest Bayesian information criterion with a forward stepwise criteria score over the complete set of possible models was deemed the least complex and fittest model (appendix p 24). 22 We plotted the risk of relapse against unique MRD thresholds for all patients. Additionally, cumulative distribution function for each group was plotted to identify the optimal discriminative threshold. Hazard ratios [HRs] comparing relapse rate, event-free survival, and overall survival between subgroups were calculated using univariate and multivariate Cox models. Finally, we did a sensitivity analysis to confirm the utility of the profile in key patient subgroups.
We did all survival analyses using Stata 14.0 and we did all subsequent analyses using R 3.4.4 (appendix p 24).

Role of the funding source
The funder of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report.

Results
Data from 456 patients from the UKALL97/99 cohort and 725 patients from the UKALL2003 cohort were analysed (figure 1) and a summary of the patient characteristics of both cohorts is shown in table 1. Using the UKALL2003 cohort, we evaluated the prognostic effect of previously published cytogenetic risk factors. We did not observe any significant correlation with any of the three survival endpoints examined and the modal chromosome number (appendix pp [20][21]. Six trisomies-+3, +5, +7, +17, +18 and +20-were associated with outcomes (appendix p 10). Trisomies of 5 and 18 were significant for all three endpoints, whereas trisomies of 7, 17, and 20 were significant for only one endpoint each, event-free survival, relapse rate, and overall survival, respectively. Among UKALL2003 patients, double trisomies and triple trisomies were associated with a significantly better relapse rate but not event-free survival and overall survival compared with other patients with high hyperdiploidy (HRs for relapse rate 0·51 [95% CI 0·28-0·91]; p=0·024 and 0·38 [0·19-0·77]; p=0·0070 for double trisomies and triple trisomies, respectively; appendix p 20). Like Heerema and colleagues, 17 we identified five distinct clusters of chromosomes (appendix p 13), which mapped closely to the five groups reported by Heerema and colleagues, 17 supporting the notion that chromosomes are gained nonrandomly and within groups. Moreover, we showed that the risk of relapse varied by cluster (appendix p 21). Specifically, group II (HR 3·16 [95% CI 1·32-7·54]; p=0·0093) and group V (3·64 [1·46-9·05]; p=0·0054), which comprised 130 (18%) and 91 (13%) cases, respectively, showed a significant increase in relapse rate compared with group I.
Next, we sought to identify the optimal number of chromosomes required to maximise prediction. Using the discovery cohort (UKALL97/99), we generated all possible combinations of up to six gained chromosomes and compared them using the C-index as a measure of the prediction power of the profile. The prediction performance of the models increased with each additional chromosome up to four chromosomes, after which no additional benefit was gained from adding more chromosomes (appendix p 5). We found further support for this result using Mallows C p statistic (appendix p 14).
We explored which combination of four chromosomes provided the optimal information for predicting outcome. Using the discovery cohort, we used univariate Cox models to identify five chromosomes (5, 11, 17, 18, and 20) as being the most informative (appendix p 6). We used multivariate stepwise Cox regression modelling to produce a final model that comprised four chromosomes-5, 17, 18, and 20. We validated this result using the Bayesian information criterion (appendix p 15). Correlation network analysis (appendix p 16) and volcano plot (appendix p 17) supported that these chromosomes had the greatest prognostic impact.
Therefore, the optimal set of chromosomes for predicting relapse was 5, 17, 18, and 20. The good risk profile comprised patients with both +17 and +18 together or patients with either +17 or +18 coupled with an absence of +5 or +20 (figure 2A). The remaining patients were classified as having a poor risk profile and were characterised either by the absence of +17 or +18 or the presence of just one in combination with +5 or +20. Overall, the high hyperdiploid good risk cluster comprised 373 (82%) of 456 patients in the discovery cohort whereas the high hyperdiploid poor risk cluster comprised 83 (18%) of 456 patients. We found no correlation between high hyperdiploid risk group (good risk vs poor risk) and age, sex, or white cell count (table 1). In the discovery cohort, patients with a high hyperdiploid poor risk profile had a significant increase in relapse rate (HR 2·50 [95% CI 1·51-4·14]; p<0·0001) and a significant decrease in event-free survival (2·04 [1·27-3·30]; p=0·0030), but no difference in overall survival (1·82 [0·93-3·55]; p=0·080; table 1) compared with those with a good risk profile.
We validated this risk profile using patients treated in the UKALL2003 trial. The relative size of the high hyperdiploid good risk and high hyperdiploid poor risk groups were similar in the validation and discovery cohorts (82% vs 80% and 18% vs 20% in UKALL97/99 and UKALL2003, respectively; table 1). As in the discovery cohort, we found no correlation between high hyperdiploid risk group (good risk vs poor risk) and age, sex, or white cell count in the validation cohort (table 1) figure 2B-D). MRD is one of the key risk factors used to direct therapy in patients with childhood acute lymphoblastic leukaemia. Therefore, we examined the interaction of MRD and the risk profile in UKALL2003. The distribution of patients across different MRD categories was similar between the two high hyperdiploidy risk groups despite the difference in outcome (table 2). However, for most of the MRD categories examined, including the 0·01% threshold used to stratify patients in the trial, patients in the high hyperdiploid poor risk group had an inferior outcome (table 2). We previously showed that the optimal prognostic MRD threshold varies by genetic risk group. 18 Threshold analysis 22 revealed MRD of 0·03% to be the optimal threshold for patients with high hyper diploid acute lymphoblastic leukaemia in UKALL2003 (appendix p 18). Stratifying patients by this MRD threshold and the UKALL high hyperdiploid profile revealed that both risk factors were highly informative for predicting relapse (table 2). Importantly, patients with MRD less than 0·03% who were also in the UKALL high hyperdiploid good risk profile group accounted for 377 (60%) of 632 patients with high hyperdiploidy, with a relapse rate of 4% (95% CI 2-6), an event-free survival rate of 95% (92-97), and an overall survival rate of 98% (96-99; table 2; appendix p 9).
Patients classified as slow early responders were treated on the high-risk protocol (regimen C) in UKALL2003. Slow early responders had a poorer outcome in terms of relapse rate compared with rapid early responders (19% [95% CI 11-32] vs 6% [4][5][6][7][8]; p=0·0003). Slow early responders with a high hyperdiploid good risk profile had a significantly lower relapse rate than did slow early responders with a high hyperdiploid poor risk profile (12% [95% CI  vs 46% ; p=0·0010; table 2). In UKALL2003, patients with MRD less than 0·01% were treated on the standard regimen (A or B) and received one or two delayed intensification blocks. The excellent outcome of patients with a high hyperdiploid good risk profile was maintained even when they were treated with the least intensive therapy (table 2). Patients with MRD of 0·01% were randomly assigned between staying on regimen A or B and regimen C or, if they had other risk factors, were assigned regimen C. The UKALL high hyperdiploid profile retained its prognostic effect among patients treated on these different pathways. The relapse rate for patients with a high hyperdiploid good risk profile  To determine the best risk-based definition of high hyperdiploidy, we compared the UKALL high hyperdiploid good risk group with the triple trisomy and double trisomy groups used by the COG in the USA (table 3). The UKALL high hyperdiploid good risk profile identified a cohort of patients with high hyperdiploidy with similar demographics, responses, and outcomes, but which accounted for a much higher proportion of the entire high hyperdiploid population as defined by modal chromosome number (table 3). As the UKALL high hyperdiploid good risk profile was considerably larger than the proportions of patients with triple trisomy and double trisomy, it captured a higher proportion of relapses (table 3). However, most of these relapses were standard risk and hence had a high chance of salvage. 9 47 relapses occurred in the UKALL2003 high hyperdyploid cohort and five (11%) were classified as high risk. 9 Among all patients with high hyperdiploidy, Cox regression analysis adjusted and unadjusted for MRD confirmed that the UKALL high hyperdiploid good risk profile was more discriminatory than both triple trisomy and double trisomy for predicting outcome (table 3). Both C-index and AUC measurements were higher for the UKALL high hyperdiploid profile than for triple trisomy and double trisomy (table 3; appendix pp 7-8).

Discussion
High hyperdiploidy is the most prevalent genetic subgroup of B-cell positive acute lymphoblastic leukaemia. 2,4,15 Although associated with a good outcome, high hyperdiploid relapse remains a major clinical problem. Many clinical trials now use high hyperdiploidy either alone or in combination with MRD to assign patients to risk groups. The original definition of high hyperdiploidy was based on the number of chromosomes, using 51 chromosomes as the threshold. 24 Currently, the optimal definition of the good risk high hyperdiploid
We propose that clinical trials that seek to define a good risk subgroup based on high hyperdiploidy should use our profile rather than rely on counting chromosomes. We estimate that around 80% of patients with high hyperdiploid acute lymphoblastic leukaemia have the high hyperdiploid good risk profile; therefore, around 25% of all patients with B-cell positive acute lymphoblastic leukaemia will fall into this category. The UKALL high hyperdiploid good risk group was larger than both the triple trisomy and double trisomy groups and yet had a similar good outcome. The outcome for the UKALL high hyperdiploid good risk group was similar to that reported for patients with ETV6-RUNX1-positive acute lymphoblastic leukaemia treated on the same trial. 18 By contrast, the outcome of the remaining patients with high hyperdiploidy was more akin to that reported for patients with intermediate risk cytogenetics. 18 Using the high hyperdiploid good risk profile to define good risk high hyperdiploidy, rather than modal chromosome number, would define a group of patients whose predicted outcome was better than a group defined by model chromosome number.
Even though the UKALL high hyperdiploid poor risk group accounted for just 20% of patients with high hyperdiploidy, it captured nearly half of those who went on to have high-risk relapse. The outcome of patients with high-risk relapse is poor, so it is important that such patients avoid any treatment reduction during frontline therapy. Applying the UKALL high hyperdiploid profile prospectively could ensure that patients at greatest risk of a high-risk relapse are assigned to the intermediate risk treatment group and avoid treatment reduction interventions or randomisations.
We previously reported that end of induction MRD was log-normally distributed and that different genetic subtypes had different distributions with distinct optimal MRD thresholds. 18 As a result, we identified that among patients with high hyperdiploidy, 0·03% MRD was the optimal threshold for patients with high hyperdiploidy rather than the 0·01% cutoff used to direct therapy. The outcome of UKALL high hyperdiploid poor risk patients did not vary much by MRD, especially given the relatively modest number of patients in each category, and eventfree survival rates were always less than 80%. However, two small overlapping very high-risk groups were identified, namely UKALL high hyperdiploid poor risk patients with MRD of 0·03% or greater or slow early responders. Patients with a UKALL high hyperdiploid good risk profile and MRD less than 0·03% had an excellent outcome. Hence the overall effect of MRD among UKALL high hyperdiploid good risk patients was borderline. Many protocols now classify patients with MRD of 5% or greater as refractory and treat them as very high risk. 26 In this study, very few patients with a high hyperdiploid good risk profile had MRD of 5% or greater and their relapse rate was high, so these patients should be treated as refractory like other patients with MRD of 5% or greater. Further delineation of the high hyperdiploid good risk profile by MRD had a marginal effect in terms of the main endpoints. However, there could be some scenarios where such delineation is required (eg, some treatment reductions), and in such scenarios 0·03% would be the most appropriate threshold.
The UKALL high hyperdiploid profile builds on much of the previous evidence of outcome heterogeneity in patients with high hyperdiploid acute lymphoblastic leukaemia. We previously reported that +18 was associated with a good outcome. 15 Additionally, +17 is part of the triple trisomy used by the COG to define low risk high hyperdiploid acute lymphoblastic leukaemia. 27 Both +5 and +20 have been associated with a poor outcome in paediatric and adult patients with high hyperdiploid acute lymphoblastic leukaemia. 6,28 The four chromosomes driving our profile map to three of the four high hyperdiploid chromosome sets proposed by Heerema and colleagues; 17 hence, our profile might represent a simplification of their model. One hypothesis to explain the good outcome of patients with high hyperdiploidy is that high hyperdiploid cells preferentially take up methotrexate. Although the functional consequences of high hyperdiploidy itself remain to be fully elucidated, differential expression of genes on the trisomic chromosomes is the dominant theory. 29 This idea would support the concept that specific patterns of trisomies could influence the response to therapy and hence prognosis. Further studies are needed to understand the functional consequences of the different combinations of trisomies within high hyperdiploidy.
There are a few limitations to our study. The extensive analysis in this study required access to two large datasets. Hence, our discovery dataset comes from the pre-MRD, era but this fact was mitigated to some degree by using a contemporary dataset from a MRD-based protocol. Also, we had to rely on cytogenetic data derived from classical karyotyping. Bone marrow samples can sometimes produce poor quality metaphases, which makes identifying chromosomes, particularly smaller chromosomes, challenging. However, comparing the distribution of gained chromosomes in high hyperdiploidy determined by cytogenetics and single nucleotide polymorphism (SNP) array revealed no differences. 15,29 Additionally, 53% of patients had both +17 and +18 and our profile automatically assigns these patients to the good group. Patients with either +17 or +18 are also assigned to the good risk group unless they have +5 or +20. Our data suggest that +5 and +20 are rare and SNP array studies confirm this observation. 29 Therefore, despite the limitations of using cytogenetic data, we are confident that the misclassification of patients in this study is low. Moving forward, widespread use of SNP arrays during the diagnosis of acute lymphoblastic leukaemia will ensure the accurate identification of all gained chromosomes in patients with high hyperdiploidy. The application of this novel profile to additional datasets will provide further evidence regarding its robustness and valuable information regarding its clinical usefulness across different treatment protocols.
The UKALL-HeH profile is simple to compute from a full karyotype or, preferably, SNP array profile. Although we acknowledge the difficulty of recognising the smaller chromosomes, our profile can also be readily determined using centromere or locus-specific fluorescence in-situ hybridisation in the event of a normal, failed, or incomplete karyotype. Although DNA index can be used to determine ploidy, it cannot ascertain which chromosomes are gained or lost, so the UKALL high hyperdiploid profile cannot be determined by DNA index. The most widely used trisomy profile in patients with high hyperdiploid acute lymphoblastic leukaemia is the tiple trisomy developed by the COG. With respect to outcome metrics, the UKALL high hyperdiploid profile matches or outperforms triple trisomy. The major advantage of our new profile is that the good risk group captures a much larger group of patients with a low risk of relapse, while defining a smaller poor risk group that is more likely to have high-risk relapses. Therefore, the profile offers advantages in the clinic in term of risk stratification.
In conclusion, by performing, to our knowledge, the most comprehensive analysis to date of risk factors in childhood high hyperdiploid acute lymphoblastic leukaemia, we identified a robust and clinically useful profile on the basis of the trisomic status of four chromosomes. It is reassuring that the constituent trisomies (+5, +17, +18, and +20) have all previously been proposed as risk factors in high hyperdiploidy and provide a framework for further investigations to elucidate precisely which genes are determining treatment response. Our risk profile outperformed previously reported risk profiles in high hyperdiploidy in terms of prediction accuracy. The prognostic effect of this profile is independent of MRD but can be refined by its integration, supporting the concept that integrating key risk features in childhood acute lymphoblastic leukaemia will improve risk stratification. If high hyperdiploidy is to be included in clinical trial stratification criteria to identify patients eligible for treatment de-intensification, it is crucial to remove high hyperdiploid poor risk patients from this group, as they have an intermediate prognosis and should be considered for treatment intensification. Therefore, we propose that the UKALL high hyperdiploid good risk profile is superior in defining good risk high hyperdiploidy compared with counting chromosomes.
Contributors AE and AVM conceived and designed the study. AE, AVM, and CJH collected and assembled the data. AE and AVM analysed and interpreted the data. AE, CJH, and AVM provided financial support. CJH and AVM provided administrative support. CJH, AVM, AV, and JM provided study materials or patients. AE and AVM wrote the manuscript. All authors approved the final manuscript, were responsible for the decision to submit for publication, and had full access to all the data in the study. AE and AVM accessed and verified the data.

Declaration of interests
We declare no competing interests.

Data sharing
The National Cancer Research Institute Children's Cancer and Leukaemia Group Leukaemia Subgroup will consider data sharing requests from researchers investigating questions regarding the biology and treatment of acute lymphoblastic leukaemia. Data, including individual patient data, and study details will be released if the project is deemed pertinent. Initial requests should be directed to Prof Anthony Moorman (anthony.moorman@newcastle.ac.uk).