Introduction

Allogeneic hematopoietic stem cell transplantation (allo-HSCT) is a curative treatment modality for adults with acute myeloid leukemia (AML). However, allo-HSCT is still associated with a high long-term non-relapse mortality rate (NRM). NRM may be attributed to infections, graft-versus-host disease (GVHD), organ toxicity, and secondary malignancies (SMs). One of the most popular myeloablative conditioning (MAC) regimens is CYTBI (twice daily 2 Gy fractions of total body irradiation over 3 days to a total dose of 12 Gy followed by intravenous cyclophosphamide 60 mg/kg × 2 days) [1]. TBI-based myeloablative conditioning (MAC 8–12 Gy) is usually preferred in fit and younger patients (< 50 years of age) with high-risk mutations or cytogenetic risk factors, for whom the combination of TBI and chemotherapy potentiates therapeutic efficacy [2]. Reduced intensity conditioning (RIC) combines lower doses of chemotherapy and/or TBI (doses of ≥ 4 Gy and < 8 Gy) to reduce toxicity while maintaining anti-leukemic effects [2]. It is still unclear how TBI-based conditioning compares to non-TBI-based protocols in patients with AML with respect to long-term mortality. In this retrospective study, we therefore analyzed the long-term mortality (including NRM) of patients after TBI-based conditioning applying a standardized fractionated TBI technique, as compared to non-TBI-based conditioning (melphalan-based protocols) focusing on AML patients and 1st allo-HCST.

Patients and methods

Data collection

We retrospectively compared long-term outcomes following TBI-based and non-TBI-based conditioning in patients with AML who received their 1st allo-HSCT at the Department of Hematology of the University Hospital Regensburg between 1999 and 2020. The eligibility criteria for this retrospective analysis included adult patients with primary or secondary AML who underwent their 1st allo-HSCT from matched sibling donors (MSD), matched unrelated donors (MUD), mismatched unrelated donors (MMUD), or haploidentical/mismatched related donors (MMRD) following TBI-based or non-TBI-based conditioning. Non-TBI-based conditioning included FBM (fludarabine, BCNU, melphalan), FTM (fludarabine, thiotepa, melphalan), and FM (fludarabine, melphalan). Source of stem cells were peripheral blood or bone marrow. The exclusion criteria were cord blood transplantation and non-myeloablative (NMA) conditioning (n = 11), previous autologous transplantation (n = 18), and non-melphalan-based protocols (n = 115). In summary, 339 patients were included in this analysis. The choice of conditioning regimen was based on the oncologists’ discretion and dependent on patient age, disease risk, and/or presence of comorbidities. Clinical data were extracted from the medical charts of the Departments of Hematology and Radiation Oncology, University Hospital Regensburg. Transplantation variables included patient age at the time of 1st allo-HSCT, sex, diagnosis, Karnofsky performance score (KPS), hematopoietic cell transplantation-comorbidity index (HCT-CI) as described by Sorror et al. [3], 2017 European Leukemia Net (ELN) genetic risk stratification as described by Döhner et al. [4], disease status before allo-HSCT, stem cell source, intensity of conditioning regimen, recipient and donor characteristics (donor age, HLA compatibility, sex match, cytomegalovirus serostatus, ABO blood group compatibility), GVHD prophylaxis, and the use of antithymocyte globulin (ATG). Variables related to outcome were the cumulative incidences of relapse (CIR), NRM, grade II–IV aGVHD (acute graft-versus-host disease), cGVHD (chronic graft-versus-host disease requiring immunosuppressive treatment), OS, PFS, and causes of death including mortality caused by secondary malignancies (SMs). The Clinical Cancer Registry at the Tumor Center Regensburg and local Viability Statistics Registration Offices were contacted to confirm the survival status before statistical analysis for patients with long site visit intervals resulting in data completeness of 100%. Data closing was April 2021. The local Ethics Board of the University of Regensburg approved this study (Number 20–1810-101).

TBI

From 1999 to 2013, two Siemens Primus linear accelerators (Siemens Medical Systems, Inc., Concord, CA) were used for TBI, and from 2013 to 2020 two linear accelerators of type Elekta Synergy™ with an Agility™ head (Elekta Ltd, Crawley, UK). We proved clinically good dose distributions and similar parameters with both linear accelerators [5]. All patients received 6 megavoltage (MV) photon beams. We adopted a twice-daily fractionation and a minimum of 6 h between the fractions. Patients were lying down on a couch at the floor level in supine and prone positions to extend the source-to-skin distance. A plate of Makrolon® polycarbonate of 1 cm thickness was placed on a stand above of the patient to neutralize the skin sparing by the buildup effect. The low diameter in the neck region was compensated by using a bolus of plastic modeling mass. Eight rotational arcs were used per patient position. The average time to deliver each fraction was 50–60 min per side (supine and prone), and the average dose rate to the total body was 4 cGy/min. Additional fixed beams were used in cranial and caudal direction to compensate for the effects of inverse square variation with increasing distance. Two individual lung shields of MCP96 of calculated thickness were designed in case of doses > 8 Gy to reduce the total dose to the center of the lung to 3.5 Gy in supine and prone positions (total dose of 7 Gy). Radio-oncologists contoured two individual lung blocks for each patient on a CT scan with a 1 to 2 cm margin between the edge of the lung on the CT film and the edge of the block. Lung blocks were tailored to avoid shielding of the vertebrae. MV imaging verified the shielding positions. Areas of the chest wall that were shielded by the blocks were supplemented once a day with electron beams to achieve the full dose to the thoracic walls. The electron fields delivered a supplemented dose of 5 Gy for 12 Gy regimens. In vivo dosimetry was used to verify the dose delivery on several points on the patient’s body, demonstrating the uniformity of the dose distribution [5].

Definitions and statistical endpoints

The primary endpoints were cumulative incidences of NRM with relapse considered a competing event. Secondary endpoints were cumulative incidences of relapse (CIR), grade II–IV aGVHD, cGVHD (requiring immunosuppressive treatment), PFS, OS, and causes of death including SMs. All times to the endpoints were calculated from the date of allo-HSCT (day 0). NRM was defined as death from any cause in the absence of prior relapse of the initial AML, with relapse considered a competing event. Relapse was defined as manifest hematologic relapse requiring treatment. Isolated mixed chimerism or molecular detection of minimal residual disease (MRD) not requiring intervention was not regarded as relapse. For CIR, death from NRM was counted as a competing event. Acute GVHD and cGVHD were defined according to described standard criteria [6, 7]. Acute GVHD was classified as clinically significant at grade II–IV aGVHD. The cumulative incidence of grade II–IV aGVHD was estimated considering death or relapse without grade II–IV aGVHD as a competing event. For the cumulative incidences of cGVHD (requiring immunosuppressive treatment), relapse or death without prior cGVHD (requiring immunosuppressive treatment) was counted as a competing event. PFS was defined from the date of allo-HSCT to the date of relapse, progression, or death from any cause. If patients were transplanted with active disease and did not reach complete remission after allo-HSCT, the date of relapse was defined as day 0. OS was defined as the time from allo-HSCT to the date of death by any cause. If a patient was event-free for all of the endpoints, the patient was censored at the last date of follow-up with confirmation of being event-free. To adjust for any potential bias derived from imbalanced patient characteristics between TBI-based and non-TBI-based conditioning, multivariable regression analysis was used. Covariates were ELN 2017 risk stratification, diagnosis, disease status, HCT-CI, patient age, conditioning intensity, KPS, donor type, graft source, sex match, donor age, donor/recipient CMV status, and the use of ATG. Standardized consensus definitions of hematopoietic recovery, graft failure, and donor chimerism were used [8]. Peripheral blood PCR-based chimerism analyses were performed on a regular schedule. Full donor chimerism was defined as 99% or greater donor chimerism. Patients were censored from the engraftment analysis if they died or had persistent leukemia/early relapse within the first 28 days after allo-HSCT. Causes of death are subdivided in a hierarchical manner with descending priority: AML, GVHD, infection, and other causes [9]. Deaths from AML include cases with disease progression or relapse after allo-HSCT. Deaths from GVHD include cases with acute and/or chronic GVHD on active treatment without infection and without evidence of disease progression or relapse after allo-HSCT. Deaths from GVHD include cases with active treatment of GVHD and documented infections contributing to death (GVHD is severe enough to cause death even if infection did not occur). Death from infection includes infections causing death without evidence of disease progression, relapse, or GVHD. Death from infection includes cases of infections with a history of GVHD that had resolved and was not treated at the time of infection. Infectious deaths are analyzed as total (bacterial, fungal, parasitic, viral, mixed, and unknown). Other causes of death include secondary malignancies, graft failures, accidents, suicides, hemorrhage, and thrombosis without evidence of disease progression or relapse after allo-HSCT, and without infection or GVHD.

Statistical analysis

Transplant-related characteristics for the TBI and non-TBI group are presented as median and interquartile range (IQR) for continuous variables and as absolute and relative frequencies for categorical variables. The Mann–Whitney U-test was used for comparisons of continuous variables, and the chi-square test of independence for categorical variables. The time-to-event endpoints CIR, NRM, grade II–IV aGVHD, and cGVHD were analyzed using uni- and multivariable Fine and Gray proportional hazard regression models to account for the respective competing events. The proportional hazard assumption of the Fine and Gray models was tested by using rescaled Schoenfeld-type residuals. PFS and OS were analyzed by uni- and multivariable Cox proportional hazard regression models. Hazard ratio (HR) and 95% confidence interval (95% CI) are presented as effect estimate. Cumulative incidence functions at fixed time points were compared using the method with Gaynor’s variance proposed by Chen et al. [10]. Median follow-up time was estimated by the reverse Kaplan–Meier method. All P-values were two-sided and P-values < 0.05 were considered significant. Statistical analysis was performed using SPSS 26.0 (SPSS Inc., Chicago, IL, USA) and R, version 4.1.2 (R Core Team. R: a language for statistical computing. 2014. The R Foundation for Statistical Computing, Vienna, Austria).

Results

Patient and transplantation characteristics

Table 1 summarizes patient, disease, and transplant characteristics of the included patients (n = 339). Patients received their 1st allo-HSCT for de novo/primary AML (n = 227) or secondary AML (n = 112) after TBI-based conditioning (n = 91) or non-TBI-based conditioning (n = 248) with peripheral blood (n = 314) or bone marrow (n = 25) as stem cell source. Median follow-up time of TBI patients was longer in comparison to patients of the non-TBI group (12.6 years vs. 6.7 years; P < 0.001). Patients of the TBI group were younger at the time of 1st allo-HSCT (median 41.6 years, IQR, 32.2–50.7) compared to patients of the non-TBI group (median 56.8 years, IQR, 48.9–63.0; P < 0.001), and had more de novo/primary AML (75.8% vs. 63.7%; P = 0.038) and a lower HCT-CI (score 0: 44.0% vs. 26.2%; P < 0.001). All conditioning regimens are summarized in Table 2.

Table 1 Transplant characteristics of TBI-based conditioning and non-TBI-based conditioning
Table 2 Conditioning regimens before 1st allogeneic hematopoietic stem cell transplantation

Engraftment and chimerism analysis

The TBI and non-TBI groups achieved an absolute neutrophil count (ANC) of > 500 cell/μL (ANC500) at median of 17.0 days (IQR, 14.0–20.0) and 16.0 days (IQR, 14.0–19.7) (P = 0.740). Median times for reaching a platelet count of 20,000/μL (PLT20,000) were 16.0 days (IQR, 13.0–21.0) for the TBI group and 18.0 days (IQR, 14.0–24.0) for the non-TBI group (P = 0.082). A total of 4 patients had primary graft failure: 2 patients (0.8%) of the non-TBI group and 2 patients (2.2%) of the TBI group (P = 0.293). One patient (1.1%) of the TBI group and 5 patients (2.0%) of the non-TBI group died before day 28 (P = 1.000) and were not evaluable for chimerism analysis on day 28. The remaining patients were analyzed regarding chimerism on day 28. The TBI group and non-TBI group showed no differences regarding full donor chimerism on day 28 (TBI 90.9%, non-TBI 93.4%; P = 0.475).

Time-to-event analyses for all endpoints

The causes of death (including mortality caused by SMs) after TBI-based conditioning and non-TBI-based conditioning are listed in Table 3. Relapse of AML was the most frequent cause of death for the overall population (47.6%), followed by NRM-GVHD (22.2%) and NRM-infectious deaths (19.0%). Two patients died due to SMs after TBI-based conditioning and 3 patients after non-TBI-based conditioning (Table 3).

Table 3 Causes of death including mortality caused by secondary malignancies after TBI-based conditioning and non-TBI-based conditioning

Cumulative incidence rates of clinical outcomes comparing TBI vs. non-TBI on specific time points are shown in Table 4.

Table 4 Cumulative incidence rates of clinical outcomes on specific time points

Figure 1 shows the estimates of the cumulative incidences of NRM and relapse (CIR) in a competing risk setting for both treatment groups (TBI vs. non-TBI).

Fig. 1
figure 1

Transplantation outcomes after TBI-based conditioning and non-TBI-based conditioning (FBM, FTM, FM): estimates of the cumulative incidences of non-relapse mortality (NRM, solid line) and relapse (CIR, dotted line) in a competing risk setting (TBI vs. non-TBI)

Figure 2 and Fig. 3 show the Kaplan–Meier estimates of OS and PFS for both treatment groups (TBI vs. non-TBI). OS and PFS were similar for both treatment groups.

Fig. 2
figure 2

Transplantation outcomes after TBI-based conditioning and non-TBI-based conditioning (FBM, FTM, FM): Kaplan–Meier estimates of overall survival (OS) by treatment group (TBI vs. non-TBI)

Fig. 3
figure 3

Transplantation outcomes after TBI-based conditioning and non-TBI-based conditioning (FBM, FTM, FM): Kaplan–Meier estimates of progression-free survival (PFS) by treatment group (TBI vs. non-TBI)

Table 5 summarizes the results of the univariable and multivariable analysis of clinical outcomes. TBI was no risk factor for NRM, CIR, PFS and OS in the multivariable regression models. Adverse ELN risk stratification translated into a higher CIR (HR, 2.48; 95% CI, 1.21–5.08) and a decreased chance of PFS (HR, 2.10; 95% CI, 1.33–3.33) and OS (HR, 1.81; 95% CI, 1.12–2.91) compared to favorable ELN risk stratification. Furthermore, advanced disease status (> second complete remission, CR2) negatively affected CIR (HR, 1.99; 95% CI, 1.18–3.36), NRM (HR, 2.52; 95% CI 1.46–4.35), PFS (HR, 2.95; 95% CI, 2.04–4.26), and OS (HR, 3.61; 95% CI, 2.44–5.33) compared to transplantation in first complete remission (CR1). Older patient age at the time of 1st allo-HCST translated into a higher risk of NRM (HR, 1.03; 95% CI, 1.01–1.06) and a lower chance of PFS (HR, 1.02; 95% CI, 1.00–1.03) and OS (HR, 1.02; 95% CI, 1.00–1.04). Patients with HCT-CI scores of 3 had a lower PFS (HR, 1.63; 95% CI, 1.13–2.35) and OS (HR, 1.49; 95% CI, 1.01–2.18) compared to patients with HCT-CI scores of 0. A female donor for a male recipient was a poor prognostic factor for NRM (HR, 2.15; 95% CI, 1.25–3.72) compared to other gender combinations. Patients with a Karnofsky performance status ≥ 80 had a higher chance of OS (HR, 0.60; 95% CI, 0.40–0.90) and PFS (HR, 0.55; 95% CI, 0.37–0.82) compared to patients with a Karnofsky performance status < 80 (Table 5).

Table 5 Univariable and multivariable analysis of clinical outcomes

TBI-based conditioning was no risk factor for grade II–IV aGVHD and cGVHD. Patient age at the time of transplantation had a significant impact on cGVHD (HR, 0.98; 95% CI, 0.96–0.99). Advanced disease at the time of 1st transplantation (> CR2, refractory/active AML) was associated with a lower risk for cGVHD (HR, 0.32; 95% CI, 0.17–0.61), but a higher risk for grade II–IV aGVHD (HR, 1.67; 95% CI, 1.06–2.62) compared to transplantations in CR1 (Table 5).

Discussion

This retrospective study analyzed the long-term outcome in patients after TBI-based and non-TBI-based conditioning focusing on AML patients and 1st allo-HSCT. The cumulative incidences of 2-year and 5-year NRM after TBI-based conditioning were 15% and 17%, respectively. Similar NRM after TBI-based conditioning has been observed by some multicenter studies containing TBI [11, 12]. Our findings indicate that long-term NRM is low after modern TBI-based conditioning and long-term outcome appears to be identical compared to non-TBI-based regimens. TBI-based conditioning was no risk factor for NRM including mortality caused by secondary malignancies which is one of the major concerns in long-term survivors after allo-HSCT. Our results are consistent with the results of Morsink et al. [13]. Morsink et al. [13] compared the results of high-dose (HD)-TBI-based (TBI 12 Gy or TBI 13.2 Gy) and non-HD-TBI-based MAC (busulfan/cyclophosphamide, busulfan/fludarabine, treosulfan/fludarabine ± TBI 2 Gy) among adults with AML who underwent a first allo-HSCT in the first or second morphologic remission. HD-TBI was not associated with different outcomes (relapse, RFS, OS and NRM) compared to non-HD-TBI conditioning.

TBI-based and non-TBI-based regimens had similar cumulative incidences of aGVHD and cGVHD; thus, TBI was no risk factor for GVHD. Our results show that the addition of ATG had no negative effects on CIR, NRM, and OS in the multivariable model [14]. In summary, relapse of AML remains the prime cause of transplant failure independent of the conditioning regimen.

Our study was underpowered to answer the question of whether patients had disadvantages regarding cataracts, thyroid diseases, or pulmonary complications after TBI-based conditioning, or to draw a firm conclusion of the different TBI-based regimens (8 vs. 12 Gy) regarding efficacy. Nevertheless, the multivariable analysis did not associate TBI-based regimens with any of the outcome variables analyzed.

Unfortunately, literature shows a considerable variability in planning, prescription, and treatment with TBI [15]. Variations involve treatment techniques, methods of fractionation, dose rates, methods of dosimetry, and lung shielding. Dose rates can vary from 2.25 to 37.5 cGy/min and photon energy from 6 to 25 MV [15]. Overall literature data support for the use of lung shielding and dose rates of 7.5 cGy/min or less rather than 15 cGy/min, as well as a twice-daily fractionation to reduce pulmonary complications and toxicity to normal tissue [16, 17]. Reasons for our favorable long-term NRM after TBI-based conditioning may be the superior lung shielding in case of doses of > 8 Gy and the consistent average dose rate of 4 cGy/min to the total body, as well as the twice-daily fractionation. However, the improved NRM is not solely based on optimized TBI technologies, but also results from a selection bias, as TBI-conditioned patients were overall younger and in better health condition. This risk-based patient selection by the transplant physicians was defined in institutional guidelines and in line with recommendations and clinical practice at most transplant centers. New TBI technologies, such as total marrow irradiation (TMI) in combination with volumetric modulated arc therapy (VMAT), may further improve results by delivering targeted forms of TBI [18, 19].

This study is limited by its retrospective nature and the comparatively small number of patients conditioned with TBI. Moreover, the selection bias to treat younger patients with fewer comorbidities and high-risk cytogenetics with TBI, and older patients with non-TBI-based regimens, prohibits matched pair analyses. The primary strength of the present study is the consistent delivery of TBI over 20 years with no major variations of other variables. Additionally, data completeness was 100% through active monitoring of all transplanted patients.

Conclusions

The findings indicate that long-term NRM is low after modern TBI-based conditioning and outcome appears to be identical compared to non-TBI-based regimens. Therefore, TBI-based conditioning can be considered part of standard of care for AML patients eligible for 1st allo-HSCT.