A Comprehensive Model to Predict Severe Acute Graft-Versus-Host Disease after Haploidentical Hematopoietic Stem Cell Transplantation


 Background: Acute graft-versus-host disease (aGVHD) remains the major cause of early mortality after haploidentical related donor (HID) hematopoietic stem cell transplantation (HSCT). We aimed to establish a comprehensive model which could predict severe aGVHD after HID HSCT.Methods: Consecutive 470 acute leukemia patients receiving HID HSCT according to the protocol registered at https://clinicaltrials.gov (NCT03756675) were enrolled, 70% of them (n = 335) were randomly selected as training cohort and the remains 30% (n = 135) were used as validation cohort. Results: The equation was as follows: Probability (grade III-IV aGVHD) = 1/1 + exp(-Y), where Y= –0.0288 × (age) + 0.7965 × (gender) + 0.8371 × (CD3+ / CD14+ cells ratio in graft) + 0.5829 × (donor/recipient relation) – 0.0089 × (CD8+ cell counts in graft) – 2.9046. The threshold of probability was 0.057392 which helped separate patients into high- and low-risk groups. The 100-day cumulative incidence of grade III-IV aGVHD in the low- and high-risk groups was 4.1% (95%CI, 1.9%–6.3%) versus 12.8% (95%CI, 7.4%–18.2%) (P = 0.001), 3.2% (95%CI, 1.2%–5.1%) versus 10.6% (95%CI, 4.7%–16.5%) (P = 0.006), and 6.1% (95%CI, 1.3%–10.9%) versus 19.4% (95%CI, 6.3%–32.5%) (P = 0.017), respectively, in total, training, and validation cohort. The rates of grade III-IV skin and gut aGVHD in high-risk group were both significantly higher than those of low-risk group. This model could also predict grade II-IV and grade I-IV aGVHD. Conclusions: We established a model which could predict the development of severe aGVHD in HID HSCT recipients.


Introduction
Allogeneic hematopoietic stem cell transplantation (allo-HSCT) is the most important curative method for acute leukemia (AL), which can signi cantly improve the long-term survival [1,2]. Human leukocyte antigen (HLA) haploidentical related donors (HIDs) have become one of the most important donors, which accounted for the proportion at 42% among allo-HSCT from family donors in Europe [3], and accounted for the proportion at 60% among all of the allo-HSCT in China [4].
Although many strategies (e.g., antithymocyte globulin [ATG] and post-transplant cyclophosphamide [PTCy]) are used to prevent acute graft-versus-host disease (aGVHD), it is still inevitable [5]. Only half of aGVHD patients could achieve durable responses to initial corticosteroid therapy [6], and there is no standard therapy for steroid refractory aGVHD and the survival among these patients is poor [7]. Thus, severe aGVHD remains the major cause of early mortality after HID HSCT [8][9][10]. An early-warning method for severe aGVHD can help to provide risk-strati cation directed prophylaxis for aGVHD and signi cantly improve the survival of patients receiving HID HSCT.
Thus, in the present study, we aimed to establish a comprehensive model which could predict the severe aGVHD after HID HSCT.

Study Design
Consecutive AL patients receiving HID HSCT between January 21, 2020 and May 31, 2021 at Peking University, Institute of Hematology (PUIH) were enrolled. The end point of the last follow-up for all survivors was November 11, 2021. A total of 67 patients had been previously reported by Ma et al. [20], and all of them were further followed-up. All patients were treated according to the protocol registered at https://clinicaltrials.gov (NCT03756675). Informed consent was obtained from all patients or their guardians. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Institutional Review Board of Peking University People's Hospital.

Transplant regimens
Major conditioning regimen consisted of cytarabine, busulfan, cyclophosphamide, and semustine [21,22]. Twelve patients received total body irradiation (TBI)-based conditioning regimen. G-PB harvests were administered to the recipients on the same day of collection [20]. ATG, cyclosporine A, mycophenolate mofetil, and short-term methotrexate were administered to prevent GVHD. Particularly, patients with CRDs or MDs could receive low dose cyclophosphamide after transplantation based on ATG for GVHD prophylaxis (Supplementary methods) [23].

Evaluation of graft composition
The methods for graft composition evaluation were showed in Supplementary methods [18, 24].

Building machine learning models
Our method consisted of three steps: selecting features, building models, and nding the optimal threshold ( Fig. 1 and Supplementary methods) 2.5.1 Backward feature selection strategy We randomly selected 70% of the entire population (n = 335) as the training cohort, the remains 30% were used as validation cohort (n = 135). For primary outcome (i.e., grade III-IV aGVHD), the model building steps were performed in the training cohort and validated in the validation cohort. The sensitivity, speci city, area under curve score, and accuracy score were identi ed in both the training and validation cohort.
We used feature selection techniques to select the predictive variables (Supplementary methods) [28]. By doing this, we could reduce the complexity of machine learning model, while also improve the generalizability. We set age and gender to be obligate variables in the machine learning model. For other variables, we selected top-3 signi cant variables using backward feature selection strategy. In detail, we started with all variables including age and gender. At each iteration, we removed the least signi cant variable (variable with the highest p-value) except age and gender. Aside from the involved variables, we also added an extra constant variate to make the feature selection more robust. The selection was realized using generalized linear models with binomial exponential family distribution of statsmodels v0.13.0 statistical models module with Python 3.8 based on anaconda3 development platform [29].

Building models
We used generalized linear models with binomial exponential family distribution to realize logistic regression models, which were equivalent models. Aside from the selected variables, we added an extra constant variate for the predicted model to make the machine learning models stronger. We used statsmodels v0.13.0 with Python 3.8 to build the models based on anaconda3 development platform. The model parameters were set to be the defaults [30][31][32].

Finding the optimal threshold
Logistic regression model produced values between 0 and 1, which could be treated as the probabilities to be positive prediction. We needed to determine the threshold of output positive predictions (1) or negative predictions (0). In detail, we drew Receiver Operating Characteristic (ROC) curves [33] and calculated the g-mean for each threshold [34]. The best threshold corresponded to the largest g-mean. The g-mean was calculated as sqrt(tpr×(1-fpr)), where tpr represented true positive rate, fpr represented false positive rate, under a given threshold.

Evaluation for model
ROC-AUC was de ned as the area under the curve of the true positive rate versus the false positive rate at various thresholds ranging from zero to one. Confusion matrix was a summary table of predictions. In this paper, the confusion matrix was of two-by-two shape. The diagonal showed the count values of correct predictions, while the others showed the count values of incorrect predictions. Besides, we also normalized the count values by the number of True Label (Outcome) or the number of Predicted Label (Prediction). To better visualize the matrix, we colored the values with Blues colorbar.

Statistical methods
In the present study, the primary outcome was grade III to IV aGVHD. The secondary outcomes included grade II to IV aGVHD, grade I to IV aGVHD, relapse, non-relapse mortality (NRM), leukemia-free survival (LFS), and overall survival (OS).
Mann-Whitney U-test was used to compare continuous variables, χ 2 and Fisher's exact tests were used for categorical variables. The Kaplan-Meier method was used to estimate the probability of LFS and OS. Competing risk analyses were performed to calculate the cumulative incidence of aGVHD, relapse, and NRM [35]. Testing was two-sided at the P < 0.05 level. Statistical analysis was performed on SPSS 22.0 software (SPSS, Chicago, IL), and R software (version 4.0.0) (http://www.r-project.org).

Patient characteristics
A total of 470 patient were enrolled, and the characteristics were all comparable between training and validation cohort ( Table 1). All patients achieved neutrophil engraftment and the median time from HSCT to neutrophil engraftment was 12 days (range, 9-28) days. Four hundred and fty-eight (97.4%) patients achieved platelet engraftment and the median time from HSCT to platelet engraftment was 13 days (range, 7-144) days, respectively.
In the training cohort, the sensitivity, speci city, area under curve score, and accuracy score were 0.632, 0.680, 0.685, and 0.678, respectively. ROC curve for the model and confusion matrix is shown in Fig.  2A and Table S2. In the validation cohort, the sensitivity, speci city, area under curve score, and accuracy score were 0.500, 0.760, 0.673, and 0.733, respectively. ROC curve for the model and confusion matrix is shown in Fig. 2B and Table S3.

Validation of the predicted model in other clinical outcomes after HSCT
In total population, the probabilities of relapse, NRM, LFS, and OS at 100 days after HID HSCT were all comparable between the low-and high-risk groups in the total population ( Fig. S2A-D).

Discussion
In the present study, we established a predicted model for grade III to IV aGVHD including patient age, gender, donor/recipient relation, CD8+ T cell count, and CD3+/CD14+ cells ratio in the graft in training cohort, which was veri ed in validation and total cohorts. To the best of our knowledge, we rstly established a comprehensive model which can effectively predict severe aGVHD in allo-HSCT recipients.
Donor/recipient relation is an important risk factor for aGVHD after HID HSCT [36]. Wang et al. [13] reported that MDs showed the higher risk of aGVHD among immediate relative donors. Mo et al. [14] reported that the risk of aGVHD was comparable between CRDs and MDs groups, which was signi cantly higher than the paternal donors group. Wang et al. [23] reported that low-dose PTCy combined with ATG could help to decrease incidence of grade III to IV aGVHD in HID HSCT recipients with CRDs and MDs, which was comparable with HID HSCT recipients with immediate relative donor other than MDs [37]. However, these studies included patients with myelodysplastic syndromes and chronic myeloid leukemia which were different from the present study. We observed that in the disease-speci c population of patients with AL, although some patients receiving low-dose PTCy based on ATG for GVHD prophylaxis, CRDs and MDs were still associated with the development of severe aGVHD.
We observed that CD3+/CD14+ cells ratio in the graft was another important variable in the model. T cell plays a critical role in the pathogenesis of aGVHD [38], and depleting T lymphocytes could signi cantly reduce the risk of GVHD [39,40]. Czerw et al. [15] reported that a higher CD3+ cells count in the graft was associated with the increased risk of aGVHD. On the other hand, the count of CD14+ cells which mainly presented the monocyte in the graft might also be associated with aGVHD [41]. Similarly, Liu et al. [19] also observed that CD3+/CD14+ cells ratio in G-PB could predict aGVHD in patients receiving HID HSCT.
The counts of CD8+ T cells in the graft was also included in the predicted model. CD8+ T cells were thought critical for aGVHD pathogenesis [38, 42,43]. CD8+ T cells were more abundant than CD4+ T cells in the blood of mice after aGVHD induction, and the severity of aGVHD was associated with the in ltration of CD8+ T cells in target tissues [44]. Several studies reported that CD8+ T cells count in the graft could predict the aGVHD in patient receiving ISD HSCT [16], URD HSCT [45], and umbilical cord blood transplantation [46].
We observed that our predict model was associated with grade III to IV and grade II to IV gut aGVHD after HID HSCT, which suggested that routine GVHD prophylaxis methods were not su cient to prevent severe gut aGVHD in high-risk patients. Severe gut aGVHD is di cult to treat and is the greatest cause of GVHDrelated mortality [47]. Thus, our predicted model could help to direct more intense prophylaxis for gut aGVHD in high-risk patients.
Our comprehensive model could predict severe aGVHD, however, it was not associated with 100-day mortality and survival. Several researches observed a signi cantly higher incidence of NRM and a lower incidence of OS and LFS in patients with severe aGVHD [48][49][50][51][52]. In contrast, the 100-day cumulative incidence of NRM after HID HSCT was only 1.5% in the present study. This may be because PUIH is the largest transplant center in China and has rich experience in GVHD therapy [53][54][55]. Thus, GVHD related death was not the leading cause of death in PUIH [56]. This could not represent the level of GVHD therapies in other transplant centers, for example, Deng et al. [57] reported that the mortality of grade III to IV aGVHD patients was as high as 81.0% in their hospital in China. In addition, it could not alter the fact that patients with severe aGVHD would experience great pain and economic burden.
The present study had some limitations. Firstly, the model was not associated with the development of grade III to IV liver aGVHD after HID HSCT, which might be due to the small sample of severe liver aGVHD in the present study. However, we observed that the rate of grade I to IV liver aGVHD in high-risk group was higher than that of low-risk group. Secondly, although we veri ed the model successfully in the validation cohort, this was a single-center study and the sample of validation cohort was relatively small. Thus, the model should be further evaluated by independent cohorts in multicenter studies. Lastly, we did not monitor plasma cytokines (e.g., interleukin [IL]-2) and biomarkers (e.g., ST2, REG3α, TNFR1, and IL-2Rα) [58, 59], which may further improve the e cacy of our predicted model.

Conclusions
We established a comprehensive model which could predict the development of severe aGVHD in HID HSCT recipients. This was the rst predicted model for severe aGVHD which can be popularized easily, can help to provide risk-strati cation directed aGVHD prophylaxis, and may further decrease the risk of severe aGVHD in HID HSCT recipients. In future, prospective, multicenter studies can further con rm the e cacy of our predicted model.

Ethics approval
The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Institutional Review Board of Peking University People's Hospital.

Consent to Participate
Informed consent was obtained from all individual participants or their guardians included in the study.

Consent to Publish
Not applicable.

Availability of data and materials
The datasets generated during the analysis of the current study are available from the corresponding author on reasonable request.

Competing Interests
The authors have no relevant nancial or non-nancial interests to disclose.     The 100-day cumulative incidence of grade III to IV aGVHD in the low-and high-risk groups in total (A), training (B), and validation (C) cohort, and (D) the rates of grade III to IV aGVHD of each organ in the lowand high-risk group. The association between predicted model and other GVHD endpoint in total population. (A) The 100-day cumulative incidence of grade II to IV aGVHD in the low-and high-risk groups; (B) The rate of grade II to IV aGVHD of each organ in the low-and high-risk groups. (C) The 100-day cumulative incidence of grade I to IV aGVHD in the low-and high-risk groups; (D) The rate of grade I to IV aGVHD of each organ in the lowand high-risk groups.