Predicting the efficiency of chidamide in patients with angioimmunoblastic T-cell lymphoma using machine learning algorithm

Background Chidamide is subtype-selective histone deacetylase (HDAC) inhibitor that showed promising result in clinical trials to improve prognosis of angioimmunoblastic T-cell lymphoma (AITL) patients. However, in real world settings, contradictory reports existed as to whether chidamide improve overall survival (OS). Therefore, we aimed to develop an interpretable machine learning (Machine learning)–based model to predict the 2-year overall survival of AITL patients based on chidamide usage and baseline features. Methods A total of 183 patients with AITL were randomly divided into training set and testing set. We used 5 ML algorithms to build predictive models. Recursive feature elimination (RFE) method was used to filter for the most important features. The ML models were interpreted and the relevance of the selected features was determined using the Shapley additive explanations (SHAP) method and the local interpretable model–agnostic explanationalgorithm. Results A total of 183 patients with newly diagnosed AITL from 2012 to 2022 from 3 centers in China were enrolled in our study. Seventy-one patients were dead within 2 years after diagnosis. Five ML algorithms were built based on chidamide usage and 16 baseline features to predict 2-year OS. Catboost model presented to be the best predictive model. After RFE screening, 12 variables demonstrated the best performance (AUC = 0.8651). Using chidamide ranked third among all the variables that correlated with 2-year OS. Conclusion This study demonstrated that the Catboost model with 12 variables could effectively predict the 2-year OS of AITL patients. Combining chidamide in the treatment therapy was positively correlated with longer OS of AITL patients.


Introduction
Angioimmunoblastic T-cell lymphoma (AITL) is a distinct kind of peripheral T-cell lymphoma (PTCL) that has a poor prognosis (Swerdlow et al., 2016).For AITL patients, the 5-year overall survival (OS) rate was 44% and the progression-free survival (PFS) rate was 32% (Advani et al., 2021).Anthracycline-based chemotherapy regimens are frequently utilized, yet their effectiveness is constrained.Based on the unsatisfactory outcome of traditional treatment, the NCCN Clinical Practice Guidelines in Oncology recommended engaging in clinical trials as preferred management strategy (Horwitz et al., 2022).Notably, although some patients had the identical staging or prognostic scores that commonly used to evaluate T cell lymphoma, their clinical outcomes varied considerably.The differences in prognosis may be due to the heterogeneity of AITL (Zhang et al., 2023).Therefore, novel models that can better stratify patients are required.
Chidamide is a benzamide type of subtype-selective histone deacetylase (HDAC) inhibitor (Gong et al., 2012).In recent years, chidamide has appeared as a promising treatment in PTCL, especially in AITL.In phase II study of chidamide in relapsed or refractory (r/r) AITL, the overall response rate (ORR) was 50% (Shi et al., 2015).In a multicenter phase II clinical trial combining chidamide with prednisone, etoposide, and thalidomide in untreated AITL, the ORR was 90.2%.The 2year progression-free survival (PFS) rate and overall survival (OS) rate were 66.5% and 82.2%, respectively (Wang et al., 2022b).However, in real world analysis, contradictory results exist as for whether combining chidamide with chemotherapy improves OS compared with chemotherapy alone (Shi et al., 2017;Liu et al., 2021;Wang et al., 2022a).Further evidence is required to clarify the efficiency of chidamide in realworld setting.
Machine learning (ML) algorithms is a key area of artificial intelligence, which may learn from complicated data by utilizing computational methods to identify possible features for prediction (Haug and Drazen, 2023).Compared with conventional generalized linear model, machine learning based on the advanced algorithm are more acceptable in terms of data distribution and integrity, as well as more flexible in terms of mining data value (Elemento et al., 2021).Therefore, machine learning has been widely used in the medical field in recent years and has developed into a potent tool for physicians to use when making clinical decisions (Radakovich et al., 2020;Haug and Drazen, 2023;Swanson et al., 2023).Hence, the purposes of the present study were to establish ML models to predict prognosis of AITL and to evaluate the benefit of chidamide in realworld setting.

Patients
This retrospective multicenter study included patients with newly diagnosed AITL between 2012 and 2022 from three centers in China, and this study was approved by the Ethics Committees of three hospitals.
Patients were treated with chemotherapy, mostly anthracyclinebased regimens, with or without chidamide.We conducted patient follow-up monthly and recorded the detailed treatment strategy and clinical outcomes.Participants with significant data omissions were excluded to maintain the integrity and reliability of our research.

Study design and machine learning algorithms
A total of 17 variables were included to identify the prognostic value of these features, including 16 baseline features and "chidamide usage".Among 183 patients, 71 patients died within 2 years after diagnosis, while 112 patients were still alive.About 75% (137) of patients were randomly selected into the training set, while the rest 46 patients fell into the validation set.Five models were built through logistics regression (LR), random forest (RF), light gradient boosting machine (LGBM), extreme gradient boosting (XGBoost), and categorical boosting (CatBoost) to predict 2-year OS.The selection of models was based on a comprehensive consideration of performance, interpretability, computational efficiency, generalizability, as well as support by literature documenting similar research contexts, ensuring the relevance and robustness of our approach.

Model validation
We explored five different machine learning algorithms and conducted grid search techniques to identify the best hyperparameter combinations, setting possible value ranges for different parameters and evaluating each parameter combination with 5-fold cross-validation.The receiver-operating characteristic (ROC) curve was used as the assessment metric to validate the performances of the different models.The performance of each model was evaluated by calculating the area under the curve (AUC), accuracy, precision, recall, specificity, and F1 score.After comparing the AUC values of the five algorithms, we chose CatBoost as our final model due to its superior performance and high AUC value on the test set.The parameters for five models were provided in Supplementary Table 1.
We implemented Recursive Feature Elimination (RFE) with the CatBoost algorithm to reduce the number of features and enhance model efficiency.RFE is a model-based feature selection method that methodically reduces the feature set by recursively removing the least significant feature during each iteration.By meticulously comparing the performance across different feature subsets, we selected 12 features that contribute most significantly to the model's predictive power.The model demonstrated good performance and robustness on the test set, highlighting its generalizability.

Model interpretation
The interpretation of the predictions produced by the models was conducted using the Shapley additive explanations (SHAP) value.The SHAP technique is capable of providing a more comprehensive explanation of the significance of each variable in all component sequences by doing a marginal calculation of their contributions.The predictions were also explained using the local interpretable model-agnostic explanation (LIME).The trustworthiness of a model's explanation for predicting a single sample using a local linear approximation of the model's behavior can be enhanced.

Statistical analysis
Patients were categorized into two groups based on their 2-year status: alive and dead.Binary variables were subjected to either a Fisher exact test or a γ 2 test.For continuous variables that adhere to the normal distribution, a Student's t-test was employed, and the data were displayed as medians along with standard deviations.For variables that do not follow the normal distribution, a Mann-Whitney U test was utilized, and the data were presented as medians along with interquartile ranges (IQRs).The development and validation of the five machine learning algorithms were conducted using Python software.Statistical significance was set at p ≤ 0.05.Python software (version 3.9.13)was utilized to perform statistical analysis.

Patient characteristics
A total of 538 patients were newly diagnosed with AITL between 2012 and 2022 from three centers in China.Following the screening process, we included 183 patients who had a comprehensive baseline examination and long-term follow-up data of at least 24 months in our study (Figure 1).Cohort characteristics are presented in Table 1.The median age was 63 (IQR = 54-68) years and 63.39% were males.AITL patients were categorized into two groups based on 2-year OS.Among 183 patients, 112 patients had an OS longer than 2 years.

Predicting 2-year OS by baseline features
Our modeling utilized a comprehensive set of 17 variables, encompassing age, gender, Ann Arbor staging, B symptom, extranidal involvement, ECOG, IPI, rash, edema/serous effusion, Hb, PLT, ALC, AEC, ALB, GLB, LDH, and chidamide.The patients were divided into two groups, namely, the training set for model development and the testing set for model performance evaluation, using a random stratification method with a ratio of 3:1.The clinical characteristics of the two data sets are presented in Table 2.The p-values of all features in the training set and the test set were greater than 0.05, indicating that the division of the training and test sets is reasonable.
Using all 17 variables, predictive models for 2-year survival were developed based on five algorithms, including LR, RF, LGBM, XGBoost, and CatBoost.The best predictive performance was observed in CatBoost (training set AUC = 0.8949, testing set AUC = 0.8571).The AUC, accuracy, precision, recall, specificity, and F1 score of five models in the training and testing data sets were shown in Table 3.The ROC curves of five models were presented in Figure 2.
The Catboost model was further optimized by employing the recursive feature elimination (RFE) method to selectively filter for the most significant characteristics.The feature importance ranking by RFE was presented in Supplementary Figure 1.In the ideal Catboost algorithm, a total of 12 factors were utilized.These variables encompassed Age, B symptom, Extranodal involvement, ECOG, IPI, Edema/Serous effusion, ALC, AEC, ALB, GLB, LDH, and Chidamide (Figure 3).Table 4 showed the AUC, accuracy, precision, recall, specificity, and F1 score of the optimized Catboost model.

Interpretation and evaluation of machine learning model
We employed the SHAP approach to quantitatively assess the impact of each feature on the prediction outcomes of the model (Figure 4).The analysis of feature ranking revealed that the five most significant features were B symptom, Edema/Serous effusion, Chidamide, Extranodal involvement, and ECOG.Chidamide usage was negatively correlated with the outcome (OS shorter than 2 years), indicating the efficiency of chidamide on improving OS in AITL patients.
Moreover, LIME algorithm was applied to explain the influence of different variables of the Catboost model on the prediction results.
Two cases were randomly selected to interpret the visualized prediction results (Figure 5).

Discussion
AITL is a life-threatening lymphoma with heterogeneous nature.Current staging systems are far from satisfactory to stratify AITL Chidamide, a HDAC inhibitor, has presented as a promising target therapy in recent years in PTCL patients.In clinical trials, chidamide remarkably prolonged the OS of AITL patients.However, in real world analysis, contradictory results exist as for whether combining chidamide with chemotherapy improves OS compared with chemotherapy alone (Shi et al., 2017;Liu et al., 2021;Wang et al., 2022a).Further evidence is required to clarify the efficiency of chidamide in real-world setting.Using ML algorithm, we demonstrated that using chidamide was among the most important features that influent 2-year OS.Specially, in the The ROC curves of five ML models.Using RFE method to screen the optimal variables on Catboost model (A).The ROC curves of the optimized Catboost model (B).
Frontiers in Pharmacology frontiersin.org06 Zhang et al. 10.3389/fphar.2024.1435284optimized Catboost model, chidamide ranked third among all the variables that correlated with OS.In general, our study supported that in real-world setting, combining chidamide in treatment strategy could prolong OS of AITL patients.Chidamide was administered as the initial treatment for 68 patients, as the second line of treatment for 22 patients, and as the maintenance treatment for 43 patients.For patients treated with chemotherapy without Chidamide, no other substitute was added.Nevertheless, the subset analysis revealed that the disparity in response rates between chemotherapy combined with chidamide and without chidamide was not statistically significant in both first line and second line treatment, owing to the small sample size.Out of the 43 patients who received chidamide as a form of maintenance therapy, 24 patients had available treatment responses both before and after the maintenance period.The treatment responses in the majority of patients (22 cases) were consistent both before and after maintenance.It is worth mentioning that two patients demonstrated improved treatment response following the administration of chidamide.There was no evidence of any progressive disease.Previously, the benefit of chidamide as maintenance therapy in real world setting was also reported by Guo et al. (2022).
Similar to the commonly used prognostic scores in T-cell lymphoma, including International Prognostic Index (IPI), Prognostic Index for T-cell lymphoma (PIT), International peripheral T-cell lymphoma Project score (IPTCLP) and modified Prognostic Index for T-cell lymphoma (mPIT), age, B symptom, Extranodal involvement, ECOG, PLT, and LDH were correlated with OS in our study (Gutiérrez-García et al., 2011).Moreover, ALC was also significantly related with 2-year OS, which was in agreement with the AITL score established based on 282 patients with AITL enrolled between 2006 and 2018 in the international prospective T-cell Project (Advani et al., 2021).Interestingly, we also showed that AEC was associated with prognosis, which has not been reported before.
Remarkably, unlike aforementioned prognostic scores, we showed that Edema/Serous effusion and ALB level were also significantly correlated with prognosis.In the optimized Catboost model, Edema/Serous effusion was found to be the second most significant variable.Previously, Sun et al. reported the correlation of serous effusion with OS based on a cohort of 55 AITL patients (Sun et al., 2021), while Huang et al. reported that ALB <30 g/L was significantly associated with poor prognosis base on a cohort of 64 AITL patients (Huang et al., 2020).From a clinical perspective, there was a correlation between Edema/Serous effusion and a reduction in ALB levels.This pair of features warrant more attention in further studies of AITL.
The current investigation possesses several limitations that should not be ignored.First, this study was conducted retrospectively and had a limited sample size.Additional validation in large cohorts is required to further substantiate the efficacy of the model.Second, present study  identified chidamide as an effective treatment to AITL patients that prolonged their OS.Nevertheless, the treatment regimens employing chidamide exhibited variability among patients in this retrospective investigation, encompassing factors such as the timing and duration of therapy.Hence, it is imperative to conduct additional validation of the efficacy of chidamide in a substantial prospective cohort within a realworld context.Furthermore, the absence of complete data has resulted in the exclusion of certain potential influencing elements, such as C-reactive protein and Beta-2-microglobulin, from the establishment of machine learning models.Moreover, along with the development of genome and transcriptome analysis, molecular abnormalities which may impact the prognosis of lymphoma were gradually unveiled (Cortes et al., 2018;Lone et al., 2021;Leca et al., 2023;Zhang et al., 2024).In the age of precision medicine, as sequencing technology becomes increasingly prevalent, it is imperative to incorporate these molecular abnormalities into future prognostic models of AITL.

Conclusion
The Catboost model developed in our study shown a strong prediction capability for the 2-year OS of patients with AITL.Chidamide, in particular, was one of the most significant factors that influenced OS.The utilization of chidamide exhibited a strong correlation with an improved outcome.Further validation of the model's predictive value should be conducted using larger cohorts.

FIGURE 4
FIGURE 4 Attribution of 12 features in the optimized Catboost model based on the SHAP algorithm.(A) Summary of SHAP analysis on the data set.One dot represents a case in the data set, and the color of a dot indicates the value of the feature.Blue indicates the lowest range and red the highest range.(B)Ranking of feature importance indicated by SHAP.SHAP values provide a clear depiction of how each feature influences the model's prediction, indicating whether the impact is positive or negative.For example, from the SHAP plot, we can observe that B symptoms have the most significant contribution.A higher value of B symptoms corresponds to a positive SHAP value, suggesting a positive impact on the prediction, supporting a likelihood of death.Conversely, a lower value of B symptoms results in a negative SHAP value, indicating a negative impact on the prediction, supporting survival.

TABLE 1
Clinical and laboratory data of AITL patients.

TABLE 2
Clinical and laboratory data of AITL patients categorized by training set and testing set.

TABLE 4
Predictive performance of the optimized Catboost model.