Development of a machine learning-based model to predict major adverse events after surgery for type A aortic dissection complicated by malnutrition

Objective This study aims to develop a predictive model for the risk of major adverse events (MAEs) in type A aortic dissection (AAAD) patients with malnutrition after surgery, utilizing machine learning (ML) algorithms. Methods We retrospectively collected clinical data from AAAD patients with malnutrition who underwent surgical treatment at our center. Through least absolute shrinkage and selection operator (LASSO) regression analysis, we screened for preoperative and intraoperative characteristic variables. Based on the random forest (RF) algorithm, we constructed a ML predictive model, and further evaluated and interpreted this model. Results Through LASSO regression analysis and univariate analysis, we ultimately selected seven feature variables for modeling. After comparing six different ML models, we confirmed that the RF model demonstrated the best predictive performance in this dataset. Subsequently, we constructed a model using the RF algorithm to predict the risk of postoperative MAEs in AAAD patients with malnutrition. The test set results indicated that this model has excellent predictive efficacy and clinical applicability. Finally, we employed the Shapley additive explanations (SHAP) method to further interpret the predictions of this model. Conclusion We have successfully constructed a risk prediction model for postoperative MAEs in AAAD patients with malnutrition using the RF algorithm, and we have interpreted the model through the SHAP method. This model aids clinicians in early identification of high-risk patients for MAEs, thereby potentially mitigating adverse clinical outcomes associated with malnutrition.


Introduction
Aortic dissection (AD) is an extremely dangerous cardiovascular emergency, especially type A aortic dissection (AAAD).Without prompt surgical intervention, its mortality rate increases by approximately 1%-2% per hour, with a mortality rate as high as 50% within 48 hours (1)(2)(3).Despite significant advancements in surgical techniques and life support technologies such as cardiopulmonary bypass in recent years, the prognosis of AAAD remains relatively poor compared to other cardiovascular diseases (4).Therefore, effectively assessing and predicting the risk of postoperative adverse clinical events is crucial before making diagnostic and therapeutic decisions for AAAD patients.
Malnutrition is a common comorbidity upon admission and also a significant adverse prognostic factor for cardiovascular diseases (5,6).Research by Shirley suggests a close association between malnutrition and mortality in patients with atrial fibrillation (7).Similarly, studies by Al-Kassou et al. (8) indicate that malnutrition increases the mortality rate following aortic valve surgery.However, there is currently limited research on the relationship between malnutrition and the prognosis of AAAD.Additionally, there is also a lack of a model that can effectively predict the clinical outcomes of AAAD patients with concurrent malnutrition following surgical treatment.
With the widespread adoption and popularization of artificial intelligence technology, its application in the field of biomedicine is rapidly advancing.Machine learning (ML), as a specialized form of artificial intelligence, has been widely utilized in the diagnosis and treatment of various diseases and their prognosis (9,10).Compared to traditional logistic regression prediction models, ML models demonstrate greater flexibility and accuracy in handling complex, non-linear relationships within data (11)(12)(13).This study aims to construct a prognosis model based on ML algorithms to predict the risk of major adverse events (MAEs) in-hospital for AAAD patients with concurrent malnutrition following surgical treatment.

Study population
This study retrospectively analyzed the clinical data of AAAD patients over 18 years old who underwent surgical treatment consecutively at our center from January 2018 to January 2022.The exclusion criteria were as follows: (1) chronic aortic dissection, (2) patients with preoperative comorbidities such as malignant tumors, hematologic disease, systemic inflammatory diseases, (3) history of previous thoracotomy, (4) significant lack of medical history data.This study was approved by the Ethics Committee of Fujian Medical University Union Hospital and complied with the Helsinki declaration.Informed consent was waived due to the retrospective nature of the study.

Definition and endpoint
The nutritional risk index (NRI) is a commonly used clinical nutritional assessment tool.In this study, the patients' nutritional status upon admission was assessed using the NRI (14).The calculation of the NRI follows the formula proposed by Buzby: 1.519 * serum albumin (g/l) + 41.7 * (current body weight [kg]/usual body weight [kg]).Usual body weight is replaced by ideal body weight, with the formula for calculating ideal body weight in male as follows: height (cm) − 100 − ([height (cm) − 150]/4), and for female, the formula is: height (cm) − 100 − ([height (cm) − 150]/2.5).When the current body weight exceeds the ideal body weight, we set the ratio of current body weight to ideal body weight as 1 (14)(15)(16).Taking into account the application of the NRI in other cardiovascular disease studies, we defined patients with NRI < 97.5 upon admission as having malnutrition.
The main endpoint of this study is the occurrence of MAEs during hospitalization.According to the consensus statement from the international study group for aortic arch surgery on the grading criteria for complications after aortic arch surgery (17), MAEs include: (1) cardiovascular complications (myocardial infarction, malignant arrhythmias, and heart failure requiring support with intra-aortic balloon pump), (2) respiratory complications [acute lung injury, acute respiratory distress syndrome, prolonged mechanical ventilation, reintubation, tracheostomy, or respiratory failure requiring extracorporeal membrane oxygenation (ECMO) therapy], (3) new-onset acute kidney injury [serum creatinine level increased more than three times baseline or renal failure requiring continuous renal replacement therapy (CRRT)], (4) gastrointestinal bleeding, (5) wound complications requiring reoperation for hemostasis or further surgical intervention, (6) postoperative death.

Data collection
Clinical data of AAAD patients were collected through the hospital's electronic medical record system, including demographic information such as gender, age, and body mass index (BMI), as well as past medical history including hypertension, diabetes, and coronary artery disease (CAD).Preoperative comorbidities such as chronic kidney disease (CKD), aortic valve regurgitation, and pericardial effusion were also documented.Preoperative laboratory test results including leukocyte, neutrophil, monocyte, lymphocyte, etc. were recorded.Intraoperative details such as operation time, cardiopulmonary bypass time, and aortic cross-clamp time were documented.Postoperative clinical outcomes included ICU stay time, mechanical ventilation time, 48 hours thoracic drainage, and postoperative complications.

Model training and performance evaluation
All AAAD patients with malnutrition were randomly divided into training and test datasets at a ratio of 8:2.The training dataset was used for modeling purposes, while the test dataset was utilized for model evaluation.In the training set, LASSO regression analysis was used to select feature variables related to MAEs from preoperative and intraoperative variables.Models were then constructed based on algorithms such as eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR), Random Forest (RF), Multilayer Perceptron (MLP), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN).To prevent overfitting, we employed a 10-fold resampling validation by further divided the training set into 10 subsets.During each iteration, nine of the subsets are used to train the model, while the remaining one subset is used for validation.Subsequently, we performed grid search for hyperparameter tuning to select the optimal model (Supplementary Table S1).In the final evaluation on the training set data, the model performance was assessed from three dimensions: model discrimination, predictive accuracy, and clinical applicability.Model discrimination was quantitatively evaluated through metrics such as the area under the ROC curve (AUC), accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score.Predictive accuracy was assessed by comparing the deviation between predicted probabilities and actual probabilities (Supplementary Tables S2, S3).Clinical applicability was judged through the decision curve analysis (DCA).
After determining the optimal model, we further validated its performance using the test set data.Additionally, we constructed calibration curves, DCA curves, and model learning curves for the optimal model.Finally, we utilized the Shapley additive explanations (SHAP) method to further explain the clinical significance of the model.

Statistical analysis
All data were analyzed using SPSS 24.0, R 4.2.1, and Python 3.7.Continuous data were expressed as mean ± standard deviation (SD) or median (interquartile range), and analyzed using Student's t-test or Mann-Whitney-U test.Categorical data were presented as frequency or percentage (%), and analyzed using chi-square test or Fisher's exact test.Initially, all AAAD patients with combined malnutrition were randomly divided into training and test sets at an 8:2 ratio.In the training set, clinical feature variables were selected through LASSO regression analysis, and models were built using six ML algorithms.The best model was selected and its performance was validated.Finally, the SHAP method was used to explain the model.A two-tailed p < 0.05 was considered statistically significant.

Patient characteristics
In this study, a total of 708 patients with AAAD were included, among which 308 patients (43.5%) presented with malnutrition upon hospital admission.We compared the postoperative clinical outcomes of patients with normal nutrition and malnutrition (Table 1), and the results showed that compared to the normal nutrition group, the malnutrition group had increased thoracic drainage volume within 48 hours postoperatively, prolonged mechanical ventilation time, ICU stay, and postoperative hospital stay.Additionally, the proportion of patients requiring CRRT, ECMO therapy, as well as the occurrence of AKI, permanent neurological dysfunction (PND), and low cardiac output syndrome (LCOS) increased.The incidence of MAEs during hospitalization (28.25% vs. 18.00%) and in-hospital mortality rate (12.99% vs. 5.28%) were significantly higher (p < 0.05).
Among AAAD patients with malnutrition, 246 patients (80%) were randomly assigned to the training set, while 62 patients (20%) were assigned to the test set.The workflow diagram of the study is shown in Figure 1.There were no significant differences in demographic data, preoperative comorbidities, or intraoperative conditions between the two groups (Table 2).

Feature variable selection
Through LASSO regression analysis, preliminary screening of feature variables related to postoperative MAEs was conducted.The result indicated a minimum mean square error lambda value of 0.004, including 35 preoperative and intraoperative feature variables (Figure 2).Subsequently, in both the training and test sets, univariate analysis was further performed to compare preoperative and intraoperative conditions between the MAEs group and the non-MAEs group (Table 3).Finally, seven feature variables, including preoperative NRI, preoperative hypertension, preoperative leukocyte, preoperative lymphocyte, preoperative albumin (ALB), preoperative D-dimer, and preoperative C-reactive protein (CRP), were determined to construct the clinical feature variables for building the ML model.

Comparative analysis of multiple models and validation of the optimal model
After selecting the required feature variables for modeling, the XGBoost, LR, RF, MLP, SVM, and KNN algorithms were employed to analyze the training set data, and the discriminative performance of the models was evaluated through AUC.The AUC of all models was validated through 10-fold resampling-validation.The final results show that among all models, the RF model demonstrates the best performance both on the training and validation sets (Figure 3).The AUC for RF model on the training set is 0.963 [95% Confidence Interval (CI): 0.940-0.986],and on the validation set, it is 0.899 (95% CI: 0.815-0.982).The Brier score (95% CI) for the RF model in the calibration curve is 0.122 (95% CI: 0.108-0.136),indicating good calibration.The DCA also illustrates its favorable clinical utility.Moreover, the precision-recall curve results indicate that the RF model has the highest average precision values on both datasets.Therefore, we conclude that the RF model is the optimal model choice for this dataset.
Subsequently, modeling and 10-fold cross-validation were conducted on the test dataset using the RF algorithm (Figure 4).The results indicate that the average AUC on the training set is 0.999, and on the validation set, it is 0.919.Ultimately, the RF model achieves an average AUC of 0.975 on the test set with an accuracy of 88.71%.The calibration curve results reveal a Brier Score of 0.078 (95%CI: 0.047-0.119)for the RF model, indicating its good predictive accuracy.Furthermore, the DCA results also demonstrate its favorable clinical utility.The results of the model learning curve indicate a good fit of the RF model on both the training and validation sets.Model construction methodology and study flowchart.

Model explanation and clinical significance analysis
To further elucidate the clinical significance of this model, we explained the prediction process and results of the RF model through the SHAP method.Based on the concept of the Shapley value in cooperative game theory, we quantify the contribution of each feature to the model's output and calculate the SHAp value for each feature variable to assess the impact of each feature on a single prediction made by the model.
Figure 5 presents the SHAP summary plot and the ranking of feature variables based on their impact on MAEs.Additionally, we further elucidate the model through two different samples from the test dataset: one where the model predicts no postoperative MAEs, and indeed, no MAEs occur f(x) = 0, and another where the model predicts postoperative MAEs, and MAEs actually occur f(x) = 0.91.

Discussion
Malnutrition, as a common comorbidity upon admission, increases the risk of complications during hospitalization and is a key factor affecting the prognosis of many diseases (18, 19).There are several scoring systems used clinically to assess the nutritional status of hospitalized patients, such as the controlling nutritional status score and the prognostic nutritional index (20,21).Among them, the NRI serves as a simple and effective scoring indicator.It primarily evaluates patients' nutritional status based on serum   albumin, actual body weight, and ideal body weight.Initially developed as a risk scoring tool to assess the nutritional status of elderly hospitalized patients and the incidence of malnutritionrelated complications and mortality, NRI has gained popularity in recent years due to its simplicity, universality, and strong prognostic value across different surgical patient populations (14).However, there is currently no research specifically applying the NRI in patients with AAAD.Therefore, we referred to the application of NRI in other cardiovascular diseases.Patients with an NRI < 97.5 upon admission were diagnosed as having malnutrition (22).By comparing postoperative outcomes between the normal nutrition group and the malnutrition group of patients with AAAD, we found that patients who had malnutrition experienced significantly prolonged mechanical ventilation time, ICU stay, total hospital days, and increased incidence of serious complications.Their short-term postoperative prognosis was poorer.Therefore, there is an urgent need for a reliable and effective predictive model to stratify the early risk and assess the prognosis of AAAD patients with malnutrition, aiming to improve the adverse postoperative outcomes caused by malnutrition.
This study represents the first application of artificial intelligence in prognostic prediction for AAAD patients with malnutrition.We attempted to establish six ML models and evaluated their efficacy in terms of model discrimination, accuracy, and clinical applicability.Ultimately, we successfully developed a ML model capable of predicting the risk of MAEs following surgery in AAAD patients with malnutrition.Compared to traditional logistic regression models, ML has greater flexibility, generalization ability, and accuracy in predictive model construction.As one of the most common ML algorithms, the RF algorithm has the advantage of higher accuracy, stronger resistance to overfitting, ease of interpretation, and suitability for large-scale data.Moreover, RF demonstrates robustness against missing data and outliers and provides assessments of the importance of each feature variable, aiding in understanding their contributions to the model's operation (23,24).
To further interpret this predictive model, we utilized the SHAP method to generate feature density scatter plots and feature importance ranking plots based on SHAP values.We described the contributions of the included feature variables to postoperative MAEs and the actual predictive results in the test dataset, enhancing the interpretability of the model.
The feature importance ranking results indicate that preoperative ALB, NRI, and D-dimer are the top three feature variables in the RF model.ALB, as a crucial plasma protein, plays a vital role in maintaining normal plasma oncotic pressure and balancing fluid within and outside blood vessels.Additionally, it holds significant value in reflecting long-term nutritional status.Hutter et al. (25) have reported a strong correlation between preoperative ALB levels and postoperative complications in male surgical patients.The relationship between preoperative low ALB levels and postoperative mortality has also been widely recognized in general surgical patients (26).In contrast, in malnourished patients, low albumin tends to be more likely to lead to decreased immune function and organ insufficiency, which increases the risk of postoperative MAEs.
The NRI, as an indicator of nutritional risk, not only takes into account serum albumin levels but also considers the patient's weight body changes.According to research by Jabbour et al. (27), when both serum albumin levels and weight body alteration occur simultaneously, their impact on postoperative outcomes surpasses that of age, which is known to be a strong predictor of prognosis.A lower NRI indicates a higher risk of malnutrition and, consequently, a greater risk of adverse postoperative outcomes.Therefore, the NRI is also identified as a crucial factor influencing prognosis.D-dimer is commonly regarded as a reliable indicator of coagulation and fibrinolysis, given that the systemic inflammatory storm induced by AAAD persistently activates both endogenous coagulation and fibrinolysis.Consequently, D-dimer levels in the serum of AAAD patients are significantly elevated.Its prognostic value in different types of aortic dissection has been widely   acknowledged (28).The elevation of D-dimer levels is closely associated with the extensive formation of false lumen in preoperative dissection, and high D-dimer levels often indicate a more severe condition and a higher risk of postoperative complications.Thus, it serves as an effective predictive variable for postoperative MAEs.Other predictive model features, including leukocyte, lymphocytes, and CRP, signify the significant association between inflammatory responses and immune activation with the prognostic outcomes in malnourished patients.Nutrition influences all physiological processes, including those related to immune system development and function (29).Chronic inflammation present in malnourished individuals can weaken the organ's resilience against disease stress (30).Thereby making immune-inflammatory blood cells and biomarkers relevant predictors for potential postoperative complications.
In conclusion, we have successfully developed a ML model based on the RF algorithm, which effectively predicts the risk of postoperative MAEs in AAAD patients with malnutrition.This model has demonstrated impressive predictive performance in both validation and testing sets.Through this model, clinicians can early identify high-risk patients with malnutrition among AAAD patients, facilitating risk stratification and decision-making.This, in turn, can help reduce the adverse clinical outcomes associated with malnutrition and improve patients' short-term prognosis.

Limitations
This study still has the following limitations.Firstly, being a single-center retrospective study, it inherently carries some biases.Secondly, the data for the training and test sets are from different periods of the same center, thus lacking external data from multiple centers to further validate the model's efficacy and clinical utility.Therefore, further multicenter, large-sample randomized clinical trials are still needed to corroborate our research findings.Finally, due to the lack of unified diagnostic criteria for malnutrition, different nutritional scoring systems diagnose malnutrition at vastly different rates.Although this study selected the widely used NRI as the evaluation basis, further integration with other types of nutritional scoring systems is still necessary to improve the assessment of prognosis risk for AAAD surgical patients with malnutrition.

Conclusion
This study, for the first time, successfully developed and validated a predictive model based on ML algorithms for predicting the risk of postoperative MAEs in AAAD patients with malnutrition.The model demonstrated excellent predictive

FIGURE 2 Feature
FIGURE 2 Feature variable selection based on the LASSO regression analysis.(A) Plot of the LASSO coefficient profiles.(B) Tuning parameter selection crossvalidation error curve.(LASSO, least absolute shrinkage and selection operator).

FIGURE 3
FIGURE 3The comprehensive analysis of six machine learning models.(A) The ROC curve and AUC of the training set.(B) The ROC curve and AUC of the validation set.(C) The calibration curve plot of six models.(D) The DCA curve of the validation set.(E) The PR curve of the training set.(F) The PR curve of the validation set (ROC, receiver operating characteristic; AUC, area under the curve; DCA, decision curve analysis; PR, precision recall).

FIGURE 4
FIGURE 4 The comprehensive analysis of Random Forest model.(A) The ROC curve and AUC of the training set.(B) The ROC curve and AUC of the validation set.(C) The ROC curve and AUC of the test set.(D) The calibration curve plot of the Random Forest model.(E) The DCA curve of the Random Forest model.(F) Random Forest model learning curve (ROC, receiver operating characteristic; AUC, area under the curve; DCA, decision curve analysis).
performance and clinical applicability.It can provide clinicians with a reliable basis for assessing the postoperative outcomes of AAAD patients with malnutrition, facilitating early risk stratification and decision-making.

TABLE 1
Comparison of postoperative outcomes between patients with normal nutrition and malnutrition.

TABLE 2
Baseline characteristics of enrolled patients.

TABLE 3
Comparison of preoperative and intraoperative conditions between the MAEs group and the non-MAEs group in different datasets.