Early prediction of acute respiratory distress syndrome complicated by acute pancreatitis based on four machine learning models

Highlights • ML can be a practical and effective early prediction method of AP complicated by ARDS.• PaO2, CRP, PCT, LA, Ca2+, NLR, WBC, and AMY were used as the optimal subset of features to early identify AP patients with a high risk for developing ARDS in ML.• BC was the superior predictive model and EDTs could be promising for predicting large samples.


Introduction
Acute Pancreatitis (AP) is a common inflammatory disorder that can lead to Systemic Inflammatory Response Syndrome (SIRS), local and systemic complications, and life-threatening organ injury or Multiple Organ Failure (MOF). Although most patients (80%) develop a mild episode of AP with a good prognosis, about 20% develop moderately severe or severe AP (MSAP or SAP) with local complications and transient or persistent organ failure. 1 Acute Respiratory Distress Syndrome (ARDS) is a syndrome of inflammatory pulmonary edema that causes hypoxia and is associated with increased permeability of the lung epithelium 2 and vascular endothelium that occurs in approximately 30% of patients with SAP. 3 The lung is often damaged initially during AP, and ARDS is a common complication. Respiratory failure is the most common type of organ failure (92%) during the early and late phases of AP with a 37% mortality rate. 4 The main cause of the high fatality rate may be related to the lack of predicting early organ failure and the management strategy. However, ARDS is somewhat preventable, and clinical outcomes may improve after appropriate interventions during the early phase of ARDS. 5 Therefore, it is important to identify patients with AP early who are at high risk for developing ARDS. A more accurate and convenient early predictive tool is needed to help physicians identify and prevent progression to ARDS.
Applications of Artificial Intelligence (AI), such as Machine Learning (ML), have become more practical in the field of disease outcome prediction with continuous improvements in computer science. ML is an emerging field and has widely infiltrated clinical medical studies. Notably, ML analysis relies on different deep iterative algorithms to integrate candidate variables, so highly accurate predictions can be obtained.
This study developed ARDS risk prediction models for patients with AP in the early stage from a larger set of clinical parameters. All of the models were tested in an independent cohort of AP patients. The ability to accurately risk stratifies may facilitate more timely interventions that are conducive to high-risk ARDS management via early identification.

Participants
The authors performed a retrospective observational study of AP patients based on the STROBE checklist. Our cases were from patients who were admitted to the Xuanwu Hospital of Capital Medical University from January 2017 to August 2022. The hospital has an independent acute pancreatitis therapy center, including a gastroenterology intensive care unit. The inclusion criteria were age ≥18 years and a confirmed diagnosis of AP. The exclusion criteria were more than 24h after onset of symptoms, history of AP attacks, AP with chronic obstructive pulmonary disease, AP with malignant tumors, AP with chronic heart failure or kidney disease, AP and pregnant, or AP with HIV/AIDS or another immune-deficiency disorder. All patients received standard medical treatment to manage AP according to international guidelines.
The AP diagnostic criteria were set up according to the revised Atlanta classification of acute pancreatitis 2012. 6 At least two of the following three criteria had to be satisfied for the AP diagnosis: abdominal pain, increased serum levels of Amylase (AMY) and/or Lipase (LPS) to at least three times the normal upper limit, and image findings of AP in abdominal ultrasonic and/or a Computed Tomography (CT) scan. Hypertriglyceridemia associated with AP was defined as levels of triglycerides ≥ 11.3 mmoL/L (1000 mg/dL) or ≥5.65 mmoL/L (500 mg/dL) accompanied by milky serum. 6 The ARDS diagnosis was made according to the Berlin definition as acute hypoxemia, a decrease in the PaO 2 /FiO 2 index <300 mmHg, and bilateral lung infiltration in an X-Ray/CT scan that was not totally illuminated by fluid overload or cardiac failure. 7 Arterial blood gas analysis was performed for patients as well as when a patient developed dyspnea during hospitalization.

Data collection
The data included clinical characteristics and laboratory findings, and patients were admitted in ≤24h. Demographic

Development of the ML models
The missing values in the original data were multiple interpolated using the bagImpute method based on the bagged tree model. The complete data were randomly distributed into the training and testing cohorts at a 4:1 ratio. The training cohort was applied to develop the model with ML algorithms, and variables were inputted that had significant differences (p<0.05) in the univariate analysis between AP patients with or without ARDS to predict the risk for ARDS. Four ML algorithms were selected, including Support Vector Machine (SVM), Ensembles of Decision Trees (EDTs), Bayesian Classifier (BC), and the nomogram algorithm. These algorithms were applied using Matlab 2014 (MathWorks, Natick, MA, USA). Internal validation was accomplished with five-fold cross-validation of the training set in each ML model after selecting the optimal feature subset. Because five-fold was used for the validation set, the above process was repeated 10 times.

Evaluation and testing of the ML models
The final Receiver Operating Characteristics (ROC) curve, the average Area Under the Curve (AUC), accuracy, precision, recall, True Negative Rate (TNR), F1 score, Negative Predictive Value (NPV), and False Discovery Rate (FDR) was utilized to evaluate and compare the predictive performance of the models. The four ML models trained on the optimal feature subsets were tested with an independent test set.

Baseline demographic and clinical characteristics
In all, 497 patients with AP were initially identified and 37 were excluded according to the exclusion criteria. Ultimately, 460 patients were included in the study (Fig. 1). The characteristics of the patients with and without ARDS are summarized in Table 1. ARDS occurred in 83 of the 460 patients (18.04%). In all, 368 patients were included in the training cohort and 92 in the testing cohort. ARDS occurred in 66 patients (17.93%) in the training cohort and 17 (18.48%) in the testing cohort. Hypertriglyceridemia (45.22%) was the most common cause of AP.
Thirty-one parameters differed significantly between patients with and without ARDS (Table 1). A significant difference was observed in the etiology of hypertriglyceridemia between the two groups. No differences in gender, age, or history of hypertension, diabetes, or NAFLD were observed between the two groups.

Feature selection and development of the ML models
The features that were significantly different between the two groups were used for feature selection using the random forest algorithm and the Recursive Feature Elimination (RFE) strategy to determine an optimal subset of features that effectively predicted the risk for ARDS in patients with AP ( Supplementary Fig. 1). As some features had strong internal correlations, the authors tested all feature correlations and retained the features with the strongest correlations using the target variable ARDS ( Supplementary Fig. 2). Ultimately, the best eight features ( Fig. 2) were identified as the optimal subset of features. These were entered into the ML models. To build a probabilistic model of the objective function and to select the most promising set of hyperparameters to evaluate, the authors optimized the ML models using a Bayesian hyperparameter optimizer (Supplementary Fig. 3).

Feature importance in the optimal feature subset
The authors quantified the importance of each feature in the optimal feature subset using an RFE strategy in the random forest algorithm. As shown in Fig. 2, PaO 2 was the most important feature, followed by CRP, NLR, Ca 2+ , WBC, PCT, LA, and AMY in order of importance in predictiveness.

ML model training and validation
The ROC curves of the four different models for predicting ARDS are shown in Fig. 3. Fig. 4 shows the ROC curves of the models after the five-fold cross-validation of the training set. The AUC values of the optimal feature subset in the SVM, EDT, BC, and nomogram models were 0.91, 0.94, 0.87, and 0.91, respectively. The EDT algorithm achieved  the highest AUC, accuracy, precision, recall, TNR, F1 score, and NPV compared to the other three algorithms. Table 2 presents a set of detailed metrics for the four models in the training dataset. Fig. 5 is a nomogram of the visual results of logistic regression, indicating the association between the predictor variables and the occurrence of ARDS in patients with AP.

Comparison of predictive performance among the four models
The authors generated four models to predict the early onset of ARDS in AP patients after admission. Then, the authors evaluated the predictive performance of each ML model trained using the optimal feature subset. All detailed performance metrics obtained by the four models in the testing set are shown in Table 3. The AUC values were 0.870 for SVM, 0.813 for EDTs, 0.891 for BC, and 0.874 for the nomogram. The ROC curve obtained for each model in the testing set is shown in Fig. 3. The AUC value demonstrated that the BC model achieved the best predictive effect with the highest AUC of 0.891, recall of 0.563, and NPV of 0.909 compared with other models. EDTs achieved good predictive performance with the highest accuracy (0.891), precision (0.800), and F1 score (0.615), but the lowest FDR (0.200) and the second-highest NPV (0.902).

Discussion
ARDS is the triggering point in the development of MOF in patients with AP, which is associated with high mortality. 8 Therefore, it is extremely important to predict the risk for ARDS early, which can help prevent the development of ARDS and further deterioration of other organs. However, there are no validated serum biomarkers or scoring systems to predict ARDS in patients with AP. ML techniques are increasingly recognized by medical professionals because of their extraordinary ability to analyze information. Here, the authors developed and tested four ML algorithms as convenient tools to predict ARDS complicated by AP in the early phase. The authors performed correlation analysis on the features and quantified the importance of each feature on the target variable. A set of high-quality optimal features was obtained, and the prediction models were constructed with the least number of features and the lowest redundancy of feature information; hyperparameter optimization was performed for each model.
Clinical data from a routine blood test, biochemistry, coagulogram, inflammatory markers, and arterial blood gas analysis were collected to develop the ML models. Although the four models all yielded satisfactory predictive performance, the BC and EDTs models more accurately predicted the risk for ARDS in patients with AP. BC had the best predictive performance using the testing set. EDTs had the highest AUC value and superior accuracy, specificity, and sensitivity in the training set.
In this study, a lower PaO 2 and a lower Ca 2+ level, as well as a higher CRP, PCT, LA, NLR, WBC, and AMY at admission were correlated with a higher risk of developing ARDS in patients with AP. Among them, PaO 2 was the foremost feature.
Hypoxemia is not only a diagnostic criterion for ARDS, but the respiratory symptoms it causes are the earliest clinical manifestations of AP. 9   As no specific drug treatment exists for ARDS, good supportive care reduces damage and improves the prognosis. 10,11 Therefore, early diagnosis benefits patients. In this study, the arterial PaO 2 in patients with ARDS was 64.20 (60.70, 72.00) mmHg, which was significantly lower than that of patients without ARDS with 79.10 (72.83, 87.00) mmHg, suggesting that ARDS should be suspected in all AP patients once hypoxemia and related symptoms appear. 11 CRP was the second-most important feature for predicting ARDS in our study and has been used to predict the severity of AP. This result also confirms that inflammation is closely associated with ARDS in patients with AP, which is consistent with the prevailing view that systemic inflammatory response syndrome is the first stage of ARDS in AP patients. 12,13 The WBC count and NLR had early predictive value for the severity of AP and persistent organ failure, [14][15][16] and are also clinical markers for predicting mortality and fatal complications in patients with ARDS. 17,18 . The NLR served as the third-most important predictive feature in our models, similar to a previous study. 19 PCT is associated with MOF and ARDS in patients with SAP. [20][21][22] The authors observed that patients with a higher PCT at admission were more likely to develop ARDS, consistent with previous findings. 23 The significantly lower Ca 2+ concentrations in patients with ARDS compared to those without ARDS suggests that tissue necrosis triggers a systemic inflammatory response, resulting in the release of inflammatory cells and mediators, which further triggers ARDS. Here, LA and Ca 2+ were the independent variables in ARDS, indicating that these features should be monitored. Although serum levels of AMY were not associated with AP severity, AMY levels at admission were a risk factor for predicting ARDS, similar to previous results. 24 However, further study on the relationship between these factors in patients with AP is warranted.
Hypertriglyceridemia-induced AP (HTG-AP) varies from 10% to 30% in different countries [25][26][27] and high TG levels are associated with the severity and clinical prognosis of AP. 28,29 HTG-AP is increasing gradually, especially in China. [30][31][32][33] In our study, hypertriglyceridemia accounted for 45.22% of the etiology, consistent with the 40%−49% reported in recent studies. 19,23 . In addition, the authors found that the proportion of HTG-AP was significantly higher in the ARDS group than in the group without ARDS, consistent with the results of pneumonia-initiated ARDS. 34 This result may be due to the fat embolism syndrome caused by high levels of free fatty acids in HTG-AP patients, which can lead to pulmonary vascular endothelial damage and microcirculatory disorder. No significant differences in age or comorbidities such as diabetes, hypertension, or NAFLD were detected between the two groups, suggesting that, unlike pneumonia-initiated ARDS, age and comorbidities cannot be used as predictors of ARDS caused by AP.
Two recent studies used nomograms to predict ARDS in AP with AUC values of 0.821 and 0.814, respectively. 19,23 The authors performed a two-step feature selection strategy to filter the optimal subset of features, followed by optimizing the parameters to develop the predictive models. Compared to complex scoring systems (e.g., APACHE II), ML models are convenient to determine prediction probability. ML has the advantage of analyzing the nonlinear relationships between various markers and ARDS over traditional statistical methods, which allows for  early prediction before significant changes in classical metrics occur. Based on the prediction performance, the authors recommend the BC algorithm with the highest AUC value of 0.891, indicating that it is more robust in extrapolation. Second, the authors recommend the EDT algorithm with superior evaluation metrics from the training set, indicating its strongest fitting ability. The unbalanced distribution of the original data may have directly affected the extrapolation ability of the model. Therefore, the authors believe that BC provided the most accurate prediction given the available data and that EDTs have greater potential as sample size increases.
Several limitations of our study should be mentioned. First, our data were derived from a single AP center and the number of cases was small. Some differences in the performance of the ML models may occur when applied to datasets from different institutions with different distributions of covariates. Second, the authors reported ARDS as a dichotomous variable (presence or absence) rather than across time; thus, our results cannot predict the development of ARDS. Third, the small sample size prevented the evaluation of subgroups according to ARDS severity. Finally, our study was retrospective and there may be patient selection bias, which is an unavoidable limitation of such studies. Further multicenter prospective studies with larger samples should be conducted to verify our ARDS predictive models in patients with AP.