Prediction of individual mortality risk among patients with chronic obstructive pulmonary disease: a convenient, online, individualized, predictive mortality risk tool based on a retrospective cohort study

Background Chronic obstructive pulmonary disease (COPD) is a serious condition with a poor prognosis. No clinical study has reported an individual-level mortality risk curve for patients with COPD. As such, the present study aimed to construct a prognostic model for predicting individual mortality risk among patients with COPD, and to provide an online predictive tool to more easily predict individual mortality risk in this patient population. Patients and methods The current study retrospectively included data from 1,255 patients with COPD. Random survival forest plots and Cox proportional hazards regression were used to screen for independent risk factors in patients with COPD. A prognostic model for predicting mortality risk was constructed using eight risk factors. Results Cox proportional hazards regression analysis identified eight independent risk factors among COPD patients: B-type natriuretic peptide (hazard ratio [HR] 1.248 [95% confidence interval (CI) 1.155–1.348]); albumin (HR 0.952 [95% CI 0.931–0.974); age (HR 1.033 [95% CI 1.022–1.044]); globulin (HR 1.057 [95% CI 1.038–1.077]); smoking years (HR 1.011 [95% CI 1.006–1.015]); partial pressure of arterial carbon dioxide (HR 1.012 [95% CI 1.007–1.017]); granulocyte ratio (HR 1.018 [95% CI 1.010–1.026]); and blood urea nitrogen (HR 1.041 [95% CI 1.017–1.066]). A prognostic model for predicting risk for death was constructed using these eight risk factors. The areas under the time-dependent receiver operating characteristic curves for 1, 3, and 5 years were 0.784, 0.801, and 0.806 in the model cohort, respectively. Furthermore, an online predictive tool, the “Survival Curve Prediction System for COPD patients”, was developed, providing an individual mortality risk predictive curve, and predicted mortality rate and 95% CI at a specific time. Conclusion The current study constructed a prognostic model for predicting an individual mortality risk curve for COPD patients after discharge and provides a convenient online predictive tool for this patient population. This predictive tool may provide valuable prognostic information for clinical treatment decision making during hospitalization and health management after discharge (https://zhangzhiqiao15.shinyapps.io/Smart_survival_predictive_system_for_COPD/).


INTRODUCTION
Chronic obstructive pulmonary disease (COPD) is a common chronic airway condition that seriously affects the quality of life of affected individuals. It has been estimated that COPD is the third most common cause of death globally (Lozano et al., 2012). The prevalence of COPD in adults ≥ 20 years of age is approximately 8.6%, whereas the prevalence among those >40 years of age is as high as 13.7% (Wang et al., 2018). The prevalence rate of stage ≥ II COPD can reach 10.1% in the general population (Buist et al., 2007). The three-year mortality rate of COPD patients has been reported to be 10.0%-36.9% according to the Global Initiative for Obstructive Lung Disease (i.e., ''GOLD'') 2017 classification criteria (Gedebjerg et al., 2018). More than 3.2 million individuals die from COPD annually (anonymous, 2017). Therefore, COPD is a serious public health challenge that requires urgent attention from government departments and medical institutions.
Several prognostic models have been developed to predict prognosis among COPD patients, including B-AE-D-C (Boeck et al., 2016), extended ADO (Puhan et al., 2012), ADO (Puhan et al., 2009), updated BODE (Puhan et al., 2009, PSI (Hu et al., 2015), and CURB65 (Chang et al., 2011). These previous models divide patients into high-and low-risk groups and evaluate mortality risk in different groups. To the best of our knowledge, these models are not able to describe individualized survival curves for specific patients. One Japanese research team established a prognostic model for predicting mortality among patients who experience acute exacerbation of COPD during hospitalization (Sakamoto et al., 2017). Although this study could predict the risk for death during hospitalization based on individual patient information, it did not further provide a predictive mortality curve for individual patients during the follow-up period after discharge.
With the development of ''big data'' analytics and data mining algorithms, precision medicine has witnessed significant advances in several research fields. Several precision medicine studies have been able to predict individual mortality curves for specific patients based on clinical information and have provided convenient predictive web tools for patients (Zhang et al., 2021;He et al., 2021;Lin et al., 2021). This convenient, predictive web tool could help patients better evaluate the risk for death and reasonably facilitate individual treatment decisions.
The current study aimed to construct a prognostic model for predicting mortality risk in individual COPD patients based on baseline characteristics. Furthermore, we plan to develop and maintain an online tool to provide an individualized, predictive mortality risk curve for patients with COPD.

METHOD Patients
COPD patients hospitalized in the Department of Respiratory Medicine of Shunde Hospital, Southern Medical University (Foshan City, Guangzhou Province, China) and the Department of Internal Medicine of The Affiliated Chencun Hospital of Shunde Hospital, Southern Medical University, between September 2009 and December 2019, were included. All patients underwent pulmonary function examination after inhaling a tracheal dilator before enrollment and were diagnosed according to a forced expiratory volume in 1 s (FEV 1 )/forced vital capacity <70%, which fulfilled the diagnostic criteria for chronic obstructive pulmonary emphysema (n = 1309). The deadline for follow-up of enrolled patients was May 1, 2020. Patients with missing survival time data were excluded from the survival analysis (n = 54). This study was reviewed and approved by the Ethics Committee of The Affiliated Chencun Hospital of Shunde Hospital, Southern Medical University (ID: 202202001). Due to the retrospective nature of the study and the use of anonymized data, requirements for informed consent were waived by the Ethics Committee of The Affiliated Chencun Hospital of Shunde Hospital, Southern Medical University (ID: 202202001). The current study was conducted in accordance with the Declaration of Helsinki, relevant guidelines, and local regulations.

Information collection
The following information was collected and recorded for survival analysis: general information, including age, sex, body mass index (BMI), smoking history, and smoking time; clinical/biochemical results within 24 h after admission, including body temperature, systolic blood pressure, diastolic blood pressure, heart rate, respiratory rate, respiratory index, partial pressure of arterial oxygen (Pa O 2 ), pH, oxygenation index, partial pressure of arterial carbon dioxide (Pa CO 2 ), sodium, potassium, calcium, blood urea nitrogen (BUN), creatinine, serum albumin (ALB), serum globulin (GLB), C-reactive protein, and B-type natriuretic peptide (BNP) levels, platelets, white blood cell count, granulocyte ratio (GR), blood glucose, and state of consciousness. Original BNP values were converted into an ordered hierarchical variable according to the expert consensus on BNP clinical application recommendations published by the American College of Cardiology (Silver et al., 2004), as follows: no heart failure (BNP <80 ng/L); grade I heart failure (BNP 95-221 ng/L); grade II heart failure (BNP 221-459 ng/L); grade III heart failure (BNP 459-1006 ng/L); and grade IV heart failure (BNP >1006 ng/L); and effective follow-up, the end date of follow-up in this study was May 1, 2020. The survival time of deceased patients was calculated by subtracting the date of discharge from the date of death. The survival time of surviving patients was calculated by subtracting the discharge date from May 1, 2020.

Model building
The current study constructed a prognostic predictive model for COPD patients using a Cox proportional hazards regression algorithm. The Cox proportional hazards regression model is a semi-parametric regression model, which uses survival outcome and survival time as dependent variables, and can analyze the independent effects of multiple factors on survival outcome and survival time at the same time (Fisher & Lin, 1999;Moolgavkar et al., 2018). As a semi-parametric regression model, the Cox proportional hazards regression model could be used to analyze data with censored survival time. The Cox proportional hazards regression model has been widely used to construct prognostic models for various diseases (Han et al., 2021;Royston & Altman, 2013;Luo et al., 2017).

Statistical analysis
Statistical analysis in the current study was performed using R version 3.6.1 (R Core Team, 2019). Continuous variables were compared between the two groups (i.e., model versus validation) using the t -test for data that were normally distributed, while the Mann-Whitney U test was used for data that were not normally distributed. The chi-squared test (default method for contingency table analysis) or Fisher's exact probability method (in case any grid was found to be <1) was used to compare categorical variables between the two groups. The random survival forest method is used to identify valuable risk factors for prognosis (Hsich et al., 2011). Differences with P < 0.05 were considered to be statistically significant.

Baseline characteristics
A total of 1,255 patients were ultimately included in the current analysis and were divided into a model cohort (n = 627) and a validation cohort (n = 628) using a random sampling method. The mortality rate was 78.3% (491/627) in the model cohort and 76.6% (481/628) in the validation cohort (P = 0.509). A comparison of baseline characteristics between the model and validation cohorts is summarized in Table 1.

Variable selection
The relative importance of various independent variables was explored using the random survival forest algorithm (Fig. 1) ) were identified to be independent risk factors for the prognosis of COPD patients according to multivariate Cox proportional hazards regression analysis (Fig. 2, Table 2).     A prognostic predictive nomogram chart is presented in Fig. 3 according to the results of multivariate Cox proportional hazards regression analysis.

Performance in the model cohort
The area under the time-dependent receiver operating characteristic (AUROC) curves for 1, 3, and 5 years were 0.784, 0.801, and 0.806 in the model cohort, respectively (Fig.  4A). Survival curve analysis revealed that the mortality rate of patients in high-risk group was significantly higher than that in low-risk group in the model cohort (Fig. 4B). The calibration correction curves suggested that the predictive model demonstrated good consistency between the predicted and actual mortality rates in the model cohort (Fig. 5).

Performance in the validation cohort
For the validation cohort, the AUROC curves for 1, 3, and 5 years were 0.765, 0.779, and 0.798, respectively (Fig. 6A). Survival curve analysis revealed that the mortality rate of patients in the high-risk group was significantly higher than that in low-risk group in the validation cohort (Fig. 6B). The calibration correction curves suggested that the predictive model demonstrated good consistency between the predicted mortality rate and the actual mortality rate in the validation cohort (Fig. 7).

Online predictive tool
To help clinicians and COPD patients in using the predictive model in predicting the mortality risk curve of individual COPD patient, an online predictive tool, the ''Smart Survival Predictive System for COPD patients'', was developed (https://zhangzhiqiao15. shinyapps.io/Smart_survival_predictive_system_for_COPD/). The user can freely choose from among eight values on the interactive webpage, and then click the ''prediction'' button to obtain the individual mortality predictive curve for an individual COPD patient. A representative mortality risk predictive curve generated by the Smart Survival Predictive  System for an individual COPD patient is shown in Fig. 8. In addition, the Smart Survival Predictive System for COPD patients can also provide the predicted mortality rate and 95% CI at a specific time.

DISCUSSION
The current study identified eight independent risk factors for the prognosis of COPD patients according to the random survival forest model. Based on these eight risk factors, we constructed a prognostic model to predict individual mortality risk curves for subject-level COPD patients, and further developed a convenient web predictive tool. The AUROC curve and calibration correction curves suggested that the prognostic model demonstrated good predictive and discriminative ability in predicting the prognosis of COPD patients. High BUN level at admission was an independent risk factor for death among COPD patients (Chen et al., 2021). BUN was related to the severity of disease in patients with COPD and could be used to assess the risk of prognosis (Shorr et al., 2011). Pa CO 2 was an independent risk factor for death of hospitalized COPD patients (Hu et al., 2016). The higher Pa CO 2 level in COPD patients suggested poorer prognosis than patients with lower Pa CO 2 level (Wen et al., 2014). Albumin and GLB could be used to evaluate the severity of disease and predict the risk for death in elderly patients with COPD (Qin et al., 2018).
Low albumin level was associated with poor 10-year survival in COPD patients (Tang et al., 2021). There was an independent correlation between albumin and the severity of illness in patients with COPD (Li et al., 2021). GLB level was associated with the severity of COPD patients and could be used to identify high-risk patients (Li et al., 2021). NT-proBNP may be a prognostic factor for poor prognosis of COPD patients (Sánchez-Marteles et al., 2009). BNP was an independent risk factor for secondary pulmonary hypertension in COPD patients, providing a valuable clue for the close relationship between the clinically poor prognosis of COPD patients and BNP (Yang et al., 2019). Age has been shown to be an independent risk factor and has been used to construct a prognostic model for COPD patients (Puhan et al., 2012). Age was a statistically significant independent risk factor for death among patients with COPD (Shorr et al., 2011). Smoking years has been used to predict the risk for deterioration of COPD patients (Bertens et al., 2013). A close relationship between smoke exposure and poor prognosis in COPD patients has been reported (Golpe et al., 2015). The neutrophil ratio has been used to predict in-hospital mortality of COPD patients (El-Gazzar et al., 2020). The neutrophil ratio was an independent risk factor of exacerbation in patients with COPD (Ye et al., 2019). The results of these previous clinical studies provide strong support for the eight independent risk factors found in our study.
The current study had several strengths. First, it constructed an online tool to predict mortality risk curves for individual COPD patients, which could provide subject-level risk prediction for this patient population. Second, it constructed a predictive nomogram chart to predict individual mortality risk at the subject-level of COPD patients at 1, 3, and 5 years using a Cox proportional hazards regression model algorithm. Third, the present investigation was a novel clinical study aiming to provide individualized predictive mortality risk curves for COPD patients, which provides a valuable avenue for exploration of individualized prognostic predictive study of patients with COPD.
Nevertheless, the present study also had some limitations, the first of which was the relatively small sample size (n = 1255), which may-to a certain extent-have affected the stability and reliability of the research results. As such, a larger sample size is necessary for future research to strengthen the reliability of the conclusions drawn. Second, because most patients enrolled in the current study were hospitalized with severe COPD, many did not undergo pulmonary function examination during hospitalization due to serious illness. As such, the final research data did not include real-time FEV 1 data and several other parameters during hospitalization. Third, several important clinical indicators were not included in the current study (FEV 1 , modified Medical Research Council dyspnea scale, and 6 min walk distance test); thus, it is difficult to compare our results with several previously reported prognostic models. It is necessary for future research to include these previous important clinical independent variables and perform a comprehensive comparison with previously reported prognostic models. Fourth, because all patients in the current study were from China, the clinical applicability of our predictive model for individuals with COPD is not necessarily generalizable to the same patient population (s) in other geographical regions. Future research cohorts from different regions will help clarify the clinical applicability of the current predictive model to populations across different regions.
In conclusion, the current study constructed a prognostic model for predicting individual mortality risk curves for COPD patients after discharge and provided a convenient online predictive tool. This prognostic predictive tool demonstrated good ability to discriminate between high-and low-risk patients, and can provide valuable prognostic information for clinical treatment decision making during hospitalization and health management after discharge.

ROC
receiver operating characteristic HR hazard ratio CI confidence interval GR granulocyte ratio BUN blood urea nitrogen ALB albumin GLB globulin BNP B-type natriuretic peptide

ADDITIONAL INFORMATION AND DECLARATIONS Funding
The current research is a project supported by the Foshan Science and Technology Bureau (2020001005121). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.