Administrative and Claims Data Help Predict Patient Mortality in Intensive Care Units by Logistic Regression: A Nationwide Database Study

Background Increasing attention has been paid to the predictive power of different prognostic scoring systems for decades. In this study, we compared the abilities of three commonly used scoring systems to predict short-term and long-term mortalities, with the intention of building a better prediction model for critically ill patients. We used the data from the National Health Insurance Research Database (NHIRD) in Taiwan, which included information on patient age, comorbidities, and presence of organ failure to build a new prediction model for short-term and long-term mortalities. Methods We retrospectively collected the medical records of patients in the intensive care unit of a regional hospital in 2012 and linked them to the claims data from the NHIRD. The Acute Physiology and Chronic Health Evaluation II (APACHE II) score, Elixhauser Comorbidity Index (ECI), and Charlson Comorbidity Index (CCI) were compared for their predictive abilities. Multiple logistic regression tests were performed, and the results were presented as receiver operating characteristic curves and C-statistic. Results The APACHE II score has the best predictive power for inhospital mortality (0.79; C − statistic = 0.77 − 0.83) and 1-year mortality (0.77; C − statistic = 0.74 − 0.79). The ECI and CCI alone have poorer predictive power and need to be combined with other variables to be comparable to the APACHE II score, as predictive tools. Using CCI together with age, sex, and whether or not the patient required mechanical ventilation is estimated to have a C-statistic of 0.773 (95% CI 0.744-0.803) for inhospital mortality, 0.782 (95% CI 0.76-0.81) for 30-day mortality, and 0.78 (95% CI 0.75-0.80) for 1-year mortality. Conclusions We present a new prognostic model that combines CCI with age, sex, and mechanical ventilation status and can predict mortality, comparable to the APACHE II score.


Introduction
Intensive care units (ICUs) provide crucial medical care to critically ill patients. Owing to the advances in diagnostics and therapeutics, human life expectancy has been extending, leading to increasing demand for ICU care [1]. The increase in ICU care is accompanied by a growing use of risk assess-ment tools, aimed at evaluating treatment [2], triage patients [3], and achieve better resource allocation [4].
In the past decades, several risk scoring systems have been introduced [5]. These include the Acute Physiology and Chronic Health Evaluation (APACHE) II score [6], Charlson Comorbidity Index (CCI) [7][8][9], and Elixhauser Comorbidity Index (ECI) [10,11]. However, the predictive power of these scoring systems varies according to previous surveys, and how to choose the best system is still not clear.
This study compares the abilities of these three scoring systems to predict short-term and long-term mortalities by combining data from ICU medical records with claims data from Taiwan's National Health Insurance Research Database (NHIRD) [12]. In this study, we have examined and evaluated the different variables using multiple logistic regression tests to identify and compare the strongest predictors. We aim to improve the predictive power of the current scoring systems and to build a new prediction model for ICU mortality.

Methods
2.1. Data Acquisition and Extraction. All ICU admissions (n = 2201) in 2012 at the National Yang-Ming University Hospital, a regional hospital in Eastern Taiwan, were identified. The study was approved by the Institutional Review Board of the National Yang-Ming University Hospital (IRB number: 2014A021). Informed consent was obtained from the patients or their guardians for all data used in this study.
The enrolled patients were admitted to our ICU between January 1 and December 31, 2012. If a patient was admitted more than once during the study period, only the first admission was included to avoid a small group of patients dominating the characteristics of the study population. As a result, 591 repeat admissions were excluded. Whether some of the remaining patients had been admitted to an ICU other than ours could not be determined because we lacked access to the medical records of other hospitals. We also excluded the following patients: (a) those who did not have a Taiwanese citizenship, as noncitizens did not have the national identification number required to link the ICU medical records to the claims and mortality data of NHIRD; (b) those who were under 20 years of age; and (c) those whose data could not be linked to NHIRD because of administrative errors (n = 1). Finally, we enrolled 1,608 patients in our study.
The ICU medical records were linked to the claims data of NHIRD from 2010 to 2013. The Taiwan NHI program is a public insurance system in which the enrollment is compulsory for Taiwanese citizens [12]. We used the patients' national identification numbers for linkage to the NHIRD   Table 1). The look-back period for comorbidities was 1 year before ICU admission. A 1-year look-back period is thought to improve the ability of a model to predict posthospitalization mortality according to previous studies [14,15]. A patient was considered to have a comorbid condition in a certain year if there were at least two claim records with an ICD-9 code for that condition during that year [15,16]. A higher CCI score indicates a higher number of comorbidities. See Appendix 1 for the ICD-9-CM codes for the different comorbidities [11].

Other
Variables. The other variables evaluated in the study were age, sex, hemodialysis, surgery, number of outpatient and emergency department visits in the previous year, number of inpatient admissions in the previous year, admissions department, and use of a ventilator.

Outcomes.
The primary outcome measured in this study was inhospital mortality, which was defined as death during the hospital stay; the censoring point was discharge from the hospital. The secondary outcomes were all-cause 30-day mortality (defined as death within 30 days after hospital discharge) and overall 1-year mortality (defined as death within 1 year after hospital discharge).

Statistical
Analysis. Data were expressed as absolute numbers and percentages. The categorical variables of surviving and deceased subjects were compared using the chisquare test. To evaluate the risk of mortality, an odds ratio with a 95% confidence interval (CI) was determined for each variable via linear regression analysis. We developed regression models to identify the strongest predictors of mortality, which were then entered into multiple logistic regression models to evaluate the overall model performance and to predict the risk of mortality. We computed the areas under receiver operating characteristic (ROC) curves (AUROCs) [17,18] as a measure of the ability of a model to predict mortality over different risk categories. The AUROC is often referred to as the concordance index number (C-statistic)    BioMed Research International and ranges from 0.5 (no discrimination) to 1.0 (perfect discrimination), with values above 0.7, 0.8, and 0.9 considered reasonable, strong, and exceptional, respectively [18]. The discrimination performance of the CCI and ECI (with/without additional variables) was compared to that of the APACHE II score, which was used as the reference model. Differences in the AUROC between the fitted models were analyzed. Statistical analysis was performed using the SPSS statistical software (SPSS Inc., Chicago, IL, USA).

Results
The patient (n = 1608) characteristics stratified by inhospital mortality have been summarized in Table 2. The inhospital mortality rate was 16.11%. The inhospital mortality and no inhospital mortality groups differed significantly in terms of age, department, use and duration of the use of a ventilator, number of inpatient admissions, and CCI, ECI, and APACHE II scores. Tables 3 and 4 show the results of univariate analyses (odds ratios, 95% confidence intervals (CIs), and p values) assessing the association between mortality and the CCI and ECI, respectively, alone and in combination with other variables.
The ECI as the sole independent variable had a lower predictive power for inhospital, 30-day, and 1-year mortality  5 BioMed Research International than that of the APACHE II score and CCI. It had a Cstatistic of 0.55 (95% CI: 0.52-0.59) for inhospital mortality, which increased slightly when age, sex, and mechanical ventilation were added to the model (ECI: MODEL 2). It was highest (0.78, 95% CI: 0.75-0.80) when all variables were included (ECI: MODEL 1). The ECI alone had a predictive power of 0.57 (95% CI: 0.54-0.61) for 30-day mortality and 0.62 (95% CI: 0.59-0.65) for 1-year mortality. Figures 1-3 present the analysis of the AUROC for the prediction of inhospital, 30-day, and 1-year mortality. Contrast tests revealed that the APACHE II score and CCI (CCI MODEL 1) had comparably predicted the three mortality outcomes.

Discussion
This study investigated and compared the predictive power of the APACHE II score, CCI, and ECI for short-term and long-term mortalities in ICU patients. The impact of comorbidities on predictive ability was analyzed using administrative data, which are more accurate than are ICU medical records. Our results show that the CCI and ECI have less predictive power than does the APACHE II score. However, when the CCI was combined with age, sex, and ventilator use, its ability to predict mortality increased and was comparable to that of the APACHE II score. The ECI had a poorer predictive power than that of the CCI, even when combined with age, sex, and mechanical ventilation. However, further adding the number of outpatient visits, emergency department visits, and inpatient admissions during the past year improved its predictive power slightly.
The APACHE II score has been used worldwide for measuring ICU performance. The scoring system was validated and outlined in 1985 by Knaus et al. [6] and is still a popular prognosis evaluation tool in ICU settings [19]. The APACHE II score takes into account various parameters, including acute physiological variables and chronic health conditions, all of which have significant effects on the outcome prediction for ICU patients. The CCI was introduced in 1987 to predict 1-year mortality using comorbidity data from medical charts [7]. The ECI was developed using patient data from 438 acute care hospitals in California in 1992 [10], and its outcome measures were selected from those commonly available in administrative databases.
Previous studies have investigated the predictive power of the various scoring systems and their ability to determine mortality rates using administrative data. Quach et al. [20] compared the discriminative ability of the CCI and APACHE   BioMed Research International II score in predicting hospital mortality in adult multisystem ICU patients and found the former to be less effective. In their study, the CCI did not provide significantly better results even when adjusted for age, sex, and acute physiology score. However, its predictive power slightly improved when it was added to the full APACHE II model. Despite its underperformance, the CCI can be considered an alternative method of risk assessment when data for the variables included in the APACHE II score are unavailable or not recorded in a standard manner. Christensen et al. [21] studied 469 adult patients admitted to a tertiary universityaffiliated ICU and found that there were no major differences in predictive power for mortality between physiology-based systems and the CCI combined with other administrative data. Fortin et al. [22] studied the predictive performance of the ECI for inhospital mortality in adult patients at a health center and found that it demonstrated excellent discrimination for all-cause inhospital mortality. Comorbidity indices and APACHE II scores have also been used to study the severity of and mortality risk adjustments for specific health conditions such as ischemic stroke [23], acute intracerebral hemorrhage [24], trauma [25], and cancer [26]. Previous studies that enrolled different types of patients have shown that the APACHE II score has the highest pre-dictive power for mortality compared with other comorbidity indices [20,27,28]. These studies usually focused on short-term mortality, and studies on longer-term mortality are limited [29].
One possible reason for the significantly higher predictive power of the APACHE II score for mortality in ICU patients when compared to that of the CCI and ECI is that the acute physiology status is usually more critical in ICU patients than in other patients and varies significantly among patients. Our study results are similar to those of Ho et al. [27], who showed that replacing the chronic condition measures of the APACHE II score with those of the CCI or ECI did not significantly improve the mortality-predicting power of the APACHE II score. On the other hand, Quach et al. [20] found that if the CCI was combined with the APACHE II score, its predictive power increased from 0.626 to 0.74. Our study shows that acute physiology variables should not be replaced by other comorbidity measures when predicting mortality in ICU patients.
When comparing short-term and long-term mortalities, we found that the APACHE II score had a slightly better predictive power for short-term mortality, whereas the CCI had a higher predictive power for long-term mortality. However, the predictive power of the CCI for long-term mortality was  still lower than that of the APACHE II score. The poorer performance of the CCI may be because of its coarse weights. If we put the comorbidity variables of the CCI in a regression model as a 0-1 binary indicator, the predictive ability of the CCI may strengthen. Like the CCI, the ECI also predicted long-term mortality better than short-term mortality. This supports our conclusion that acute physiology status is more important for short-term versus long-term mortality. While the APACHE II score was originally designed to measure the severity of the conditions in critical care patients, the CCI and ECI were not. This probably explains the higher predictive power of the APACHE II score in our study [21,28]. However, considering the time and cost involved in data collection, comorbidity measures derived from administrative data (such as those used in the CCI) may still have substantial advantages in terms of data accessibility [30].

Limitations
This study has some limitations. First, this study retrospectively collected patient information from a single regional hospital in Eastern Taiwan. Considering the limited study period and single geographic location, our findings cannot be extrapolated to other ICUs in Taiwan. A study design including different hospitals would, therefore, produce more reliable data. Second, our findings need to be validated by prospective analysis of subsequent ICU admissions. Finally, although the use of the NHIRD has advantages (e.g., large sample sizes, long observation periods, updated information, and easy access to different information sources), it also has some disadvantages, such as possible misclassification of diseases and difficulty in controlling confounding factors [31].

Conclusions
For ICU patients, the APACHE II score has the strongest predictive power for short-term mortality, followed in turn by the CCI and ECI. The ability of our new model, which combines the CCI with age, sex, and use of mechanical ventilation, to predict short-term and long-term mortality in ICU patients is comparable to that of the APACHE II score.