Comparison of the predictive value of scoring systems on the prognosis of cirrhotic patients with suspected infection

Abstract Cirrhotic patients with infection are prone to develop sepsis or even septic shock rendering poorer prognosis. However, few methods are available to predict the prognosis of cirrhotic patients with infection although there are some scoring systems can be used to predict general patients with cirrhosis. Therefore, we aimed to explore the predictive value of scoring systems in determining the outcome of critically ill cirrhotic patients with suspected infection. This was a retrospective cohort study based on a single-center database. The prognostic accuracy of the systemic inflammatory response syndrome (SIRS) criteria, quick Sequential Organ Failure Assessment (qSOFA), chronic liver failure (CLIF)-SOFA, quick CLIF-SOFA (qCLIF-SOFA), CLIF-consortium organ failure (CLIF-C OF), Model for End-Stage Liver Disease (MELD), and Simplified Acute Physiology Score (SAPS) II were compared by using area under the receiver operating characteristic (AUROC) curve and net benefit with decision curve analysis. The primary endpoint was in-hospital mortality while the secondary endpoints were duration of hospital and intensive care unit (ICU) stay and ICU mortality. A total of 1438 cirrhotic patients with suspected infection were included in the study. Nearly half the patients (50.2%) were admitted to the ICU due to hepatic encephalopathy and the overall in-hospital mortality was 32.0%. Hospital and ICU mortality increased as the score of each scoring system increased (P < .05 for all trends). The AUROC of CLIF-SOFA (AUROC, 0.742; 95% confidence interval, CI, 0.714–0.770), CLIF-C OF (AUROC, 0.741; 95% CI, 0.713–0.769), and SAPS II (AUROC, 0.759; 95% CI, 0.733–0.786) were significantly higher than SIRS criteria (AUROC, 0.618; 95% CI, 0.590–0.647), qSOFA (AUROC, 0.612; 95% CI, 0.584–0.640), MELD (AUROC, 0.632; 95% CI, 0.601–0.662), or qCLIF-SOFA (AUROC, 0.680; 95% CI, 0.650–0.710) (P < .05 for all). In the decision curve analysis, the net benefit of implementing CLIF-SOFA and CLIF-C OF to predict the prognosis of cirrhotic patients with suspected infection were higher compared with SIRS, qSOFA, MELD, or qCLIF-SOFA. CLIF-SOFA and CLIF-C OF scores, as well as SAPS II were better tools than SIRS, qSOFA, MELD, or qCLIF-SOFA to evaluate the prognosis of critically ill cirrhotic patients with suspected infection.


Introduction
Liver cirrhosis is a common disease and decompensated cirrhosis is frequently complicated with infection. [1] Surprisingly, infection rates in hospitalized patients with cirrhosis were 4-to 5-fold higher than those among the general patient population, and associated with poor prognosis. [2] Given that cirrhosis is commonly associated with some immune dysfunction, once infected, the cirrhotic patients are prone to develop severe sepsis and septic shock. [3] Despite the advances in clinical care and treatment, the mortality rate of sepsis and septic shock remained 20% to 60%. [4] To more accurately stratify the risk of patients with sepsis, in 2016, Sepsis 3 definition was proposed and Sequential Organ Failure Assessment (SOFA) score was recommended to define sepsis while quick SOFA (qSOFA) was used to identify sepsis patients in ward and emergency department early. [5] Thereafter, systemic inflammatory response syndrome (SIRS), SOFA and qSOFA were validated for sepsis patients in intensive care unit (ICU) and emergency department in different countries. [6][7][8] On the other hand, for general cirrhotic patients, some liverspecific scores such as Model for End-Stage Liver Disease (MELD), [9] chronic liver failure (CLIF)-SOFA, [10] and CLIF-consortium organ failure (CLIF-C OF) [11] have long been used to evaluate patients' outcome. Moreover, recently, the quick CLIF-SOFA (qCLIF-SOFA) [12] presented good discriminative ability for outcome of cirrhotic patients. However, for cirrhotic patients with suspected infection, there was no unanimous consent on the superiority of these scores when predicting the prognosis of these patients. Hence, we sought to answer the question by comparing the prognostic accuracy of the abovementioned scoring systems using MIMIC (Medical Information Mart for Intensive Care) III database.

Data source and extraction
This study was based on a publicly available ICU database named MIMIC-III (version 1.4), a large, single-center database containing information of more than 40,000 patients admitted to Beth Israel Deaconess Medical Center (a teaching hospital of Harvard Medical School in Boston, MA) from 2001 to 2012. The database contains data of general information, treatment processes, and survival data. The access of the MIMIC III database for research was approved by the Institutional Review Boards of Beth Israel Deaconess Medical Center after completion of the NIH webbased course named "Protecting Human Research Participants." Since all patients were de-identified, informed consent was waived. Data was extracted from MIMIC-III by using structure query language (SQL) with pgAdmin4 PostgreSQL 9.6. Table 1 Components of the seven scoring systems.

Inclusion criteria and definitions
Patients with liver cirrhosis and suspected infection were included. Cirrhotic patients were extracted according to International Classification of Diseases (ICD)-9 codes (5712, 5715, 5716 indicated "alcoholic cirrhosis of liver," "cirrhosis of liver without mention of alcohol," and "biliary cirrhosis," respectively). Of the cirrhotic patients, those with suspected   infection were extracted if one of the following criteria was fulfilled: ICD-9 contained any of the following term "infection," "pneumonia," "meningitis," "peritonitis," "bacteremia," "sepsis," or "septic"; positive microbiological culture. [13] Of all the cirrhotic patients, those diagnosed with hepatic encephalopathy (HE) was identified by ICD-9 code of 5722. According to previous studies, [14,15] Glasgow Coma Scale (GCS) was associated with West-Haven grade to some degree. Given GCS is somewhat consistent with West Haven grade, [15] thus, in this study, HE patients with GCS 15 and 3 to 5 scores were categorized as grade 1 and 4, respectively, while those with GCS 13 to 14 and 6 to 12 scores fell into grade 2 and 3, respectively. Demographic, laboratory, and clinical data on ICU admission were collected including age, gender, ethnicity, admission location, Simplified Acute Physiology Score (SAPS) II, comorbidities, complication of cirrhosis, in-hospital and ICU outcomes, some laboratory, and clinical parameters. Prognostic scoring systems including SAPS II, CLIF-SOFA, SIRS, qSOFA, qCLIF-SOFA, MELD, and CLIF-C OF were calculated for all patients (Table 1). [5,10,11,12,16,17] CLIF-SOFA and CLIF-C OF shares the 6 components including bilirubin, kidney, HE grades, international normalized ratio (INR), circulation, and lungs. However, subscores of each component ranged from 0 to 4 for CLIF-SOFA, whereas from 1 to 3 for CLIF-C OF. [10,11] As for qCLIF-SOFA, it included bilirubin, creatinine, INR, mean arterial pressure and vasopressin usage and each with subscores of 0 or 1. [12] Recently proposed qSOFA score contained respiratory rate, mentation status and systolic blood pressure. [5]  The primary endpoint was in-hospital mortality. The secondary endpoints included ICU mortality and the length of stay in ICU and hospital.

Statistical analysis
The Kolmogorov-Smirnov test and histograms were performed to test the normality of the distribution of quantitative variables. Normally distributed quantitative variables were presented as mean ± SD while skewed variables were summarized as median and interquartile range (IQR). For comparison, x 2 analysis or Fisher exact test were performed for categorical variables. Quantitative variables were compared using analysis of variance (ANOVA) or t test for normally distributed data and Kruskal-Wallis test or Mann-Whitney test for non-normally distributed data.
Predictive accuracy of each score was determined by comparing the area under the receiver operating characteristic (AUROC) curve. Clinical significance and net benefit were estimated by decision curve analysis. Decision curve analysis was first proposed in 2006 and has been used in studies published in the Lancet [18] and Journal of Clinical Oncology. [19] Decision curve is a complement to ROC curve for a risk model. It is more informative than an ROC curve because the true-and falsepositive fractions are displayed as functions of the risk threshold, whereas the risk threshold is suppressed in the ROC curve. [20] All analyses were performed using R 3.3.3 (http://www.rproject.org/), and a P-value less than .05 was considered statistically significant.

Basic characteristics
A total of 1438 ICU patients with cirrhosis and suspected infection were included in the present study (Table 2). Of them, about 64% were male and nearly half aged over 60 years old. In the cohort, the median SAPS II was 56 and the in-hospital mortality was 32.0% and nearly half the patients (50.2%) admitted to ICU for hepatic encephalopathy. Among all the cirrhosis patients, most of them were with sepsis according to Sepsis-3 definition. Approximately 30% patients had associated diabetes mellitus, and 29% patients' alcohol abuse or dependence. With respect to laboratory data, white blood cell (WBC) count, creatinine, lactate, INR, partial thromboplastin time, total bilirubin, and aspartate aminotransferase were significantly higher for non-survivors than survivors (P < .001 for all).
The incidence of respiratory, bloodstream, and peritoneal infections was significantly increased among non-survivors (P < .05 for all). In this regard, causative agents for worse outcome were predominantly fungal infections (P < .001), for example, Candida albicans (P = .013) and Aspergillus (P = .001) ( Table 3). Table 4 Primary and secondary endpoints based on the score of each scoring system.

Outcomes of all patients
The distributions of each scoring system and their association with in-hospital mortality were shown in Figure 1. As expected, in-hospital and ICU mortality elevated as the score of each scoring system increased (P < .05 for all trends) ( Table 4).

Prognostic accuracy of scoring systems
Predictive value of in-hospital mortality was significantly higher  Table 5).
To facilitate the comparison of clinical significance among different scoring systems, decision curve analysis was performed. In present analysis, CLIF-SOFA, CLIF-C OF, and SAPS II based models had higher net benefit than SIRS, qSOFA, MELD, or qCLIF-SOFA based models across a wide range of decision threshold probabilities (approximately 10%-70% risk of death) ( Fig. 3 and Table 6). It seemed SIRS and qSOFA failed to produce net benefit while the probability of death exceeding more than 40%. However, at extremely high risks of death (> 70%), CLIF-SOFA and CLIF-C OF based models presented poor predictive value with negative net benefit (Fig. 3).

Discussion
Bacterial infections are responsible for the death of 30% to 50% cirrhotic patients. [21] Thus, an optimal prognostic score helps ICU physicians to early identify those with high risk of death and to intervene timely. In this study, we compared the predictive value of five scoring systems on the prognosis of critically ill cirrhotic patients with suspected infection. We confirmed that CLIF-SOFA, CLIF-C OF, and SAPS II had superior prognostic value for in-hospital mortality than SIRS criteria, qSOFA, MELD, or qCLIF-SOFA. However, at extremely high risks of death, CLIF-SOFA and CLIF-C OF scores showed poor predictive ability.
Several scores had been developed to evaluate the severity of cirrhosis, mainly focusing on the loss of liver function and its  Table 5 Prognostic accuracy of SIRS criteria, qSOFA, MELD, CLIF-SOFA, CLIF-C OF, qCLIF-SOFA, and SAPS II among critically ill cirrhotic patients with suspected infection. complications. Previous studies proved that MELD score, had better accuracy in predicting 3-month mortality among patients with chronic liver disease and needing liver transplantation, compared with Child-Turcotte-Pugh (CTP) score, which was applied for the allocation of donor liver. [16] However, a negative value of MELD scores may not be always favorable. Later, the Royal Free Hospital (RFH) score was proposed in an observational study including 312 cirrhotic patients in ICU and it proved similar discriminative ability compared with SOFA and CLIF-SOFA and better than MELD score. [22] Another modified score for critically ill cirrhosis (MSCIC) including prothrombin time, bilirubin, vasopressin usage, HE and SIRS, was developed and demonstrated superior to CTP, MELD, Acute Physiology and Chronic Health Evaluation (APACHE) II scores, and CLIF-SOFA. [23] However, the SIRS criteria within MSCIS were modified and it was also too complex to calculate. The prognostic accuracy of liver-specific scoring systems were validated and favored by present study. CLIF-SOFA, a modified SOFA score, was customized for chronic hepatic disease including more hepatic components compared with SOFA. CLIF-C OF was a simplified version of CLIF-SOFA and presented similar performance with CLIF-SOFA in the previous study, [24] which was consistent with our results. Bilirubin, HE, and INR, included in CLIF-SOFA and CLIF-C OF, were significantly associated with the prognosis of cirrhotic patients and were recommended to be used as predictor of outcomes. [25] In this sense, CLIF-SOFA and CLIF-C OF were more appropriate for patients with chronic disease than SIRS or qSOFA. A large  Table 6 Net benefit, true-and false-positive rate of using SIRS, qSOFA, MELD, CLIF-SOFA, CLIF-C OF, qCLIF-SOFA, and SAPS II on different decision thresholds. research had previously proved that SIRS was not a perfect score for predicting ICU mortality and had poor ability to define a transition point in risk of death. [26] Some explained SIRS criteria were too sensitive and not specific enough that the severity of illness was overestimated. [27] As our analysis indicated, SIRS showed nearly no prognostic value when risk probability surpassed 40%. Similar to SIRS criteria, qSOFA, with no laboratory measurements and originally developed to be used at bedside, also failed to predict and discriminate patients with high risk. Though showing great prognostic value among patients with suspected infection outside of the ICU, [7,28] qSOFA may be inappropriate for those severe patients. Since lactate had long been regarded as an predictor for sepsis or septic shock, previous studies found that when adding lactate levels to qSOFA, the predictive value turned significantly elevated. [29] To improve the predictive validity of qSOFA, some authors even recommended replacing mention status with lactate levels for mention alternation was potentially biased by physician's judgment and was thus difficult to validate. [28] Nevertheless, this may result in concern that "quick" SOFA will become not that quick. Inspired by qSOFA, some authors argued that CLIF-SOFA was too timeconsuming and proposed qCLIF-SOFA. [12] Compared with CLIF-SOFA, HE grade was excluded and subscores of each component was just 0 or 1. Although they found simpler qCLIF-SOFA possessed comparable accuracy for prognostic prediction of 28-and 90-day mortality compared with CLIF-SOFA, the results were not verified in population out of ICU, such as emergency department. Moreover, risk factors and severe complication such as HE were not taken into account, which may result in missed diagnosis. It was worth mentioning that 2 of 5 components within qCLIF-SOFA were circulation-related (mean arterial pressure and vasopressin application) and it may cause some overlap and underestimate the illness of liver itself. However, as a classical severity score in ICU, [17] SAPS II remained a better predictive value than MELD, which was consistent with a recent observational study. [23] Although we supported the use of CLIF-SOFA and CLIF-C OF as screening tools for the in-hospital mortality of cirrhotic patients with suspected infection, they both showed poor prognostic value for patients with high risk of death (above mortality of 70% in present analysis showed in Fig. 3). This probably explained why the AUROC of CLIF-SOFA and CLIF-C OF were marginally smaller than those reported in previous studies. [23,30,31] Jalan et al [11] suggested adding age and logtransformed white blood cell count to CLIF-C OF to produce a specific prognostic score for acute-on-chronic liver failure (ACLF) named CLIF Consortium ACLF score (CLIF-C ACLFs). The CLIF-C ACLFs showed good discrimination ability at ACLF diagnosis and proved to have potential use on stratifying the risk of death in ACLF patients. Recently, authors discovered that CLIF-SOFA presented an improvement of discriminative ability when incorporate temperature into original CLIF-SOFA. [32] It indicated that inflammatory factor had great impact weight on mortality and could complement the prognostic accuracy of CLIF-SOFA or CLIF-C OF.
There are several limitations for the present study. First, as a retrospective cohort research, the study may have a hereditary limitation. For example, we could not avoid the heterogeneity because the cohort included cirrhotic patients of various etiologies. Second, HE grade was roughly defined according Glasgow Coma Scale for the data on altered mention, nervous reflex was limited and some other items such as electroencephalogram were not available. Third, the study was based on a single-center database, which may result in concerns on the generalization of the conclusions and the selection bias.
In summary, although there are a lot of discussion and controversies on the predictive value of all kinds of prognostic scores, CLIF-SOFA, CLIF-C OF, and SAPS II scoring systems are optimal tools to predict the prognosis of critically ill cirrhotic patients with suspected infection up to now. However, large multicenter prospective studies are needed to improve the predictive ability of scoring systems.