Child–Pugh Versus MELD Score for the Assessment of Prognosis in Liver Cirrhosis

Supplemental Digital Content is available in the text


INTRODUCTION
L iver cirrhosis has a high morbidity and mortality, which is the 14th most common cause of death all over the world and the 4th in central Europe. It leads to 1.03 million deaths per year in the world, 1 and 170,000 deaths per year in Europe. 2 The prevalence of liver cirrhosis may be underestimated, because patients at the early phase of liver cirrhosis are often asymptomatic, and most of patients with liver cirrhosis are admitted due to its related complications. The 1-year mortality of liver cirrhosis varies greatly from 1% to 57% according to the complications. 3 It is necessary to use the prognostic models to identify high-risk patients.
Child-Pugh score was firstly proposed by Child and Turcotte to predict the operative risk in patients undergoing portosystemic shunt surgery for variceal bleeding. The primary version of Child-Pugh score included ascites, hepatic encephalopathy (HE), nutritional status, total bilirubin, and albumin. Pugh et al 4 modified the Child-Pugh classification by adding prothrombin time or international normalized ratio (INR) and removing nutritional status. Child-Pugh score has been widely used to assess the severity of liver dysfunction in clinical work.
Model for end-stage liver disease (MELD) score was initially created to predict the survival of patients undergoing transjugular intrahepatic portosystemic shunts (TIPS). 5 The primary version of MELD score included the etiology of liver cirrhosis, but this variable was unnecessary. 6 The present version of MELD score incorporated only 3 objective variables, including total bilirubin, creatinine, and INR. Currently, it has been used to rank the priority of liver transplantation (LT) candidates.
Child-Pugh and MELD scores have been widely used to predict the outcomes of cirrhotic patients. However, they have some drawbacks. First, 2 variables (i.e., ascites and HE) included in Child-Pugh score are subjective and may be variable according to the physicians' judgment and the use of diuretics and lactulose. Second, INR, which is one component of both Child-Pugh and MELD scores, does not sufficiently reflect coagulopathy and consequently liver function in

Description of Statistical Results
Their statistical results were summarized in Table 2. There were 269 comparisons between MELD and Child-Pugh scores. Among 60 comparisons, a statistically significant difference (P < 0.05) was observed. In details, the superiority of MELD score over Child-Pugh score was observed in 44 comparisons; and the superiority of Child-Pugh score over MELD score was observed in 16 comparisons. Among 99 comparisons, no statistically significant difference (P ! 0.05) was observed. Among 110 comparisons, the statistical significance was not reported.

Study Quality
The brief explanation of study quality was presented in Table 3. As for the risk of bias, 48 and 71 studies had low and unclear risks in the term of patient selection, respectively; 119 studies had low risks in the term of index tests; 117 and 2 studies had low and unclear risks in the term of reference standard, respectively; 91 and 28 studies had low and unclear risks in the term of flow and timing, respectively. As for the applicability concerns, 94 and 25 studies had low and high concerns in the term of patient selection, respectively; 2, 1, and 116 studies had low, unclear, and high concerns in the term of index test, respectively; 1 and 118 studies had low and high concerns in the term of reference standard, respectively.

>0.05
Alcoholic liver disease P

Subgroup Analysis According to the Clinical Presentations
Two studies were eligible for the subgroup meta-analysis to compare the diagnostic accuracy of Child-Pugh versus MELD score in patients with ACLF. 40,119 The mean AUSROC of MELD score was larger than that of Child-Pugh score. There was no statistically significant diagnostic threshold effect in the meta-analysis of Child-Pugh or MELD score. The 95%CIs of DORs, NLRs, and PLRs were overlapped between them. But the 95%CIs of sensitivities and specificities were not overlapped. Child-Pugh score had a higher summary sensitivity than MELD score, but MELD score had a higher summary specificity than Child-Pugh score.
Four studies were eligible for the subgroup meta-analysis to compare the diagnostic accuracy of Child-Pugh versus MELD score in patients with UGIB. 84,94,109,117 The mean AUSROC of MELD score was larger than that of Child-Pugh score. There was a statistically significant diagnostic threshold effect in the meta-analysis of MELD score. Thus, DOR, NLR, PLR, sensitivity, or specificity of MELD score was not calculated.

Subgroup Analysis According to the Etiology of Liver Diseases
Two studies were eligible for the subgroup meta-analysis to compare the diagnostic accuracy of Child-Pugh versus MELD score in patients with alcohol alone related liver cirrhosis. 19,61 The mean AUSROC of Child-Pugh score was larger than that of MELD score. There was no statistically significant diagnostic threshold effect in the meta-analysis of Child-Pugh or MELD score. The 95%CIs of DORs, NLRs, PLRs, sensitivities, and specificities were overlapped between them.
Two studies were eligible for the subgroup meta-analysis to compare the diagnostic accuracy of Child-Pugh versus MELD score in patients with hepatitis B virus alone related liver cirrhosis. 56,119 The mean AUSROC of MELD score was larger than that of Child-Pugh score. There was a statistically significant diagnostic threshold effect in the meta-analysis of MELD score. Thus, DOR, NLR, PLR, sensitivity, or specificity of MELD score was not calculated.

Subgroup Analysis According to the Patients' Conditions
Six studies were eligible for the subgroup meta-analysis to compare the diagnostic accuracy of Child-Pugh versus MELD score in patients admitted to ICU. 42,80,107,108,110,112 The mean AUSROC of MELD score was larger than that of Child-Pugh score. There was no statistically significant diagnostic threshold effect in the meta-analysis of Child-Pugh or MELD score. The 95%CIs of DORs, PLRs, and specificities were overlapped between them. But the 95%CIs of NLRs and sensitivities were not overlapped. MELD score had a smaller summary NLR and a higher summary sensitivity than Child-Pugh score.
Four studies were eligible for the subgroup meta-analysis to compare the diagnostic accuracy of Child-Pugh versus MELD score in LT candidates. 48,67,87,115 The mean AUSROC of MELD score was larger than that of Child-Pugh score. There was no statistically significant diagnostic threshold effect in the meta-analysis of Child-Pugh or MELD score. The 95%CIs of DORs, NLRs, PLRs, sensitivities, and specificities were overlapped between them.

Subgroup Analysis According to the Treatment Options
Five studies were eligible for the subgroup meta-analysis to compare the diagnostic accuracy of Child-Pugh versus MELD score in patients who underwent surgery. 17,32,52,104,111 The mean AUSROC of Child-Pugh score was larger than that of MELD score. There was no statistically significant diagnostic threshold effect in the meta-analysis of Child-Pugh or MELD score. The 95%CIs of DORs, NLRs, PLRs, and sensitivities were overlapped between them. But the 95%CIs of specificities were not overlapped. Child-Pugh score had a higher summary specificity than MELD score. Two studies were eligible for the subgroup meta-analysis to compare the diagnostic accuracy of Child-Pugh versus MELD score in patients who underwent TIPS. 11,91 Because only 2 comparisons were eligible for the subgroup metaanalysis, the mean AUSROCs of Child-Pugh and MELD scores could not be calculated. The 95%CIs of DORs, NLRs, PLRs, sensitivities, and specificities were overlapped between them.

Subgroup Analysis According to the Endpoints
Five studies were eligible for the subgroup meta-analysis to compare the diagnostic accuracy of Child-Pugh versus MELD score for predicting the in-hospital mortality. 62,84,[110][111][112] The mean AUSROC of MELD score was larger than that of Child-Pugh score. There was a statistically significant diagnostic threshold effect in the meta-analysis of Child-Pugh score. DOR, NLR, PLR, sensitivity, or specificity of Child-Pugh score was not calculated.
Eight studies were eligible for the subgroup meta-analysis to compare the diagnostic accuracy of Child-Pugh versus MELD score for predicting the 3-month mortality. 11,19,32,74,91,94,117,119 The mean AUSROC of MELD score was larger than that of Child-Pugh score. There were statistically significant diagnostic threshold effects in the metaanalyses of Child-Pugh and MELD scores. DORs, NLRs, PLRs, sensitivities, or specificities of Child-Pugh and MELD scores were not calculated.
Seven studies were eligible for the subgroup meta-analysis to compare the diagnostic accuracy of Child-Pugh versus MELD score for predicting the 6-month mortality. 19,24,25,56,67,76,127 The mean AUSROC of MELD score was larger than that of Child-Pugh score. There was a statistically significant diagnostic threshold effect in the metaanalysis of Child-Pugh score. DOR, NLR, PLR, sensitivity, or specificity of Child-Pugh score was not calculated.
Eight studies were eligible for the subgroup meta-analysis to compare the diagnostic accuracy of Child-Pugh versus MELD score for predicting the 12-month mortality. 13,24,61,65,77,94,117,127 The mean AUSROC of Child-Pugh score was larger than that of MELD score. There was no statistically significant diagnostic threshold effect in the metaanalysis of Child-Pugh or MELD score. The 95%CIs of DORs, NLRs, PLRs, sensitivities, and specificities were overlapped between them.

DISCUSSION
To our knowledge, this is the most comprehensive review to evaluate the diagnostic accuracy of Child-Pugh and MELD scores in patients with liver cirrhosis. Indeed, several previous narrative reviews regarding their prognostic values had been published by top experts. [129][130][131] By comparison, our study employed a systematic search strategy to maximize the number of relevant papers. Several additional strengths included: the study and patient characteristics were systematically analyzed; the study quality was carefully evaluated; the clinical significance of Child-Pugh and MELD scores was further subdivided according to the different study population; and the metaanalysis was employed to synthesize the statistical results. Some remarkable findings should be summarized as follows.
First, in patients with ACLF, Child-Pugh score had a significantly higher sensitivity than MELD score, because the 95%CIs were not overlapped among them and the lower limit of 95%CI of Child-Pugh score was higher than the upper limit of 95%CI of MELD score (0.73 > 0.71); by contrast, MELD score had a significantly higher specificity than Child-Pugh score, because the 95%CIs were not overlapped among them and the lower limit of 95%CI of MELD score was higher than the upper limit of 95%CI of Child-Pugh score (0.70 > 0.58). These findings suggested that Child-Pugh score might have a better discriminative ability to predict the probability of developing some endpoint events in patients with ACLF, and that MELD score might have a better discriminative ability to predict the probability of free of developing some endpoint events in such patients.
Second, in patients admitted to ICU, MELD score had a significantly smaller NLR than Child-Pugh score, because the 95%CIs were not overlapped among them and the upper limit of 95%CI of MELD score was smaller than the lower limit of 95%CI of Child-Pugh score (0.35<0.36). MELD score also had a significantly higher sensitivity than Child-Pugh score, because the 95%CIs were not overlapped among them and the lower limit of 95%CI of MELD score was higher than the upper limit of 95%CI of Child-Pugh score (0.76 > 0.71). These findings suggested that MELD score might have a better discriminative ability to predict the probability of developing some endpoint events in such patients.
Third, in patients undergoing surgery, Child-Pugh score had a significantly higher specificity than MELD score, because the 95%CIs were not overlapped among them and the lower limit of 95%CI of Child-Pugh score was higher than the upper limit of 95%CI of MELD score (0.79 > 0.73). These findings suggested that Child-Pugh score might have a better discriminative ability to predict the probability of free of developing some endpoint events in such patients.
Fourth, Child-Pugh and MELD scores had statistically similar discriminative abilities in some subgroups (i.e., patients with alcohol alone related liver cirrhosis, LT candidates, patients undergoing TIPS, and 12-month mortality as the endpoint).
Fifth, because of statistically significant diagnostic threshold effects, DORs, NLRs, PLRs, sensitivities, or specificities could not be compared in some subgroups (i.e., patients with acute gastrointestinal bleeding, patients with hepatitis B virus alone related liver cirrhosis, in-hospital mortality as the endpoint, 3-month mortality as the endpoint, and 6-month mortality as the endpoint).
Our study had 2 major limitations. First, although a great number of papers were included in the systematic review, not all included studies were eligible for our meta-analysis. Additionally, in some subgroup analyses, DORs, NLRs, PLRs, sensitivities, or specificities were not available. Thus, the combination of data from some selected papers could result in the potential bias. Second, the cut-off values of Child-Pugh and MELD scores for the assessment of prognosis were different among included studies. Therefore, we could not obtain any accurate thresholds for identifying the high-risk or lowrisk patients.
In conclusion, we provided an overview regarding the comparison of Child-Pugh and MELD scores for the assessment of prognosis in liver cirrhosis. Both of them had similar prognostic significance in most of cases. However, given their distinctive benefits for some specific conditions, further studies might be necessary to clarify the candidates who should use Child-Pugh or MELD score for the assessment of prognosis and the timing when we should use Child-Pugh or MELD score for the assessment of prognosis. New scores should also be proposed to more accurately assess the prognosis of patients with liver disease based on prospective studies.