Screening for hepatocellular carcinoma in chronic liver disease: A systematic review and meta-analysis of randomized controlled trials comparing screening methodologies

Background Hepatocellular carcinoma (HCC) is the 5th most prevalent cancer and the second most common cause of cancer-related mortality worldwide. HCC is often asymptomatic until an advanced stage. Current guidelines recommend ultrasound surveillance with or without measurement of serum alpha-fetoprotein. Our objective was to determine if screening for HCC is beneficial or harmful in patients with chronic liver disease. Primary outcomes were all-cause mortality and quality of life. Secondary outcomes were mortality due to HCC, the number of cases of HCC detected and adverse events. Methods This is a systematic review and meta-analysis of data from randomized controlled trials. To be included trials had to randomize patients to either an HCC screening group or non-screening group, randomize patients to different screening frequencies or randomize patients to different screening methods. All published reports of randomized trials on screening for HCC were eligible for inclusion, irrespective of the language of publication. Studies had to include patients with chronic liver disease. Data extraction and analysis were performed independently by two reviewers. Results When screening with six-monthly alpha-fetoprotein and ultrasound abdomen was compared to no screening there was no evidence of difference in HCC related mortality when adjusted for clustering across a range of intracluster correlation coefficients (Intracluster coefficient (ICC) 0.02, odds ratio (OR) 0.60, 95% confidence interval (CI) 0.31-1.15). Screening with six-monthly alpha-fetoprotein when compared to a single alpha-fetoprotein check did not result in a statistically significant difference in all-cause mortality (OR 1.02, 95% confidence interval (CI) 0.65-1.60), mortality due to HCC (OR 1.01, 95% CI 0.57-1.78) or the number of HCC detected (OR 1.11 95% CI 0.64-1.92). randomized controlled trials using

Hepatocellular carcinoma (HCC) is the 5th most prevalent cancer and the second most common cause of cancer-related mortality worldwide (1). In 2012 there are estimated to have been 782,000 new cases of hepatocellular carcinoma worldwide. 83% of these cases occurred in less developed regions with 50% occurring in China alone (1). It is 2-4 times more common in men than women (2).
Chronic liver diseases such as hepatitis B, hepatitis C, alcoholic liver disease and non-alcoholic fatty liver disease are major risk factors for HCC (3,4). It usually occurs in patients where liver disease has progressed to liver cirrhosis (5,6). The incidence of hepatocellular carcinoma is increasing in certain regions such as the United States (incidence men: 8.11/100,000, women: 2.47/100,000). This may be related to an increase in the prevalence of hepatitis C in the United States of America and increased immigration from high prevalence regions (7). The incidence in China decreased over the time period 2000-2014 (incidence men: 38.3/100,000, women: 14.3/100,000) (8,9). The median age at diagnosis varies between populations. According to data from the Surveillance Epidemiology and End Results (SEER) program the median age of HCC diagnosis by region of birth in those living in the United States was 70-74 years for people from Europe, 65-69 years for people from Asia and 40-45 years for people from West Africa (9). Hepatocellular carcinoma is often asymptomatic until an advanced stage. Early-stage tumors are more likely to be amenable to treatment and have better overall survival (10). There is no curative treatment for intermediate or late-stage tumors. Patients with symptomatic hepatocellular carcinoma have a 3-year survival of 8% (11). Patients with wellcompensated liver disease may be considered for surgical resection. The survival rates are 58% at 3 years and 42% at 5 years in non-cirrhotic patients with surgical resection (12). Liver transplantation is the main course of therapy in patients with cirrhosis of the liver. 5-year survival rates after liver transplantation are 69%, with a tumor recurrence rate of 7% (13). Patients with intermediate-stage hepatoma treated with transcatheter arterial chemo-embolization (TACE) have a median overall survival of 19-20 months (10).
Screening is the periodic application of a test in people at risk of developing a given disease to identify an early or latent stage. Screening for hepatocellular carcinoma fits many of the requirements of a screening program because it follows a known clinical course, has an early treatable stage and is an important health problem. The rationale for screening is that patients at high risk of HCC, such as those with chronic liver diseases, can be identified and invited to participate. However, HCC often occurs in those with undiagnosed liver cirrhosis (14). Patients with undiagnosed liver cirrhosis or chronic liver disease will not be captured by screening programs founded on this rationale. Screening may result in harm. Invasive procedures may be performed based on a positive screening test. Liver biopsy has a 0.5% incidence of hematoma, a 0.1% risk of infection and a 0.05% risk of death associated with liver biopsy (15). Negative psychosocial consequences occur in 3-20% of patients undergoing screening (16). In the absence of an effective means of identifying patients with an early-stage HCC however, most patients will die. Observational studies have shown that patients undergoing screening had earlier stage disease compared to patients who did not undergo screening (17,18). Non-randomized studies are subject to lead and length time bias. Lead-time is the time by which diagnosis is anticipated by screening with respect to the symptomatic detection of disease. Length time bias is an overestimation of survival duration due to the relative excess of cases detected that are slowly progressing. Even when non-randomized studies account for lead time bias the effects of screening on survival vary with the assumed tumor doubling time (19). Survival time in one study was significantly longer in a screened group compared to a non-screened group when the tumor doubling time was assumed to be less than 90 days, but this was not evident if the tumor doubling time was assumed to be greater than 90 days (20). Most clinical guidelines recommend screening for hepatocellular carcinoma ( Table 1). A well designed randomized controlled trial could eliminate lead-time bias. The purpose of this review is to determine if there is evidence from randomized controlled trials evaluating the efficacy of screening for HCC in patients with chronic liver disease.

OBJECTIVES
The objective was to determine if screening for HCC is effective in reducing mortality while being safe and acceptable to the screening population.

METHODS
Included studies were randomized controlled trials. Other types of studies were excluded. This was to minimize confounding due to the potential for selection bias, performance bias and detection bias in non-randomized studies.  (28). A serious adverse event is any untoward medical occurrence that results in death, is life-threatening, requires inpatient hospitalization or prolongation of existing hospitalization, results in persistent or significant disability or incapacity, or is a congenital anomaly/birth defect) (28).

Data collection and analysis
The first systematic reviewer screened titles and abstracts for inclusion. The full articles of studies deemed eligible were then assessed by both first and second systematic reviewers for inclusion. All studies which were excluded were recorded with the relevant explanation. Any disagreements between the first and second systematic reviewers were resolved by discussion. It was planned that for any discrepancies that arose during data extraction where agreement could not be reached between the first and second reviewers, a third reviewer would be invited to resolve the discrepancy. Data extraction was performed independently by two reviewers. A standardized data extraction spreadsheet was used. Any disagreement between systematic reviewers about data extracted was resolved by discussion. The assessment of the risk of bias in included studies was performed using the Cochrane Collaboration's recommended domain-based evaluation (30). The odds ratio was used as the measure of treatment effect for all-cause mortality, mortality secondary to hepatocellular carcinoma and the number of cases of hepatocellular carcinoma. Using a random effect model the Manthel-Haensel method was used to calculate an odds ratio as the measure of treatment effect. Studies that were cluster randomized controlled trials that had not accounted for this in their statistical analyses were adjusted for clustering using guidance from the Cochrane Consumers and Communication Group (31). An intracluster correlation coefficient (ICC) of 0.02 was used for adjustment. There were no intracluster correlation coefficients referenced in the included studies. An intracluster coefficient of 0.02 was chosen based on empirical estimates for randomized controlled trials with continuous and binary outcome measures which were generally less than 0.064 with a median value of 0.02 (32). It is thought an intracluster coefficient of less than 0.05 is reasonable for outcome variables.
Missing data was deemed to be acceptable if the explanation provided by the study authors was deemed likely not to impact on the study outcomes (e.g. when a participant with a history of hepatocellular carcinoma was randomized but subsequently excluded in studies which randomized participants in clusters which had pre-specified the exclusion of anyone with a history of hepatocellular carcinoma) or if the number missing was negligible. We accepted intention-to-treat analyses and modified intention-to-treat analyses but not per-protocol analyses.
Heterogeneity was assessed using forest plots of the measures of treatment effect. We looked for overlap of the confidence intervals of the measure of treatment effect and whether the point estimates of treatment effect were on the same side of the line of no effect. If all point estimates were on the same side of the line of no effect, we looked for the magnitude of those treatment effects. The chi-square test was used to assess for heterogeneity. A P-value of less than 0.10 was used as a cut-off for the detection of heterogeneity. The I2 test for heterogeneity was also used. An I2 of 30-60% was considered to represent moderate heterogeneity, 50-90% was considered to represent substantial heterogeneity and 75-90% was thought to be considerable heterogeneity.
Interpretation of the I2 also required concomitant interpretation of the chi-square test as well as the direction and magnitude of treatment effects point estimates and their confidence intervals.
If sufficient studies (10 or more) met the review inclusion criteria it was planned to generate a funnel plot and obtain the Egger test and Begg test to check for asymmetry as an indicator of reporting bias. Data was synthesized and analyzed using RevMan (Review Manager) 5.2 (33).
Any heterogeneity detected between studies was investigated by considering both clinical and methodological factors. Clinical factors to considered were the baseline risk of hepatocellular carcinoma of study participants, whether the study was performed in primary care, secondary care or both and the etiology of cirrhosis of trial participants. The investigation of heterogeneity due to any potential methodological factors was performed by assessing the risk of bias in each study. For any heterogeneity detected we planned on changing the measure of treatment effect from odds ratio to risk difference to determine if it altered heterogeneity. It was planned to perform a subgroup analysis for all outcomes of participants who had hepatitis B or hepatitis C where studies reported the necessary data to explore any heterogeneity and determine the primary and secondary outcomes in these subgroups. It was planned to perform a sensitivity analysis to examine the effect of non-hepatitis B or C participants on the review outcomes where studies reported the necessary data. The quality of evidence was assessed using guidance from the GRADE handbook for grading quality of evidence and strength of recommendations and GRADEpro software (34,35).

Results of the search
We identified a total of 7753 references through the electronic searches of the Cochrane Central

Included studies
Included studies are shown in Table 2 (summarised from Table S2

Risk of bias in included studies
The risk of bias assessment was performed using the Cochrane Risk of Bias tool for Randomized Controlled Trials ( Table 3) (30).

Allocation (selection bias)
Information regarding allocation concealment was not available in the studies by Zhang et al (36) Chen et al (37) (41). The risk of selection bias in these studies is therefore unclear. Allocation concealment in the study by Trinchet et al (40) was achieved using a centralized phone procedure to the data-management center. The risk of selection bias in this study is low.

Blinding (performance bias and detection bias)
In the study by Wang et al (41) (40). The risk of performance bias and detection bias, as a result, is unclear.

Incomplete outcome data (attrition bias)
In the study by Chen et al (37) (40) Wang et al (41). The risk of bias due to incomplete outcome data is unclear in these studies.

Selective reporting (reporting bias)
In the studies by Chen et al (37) and Pocha et al (38) detection rates, the characteristics of the screening test and the outcomes with respect to incidence, stage, survival, and mortality in the screened and control group were reported. The risk of selective reporting bias in these studies is low. In the study by Sherman et al (39) the authors have not reported on all-cause mortality and disease-specific mortality in the screening group compared to the control group however they state that this study was not designed to evaluate these outcomes and that this trial was a preliminary study preparatory to a larger study. The risk of selective reporting bias in this study is low. Trinchet et (40) reported on most important outcomes and the risk of reporting bias is low. Wang et al (41) reported on overall survival in patients with hepatocellular carcinoma but did not report on all-cause mortality. The risk of reporting bias in this study is therefore high. Similarly, Zhang et al (36 )did not report all-cause mortality and the risk of reporting bias is also high.  Figures S1-22  In the study by Chen et al (37), there was no statistically significant difference in all-cause mortality when screening with six-monthly alpha-fetoprotein was compared to a single alpha-fetoprotein check (OR 1.02, 95% CI 0.65-1.60). There was no evidence of difference in mortality due to hepatocellular carcinoma in the same study (OR 1.01, 95% CI 0.57-1.78). There was no evidence to support any difference in the number of cases of hepatocellular carcinoma detected also in the same study (OR 1.11 95% CI 0.64-1.93). There were 5581 participants in this study.

Effects of interventions (Forest Plots in
There was no evidence of difference in all-cause mortality when ultrasound was compared to CT (OR No studies reported on quality of life or adverse events. It was not possible to perform a subgroup analysis for all outcomes of participants who had hepatitis B or hepatitis C because studies had not reported the necessary data to explore any heterogeneity and determine the primary and secondary outcomes in these subgroups. It was not possible to perform a sensitivity analysis to examine the effect of non-hepatitis B or C participants on the review outcomes. There were insufficient studies to meet the review inclusion criteria as planned to generate a funnel plot and obtain the Egger test and the Begg test to check for asymmetry as an indicator of reporting bias.

Quality of the evidence
The number of hepatocellular carcinoma related deaths and the number of cases of hepatocellular carcinoma detected in the study by Zhang et al (36) was small. The confidence intervals were wide suggesting imprecision. There was no inconsistency or indirectness. The quality of evidence in this study is therefore moderate. In the studies by Chen et al (37), Pocha et al (38), Wang et al (41), and Trinchet et al (40) there was a high risk of attrition bias because of poor compliance with follow up screening tests. The confidence intervals were wide and would likely include a minimal clinically important difference for a trial comparing screening frequencies for hepatocellular carcinoma or comparing screening methodologies. There was no inconsistency or indirectness in these four studies. For these reasons the quality of evidence from these studies is low. There were insufficient studies to formally assess for publication bias with a funnel plot.

DISCUSSION
There was no evidence from this review to support screening for hepatocellular carcinoma in patients with chronic liver disease to reduce mortality. A significant finding was the effect of adjusting for clustering on study outcomes. Understandably, clustering is used in randomized controlled trials for logistical and feasibility reasons. Correlation of observations occurs within clusters and this must be accounted for in the statistical analysis. This is because participants within clusters are more likely to be similar to each other than participants in other clusters. In the study by Zhang et al, they reported a 37% reduction in the HCC related mortality (36). When adjusted for clustering there was no evidence of difference in the HCC related mortality ratio. This finding remained true across a range of intracluster correlation coefficients and across a range of assumed number of clusters (the number of clusters in the study was reported as "more than 300", we performed our analysis assuming there were 301,350 and 399 clusters respectively). A potential bias in the review process was the use of an intracluster coefficient of 0.02 to adjust for clustering. We could not find previous similar studies to give guidance on the estimation of a suitable intracluster coefficient. The choice of too small an intracluster coefficient can have a substantial impact on confidence interval width. We demonstrated that the use of a range of intracluster coefficients (0.02, 0.05, 0.1) did not change the outcome dramatically. Of note, there was a lack of studies that included quality of life and adverse events as outcomes. These are important outcomes when considering a screening test. A screening test with significant adverse events or reductions in quality of life may not be acceptable to the target population.
It is important to consider the treatments patients received when discussing the benefits of screening and its impact on mortality. In the study by Zhang et al (36), no reduction in HCC related mortality was seen despite the screening group detecting more early-stage tumors and receiving more liver transplantation compared to the control group (46.5% vs. 7.5%). Similarly, in the study by Trinchet et al (41) more patients with HCCs <10mm were detected. More patients received liver transplantation when screened every 3 months with ultrasound compared to 6-monthly screening in this trial (18.9% vs. 4.3%). This did not translate to a reduction in all-cause mortality or HCC related mortality. Wang et al (40) found that when comparing screening every 4 months versus every 12 months they detected more early-stage tumors and it resulted in more curative therapy being given but did not confer a benefit in survival in the 4-year follow-up. From these trials it appears despite early diagnosis and treatment, mortality was unchanged, suggesting treatments were not an effective means of cure in HCC detected by screening.
Nil evidence of effect is not evidence of no effect. Reasons why the studies included in this review may not have detected an effect if it was truly there may be related to the patients enrolled. Within some studies, for example, Chen et al (37), patients with non-cirrhotic chronic liver disease and cirrhotic liver disease were included in each arm. The patients with non-cirrhotic liver disease would be at low risk of HCC and may have diminished any potential significant findings in cirrhotic liver disease patients. The studies by Chen et al (37) and Zhang et al (36) were performed in what the authors note to be high prevalence settings. This may limit the applicability of evidence from these studies to lower prevalence settings. However, given there was a lack of evidence of effect in a high prevalence setting it is unlikely to materialize in a low prevalence setting.
A limitation of this study may have been the choice of all-cause mortality as the primary outcome.
All-cause mortality would be a standard primary outcome for studies evaluating screening tests.
However, patients with chronic liver disease die for many reasons other than HCC and often before its development if it was going to be an occurrence. As such, a screening test for HCC may not result in any significant reduction in all-cause mortality. A second limitation was the small number of studies that met the inclusion criteria (N=6). The strengths of this review include rigorous methodology and adherence to PRISMA guidelines for reporting on systematic reviews (Online Supplementary Document Table 1).
Current NICE guidelines state "offer ultrasound (with or without measurement of serum alphafetoprotein) every 6 months as surveillance for hepatocellular carcinoma (HCC) for people with cirrhosis who do not have hepatitis B virus infection" and "Perform 6-monthly surveillance for HCC by hepatic ultrasound and alpha-fetoprotein testing in people with significant fibrosis (METAVIR stage greater than or equal to F2 or Ishak stage greater than or equal to 3) or cirrhosis" (20). This recommendation is based on evidence from a systematic review and meta-analysis performed by the NICE guidelines development group. There is very low-quality evidence from 2 retrospective cohort studies (n=351) which indicated a clinical benefit of surveillance for survival when analyzed using time-to-event data (50,51). They also evaluated annual versus 6-monthly surveillance and found only very low-quality evidence from 1 observational study (n=649) which indicated a clinical benefit of 6-monthly surveillance for survival (52). Low-quality evidence from the same study indicated a clinical benefit of 6-monthly surveillance for the detection of HCC beyond a very early stage. Moderate quality evidence from 1 randomized controlled trial (n=1278) indicated a clinical benefit of 3-monthly surveillance for survival and HCC occurrence when compared to 6-monthly surveillance (40). Evidence ranging from very low to moderate quality from the same randomized controlled trial indicated no clinical difference in HCC diameter >30 mm at detection, the number of HCC nodules detected or the HCC stage at detection. A Cochrane review performed in 2012 examined whether alpha-fetoprotein and/or ultrasound was effective for screening for hepatocellular carcinoma (53). They concluded there was insufficient evidence. The purpose of our systematic review was to determine if there was any high-quality evidence to support or refute screening for hepatocellular carcinoma, screening frequencies, and screening methodologies. The systematic review and meta-analysis we have conducted included studies that included patients with all stages of chronic liver disease and was not limited to alpha-fetoprotein and ultrasound as the only interventions. Conducted in 2018 it is more up to date and has broader inclusion criteria. Of note, all studies identified in this review were published after 1995 and all in English.
An increase in worldwide use of the hepatitis B vaccine and a reduction in global hepatitis C burden due to recent treatment advances may result in reductions in the global incidence of HCC. According to the WHO at the end of 2017, 187 countries had the hepatitis B vaccine nationwide. Global coverage with the uptake of all 3 vaccines is estimated at 87% (54). Should the incidence of HCC decline, in the absence of a high-quality evidence base to support screening, then resource-limited healthcare programs may opt to allocate resources to higher impact evidence-based interventions for other conditions of importance to their population. Screening has the potential to have the greatest benefit in high-risk populations such as those with hepatitis B. There is a need for new effective evidence-based hepatocellular carcinoma screening methods and treatments to reduce disease burden in these high-risk populations.
Screening according to current guidelines also requires resource allocation for laboratory measurement and reporting of alpha-fetoprotein and radiology staffing and infrastructure to perform ultrasonography. Consideration should be given to the possibility that screening for HCC is potentially harmful to patients and consumes significant healthcare resources. Screening may even detract from other aspects of chronic liver disease care. In the absence of high-quality evidence evaluating adverse events and the psychosocial impact of screening for HCC, it may not be correct to advocate for screening on a population level. Current guidelines that recommend screening may even act as a deterrent to such a high quality randomized controlled trial being undertaken. A feasibility study in Australia found that 99.5% of people would not participate in a randomized controlled trial comparing screening for hepatocellular carcinoma to no screening. Of this 88 % elected for a nonrandomized screening program (55). Notably in this study important information was omitted from the decision aid presented to patients. There was no mention of the potential for invasive investigations such as liver biopsy being performed based on the results of screening tests.
There was no mention of the risks associated with invasive investigations such as liver biopsy. Only non-invasive further investigations such as CT or MRI were discussed. There was no discussion of the potential consequences of radiation exposure with CT or the consequences of finding radiological lesions elsewhere such as "incidentalomas". We feel the researchers also did not point out to patients the best available evidence on screening for hepatocellular carcinomas comes from the randomized controlled trial by Zhang et al (36) which, despite its limitations, did not find evidence that screening reduced all-cause mortality or HCC related mortality (when adjusted for clustering).
We also feel the potential for negative psychosocial consequences with screening could have been discussed in greater detail. The provision of this information which is relevant when discussing the disadvantages of screening may have changed the outcome of the feasibility study.
Hepatocellular carcinoma is an important health problem globally with no curative treatment for those diagnosed with intermediate or late-stage tumors. If a curative treatment for HCC emerges then the question of whether screening should be performed deserves a large well-conducted randomized controlled trial to evaluate its effect on all-cause mortality, HCC related mortality, adverse events, quality of life and cost-effectiveness.

CONCLUSION
There is no evidence of effect from randomized controlled trials that screening for hepatocellular carcinoma reduces all-cause mortality, HCC related mortality or results in more HCCs being detected.
There is a need for a high quality randomized controlled trial which should include adverse events and quality of life as outcomes.