Recall patterns and risk of primary liver cancer for subcentimeter ultrasound liver observations: a multicenter study

Background: Patients with cirrhosis and subcentimeter lesions on liver ultrasound are recommended to undergo short-interval follow-up ultrasound because of the presumed low risk of primary liver cancer (PLC). Aims: The aim of this study is to characterize recall patterns and risk of PLC in patients with subcentimeter liver lesions on ultrasound. Methods: We conducted a multicenter retrospective cohort study among patients with cirrhosis or chronic hepatitis B infection who had subcentimeter ultrasound lesions between January 2017 and December 2019. We excluded patients with a history of PLC or concomitant lesions ≥1 cm in diameter. We used Kaplan Meier and multivariable Cox regression analyses to characterize time-to-PLC and factors associated with PLC, respectively. Results: Of 746 eligible patients, most (66.0%) had a single observation, and the median diameter was 0.7 cm (interquartile range: 0.5–0.8 cm). Recall strategies varied, with only 27.8% of patients undergoing guideline-concordant ultrasound within 3–6 months. Over a median follow-up of 26 months, 42 patients developed PLC (39 HCC and 3 cholangiocarcinoma), yielding an incidence of 25.7 cases (95% CI, 6.2–47.0) per 1000 person-years, with 3.9% and 6.7% developing PLC at 2 and 3 years, respectively. Factors associated with time-to-PLC were baseline alpha-fetoprotein >10 ng/mL (HR: 4.01, 95% CI, 1.85–8.71), platelet count ≤150 (HR: 4.90, 95% CI, 1.95–12.28), and Child-Pugh B cirrhosis (vs. Child-Pugh A: HR: 2.54, 95% CI, 1.27–5.08). Conclusions: Recall patterns for patients with subcentimeter liver lesions on ultrasound varied widely. The low risk of PLC in these patients supports short-interval ultrasound in 3–6 months, although diagnostic CT/MRI may be warranted for high-risk subgroups such as those with elevated alpha-fetoprotein levels.


INTRODUCTION
Patients with cirrhosis are at high risk of developing HCC, with an annual incidence of~1%-2% per year. [1] HCC is a leading cause of death in those with compensated cirrhosis, although prognosis highly varies by tumor stage at diagnosis. Patients with early-stage HCC can achieve 5-year survival exceeding 60% if they are eligible for liver transplantation, surgical resection, or local ablative therapy, whereas patients with more advanced tumor burden have a median survival of 2-3 years. [2] Therefore, professional society guidelines from the American Association for the Study of Liver Diseases (AASLD) and European Association for the Study of the Liver recommend HCC surveillance in at-risk patients, including those with cirrhosis from any etiology or subgroups with noncirrhotic chronic hepatitis B infection. [3,4] Several case-control and cohort studies have demonstrated that HCC surveillance is associated with significantly improved clinical outcomes, including early tumor detection and overall survival. [5,6] Surveillance is performed using a semiannual abdominal ultrasound and a serum biomarker, alphafetoprotein (AFP), although this is one step in the larger screening continuum, which also requires timely diagnostic evaluation in those with abnormal surveillance results. [7] Similar to the Liver Imaging and Reporting and Data System (LI-RADS) for CT and MRI findings, the American College of Radiology has proposed a classification system for ultrasound visualization and findings. [8,9] The AASLD has recommended recall strategies based on ultrasound findings. Patients with liver lesions ≥ 1 cm (US LI-RADS 3) and those with AFP ≥ 20 ng/mL are recommended to undergo diagnostic multiphase CT or dynamic contrastenhanced MRI, given a high risk of HCC. [3,4] In contrast, patients with liver lesions <1 cm in maximum diameter (US LI-RADS 2) are recommended to undergo shortinterval ultrasound within 3-6 months. This latter recommendation is largely based on historical studies suggesting a low risk of primary liver cancer (PLC) in patients with subcentimeter lesions. [10][11][12][13][14] However, most studies evaluating the natural history of subcentimeter liver lesions are limited by small sample sizes, included a majority of patients having active viral hepatitis, and predated current HCC diagnostic criteria, highlighting a need for data from a contemporary cohort of patients.
Despite the guideline recommendations for ultrasound-based follow-up, there has been an increasing utilization of CT or MR imaging in clinical practice considering ultrasound's suboptimal sensitivity, particularly in obese patients and those with nonviral liver disease. [15,16] Therefore, there is also a need to better understand practice patterns for patients with subcentimeter lesions, as this informs the risk of surveillance harms and the cost-effectiveness of surveillance programs. [17] To address these gaps, we conducted a multicenter cohort study to characterize the risk of PLC and variation in surveillance practice patterns in patients with subcentimeter liver lesions on ultrasound.

Study population
We conducted a retrospective cohort study among adult patients with cirrhosis from 12 US health systems in the North American Liver Cancer Consortium. [18,19] All sites were academic tertiary care referral centers with associated liver transplant programs, although 1 site had an associated safety-net health system. We included patients with cirrhosis who had at least 1 subcentimeter liver lesion between January 2017 and December 2019. Cirrhosis diagnosis was based on (1) histology, (2) noninvasive markers of fibrosis (eg, transient or MR elastography or blood-based biomarker panels) demonstrating F4 fibrosis or (3) cirrhoticappearing liver on imaging with signs of portal hypertension (eg, intra-abdominal varices, ascites). Individuals with coexistent liver lesions ≥ 1 cm or any history of PLC were excluded. This study was approved by the institutional review boards at each site.

Data collection
Demographic, clinical, and laboratory data were collected at baseline by review of the electronic medical record. Cirrhosis etiology was classified as hepatitis C (viremic vs. post-SVR), hepatitis B, alcohol-associated liver disease, NAFLD, or other. [20] Body mass index (BMI) was categorized according to World Health Organization classification: normal (BMI <25), overweight (BMI: [25][26][27][28][29].99), class I obesity (BMI: 30-34.99), class II obesity (BMI: 35-39.99), and class III obesity (BMI ≥ 40). Liver disease severity was assessed by the Child-Pugh class, with ascites and HE classified as none, mild or controlled, and severe or uncontrolled. Laboratory indices of interest included platelet count, aspartate transaminase, alanine transaminase, bilirubin, albumin, and INR. For multivariable models, age was dichotomized at the median value (60 y), whereas laboratory values were dichotomized based on the upper limit of normal.
Ultrasound exams at each site were performed according to LI-RADS technical recommendations. [21] Ultrasound exams were interpreted by abdominal radiologists per routine clinical care, and findings were abstracted from radiology reports. We recorded the number, maximum diameter, and location of any liver observations on each imaging study. For those who developed PLC, we recorded the method of detection (surveillance, incidental, and diagnostic) and tumor stage.
Patients were followed per institutional standard of care from the time of index imaging until progression to PLC, death, liver transplantation, or end of follow-up (date of last available CT or MRI imaging), whichever occurred earliest. We documented the receipt and imaging findings of follow-up imaging (ultrasound, CT, or MRI) or other diagnostic evaluation (eg, receipt of liver biopsy) after the index liver observation. For patients who underwent liver transplantation, we recorded explant findings, including the presence of PLC, dysplastic nodules, or any other potential pathologic correlates of interest.

Statistical analysis
We described variation in recall patterns after detection of the subcentimeter liver observation, including the proportions with guideline-concordant versus nonconcordant follow-up. We performed a generalized estimating equation analysis, accounting for clustering by site, to identify predictors of the most common recall strategies. Variables with p < 0.10 in univariable analyses were retained in the multivariable models, as well as observation size and AFP level given a priori clinical importance. For the multivariable model, we used a significance threshold of p < 0.05.
Our primary outcome was patient-level progression to PLC, that is, LR-5 or LR-M on follow-up CT/MRI or histological confirmation, per AASLD criteria. [3] We used the Fine-Gray subdistribution hazards model to characterize time-to-PLC development, with liver transplantation and death as competing events. Univariable and multivariable Cox regression analyses were performed to identify the factors associated with PLC. As above, observation size, AFP level, and variables with p < 0.10 in univariable analyses were retained in the multivariable models, which relied on a backward selection process using a significance threshold of p < 0.05. All statistical analyses were performed using SAS 9.4.

Study cohort
The baseline characteristics of the study cohort (n = 746) are presented in Table 1. The median age was 59 years, and the majority (54.0%) of the cohort was male. The cohort was diverse regarding liver disease etiology (25.7% hepatitis B, 14.9% active hepatitis C, 13.0% post-SVR, 17.4% NAFLD, and 15.1% alcoholassociated) and race/ethnicity (33.8% non-Hispanic RECALL AND NATURAL HISTORY OF US-2 LIVER LESIONS | 3 White, 21.8% Asian, 20.0% Hispanic, and 16.6% non-Hispanic Black). Most patients had compensated liver disease (78.4% Child-Pugh A). Most patients (66.0%) had a single lesion, 11.1% had 2 lesions, and the remainder had 3+ lesions. The median lesion diameter was 0.7 cm (interquartile range: 0.5-0.8 cm), with 61.3% being > 0.5 cm in diameter. Median AFP was 3.4 ng/mL (interquartile range: 2.0-6.0 ng/mL), with 11.1% of patients having an AFP of > 10 ng/mL. Most patients (44.8%) had adequate ultrasound visualization, although moderate and severe visualization limitations were reported in 20.8% and 2.5%, respectively. The interpreting radiologist provided a recommendation for follow-up CT or MRI in 23.6% (n = 175) of cases.

Variation in recall procedures
Follow-up of patients with subcentimeter ultrasound lesions was variable, with only 27.8% receiving guideline-concordant ultrasound within 3-6 months, ranging from 0% to 40.3% across sites (Figure 1). There were 5 sites in which ≤ 10% of patients underwent ultrasound within 3-6 months and 4 sites with ≥ 30% of patients. The most common alternative strategies were CT/MRI within 3 months (20.8%) and ultrasound within 6-12 months (16.0%). One fourth (24.8%) of patients failed to receive repeat imaging within 1 year of index ultrasound, ranging from 9.7% to 41.7% across sites. Three sites had more than one third of patients fail to undergo repeat imaging within 1 year of the subcentimeter liver lesion detection.

DISCUSSION
In this multicenter contemporary cohort of patients with subcentimeter liver observations on abdominal ultrasound, we observed large variation in recall patterns, with less than one third undergoing guideline-concordant follow-up ultrasound in 3-6 months. This variation is highlighted by the finding that one-fifth of patients underwent diagnostic CT/MRI within 3 months, whereas one fourth failed to have any repeat imaging within 1 year. Patients had a PLC incidence of 25.7 per 1000 person-years, supporting ultrasound in 3-6 months as a guideline recommendation for this group of patients. However, diagnostic CT/MRI may be warranted in some patient subgroups with higher PLC risk, such as those with Child-Pugh B cirrhosis, clinically significant portal hypertension, or elevated AFP levels.
Guideline recommendations for short-interval ultrasound is based on 3 principles: a low short-term risk of PLC in these patients, the low sensitivity of diagnostic imaging in patients with lesions <1 cm, and sufficiently long tumor doubling times for those with HCC. However, we found only 1 in 4 patients received guidelineconcordant follow-up using ultrasound within 6 months. We noted both surveillance "overuse," with~20% undergoing short-interval CT/MRI, and "underuse," with > 40% undergoing only intermittent surveillance. The former has implications for the enumeration of physical harms and cost-effectiveness, whereas the latter can mitigate surveillance benefits. [16,22] We found radiologist recommendation for CT/MRI was associated with significantly lower odds of ultrasound within 3-6 months, highlighting this as a potential intervention target to promote guideline-concordant follow-up.
Variation in the follow-up of subcentimeter ultrasound lesions may also be related to evolving data regarding tumor doubling times and accuracy of diagnostic imaging to characterize small liver lesions. Recent studies demonstrate a median tumor doubling time of 5-7 months, although over one fourth of patients have a rapid doubling time of <3 months. [23,24] Notably, one of the most consistent correlates of rapid growth included small tumor size, likely in part related to tumor growth kinetics. [24] Although few studies specifically examine the accuracy of diagnostic imaging for lesions <1 cm, MRI seems to have a sensitivity of 69% (95% CI, 54%-81%) for these lesions. [25] Outside of HCC detection, diagnostic imaging can also help differentiate those patients with suspicious lesions (LR-4) versus those with indeterminate lesions (LR-3). [26,27] These F I G U R E 1 Variation in recall strategies for patients with subcentimeter liver observations on ultrasound.
RECALL AND NATURAL HISTORY OF US-2 LIVER LESIONS | 5 observations have the differential risk of developing HCC over time, and differentiating the 2 can help inform which patients are best followed by cross-sectional imaging and which patients are the sufficiently low risk that ultrasound surveillance is acceptable.
Our study directly informs the expected natural history and risk of PLC in subcentimeter liver lesions on ultrasound. Prior studies reported a wide variation in PLC risk, ranging from 15% (2 of 13 lesions) in a study by Forner et al [10] to 69% (33 of 48 lesions) in a study by Caturelli et al. [11] Notably, Trinchet et al [13] found a higher proportion of subcentimeter lesions in patients undergoing quarterly ultrasound-based surveillance than semiannual surveillance; however, only 19% were confirmed as HCC at the end of the trial follow-up. In this contemporary cohort of patients, we found patients with subcentimeter liver lesions on ultrasound had a PLC Incidence of 22.3 per 1000 person-years. This incidence rate parallels that reported in broader cohorts of patients with cirrhosis, suggesting that ultrasound within 3-6 months is a reasonable strategy for these patients. Risk stratification models, using clinical risk factors such as AFP level and degree of liver dysfunction, may help identify patient subgroups who could benefit from MRI or CT imaging. [28] More nuanced approaches incorporating radiomics or blood-based biomarkers may also be helpful to augment the accuracy of risk stratification models. [29,30] A prior modeling study suggested a risk-stratified surveillance strategy among patients with cirrhosis would be cost-effective compared with a "one-size-fits-all" ultrasound-based approach. [31] Of course, one unintended consequence of this approach would be adding health care visits and resultant indirect costs to patients. [32,33] Another workflow could be same-day contrastenhanced ultrasound, although its performance characteristics in this patient population with lesions <1cm would need to be defined.
We acknowledge several study limitations. First, there was variable follow-up among patients, given the retrospective nature of the study, which may have resulted in ascertainment bias for PLC diagnoses. Our findings should be validated using prospectively collected data from a large patient cohort with standardized imaging follow-up. Second, our study was retrospective in nature and, therefore, liable to residual confounding. For example, we identified factors associated with recall strategies but were unable to identify some potential drivers of behavior, including fear of medical malpractice litigation. Third, our study relied on reports from interpreting radiologists, which could result in measurement error given the poor interobserver reliability of ultrasound interpretation. [34] Studies in which ultrasounds are independently reviewed by expert radiologists, with or without radiomics for lesion detection, should be considered to better characterize the natural history of subcentimeter liver lesions. Fourth, a limited number of patients in our cohort progressed to PLC, so we may have been underpowered to identify predictors of disease progression. Finally, we included multiple sites in the US, although our results, particularly those describing practice variation, may not be generalized to nonacademic settings or those outside the US. We believe these limitations are balanced by strengths of our study including the use of a large, contemporary multicenter cohort of patients and the availability of detailed clinical, laboratory, and imaging data over long-term follow-up. In summary, we found a large variation in follow-up imaging performed in patients with cirrhosis and subcentimeter liver lesions on abdominal ultrasound. The risk of PLC in these patients supports short-interval ultrasound as a reasonable recall recommendation, although diagnostic CT/MRI imaging may be warranted in some subgroups with higher PLC risk, such as patients with more advanced cirrhosis or those with elevated AFP levels. AUTHOR CONTRIBUTIONS Amit G. Singal: had full access to all data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis, study concept and design, drafting of the manuscript, obtained funding, and study supervision.

ACKNOWLEDGMENT
The authors thank Imon Banerjee, PhD, who helped with algorithm to identify patients at one of the study sites.

FUNDING INFORMATION
This study was conducted with support from NIH U01 CA230694 and R01 CA212008. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. The funding agencies had no role in design and conduct of the study; collection, management, analysis, and interpretation of the data; or preparation of the manuscript.