Application of the Doylestown algorithm for the early detection of hepatocellular carcinoma

Background We previously developed a logistic regression algorithm that uses AFP, age, gender, ALK and ALT levels to improve the detection of hepatocellular carcinoma (HCC). In 3,158 patients from 5 independent sites, this algorithm, referred to as the “Doylestown” algorithm, increased the AUROC of AFP 4% to 12% and had equal benefit regardless of tumor size or the etiology of liver disease. Aims Analysis of the Doylestown algorithm using samples from individuals taken before their diagnosis of HCC. Methods Here, the algorithm was tested using samples at multiple time points from (a) patients with established chronic liver disease, without HCC (120 patients) and (b) 116 patients with HCC diagnosis (85 patients with early stage HCC and 31 patients with recurrent HCC), taken at the time of, and up to 12 months prior to cancer diagnosis. Results Among patients who developed HCC, comparing the Doylestown algorithm at a fixed cut-off to AFP at 20 ng/mL, the Doylestown algorithm increased the True Positive Rate (TPR) in identification of HCC from 36 to 50%, at a time point of 12 months prior to the conventional HCC detection. Similar results were obtained in those patients with recurrent HCC, where the Doylestown algorithm increased TPR in detection of HCC from 18% to 59%, at 12 months prior to detection of recurrence. Conclusions This algorithm significantly improves the prediction of HCC by AFP alone and may have value in the early detection of HCC.


Results
Among patients who developed HCC, comparing the Doylestown algorithm at a fixed cut-off to AFP at 20 ng/mL, the Doylestown algorithm increased the True Positive Rate (TPR) in identification of HCC from 36 to 50%, at a time point of 12 months prior to the conventional HCC detection. Similar results were obtained in those patients with recurrent HCC, where the Doylestown algorithm increased TPR in detection of HCC from 18% to 59%, at 12 months prior to detection of recurrence. PLOS  Introduction Hepatocellular carcinoma (HCC), is often the associated with chronic hepatitis B virus (HBV), chronic hepatitis C virus (HCV) infection, excessive alcohol consumption and nonalcoholic steatohepatitis (NASH) [1]. HCC is the second leading cause of cancer-related death worldwide and the leading cause of death in patients with cirrhosis [1]. Prognosis for patients with HCC is related to tumor stage at time of diagnosis, and there are higher rates of curative treatment and better overall survival among those with early stage tumors [2]. Therefore, HCC surveillance has been recommended for "at-risks patients using ultrasonography, in both antiviral treatment naïve and experienced patients [3]. Surveillance can be performed with or without serum levels of the oncofetal glycoprotein, alpha-fetoprotein (AFP) [4,5]. There has been extensive debate about the utility of AFP, given its suboptimal sensitivity and specificity [6][7][8]. On the other hand, AFP is an economical and readily available test in both developed and developing countries.
We have recently developed a logistic regression algorithm that utilizes AFP, age, gender, alkaline phosphatase (ALK) and alanine aminotransferase (ALT) levels to improve the detection of HCC, particularly for those with a background of liver cirrhosis [9]. We define this as the Doylestown algorithm. In an analysis of 3,158 patients from 5 independent sites, this algorithm improved detection of HCC as compared to AFP [9]. However, in that previous work, only patients with either cirrhosis or HCC at the time of cancer detection were analyzed. In the present study, we further evaluate the predictability and accuracy of the algorithm by applying it to data from multiple time points from patients with chronic hepatitis without HCC, and from those with cirrhosis before and up to the time of their diagnosis of HCC.

Study populations
For the non HCC subjects, electronic health data were used to randomly select ambulatory clinic patients with chronic liver disease at Beth Israel Deaconess Medical Center (BIDMC), Harvard Medical School. Patients were excluded if they have less than 12 months of clinical observation or if there was suspicious liver mass or HCC identified prior to study enrollment. In addition, only those patients who had the required components of the Doylestown algorithm (age, gender, ALT, ALK and AFP) were utilized in this study. All patients with chronic hepatitis B virus, regardless of cancer status, were on anti-viral therapy (entecavir (Baraclude), tenofovir (Viread), or telbivudine (Tyzeka)). Patients with HCC were identified using ICD9 codes and patient lists of the weekly multidisciplinary HCC conference at BIDMC. All the cases were adjudicated to confirm they met HCC criteria based on the AASLD practice guideline [10]. Patients with tumor size greater then 5 cm at the time of initial HCC diagnosis were excluded from the analysis. As stated, for both the controls and patients with HCC, only patients with available AFP and liver panel (age, gender, ALK and ALT) values at least every 6 months were included.
Briefly, the diagnosis of HCC was made based on accepted standard criteria for one of the following modalities: histopathology, magnetic resonance imaging [MRI], computed tomography (CT) and in a single case, magnetic resonance cholangiopancreatography (MRCP) [10]. Diagnosis of cirrhosis was based on liver histology, Fibroscan score or clinical, and imaging features. For cirrhotic patients to serve as controls, they must have no evidence of HCC by baseline US or MRI of the liver if the AFP was elevated within 3 months prior to enrollment, and for another 6 months after enrollment. The controls in this cohort have been followed for a mean period of 61.6 months (range of 12.7-170.6). Tumor staging was determined using the United Network of Organ Sharing-modified TNM staging system for HCC and 85 (out of 115 possible patients) with early stage HCC were included in this study. Early HCC was defined as T1 (single lesion < 2 cm in diameter) or T2 (single lesion between 2 and 5 cm in diameter) lesions, which met criteria for liver transplantation in the United States.
Patients with recurrent HCC who had at least 12 months of clinical data prior to their cancer diagnosis were also included in this study. Patients were identified systematically and consecutively from the BIDMC electronic HCC registry. In this case, all HCC tumor sizes and stages were included. The detection of HCC was as above. The research study was performed under a BIDMC IRB approved protocol.
Serum AFP and lab values for ALT, ALK were determined using commercially available immunoassays at clinical BIDMC laboratory and taken prior to analysis using the Doylestown algorithm. The data for all patients are found as supplementary data (S1 File).
The output value is a continuous variable ranging from 0 to 1. In our prior work an output value of >0.5 was used to identify patients with HCC and had a specificity of >90% [9]. The selection of patients, data collection and application of the algorithm, as found in a base Microsoft Excel file, were uniformly performed at BIDMC.
Descriptive statistics were used. For statistical analysis among groups, in the cases of longitudinal data, linear mixed effects model was applied to compare values among the multiple time points. Kruskal-Wallis One-way ANOVA test was used for independent multiple groups. All analysis applied with R software(www.r-project.org) and GraphPad Prism 5.0 (GraphPad Software, La Jolla, CA, CA).
For comparisons between the sensitivity of AFP and the Doylestown algorithm, We utilized the two sample proportion test to check statistical difference between the true positive rate of AFP and the Doylestown algorithm. The power calculation for each time point followed the guide of Cohen et al 1988 [11].
Ethical issues. This study was approved by the Institutional Review Boards of the Beth Israel Deaconess Medical Center, Harvard Medical School.

Performance of the Doylestown algorithm in those with chronic liver disease
In our previous work, we analyzed the performance of the Doylestown algorithm in individuals with a diagnosis of cirrhosis or HCC at the time of serum sample collection [9]. However, it was unclear how well this equation would perform with samples from those with chronic liver disease, but without liver cirrhosis or HCC. In addition, it was of interest to determine how early, prior to a diagnosis of HCC, this equation would predict the cancer. To that end, we applied the Doylestown algorithm to data from 120 "control" patients with chronic liver disease (see Table 1) but without cirrhosis or HCC (see Fig 1 for study design). As shown in Table 1, the mean age of these patients was 44.4 with a range of 20 and 82 years of age. The majority of patients were male (56%) and had mean values of ALK, ALT and AFP within the normal range (see Table 1). S1 Therefore, although most of the 120 patients in the control group were properly classified as not having a diagnosis of HCC, two patients in this control group were incorrectly classified (i.e. Doylestown algorithm output as having HCC with cut-off value >0.5). Both of these patients were older, being 65 and 82 years of age, and had AFP <10 ng/mL. One patient (male aged 65) had "normal" levels of AFP (8.9 ng/ml), ALK (67 IU/L) and ALT (38 IU/L). The oldest patient (female aged 82) also had normal levels of AFP (4.5 ng/ml), ALK (94 IU/L) and ALT (12 IU/L).
Longitudinal  Table 2. As Table 2 shows, for these controls, ALK, ALT and AFP values remained Table 1. Clinical characteristics of control patients without HCC 1 .

Number of patients 120
Gender (% male) [9]. Importantly, as Table 2 shows, the fluctuations in ALK and ALT values had little impact on the Doylestown algorithm. None of these 25 patients had Doylestown algorithm values ! 0.5 (Fig D in S2 Fig).

Performance of the Doylestown algorithm in those with HCC
We previously examined the performance of the Doylestown algorithm only in those at the time of HCC diagnosis. Here we have evaluated the performance of the Doylestown algorithm in early HCC detection longitudinally (see Fig 1 for study design). A total of 85 patients with clinical data available from 1 month to 12 months before the diagnosis of HCC were examined ( Table 3 Table 4. An analysis of the clinical data at the various time points shows that the levels of ALK and ALT do not appreciably change in these individuals as a function of time, similar to that observed with controls. Patients with cirrhosis have much higher levels of ALK and ALT compared to controls (Tables 1 & 3). As expected, AFP levels increase close to the time of cancer diagnosis though the change was not statistically significant (p = 0.067). Importantly, the mean Doylestown algorithm output values were >0.5 at time Table 3. Clinical characteristics of HCC patients 1 .

Number of patients (N) 2 85
Diagnosis 2) The total number of patients examined in this experiment. 1-4 samples were available per patient. points 9 months prior to HCC diagnosis(p = 0.067). Therefore, the algorithm was able to identify patients who ultimately developed HCC as early as 9 months prior to the HCC Diagnosis. S4 Fig presents the receiver operator characteristic curves (ROC) for AFP or the Doylestown algorithm at either 1-3 months prior to HCC diagnosis (Fig A in S4 Fig, 6-9 months prior to HCC diagnosis (S4B Fig) or 12 months prior to HCC diagnosis (Fig C in S4 Fig). As this figure shows, increases in AUC were observed at all time points as were increased sensitivities at all relevant specificity values. In Table 5, the performance of the Doylestown algorithm was compared to AFP using an output value of !0.5 for the Doylestown algorithm and 20 ng/ mL for AFP as cutoffs. In our prior analysis, an output of !0.5 was associated with HCC with a specificity of !90% [9]. The 20 ng/mL cut-off value of AFP was used as historically this has been the accepted level of HCC detection and in one of the largest studies conducted by the National Cancer Institute was associated with 90% specificity [12][13][14][15]. Using these values, at 12 months prior to HCC diagnosis, AFP classified 36% of individuals as having HCC and the Doylestown algorithm increased this predictive value to 50%. However, this difference did not meet statistical significance (p = 0.2250) At 6-9 months prior to HCC diagnosis, AFP classified 30% of individuals as having HCC and the Doylestown algorithm increased this rate to 50% (p = 0.036). At 1-3 months prior to HCC detection, AFP identified 58% of those with HCC and the Doylestown algorithm identified 71% (p = 0.0550). Thus, consistent with our previous data, improved performance as compared to AFP was observed with the application of the Doylestown algorithm. The impact of altering the cut-off for AFP or the Doylestown algorithm output value on HCC detection is shown in Table 5. As this table shows, decreasing the cut-off to 10 ng/mL for AFP or to 0.25 for the Doylestown algorithm increased the detection of HCC for both markers. However, as Table 5 shows, the Doylestown algorithm maintained its superior detection of HCC as compared to AFP and actually increased the improvement in sensitivity over AFP alone. Increasing the cut-off of AFP to 100 ng/mL or the Doylestown algorithm to 0.75 decreased detection of HCC but once again, the Doylestown algorithm maintained its superior detection of HCC as compared to AFP.

Performance of the Doylestown algorithm in those with recurrent HCC
Next, we compared AFP to the Doylestown algorithm for those patients who had HCC recurrence(n = 31). Clinical data was available at <3 months, 6 months, 9 months and 12 months prior to cancer recurrence. The mean size of the tumor at time of diagnosis was 1.8 cm (0.8-5.5 cm). A scatter plot of the three main components of the Doylestown algorithm (AFP, ALK  Table 6. Using individual patient clinical values from patients 12 months prior to the confirmed diagnosis of HCC recurrence by imaging or biopsy, AFP and the Doylestown algorithm identified 18% and 59% of the recurrent HCC respectively (Table 7, p = 0.0021). At 9 months prior to HCC recurrence, AFP detected 29% of individuals as having HCC and the Doylestown algorithm increased this to 57% (p = 0.0202). Similar results were obtained at 6 months prior to recurrence; AFP and the Doylestown algorithm identified 27% and 64% of the HCC respectively (p = 0.0024). Within 3 months of HCC recurrence, AFP correctly identified HCC in 47% and the Doylestown algorithm accuracy increased to 67% (p = 0.0999). The impact of altering the cut-off for AFP or the Doylestown algorithm output value on HCC detection is shown in Table 7. As this table shows, decreasing the cut-off to 10 ng/mL for AFP or to 0.25 for the Doylestown algorithm increased the detection of HCC for both markers. However, as Table 7 shows, the Doylestown algorithm maintained it superior detection of HCC as compared to AFP. Increasing the cut-off of AFP to 100 ng/mL or the Doylestown algorithm to 0.75 decreased detection of HCC but once again, the Doylestown algorithm maintained it superior detection of HCC as compared to AFP. 1) Sensitivity using the indicated cut-off (see below) of AFP at either 1-3 months, 6-9 months, or 12 months before HCC detection using a cut-off of 20 ng/mL.
2) Sensitivity of the Doylestown algorithm at either 1-3 months, 6-9 months, or 12 months before HCC detection using a cut-off of 0.50 output units.
3) P value between AFP and the Doylestown algorithm using a one-sided alternative hypothesis test.

4) Power level of this difference.
5) Sensitivity of AFP at either 1-3 months, 6-9 months, or 12 months before HCC detection using a cut-off of 10 ng/ mL. 6) Sensitivity of the Doylestown algorithm at either 1-3 months, 6-9 months, or 12 months before HCC detection using a cut-off of 0.25 output units. 7) Sensitivity of AFP at either 1-3 months, 6-9 months, or 12 months before HCC detection using a cut-off of 100 ng/mL 8) Sensitivity of the Doylestown algorithm at either 1-3 months, 6-9 months, or 12 months before HCC detection using a cut-off of 0.75 output units.

Discussion
The development and validation of the Doylestown algorithm in predicting HCC had been done using measurements of the variables collected, at a single time, from case-control or nested case-control studies [9]. Here, we presented a retrospective, longitudinal study applying the Doylestown algorithm to serial time points in patients with chronic liver diseases with and without HCC to demonstrate the robustness of the algorithm. The inclusion of control patients, with chronic liver diseases without liver cirrhosis or HCC confirmed the specificity of this algorithm. It was essential to determine that the fluctuations in the key components of the   1) Sensitivity of AFP at either <1-3 months, 6 months, 9 month or 12 months before HCC detection using a cut-off of 20 ng/mL. 2) Sensitivity of the Doylestown algorithm at either <1-3 months, 6 months, 9 month or 12 months before HCC recurrence using a cut-off of 0.50 output units.
3) P value between AFP and the Doylestown algorithm using a one-sided alternative hypothesis test.
4) Power level of this difference. 5) Sensitivity of AFP at either <1-3 months, 6 months, 9 month or 12 months before HCC recurrence using a cut-off of 10 ng/mL. 6) Sensitivity of the Doylestown algorithm at either <<1-3 months, 6 months, 9 month or 12 months before HCC recurrence using a cut-off of 0.25 output units.

7)
Sensitivity of AFP at either <1-3 months, 6 months, 9 month or 12 months before HCC recurrence using a cut-off of 100 ng/mL 8) Sensitivity of the Doylestown algorithm at either <1-3 months, 6 months, 9 month or 12 months before HCC recurrence using a cut-off of 0.75 output units. algorithm (such as an individual's AFP, ALK and ALT values) do not adversely impact the performance of the Doylestown algorithm and falsely identify those who do not have HCC. The application of the Doylestown algorithm did lead to false positive results in 2 patients. Both patients were older and had values close to the 0.5 cut-off. The 82 year old patient had normal AFP, ALK and ALT levels, the advanced age accounted for the false positive value of the Doylestown algorithm in these 2 cases. The algorithm was originally developed using a cohort of patients younger than 70 years old. Thus, this algorithm may not be accurate for those > 70 years of age and we are currently developing novel biomarkers and algorithms for those patient without cirrhosis but with chronic HBV infections. The Doylestown algorithm, which combines AFP with clinical values increased the detection of HCC as compared to AFP alone at all time points examined. Using a fixed cut-off of 20 ng/mL for AFP and 0.5 output units for the Doylestown algorithm resulted in a 13% increased at the closest time point to HCC detection (<1 month), a 20% increase at the 6-9 month-time point and a 14% increase at the 12 month-time point (Table 7). This improvement in cancer predication also applied to those with HCC recurrence (Table 7). Indeed, at 12 months prior to the detection of recurrent HCC, the Doylestown algorithm increased the sensitivity of AFP alone by over 3 fold. It is noted that one patient with a large tumor at recurrence (5.5 cm lesion) was AFP negative (AFP value of 6.2 ng/mL) at the time of diagnosis, which may explain this person's late diagnosis. The Doylestown algorithm values for this individual were 0.5 at the 12 month time period (with an AFP of 6.5 ng/mL) and rose to 0.9 at the time of diagnosis. Importantly, as we have shown here and in our previous report [9], the specificity of the Doylestown algorithm was similar or better then AFP. Together, these data suggest that the Doylestown algorithm enhances the prediction of HCC compared to AFP alone with increased sensitivity and similar specificity.
A limitation of the current study is the sample size that was available. For example, clinical data was only available from 35 patients at a time of 12 months prior to HCC diagnosis. A larger study with time points up to 2-years prior to HCC diagnosis will have to be performed to fully demonstrate the true benefit of the use of this algorithm for HCC surveillance. It is noted that in no case did the use of the Doylestown algorithm reduce the number of patients detected as compared to AFP, suggesting again, what was observed in our previous study, that the use of this algorithm does not impart any harm.
It is noted that in the case of patients with recurrent cancer, the sample size and difference between AFP and the Doylestown algorithm was large enough to provide >85% power and highlight the benefit in the use of the algorithm as opposed to AFP in this setting. However, future work will need to be performed in those who do not have re-occurrent disease. Such a study will require careful analysis and the inclusion of patients with MRI/CT-scan proof of no HCC at a time point of 12-24 months post last data point.
This study demonstrated that the Doylestown algorithm, by using readily available clinical parameters, is superior to AFP alone in predicating accurately the development of initial and recurrent HCC among patients with chronic liver diseases. While this algorithm is practical and easy to adopt in routine clinical use, it is important to emphasize that the current algorithm can be further complemented and improved with novel biomarkers to achieve early detection of hepatocellular carcinoma.