Diagnostic accuracy of serum dickkopf-1 protein in diagnosis hepatocellular carcinoma

Abstract Background: To verify the accuracy of serum dickkopf-1 protein (DKK-1) in the diagnosis of hepatocellular carcinoma (HCC) by an updated meta-analysis. Methods: We searched potential eligible studies in PubMed and Embase before July 8, 2018. Sensitivity (SN), specificity (SP), positive likelihood ratio (PLR), negative likelihood ratio (NLR), summary receiver operating characteristics curve (sROC), and diagnostic odds ratio (DOR) were pooled with their 95% confidence intervals CIs) using a bivariate random-effects model. Results: A total of 8 articles contained 10 studies on diagnosis of HCC with DKK-1 alone,7 articles contained 9 studies on diagnosis of HCC with a-fetoprotein (AFP) alone and 5 articles contained 7 studies on diagnosis of HCC with DKK-1 + AFP were identified. The pooled SN, SP, PLR, NLR, and DOR of DKK-1 alone, AFP alone and DKK-1 + AFP were 0.72 (95% CI: 0.70–0.75), 0.62 (95% CI:0.59–0.64) and 0.80 (95% CI:0.78–0.83), 0.86 (95% CI: 0.84–0.87), 0.82 (95% CI:0.80–0.84) and 0.87 (95% CI: 0.85–0.88), 4.91 (95% CI: 2.73–8.83), 3.60 (95% CI:2.01–6.44) and 6.18 (95% CI: 4.68–8.16), 0.32 (95% CI: 0.22–0.47), 0.49 (95% CI:0.40–0.60) and 0.20 (95% CI: 0.15–0.26), and 17.21 (95% CI: 9.10–32.57), 7.45 (95% CI:3.69–15.01) and 31.39 (95% CI: 23.59–43.20), respectively. The area under the sROC was 0.88, 0.70, and 0.92 for the 3 diagnostic methods. Conclusions: Serum DKK-1 + AFP showed a high accuracy for diagnosis of HCC, and serum DKK-1 alone had moderate accuracy as compared to a previous meta-analysis, while AFP alone owned an unsatisfied diagnostic behavior for HCC. Due to the limitations of the current analysis, further well-designed studies are needed to confirm the diagnostic value of DKK-1 and DKK-1 + AFP in HCC diagnosis.


Introduction
Hepatocellular carcinoma (HCC) is one of the most common malignant tumors, with about 78,200 newly diagnosed cases per year and second highest mortality rate worldwide. [1,2] Its incidence is expected to increase in the next 10 to 20 years. The 5-year survival rate differs by stages, with the rate of 50% to 75% in the early stage, which further decreases to 3% for distant metastasis HCC patients. [3,4] Hepatitis B/C virus infection, alcohol, nonalcoholic fatty liver disease, Budd-Chiari syndrome, aflatoxin, and so on, were identified as risk factors for HCC. In clinical practice, serum a-fetoprotein (AFP) and ultrasonography are widely utilized for early detection of HCC. [5] However, with a sensitivity (SN) of 53% and specificity (SP) of 90% at a cut-off value of 20 ng/ml, western countries have excluded AFP for HCC diagnosis due to its lack of accuracy. [6][7][8] Furthermore, AFPnegative HCC could be missed if it is used as a marker for diagnosis of HCC.
Surgery, local treatment, radiation therapy, systemic therapy, and so on, are currently used in the management of different stages of HCC, but there are limitations for clinical application of surgery and nonsurgical treatments are incapable of significantly improving overall survival and avoiding relapse of HCC. [4,9] Current methods for early screening of HCC include imaging and tumor biomarkers. [10,11] Circulating cell-free nucleic acids could also contribute to the diagnosis of HCC. [12] Among these methods, biomarkers seem to be more convenient and costeffective.

Methods
The present study was carried out based on the published studies. Thus, the approval from an ethics committee or institutional review board was not required.

Search strategy
This systematic review was conducted based on the preferred reporting items for systematic reviews and meta-analyses guidelines. [33] Relevant articles published in English were searched in PubMed and Embase before July 8, 2018. The search terms used were "hepatocarcinoma or hepatoma or liver cancer or hepatocellular carcinoma or HCC," "dickkopf-1 or DKK-1." The reference lists of all relevant articles were manually searched for additional eligible studies. The search procedure was conducted by 2 independent investigators.
The inclusion criteria were: (1) the study used DKK-1 as a biomarker to diagnose HCC; (2) the sample type was serum DKK-1; (3) the diagnosis of HCC was established by pathological methods or in line with correlated accepted guidelines; (4) the study provided sufficient data to calculate the SN and SP of DKK-1.

Data extraction
Data extraction was performed by 2 independent investigators (Xueyi Tang and Yi Zeng), and any disagreements were resolved by a third author (Yongqiang Zhan). The data extracted from each study included first author, date of publication, geographical region, study design, reference standards, measuring methods and cut-off values, gender and sex ratio of HCC patients, and the number of true positive, false positive, false negative, and true negative subjects.

Study quality assessment
The assessment tool quality assessment of diagnostic accuracy studies 2 (QUADAS-2), which was developed based on QUADAS, was used to assess the quality of each paper. [34,35] QUADAS-2 has 4 domains: patient selection, index test, reference standard, flow, and timing. Each domain of QUADAS-2 was assessed as "yes," "no" or "unclear." Signaling questions were used to judge risk of bias as "high" or "low." A third author (Zuhui Pu) was consulted for any disagreements.

Statistical analysis
Two independent investigators (Xueyi Tang and Yi Zeng) performed the statistical analysis using MetaDisc version 1.4, Revman version 5.3 and STATA version 12.0 software programs, and P < .05 represents statistical significance. SN, SP, positive likelihood ratio (PLR), negative likelihood ratio (NLR), and diagnostic odds ratio (DOR) were pooled with their 95% confidence intervals (CIs). Substantial heterogeneity, a nonuniformity indicator, was demonstrated as I 2 value > 50%, [36] and a random-effects model was adopted. The DOR was also pooled since it is an independent factor calculated from PLR and NLR to indicate the performance of diagnosis test. The pooled diagnostic SN, SP, and heterogeneity were demonstrated by forest plots. Summary receiver operating characteristic curves (sROC) represented the total diagnostic efficacy of DKK-1. Threshold effect was evaluated by calculating the Spearman correlation coefficient and P < .05 indicated threshold effect. [37] If heterogeneity was not found by threshold effect, subgroup analysis was used for further exploration. Deeks' funnel plot asymmetry test was utilized for assessing publication bias, [38] and a SN analysis was also performed.

Quality assessment
The quality of 8 articles is demonstrated in Figure 2. In the patient selection domain, the risk of bias was noted as "unclear" for almost all included articles that did not illustrate whether consecutive or random patients were enrolled, except 3. [22,23,28] The risk of bias was noted as "high risk" in patient selection domain for 3 studies as they only included high-risk population in the control. [27,30] In the index test domain, the risk of bias was noted as "unclear" for the included articles without prespecified diagnostic thresholds, except 1. [26] In the flow and timing domains, the risk of bias was noted as "low" for 4 articles as they used histopathology as reference standard for all included HCC, [22,23,26,29] whereas it was noted as "unclear" for the remaining articles because they did not use the same reference standard for all included HCC.  [3][4][5][6]. To analyze the source of heterogeneity, we first calculated threshold effects. The Spearman correlation coefficient between the logit of SN and the logit of 1-SP was 0.309 (P = .385), which indicated that the threshold effect did not result in heterogeneity among included studies. Consequently, subgroup analyses were performed to identify the potential sources of heterogeneity.

Subgroup analyses
Three subgroup analyses were conducted according to the stage of HCC and high-risk participants (patients with risk factors for HCC). The first analysis based on the composition of control group included those studies with high-risk population as control. Another analysis calculated diagnostic performance of DKK-1 for early HCC. The last subgroup analysis calculated diagnostic performance of DKK-1 for distinguishing early HCC from high-risk control (Table 3). A total of 1751 patients comprising of 1129 HCC patients and 622 patients with high-risk factors in 7 studies in 6 articles were identified, [23,[27][28][29][30][31] and the results were pooled as follows: SN was 0.72 (95% CI: 0.69-0.74), SP was 0.82 (95% CI: 0.79-0.85), DOR was 16.79 (95% CI: 10.17-27.72), and AUC was 0.87. The I 2 values of SN, SP, and DOR were 95.4%, 91.8%, and 61.1%, respectively (Table 3).

Pooled diagnostic accuracy of AFP in HCC diagnosis
The pooled SN and SP of AFP in HCC diagnosis was 0.62 (95% CI: 0.59-0.64) and 0.82 (95% CI: 0.80-0.84), with I 2 values of 74.5% and 96.7%. A bivariate random-effect model was executed due to the existence of substantial heterogeneity. The pooled PLR, NLR, and DOR were 3.60 (95% CI: 2.01-6.44), 0.49 (95% CI: 0.40-0.60), and 7.45 (95% CI: 3.69-15.01) with I 2 values of 96.3%, 84.7%, and 92.5%, respectively. The sROC curve was plotted, and the AUC was 0.70 (SE = 0.0484) ( Table 3). To analyze the source of heterogeneity, we calculated threshold effects. The Spearman correlation coefficient between the logit of SN and the logit of 1-SP was À0.050 (P = .898), which indicated that the threshold effect did not result in heterogeneity among included studies. Consequently, subgroup analyses were performed to identify the potential sources of heterogeneity.

Subgroup analyses
Three subgroup analyses were conducted as previously described (Table 3).

SN analysis and publication bias
The SN analysis was performed to estimate the impact of each study in diagnosing HCC with DKK-1 alone and DKK-1 + AFP, and the result revealed that the data were stable. We used Deeks' funnel plot asymmetry test to evaluate the publication bias, and the P-value was .585 (DKK-1 alone) and .693 (DKK-1 + AFP), which indicated no potential publication bias among all the included studies (Fig. 11).

Discussion
Given that cirrhotic hepatitis patients and chronic HBV carriers are recommended for regular surveillance to avoid the tumorigenesis of HCC, timely diagnosis of HCC provides more therapeutic options, and better prognosis for patients. [2] In case histopathology data is unavailable, serum AFP level combined with medical imaging could be used to detect HCC. [5] However, low SN of AFP makes it a sub-optimal marker for HCC screening and in 5% to 7% cases, imaging could not distinguish HCC from other non-HCC tumors. [39,40] Hence, it is critical to search for a Figure 6. The pooled diagnostic accuracy of DKK-1 in HCC diagnosis. DKK-1 = dickkopf-1 protein, HCC = hepatocellular carcinoma. Table 3 Summary of diagnostic accuracy of DKK-1 and DKK-1 + AFP.     Table 3. Although a previous meta-analysis has examined this issue, [32] the present analysis deserves attention, because more studies were included and 2 different subgroup analyses were conducted. During the process of screening potential eligible studies, we set the inclusion and exclusion criteria similar to the previous metaanalysis and thus 2 studies that used plasma as the sample to explore the diagnostic value of DKK-1 in HCC diagnosis were excluded. [24,25] It was also a consideration of homogeneity because biomarkers examined by different samples were in very dynamic concentrations. [41] There was no limitation in language of included articles in the previous meta-analysis, but in the present meta-analysis, only articles published in English were included. As compared to the results of the previous meta-analysis, the AUC was 0.88 versus 0.84, while the DOR had decreased more than one-third (17.21 vs 26.90) in the present meta-analysis, which indicated that serum DKK-1 alone may not be optimal in diagnosing HCC. For the combination of DKK-1 and AFP, the AUC was 0.92 versus 0.88 and the DOR was 31.93 versus 24.60 in the current and previous meta-analysis, which indicated that DKK-1 + AFP was more suitable for HCC diagnosis than DKK-1 alone. Serum DKK-1 had shown diagnostic value in diagnosing HCC in many studies, [22][23][24][25][26][27][28][29][30][31] and majority of them concluded that DKK-1 could commendably detect HCC, except Mao et al to differentiate AFP (À) HCC from liver cirrhosis. [24] In another study published in 2012, serum DKK-1 showed a moderate diagnostic value in distinguishing AFP (À) HCC from high-risk patients. [23] However, it is difficult to predict whether DKK-1 could display a good diagnostic accuracy in AFP (À) HCC as there was insufficient data to analyze in the current meta-analysis. Thus, more studies on diagnosing AFP (À) HCC with serum DKK-1 are needed.
The previous meta-analysis conducted by Zhang et al indicated that both DKK-1 and DKK-1 + AFP had satisfactory accuracy for diagnosing HCC, [32] [45][46][47] As compared to the results pooled in meta-analysis of different markers, our results with SN of 0.72, SP of 0.86 and AUC of 0.8596 in HCC diagnosis with serum DKK-1 alone might seem moderate. However, with AUC of 0.92, the combination of DKK-1 and AFP showed an equivalent diagnostic performance as compared to OPN and DCP. [46,47] Heterogeneity among the included studies was evaluated through different methods in this analysis since it is an indicator of the reliability of the results. Threshold effect was thought to be a primary cause for heterogeneity in diagnostic studies. In the current meta-analysis, the Spearman correlation coefficients of DKK-1 alone, AFP alone, and DKK-1 + AFP in diagnosing HCC were 0.378 (P = .226), À0.050 (P = .898), and 0.119 (P = .779), which indicated that threshold effect did not exist as all P-values were >.05. Then, we performed subgroup analyses according to the stage of HCC and high-risk control in DKK-1 alone, AFP alone, and DKK-1 + AFP, respectively. The I 2 values of DOR in 3 subgroups of DKK-1 were 61.1%, 64.5%, and 67.1%, respectively. The I 2 values of DOR in 3 subgroups of AFP were 91.0%, 45.0%, and 78.1%, respectively. The I 2 values of DOR of DKK-1 + AFP were 54.2%, 25.3%, and 63.3% (Table 3). As compared to the I 2 value (37.1%) of DOR of DKK-1 + AFP in diagnosing all HCC patients, we found that the stage of HCC was the source of heterogeneity, as the I 2 value of DOR decreased >10% and the I 2 values of both pooled SN and SP were 0.0%.
Similar to the previous meta-analysis, I 2 value of pooled SN in DKK-1 alone of early HCC subgroup was 0.0%, which indicated the stage of HCC was the source of heterogeneity in the current meta-analysis. However, the stage of HCC failed to appropriately explain the potential source of heterogeneity of SP in DKK-1 alone, even though the I 2 values of DOR decreased >10%. Likewise, all of them were still >50%.
The limitations in the included studies and this meta-analysis were as follows: (1) As compared to the previous meta-analysis, the overall participants in the diagnosis test did not significantly increase, although more studies were included in the current metaanalysis (2678 vs 1115). However, 2 large samples were predominantly included in the previous meta-analysis, [22,23] which might lead to bias of the result. Hence, it is reasonable and necessary to further confirm the diagnostic performance of DKK-1 and DKK-1 + AFP. (2) The study design of all included studies was retrospective, and poor results might be removed from raw data. Besides, the purpose of included studies was incongruous. (3) There were only 3 studies with non-Chinese blood samples, [27,28,31] and only articles published in English were screened, which may have led to the limitations of geographical regions and languages. (4) Different cut-off values of serum DKK-1 were used among the included studies, which made it difficult to estimate the real diagnostic value. However, as a novel marker, DKK-1 should be tested for detecting HCC in future studies to explore the optimum cut-off value. (5) The standard references of HCC diagnosis differed among the included studies, including biochemistry, imaging character- istics, and pathology. However, it is difficult to have uniform methods for diagnosis of diseases in clinical practice. (6) Due to constraints of the small number of included studies, we did not perform meta-regression in the current meta-analysis to further search for the source of heterogeneity. Although subgroup analyses identified that the stage of HCC was the source of heterogeneity of DKK-1 + AFP in HCC diagnosis, it could not confirm whether the stage of HCC was the source of heterogeneity in DKK-1.

Conclusion
Serum DKK-1 + AFP showed high accuracy for diagnosing HCC, while serum DKK-1 alone, with a lower DOR, showed moderate accuracy as compared to the previous meta-analysis. However, more studies are needed to ascertain the diagnostic value of serum DKK-1 in AFP (À) HCC. Due to the limitations of the current meta-analysis, further well-designed studies are needed to confirm the diagnostic value of DKK-1 and DKK-1 + AFP in HCC diagnosis.