Establishment and validation of a nomogram containing cytokeratin fragment antigen 21-1 for the differential diagnosis of intrahepatic cholangiocarcinoma and hepatocellular carcinoma

Background Our study aimed to develop a nomogram incorporating cytokeratin fragment antigen 21–1 (CYFRA21–1) to assist in differentiating between patients with intrahepatic cholangiocarcinoma (ICC) and hepatocellular carcinoma (HCC). Methods A total of 487 patients who were diagnosed with ICC and HCC at Qilu Hospital of Shandong University were included in this study. The patients were divided into a training cohort and a validation cohort based on whether the data collection was retrospective or prospective. Univariate and multivariate analyses were employed to select variables for the nomogram. The discrimination and calibration of the nomogram were evaluated using the area under the receiver operating characteristic curve (AUC) and calibration plots. Decision curve analysis (DCA) was used to assess the nomogram’s net benefits at various threshold probabilities. Results Six variables, including CYFRA21–1, were incorporated to establish the nomogram. Its satisfactory discriminative ability was indicated by the AUC (0.972 for the training cohort, 0.994 for the validation cohort), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) values. The Hosmer–Lemeshow test and the calibration plots demonstrated favorable consistency between the nomogram predictions and the actual observations. Moreover, DCA revealed the clinical utility and superior discriminative ability of the nomogram compared to the model without CYFRA21–1 and the model consisting of the logarithm of alpha-fetoprotein (Log AFP) and the logarithm of carbohydrate antigen 19–9 (Log CA19–9). Additionally, the AUC values suggested that the discriminative ability of Log CYFRA21–1 was greater than that of the other variables used as diagnostic biomarkers. Conclusions This study developed and validated a nomogram including CYFRA21–1, which can aid clinicians in the differential diagnosis of ICC and HCC patients.


Introduction
Primary liver carcinoma (PLC) represents a significant global public health issue (1) and encompasses three primary histological subtypes: hepatocellular carcinoma (HCC), intrahepatic cholangiocarcinoma (ICC), and mixed hepatocellular cholangiocarcinoma (2).Although the incidence of ICC is relatively low compared to that of HCC, recent studies indicate that the incidence of ICC is gradually increasing (3).ICC and HCC differ in etiology, biology, and carcinogenic mechanism, which has been confirmed by previous studies (4)(5)(6).Thus, treatment and prognosis substantially differ (7)(8)(9).The differential diagnosis of patients with ICC and HCC remains a research focus, as effective treatment strategies depend on accurate and early differentiation.
Postoperative pathological biopsy is the gold standard for distinguishing HCC from ICC, but it is not feasible for patients with surgical contraindications.Therefore, simpler and more accurate diagnostic methods are urgently needed to facilitate early differential diagnosis and meet the diagnostic needs of patients with contraindications.While imaging technologies such as CT and MRI are prominent in differentiating HCC from ICC (7,(10)(11)(12)(13)(14), their limitations and dependence on technicians' interpretation skills are notable.Ultrasound examinations also fail to provide satisfactory accuracy.Likewise, serological markers such as carbohydrate antigen 19-9 (CA19-9), alpha-fetoprotein (AFP), and inflammatory indices demonstrate limited distinguishing capabilities (15)(16)(17).Consequently, the search for more reliable diagnostic tools continues.
Cytokeratin fragment antigen 21-1 (CYFRA21-1), a fragment of cytokeratin 19, is a sensitive marker predominantly used for detecting non-small cell lung cancer (NSCLC) (18).Recent studies have shown that it is also specifically released in the serum of patients with liver and biliary diseases, particularly cholangiocarcinoma, and has attracted increasing amounts of recent attention (19,20).Thus, the serum level of CYFRA21-1 shows promise as a marker for differentiating between ICC and HCC.
A nomogram is a predictive tool that creates simple charts based on a statistical model and is increasingly utilized to aid in clinical decision-making.The aim of this study was to develop and validate an accurate nomogram using clinical indicators obtained at hospital admission, enabling safer, simpler, and more cost-effective identification of ICC and HCC patients in the early stages of the disease, thereby facilitating individual clinical decision-making.

Patients
This retrospective study included a total of 365 patients with pathologically confirmed diagnoses of ICC and HCC from January 2016 to April 2022 at Qilu Hospital of Shandong University; these patients composed the training cohort.The inclusion and exclusion criteria were as follows: Inclusion criteria: 1. Patients were diagnosed with ICC or HCC based on pathological examination.

Availability of complete clinical information.
Exclusion criteria: 1. Patients with mixed tumors confirmed histopathologically.
2. Individuals with other malignancy types.
3. Patients who had undergone previous surgical treatment.

Statistical analysis
Numerical variables are presented as the means with standard deviations (SD) or medians with interquartile ranges (IQR).Student's t-test or the Mann−Whitney test was applied for variable comparisons, as appropriate.Categorical variables are expressed as frequencies and were compared using Pearson's c2 test.Variables with skewed distributions, such as CYFRA21-1, CA19-9, CA125, AFP, CEA, ALP, SA, LDH, FIB, and D-dimer, underwent a logarithmic transformation.
Multivariate logistic regression analysis identified independent differential factors for ICC and HCC.The training cohort was subjected to stepwise regression based on the Akaike information criterion as a stopping rule.A nomogram was developed from these independent factors and validated in the validation cohort.The accuracy of the nomogram and its comparative discriminative performance against other models were evaluated using receiver operating characteristic (ROC) curves and area under the curve (AUC).Model consistency was assessed using calibration curves and the Hosmer-Lemeshow test.Decision curve analysis was used to quantify the net benefit of various threshold probabilities and assess the clinical utility of the nomogram and other models.

Clinicopathologic characteristics of patients
During the study period, a total of 487 patients who underwent hepatectomy for primary hepatic carcinoma and met the inclusion criteria were included.The training cohort comprised 365 patients (279 with HCC and 86 with ICC), while the validation cohort consisted of 122 patients (87 with HCC and 35 with ICC).The demographics and clinicopathological variables of the patients in the training and validation cohorts are presented in Supplementary Table S1, and no significant differences were detected between the two cohorts.Additionally, the baseline clinicopathological data were compared between ICC patients and HCC patients in the training cohort, and the results are detailed in Table 1.
Nomogram containing CYFRA21-1 for differentiating ICC and HCC and calibration plots of nomogram.Six variables including Gender, Jaundice, Hepatitis, Log CYFRA21-1, Log CA19-9 and Log AFP were selected to establish the nomogram.For example, a 71-year-old male patient with jaundice and no history of hepatitis, CA19-9 of 680.3IU/ml,AFP of 3.11ng/ml, CYFRA21-1 of 20.30ng/ml had a 99.9% probability of diagnosing ICC

Discussion
The epidemiology, risk factors, genetics, and epigenetics of ICC and HCC vary significantly (9), accompanied by notable differences in cellular metabolism, leading to distinct treatment approaches and prognoses for each (4,(21)(22)(23).In summary, precise and accurate differential diagnosis is imperative in navigating the complexities of these distinct liver cancers.
Our study combined CYFRA21-1 with traditional differential diagnostic indicators (sex, jaundice, hepatitis, Log AFP, and Log CA19-9) to increase the accuracy and specificity of differentiating ICC from HCC and developed a nomogram that achieved greater benefit than did previous models, potentially aiding in therapeutic decision-making.
Female sex was found to be positively associated with ICC, and a history of hepatitis was negatively associated with ICC, which aligns with the findings of previous studies (7,24).Jaundice is positively associated with ICC, which can be attributed to the fact that the location of the ICC is more prone to causing biliary obstruction than the location of the HCC (25).
As CA19-9 and AFP are widely used biomarkers for diagnosing ICC and HCC, respectively, the combined use of CA19-9 and AFP levels is prevalent in distinguishing ICC from HCC in clinical practice (7,16,20)  can promote CK degradation, leading to high expression of CK fragments (30).Severe chronic liver damage induces a ductular reaction (DR) composed of ductal cells and liver progenitor cells (LPCs), with CK19 being a prominent histological marker for DR (31,32).Therefore, the expression of CK19 may be related to the diagnosis and progression of liver and biliary tract diseases.CYFRA21-1, a soluble fragment of CK19 and a useful marker for non-small cell lung cancer (NSCLC) (18), has been gaining attention for its potential role in the diagnosis and prognosis of liver and biliary tract diseases (33,34).A previous study linked CK19 expression with the progression of ICC and demonstrated higher CYFRA21-1 serum levels in ICC patients than in those with extrahepatic adenocarcinoma (35).However, few studies have compared the serum CYFRA21-1 concentration between patients with ICC and patients with HCC.Given the predominant expression of CK19 in chronic biliary tract disease and the absence of CK19 in hepatocytes (36, 37), CYFRA21-1 levels are expected to be greater in ICC than in HCC, a hypothesis supported by our study.This study established that CYFRA21-1 is an independent risk factor for distinguishing between ICC and HCC, and the AUC of CYFRA21-1 was greater than that of Log CA19-9 and Log AFP, which indicates that CYFRA21-1 plays a significant role in the differential diagnosis of ICC and HCC.
However, this study has several limitations.The data were sourced from a single institution, highlighting the need for further validation with a larger external sample.Additionally, the relatively small sample size of this study necessitates further research with larger cohorts to ascertain the definitive impact of the serum CYFRA21-1 concentration in differentiating between ICC and HCC.Furthermore, this study identified only ICC and HCC, and further research is needed to explore whether CYFRA21-1 can play a role in differentiating ICC from other types of liver cancer, including mixed hepatocellular-cholangiocarcinoma and liver metastases.

Conclusion
In conclusion, we developed a nomogram with a superior AUC compared to that of previous models, and its predictive ability was assessed from various perspectives.Furthermore, this study underscores the clinical significance of CYFRA21-1 in  differentiating between ICC and HCC patients and offers a novel approach for differential diagnosis.
(A).The calibration curves of the nomogram in the training (B) and validation (C) cohorts.The calibration curves of the nomogram showed good consistency between the predicted probability of ICC diagnosis and the actual probability.HCC, hepatocellular carcinoma; ICC, intrahepatic cholangiocarcinoma; Log CYFRA21-1, logarithm of cytokeratin fragment antigen 21-1; Log CA19-9, logarithm of carbohydrate antigen 19-9; Log AFP, logarithm of alpha-fetoprotein.* represented P value < 0.05 between ICC group and HCC group; ** represented P value < 0.01 between ICC group and HCC group; *** represented P value ≤ 0.001 between ICC group and HCC group.

FIGURE 2 ROC
FIGURE 2 ROC curves of the nomogram, models and variables in the training cohort.In the training cohort, ROC curves of the nomogram, Model 1 and Model 2 (A), and ROC curves and AUC of six variables including in the nomogram (B).ROC curves, receiver operating characteristic curves; AUC, area under the curve; Log CYFRA21-1, logarithm of cytokeratin fragment antigen 21-1; Log CA19-9, logarithm of carbohydrate antigen 19-9; Log AFP, logarithm of alpha-fetoprotein.

FIGURE 3 ROC
FIGURE 3 ROC curves of the nomogram and other models in the validation cohort.ROC curves of the nomogram, Model 1 and Model 2 in the validation cohort.ROC curves, receiver operating characteristic curves; AUC, area under the curve.

4 DCA
FIGURE 4 DCA of nomogram and models.DCA of the nomogram, Model 1 and Model 2 in the training (A) and validation (B) cohorts, the x-and y-axes respectively show the risk threshold probability and net benefit.DCA, decision curve analysis.

Table 2
summarizes the results of the univariate and multivariate logistic analyses.Twenty-two candidate variables

TABLE 1
Characteristics of patients in HCC and ICC in the training cohort.

TABLE 2
Univariate and multivariate logistic regression analysis of ICC presence based on preoperative data in training cohort.
. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of the nomogram, Model 1, and Model 2 in the training and validation cohorts were compared and are illustrated in Table3.

TABLE 3
Diagnostic efficacy of different methods., Area under the receiver operating characteristic; CI, confidence interval; PPV, Positive predictive value; NPV, Negative predictive value. AUC