A predictive nomogram for lymph node metastasis of incidental gallbladder cancer: a SEER population-based study

Existing imaging techniques have a low ability to detect lymph node metastasis (LNM) of gallbladder cancer (GBC). Gallbladder removal by laparoscopic cholecystectomy can provide pathological information regarding the tumor itself for incidental gallbladder cancer (IGBC). The purpose of this study was to identify the risk factors associated with LNM of IGBC and to establish a nomogram to improve the ability to predict the risk of LNM for IGBC. A total of 796 patients diagnosed with stage T1/2 GBC between 2004 and 2015 who underwent surgery and lymph node evaluation were enrolled in this study. We randomly divided the dataset into a training set (70%) and a validation set (30%). A logistic regression model was used to construct the nomogram in the training set and then was verified in the validation set. Nomogram performance was quantified with respect to discrimination and calibration. The rates of LNM in T1a, T1b and T2 patients were 7, 11.1 and 44.3%, respectively. Tumor diameter, T stage, and tumor differentiation were independent factors affecting LNM. The C-index and AUC of the training set were 0.718 (95% CI, 0.676–0.760) and 0.702 (95% CI, 0.659–0.702), respectively, demonstrating good prediction performance. The calibration curves showed perfect agreement between the nomogram predictions and actual observations. Decision curve analysis showed that the LNM nomogram was clinically useful when the risk was decided at a possibility threshold of 2–63%. The C-index and AUC of the validation set were 0.73 (95% CI: 0.665–0.795) and 0.692 (95% CI: 0.625–0.759), respectively. The nomogram established in this study has good prediction ability. For patients with IGBC requiring re-resection, the model can effectively predict the risk of LNM and make up for the inaccuracy of imaging.

GBCs may be confirmed at the early stage through laparoscopic cholecystectomy (LC) so that early R0 resection may be performed; thus, progression of the disease may be avoided, and the overall survival rate may be improved [4,5].
More than 50% of GBCs are diagnosed by intraoperative or postoperative pathological examination after LC [4] and are considered incidental gallbladder cancer (IGBC), in which stage T1/2 GBCs are the most common [6]. IGBC often requires radical re-resection [5]. Among patients with lymph node metastasis (LNM), lymph node dissection is an important part of radical surgery [7]. Although an increasing number of clinical centers emphasize the importance of high-quality lymph node dissection [8][9][10], a study based on the SEER database showed that the lymph node resection rates for stages T1a, T1b, and T2 GBC were only 33.6, 39.2, and 53.7%, respectively [7], which indicated that preoperative lymph node examination was seriously insufficient. LNM is an independent factor influencing the prognosis of early GBC [11,12]. Therefore, the preoperative diagnosis of LNM is very important. However, current imaging is still not sensitive enough to identify LNM in the preoperative examination [13]. In lieu of the low incidence rate of GBC, there is still no study with a large sample size for predicting the risk factors for LNM in early GBC, and there is no quantified prediction model.
LC makes general pathological information on patients with IGBC available before the patients receive reresection [2]. In recent years, nomograms have been broadly used for preoperative prediction of the risk of LNM and have been proven to be effective [14][15][16]. Therefore, this study aims to use the pathological and demographic information contained in the SEER database to determine the LNM risk factors for IGBC and to establish a nomogram model for predicting the incidence rate of LNM at the early stage of IGBC before re-resection.

Data collection
The SEER (Surveillance, Epidemiology, and End results) database is currently the largest publicly available cancer database, covering approximately 28% of the US population [3]. The National Cancer Institute's SEER*Stat software (8.3.6 version) was used to collect data. The inclusion criteria were as follows: (1) site record: C23.9, according to the Third Edition of International Classification of Diseases for Oncology (ICD-O-3); (2) pathological type: adenocarcinoma or squamous cell carcinoma; (3) T stage classified as T1a, T1b, T2 and N stage classified as N0 and N1 according to 6th edition AJCC staging system; (4) underwent surgery; (5) at least 1 regional lymph node examined; and (6) no preoperative radiotherapy. After the inclusion, patients were excluded if their information regarding tumor size or tumor differentiation was unknown. We also excluded patients diagnosed with M1 stage, for whom surgery was not suitable [17].
We extracted the demographic and clinicopathologic data of patients with T1/2 GBC from the SEER database for model development and validation, including age, sex, race, tumor size, histology, differentiation, depth of invasion, and number of lymph nodes examined.
The whole dataset from the SEER database was randomly partitioned into a training set and a validation set, which included 70 and 30% of the dataset, respectively. To let each data has the same chance to be assigned to training set and validation set, a simple random sampling method was used for allocation. Specifically, we installed caret package in R software version 3.6.2, then we loaded the foreign, survival and caret packages. And the last step was to run the packages by specific codes. The codes were attached in our Supplementary Material.

Statistical analysis
Correlations between the clinicopathological characteristics of patients and LNM were analyzed using Pearson's chi-square test or Fisher exact test when needed. To identify factors that were associated with LNM, binary logistic regression analysis was used for univariate and multivariable analyses. Odds ratios (ORs) were presented with 95% CIs. Preoperatively available variables were included in the logistic regression analysis. To construct a well-calibrated and discriminative nomogram for predicting LNM, a model was developed in a training set and then validated in the validation set. A logistic regression model was used to construct the nomogram with a backward stepwise procedure. Variables with P < 0.05 were included in the nomogram.
Nomogram performance was quantified with respect to discrimination and calibration. Discrimination (the ability of a nomogram to separate patients with different lymph node statuses) was quantified by concordance indexes (C-indexes) and the area under the receiver operating characteristic (ROC) curve (AUC). Calibration was assessed graphically by plotting the relationship between the actual (observed) probabilities and predicted probabilities (calibration plot) with the bootstrapping method (1000 replications). Clinical usefulness and net benefit were estimated with decision curve analysis (DCA).
Statistical analyses of correlations between clinicopathological characteristics were conducted using SPSS version 24.0 (IBM, NY, US). The partition of dataset, logistic regression analysis, construction and performance quantification of nomogram and DCA were conducted using R statistical software version 3.6.2. All tests were two-sided, and P < 0.05 was deemed significant.

Factors associated with preoperative LNM
As shown in Table 2, the logistic regression model was used to further verify the effectiveness of the included factors. Univariate analysis showed that tumors with a diameter > 1 cm, stage T2, and poor/undifferentiation were closely related to LNM. Multivariate analysis further confirmed that tumors with a diameter > 1 cm (OR = 3.628, 95% CI: 1.770-7.437), stage T2 (OR = 11.104, 95% CI: 2.590-47.597), and poor/undifferentiation (OR = 2.110, 95% CI: 1.184-3.762) were independent factors influencing LNM. Based on the OR value, T2 stage was the most correlated, followed by the tumor diameter and then the degree of differentiation. Age, sex, race and pathological pattern were not significantly correlated with LNM.

Validation of the model
The nomogram demonstrated good accuracy for predicting positive lymph nodes, with a C-index of 0.718 (95% CI, 0.676-0.760) and an AUC of 0.702 (95% CI, 0.659-0.702). The calibration plot presented good agreement between the bias-corrected prediction and the ideal reference line with an additional 1000 bootstraps (mean absolute error = 0.02) (Fig. 2a, c).
The C-index and AUC of the validation set were 0.73 (95% CI: 0.665-0.795) and 0.692 (95% CI: 0.625-0.759), respectively, which revealed good concordance and reliable ability to estimate the status of lymph node involvement. The calibration plot of validation also demonstrated good agreement between the bias-corrected prediction and the ideal reference line with an additional 1000 bootstraps (mean absolute error = 0.035) (Fig. 2b, d).

Comparison between different prediction methods
Comparisons between different prediction methods were conducted by decision curve analysis. The decision curve has the ability to show the clinical usefulness of each method based on a continuum of potential thresholds for LNM risk (x-axis) and the net benefit of using the model to risk stratify patients (y-axis) relative to assuming that no patient will have LNM. Figure 3 reveals that the nomogram provided the largest net benefit across the range of LNM risk compared with the methods using tumor size, differentiation and T-stage alone.

Discussion
GBC is a highly occult cancer with no obvious clinical manifestations in its early stage [3]. With the development of laparoscopy, an increasing number of stage T1/ 2 IGBCs can be detected via pathological biopsy after LC [6]. For IGBCs, postoperative pathological evaluations need to be completed in combination with imaging for re-resection [18,19].
For patients with LNM, lymphadenectomy is an important part of radical resection, and all positive lymph nodes need to be cleared [20]. Although high-quality lymph node dissection was emphasized, preoperative lymph node examination was seriously insufficient based on the results that the resection rates of T1a, T1b, and T2 GBC were only 33.6, 39.2, and 53.7%, respectively, according to this SEER-based study [7]. Although current NCCN guidelines recommend radical surgery for all patients with GBC at stages T1b and above [18], several studies have concluded that patients with T1b and T2 stages might not require radical surgery [21][22][23][24]. However, some studies have shown that LNM is closely related to malignant phenotype of early stage GBC [25,26], we believe that patients diagnosed with LNM preoperatively should receive more aggressive surgical treatment and more extensive lymph node dissection than patients without LNM.
CT is the most commonly used clinical imaging method [27]. Although CT can accurately show the invasion of tumors in blood vessels and adjacent organs, its accuracy for the identification of LNM is very low [28]. Some studies have shown that more than half of the positive lymph nodes existing among GBC patients cannot be detected by preoperative CT examination [24,27,29]. Unfortunately, neither MRI nor PET-CT is a good supplement for CT [28,30,31]. The present study may combine clinical imaging to further improve the estimation of the risk of LNM, which is conducive to clinicians choosing the most suitable surgical methods for patients. Among the cases of GBC included in this study, the LNM rate of stage T1a was 7%, stage T1b was 11.1%, and stage T2 was 44.3%. For a variety of early primary cancers in the digestive tract, such as gastric cancer [14], appendiceal cancer [15], and colon cancer [16], the SEER database has been used to establish a nomogram for predicting the risk of LNM. In this paper, the SEER database was used to predict the risk of LNM in IGBC and construct a nomogram. In the present study, tumor diameter, tumor differentiation degree and T stage were independent factors influencing metastasis, of which T stage was the most significant factor. Compared with that at stage T1a, the risk of LNM at stage T2 may have increased by 11 times. The second most significant factor was tumor diameter. When the tumor diameter was greater than 1 cm, the risk of LNM may have increased by 3.6 times. According to the nomogram, there was little difference in the risk of LNM when the tumor diameter was greater than 1 cm, but the risk was reduced when the tumor diameter was greater than 4 cm. The least significant factor was tumor differentiation. The risk of LNM in poorly differentiated or undifferentiated patients was only twice as high as that in welldifferentiated patients. Gallbladder adenocarcinoma (76-90%) and squamous cell carcinoma (2-10%) are the two most common pathological patterns of GBC and the prognosis of squamous cell carcinoma is worse than that of adenocarcinoma [32], but in our study, it is indicated that there was no significant correlation between pathological patterns and LNM. We believe that there are two possibilities: (1) according to the relevant literature, squamous cell carcinoma is more likely to invade the liver than LNM [33], which may further confirm that there is no correlation pathological patterns in LNM; and (2) the number of T1/2 squamous cell carcinomas is too small to be statistically significant. Considering the low incidence rate of GBC, few singlecenter studies have previously used clinical data to predict the risk of LNM in early GBC. Therefore, we used DCA to compare the differences in predictive power among the nomogram and the included univariates. According to Fig. 3, the probability thresholds of differentiation, T stage, tumor size and nomogram are 0.23-0.49, 0.03-0.45, 0.28-0.51 and 0.02-0.63, respectively. The curve of T-stage is very close to that of nomogram containing three factors, but the probability threshold of T-stage is smaller than that of nomogram. When the risk is decided at a probability threshold lower than 0.38, the T-stage curve and the nomogram curve almost overlap which indicates the two prediction models almost have the same net benefit within this range, and both are higher than the reference line. However, when the risk is decided higher than 0.38, the net benefit of T-stage is not as good as that of the nomogram. A comparison between tumor and differentiation shows that when the risk is decided at a probability threshold of 0.23-0.28, the net benefits of tumor and differentiation are very close and nearly equal to the reference line; when the probability is decided at a probability threshold of 0.28-0.35, the net benefits of these two are still very close, but higher than the reference line; when the risk is decided at a probability threshold of 0.35 and 0.4, the net benefit of differentiation is relatively high; and when the probability is decided higher than 0.4, the net benefit of tumor size is less than 0 while the differentiation model has a prediction ability higher than that of the tumor model. However, the net benefits of these two models within their probability thresholds are both smaller than that of the nomogram. To sum up, although the univariate models have certain predictive power, DCA shows that the nomogram predicts accurately in a wider range.
For GBC patients accompanied by LNM, existing studies recommend cholecystectomy and lymph node dissection for patients at stage T1a [34], and radical surgery for patients at stage T1b/T2 [26]. The total score calculated by the nomogram corresponds to the risk of LNM. Zhu et al. [35] put forward that patients with a ≤ 5.0% predicted risk of LNM are considered as low-risk group, those with 5-15% predicted risk as intermediate risk group, and those predicted risk >15% as high risk group. Combining these conclusions with our study, we assume that patients in low-risk group could choose long-term follow-up, and patients in the high-risk group should be recommended for a re-resection; as for those in intermediate-risk group, patients could choose a longterm follow-up, however, the recommendation of reresection should better be come up with. Take a T1b IGBC patients for example, in clinical practice, if a T1b IGBC patient pathologically diagnosed after LC is with poor compliance to a re-resection, in the meanwhile, no LNM is found by imaging, which is considered having low ability to detect LNM [27,28,30,31], the clinician will be caught into a dilemma that whether a reresection is needed or not. In this case, the clinician may use our nomogram to make a decision. If he/she is pathologically confirmed with a poorly differentiated or undifferentiated tumor with a diameter between 3 and 4 cm, his/her total score will be 113. His/her corresponding risk of LNM is nearly 19% and is allocated to highrisk group. The clinical suggestion is that him/her should undergo a radical re-resection. In contrast, if the T1b patient is with a highly differentiated tumor with a diameter less than 1 cm, his/her total score will be 20, and the risk of LNM is nearly 3% and is allocated to low-risk group. The clinical suggestion is that he/she could choose to follow up regularly.
We must recognize the limitations that may exist with our study. First, all selected patients have received lymph nodes biopsy and the median number of lymph nodes inspected in training set was 2 (IQR: 1-5), however, the effect of selection bias with LN+ and LN-due to the non-randomized nature of this study can't be expected. Steffen et al. [7] claimed that retrieval of even a few lymph nodes reliably predicts the lymph node status, which may compensate for this bias. Second, previous studies have concluded that age < 60, elevated CA199 levels [27], and hepatic-sided tumors [36] can also be used for predicting LNM. However, in this study, age was not necessarily associated with LNM, and this study lacked information concerning the preoperative diagnosis of CA199 and tumor location, which may have led to insufficient influencing factors. Last but not least, the data in SEER database is originated from different sources and hospitals [3], so our study is considered as a multicenter study. However, GBC has regional differences in incidence [37]. Although the nomogram constructed in this study was validated internally and externally having good prediction ability, in our view, the generalization ability of the nomogram is still needed to be verified with clinical data other than SEER database. Therefore, we hope that in the future, large sample of GBC patients from different regions can be obtained to construct a nomogram using the three variables selected in this study for further external validation, as well as measurement of the generalization ability of the nomogram.
Despite limitations above, the large-sample based study predicts LNM with good discrimination and calibration both in the training and validation cohorts. The nomogram constructed in this study visualizes the risk factors and could better guide the clinical decisions.

Conclusion
In conclusion, based on the clinical risk factors identified in a large population-based cohort, we established the first practical nomograms that could objectively and accurately predict the individualized risk of LNM for IGBC patients who required re-resection. Moreover, the validation set results demonstrated that the nomograms performed well and had high accuracy and reliability. Our nomogram was demonstrated to be clinically useful in DCA, and it made up for the inaccuracy of imaging.
Therefore, these results could help clinicians improve individual treatment and make clinical decisions regarding patients with T1/2 stage IGBC.