Modelling the probability of erroneous negative lymph node staging in patients with colon cancer

Background Patients in who with insufficient number of analysed lymph nodes (LNs) are more likely to receive an incorrect LN staging. The ability to calculate the overall probability of undiagnosed LN involvement errors in these patients could be very useful for approximating the real patient prognosis and for giving possible indications for adjuvant treatments. The objective of this work was to establish the predictive capacity and prognostic discriminative ability of the final error probability (FEP) among patients with colon cancer and with a potentially incorrectly-staged LN-negative disease. Methods This was a retrospective multicentric population study carried out between January 2004 and December 2007. We used a mathematical model based on Bayes’ theorem to calculate the probability of LN involvement given a FEP test result. Cumulative sum graphs were used to calculate risk groups and the survival rates were calculated, by month, using the Kaplan–Meier method. Results A total of 548 patients were analysed and classified into three risk groups according to their FEP score: low-risk (FEP < 2%), intermediate-risk (FEP 2%–15%), and high-risk (FEP > 15%). Patients with LN involvement had the lowest overall survival rate when compared to the three risk groups. This difference was statistically significant for the low- and intermediate-risk groups (P = 0.002 and P = 0.004, respectively), but high-risk group presented similar survival curves to pN+ group (P = 0.505). In terms of disease-free survival, the high-risk group presented similar curves to the intermediate-risk group until approximately 60 months’ follow-up (P = 0.906). After 80 months’ follow-up, the curve of high-risk group coincided with that of the pN+ group (P = 0.172). Finally, we summarized the FEP according to the number of analysed LNs and accompanied by a contour plot which represents its calculation graphically. Conclusions The application of Bayes’ theorem in the calculation of FEP is useful to delimit risk subgroups from among patients without LN involvement.


Introduction
Colon cancer is the most frequent malignancy in both sexes in Western countries, with an incidence of approximately 471,000 cases per year and a mortality of 228,000 cases per year in Europe [1]. Lymph node (LN) involvement is the prognostic factor most directly related to the survival and disease-free interval of cancer patients.
Thus, patients with stages I and II cancers have a 5-year overall survival (OS) rate higher than 75% compared to 30%-60% in patients with stages III and IV cancers [2][3][4]. Tumor-node-metastasis (TNM) classification is the gold standard method for staging colon cancer, however, this system recommends collecting at least 12 LNs for correct staging. Despite the multidisciplinary approach to LN analysis in colon cancer, for various reasons related to patients, surgeons, and pathologists, the number of LNs analysed is very variable between patients [5], and significantly fewer than 12 are usually analysed [6,7]. Without

Open Access
Cancer Communications *Correspondence: carlosfortea@gmail.com 1 Division of Colorectal Surgery, Department of Surgery, Consorci Hospitalari Provincial de Castelló, Castelló, Spain Full list of author information is available at the end of the article a doubt, patients with a pN0 LN staging have the highest risk (in terms of their therapeutic management) of suffering the most harmful consequences of being given an incorrect classification and prognosis. Calculation of the final error probability (FEP), i.e. the probability that the patients will present undiagnosed LN involvement, would be very useful for predicting the real prognosis and possible indications for adjuvant treatment among these patients.
The Bayes' theorem statistical method can be used to calculate the probability of presenting affected LNs even when a patient presents negative anatomopathological study results. It considers the anatomopathological study of the surgical specimen as a diagnostic test with a binary result in this case: positive LNs (presence of disease) or negative LNs (absence of disease) [8]. The probability of a patient having the disease, even when given a negative test result (i.e., the probability that a patient with a negative histological study result has an unidentified lymphnode metastasis) is represented by the complementary value of the final negative predictive value. The objective of this work was to establish the predictive capacity and prognostic discriminative ability of the FEP among patients with colon cancer and with a potentially incorrectly-staged LN-negative disease.

Patients
This is a multicentric population study using data from a high-quality tumour sample registry included in the European cancer registry-based study on survival and care of cancer patients (EUROCARE study) [1]. The data used from this registry corresponded to the period between January 2004 and December 2007. Patients with colon cancer treated with surgery with curative intent and lymphadenectomy, a complete anatomopathological report, and a clear clinical status at their last follow up were included. Patients with cancer of the rectum or caecal appendix, with metastases at diagnosis, scheduled surgery with palliative intention without lymphadenectomy, scheduled surgery without resection, incomplete anatomopathological reports, a dubious vital status at the last follow-up control, and those with insufficient or no monitoring were excluded. The study was approved by the institutional review board of the Hospital General de Castellon (PIC: 2013/2/CIR). All participating patients provided their written informed consent.

Variables
The study variables were age, sex, tumour location, histology, differentiation grade, and the size, number of analysed LNs, number of positive LNs, TNM classification, condensed T and N stages, chemotherapy, FEP, OS, disease-free survival (DFS), overall recurrence, locoregional recurrence, metastasis, and follow-up time.
Because all the data in the tumour registry is coded according to the sixth edition of the Union for International Cancer Control (UICC) TNM classification, we had to adapt them to the new guidelines for the seventh edition. Thus, although the N category was easily adapted, the T category could not be adapted to the new classification because the tumour registry contained insufficient data. As in other population studies, to minimise the effects of possible misclassifications, we used condensed TNM stages. The recurrence variable included patients who presented locoregional recurrence and those who presented distant metastases.

FEP
We used a well-known mathematical model based on Bayes' theorem to calculate the various diagnostic test parameters (sensitivity, specificity, and predictive values). According to Bayes' theorem, the FEP is the probability of LN involvement (N+) given a negative test result (n−), in other words, p(N+/n−), can be deduced from the following mathematical formula [8]: In the Bayes' theorem formula: p(N+) is the prevalence of pN1 cases in the series; p(N−) is the complement to p(N+); p(n−/N+) is probability of a false negative (1 − Sensitivity) and is calculated by obtaining the hypergeometric probability resulting from the consideration of (1) the total LNs analysed from all the patients in the series; (2) the total number of positive LNs obtained in the series; (3) the number of positive LNs in a specific patient (equal to 0 for pN0 cases); and (4) the number of LNs analysed in a specific patient; the Specificity is p(n−/ N+) and equals 1 because the presence of false ganglionic positives is considered impossible.
Given that there is a substantially greater probability of patients misclassified as pN0 (because they had an insufficient number of LNs analysed) having pN1 rather than pN2 or pN3 tumours, we decided to calculate the FEP of pN1 incorrectly being classified as pN0. Thus, all the FEP calculations refer to this adjusted FEP, set to pN1. Once the FEP was obtained, Cumulative Sum (CUSUM) [9] curves were used to calculate the optimal cut-off points following the method described by Barrio et al. [10], to obtain three incorrect pN0 classification risk groups.

Statistical analysis
Quantitative variables are expressed as the mean ± standard deviation (SD). Categorical variables are reported as frequencies and percentages. For the univariate analysis, the Chi square test (or exact Fisher test in small samples) was used to compare two qualitative samples; the Student t-test was used to compare two quantitative samples; and the ANOVA test was used to compare more than two quantitative samples.
The follow-up time we considered was from the date of surgery until the day of death, or the last day of follow-up in patients who did not die. This was because the tumour registry did not contain any clear definition of the date of diagnosis. Survival analysis was performed using the Kaplan-Meier method and the log-rank test was implemented to estimate the differences between groups in terms of OS and DFS. Probability values of P < 0.05 were accepted as the statistical significance cut-off level. Statistical analysis was carried out with the IBM SPSS Statistics ® program version 22 (IBM ® , Armonk, New York, USA). The CUSUM curves were calculated using the STATA ® program version 14 (StataCorp LP ® , College Station, Texas, USA).

Results
During the period from January 2004 to December 2007, 944 patients were diagnosed with colon cancer in Castellon province (Spain), 140 of which were not operated on because they had contraindications for anaesthesia or because they had unresectable neoplasms. Eighty-three patients were operated on with palliative intention and did not have an accurate lymphadenectomy record and so were not included in this work. We also excluded 116 cases with colon neoplasms that did receive an intervention but who also presented synchronous distant metastases. Insufficient data were obtained regarding the number of LNs analysed and affected in 49 cases and for 8 of the patients, there were no follow-up data recorded. Thus, here we eventually analysed 548 patients (Fig. 1).
The clinical and histopathological characteristics of the 548 patients, and of the three risk-groups, are shown in Table 1. Of note, (1) younger patients had a significantly lower risk (P = 0.002); (2) more tumours were located in the right colon in low-risk patients (P < 0.001); (3) the high-risk group contains more well differentiated tumours (P = 0.013); (4) the low-risk group had more cases of pT3-T4 TNM classifications (P = 0.044); (5) as the patient risk increased, the tumour sizes tended to decrease (P < 0.001); and (6) as fewer LNs were analysed the patient risk and overall mortality increased (P < 0.001 and P = 0.019, respectively). The following factors were identified as being related to OS (  Table 3 shows the FEP results according to the number of analysed LNs and is accompanied by a contour plot (Fig. 2) which represents its calculation graphically.
The OS charts (Fig. 3a) show that patients with LN involvement had the lowest OS rate than the three risk groups. This difference was statistically significant for the low-and intermediate-risk groups (P = 0.002 and P = 0.004, respectively), but not with respect to the highrisk group (P = 0.505). In other words, N0 patients with a high-risk FEP (> 15%) had an OS rate like that of pN+ patients.
We subsequently carried out a similar analysis in which we divided patients with LN involvement into pN1 and pN2 groups. As shown in Fig. 3b, the OS rates differed between the low-and high-risk groups (P = 0.014). The pN1 group had statistically significant difference to the low-risk and pN2 group (P = 0.008 and P = 0.034, respectively). On the other hand, pN2 group had a significantly worse prognosis than low-and intermediate-risk groups (all P < 0.001), but there were no statistically significant differences between high-risk group (P = 0.066). Interestingly, the survival curves of the high-risk and pN1 patient were very similar (P = 0.980). In terms of DFS (Fig. 4a)

Discussion
LN involvement is one of the most important prognostic factors in colon cancer. Given its enormous importance, especially in terms of prognostic and therapeutic decisions, gaining a detailed picture of the true LN status of patients with colon cancer should be a priority for clinicians involved in the diagnostic-therapeutic process of these patients. In this sense, the use of FEP may be useful, especially for patient groups which more frequently see staging errors. Our group has extensive experience in using FEP to assess LN involvement in the contexts of colon [11], gastric [12], and breast cancers [13]. This method aims to calculate the probability that a pN+   patient in whom with insufficient analysed LNs will be erroneously classified as pN0. This has obvious clinical and prognostic consequences, especially in the light that stage III patients (i.e., those with LN involvement) can significantly benefit from chemotherapy [2,[14][15][16]. Correctly staging patients with colon cancer is very important given that around 60% of them do not currently present LN involvement at diagnosis [6,7], a figure similar to the 63.3% staged as pN0 in this study. FEP is particularly important for the patients in whom with insufficient analysed LNs because their TNM classification would not necessarily be accurate. This was the case in the series used in this study, in which the threshold 12 LNs were not analysed in 56% of the cases. Similar results have been reported in several previous studies, especially in population analyses like ours, in which this lymph-node threshold goal was not reached [2,6,17,18]. There are likely several reasons why it is common for so few LNs to be analysed, likely because of factors related to the patient and surgeons, and to the type of anatomopathological study undertaken. Our results clearly show that FEP is strongly negatively related to lower OS rates (Fig. 3a). Moreover, patients with LN metastases have a poorer prognosis than those classified into the three risk groups. Most importantly, there were no statistically  To try to increase the accuracy of our analysis, we further categorised pN1 and pN2 patients with LN infiltration and, as expected, pN2 patients had the poorest prognosis of the three groups whereas the low-risk group had the best OS rate (Fig. 3b). It is also important to highlight that the high-risk and pN1 patients had very similar OS curves. Therefore, the pN1 patients had a similar OS rate to the pN0 patients with a high staging error risk.
In terms of DFS, patients with an intermediate and high risk had a DFS rate like that of pN1 patients (Fig. 4b).
Thus, our data confirm that a minimum of 12 LNs must be analysed to reduce TNM classification staging errors. It should also be noted that low-risk patients had more right-colon neoplasms, were younger, and had larger tumours, i.e., their tumour characteristics favoured easier LN analysis [5,19]. Similarly, when younger patients were included in this group and more of LNs were analysed, their OS rate was better.
The use of this mathematical model, which is based on the Bayes' theorem, has been previously described in the literature in the identification of groups with similar prognoses from among patients with different cancers but with similar characteristics to those of our patient cohort. This model was first described by Kiricuta et al. [8] in 1992 and was applied in breast cancer to calculate the probability of tumour persistence after an incomplete axillary dissection, staging the patients according to the T category of the TNM classification. Later, Okamoto et al. [20] studied the probability of LN involvement in patients with negative sentinel-LN breast cancer using a Bayesian model. Iyer et al. [21] further demonstrated the usefulness of this method in 1652 patients with breast cancer in a study which aimed to evaluate the probability of LN involvement by staging it into T1 and T2. Following on from this work, Joseph et al. [22] studied 1585 patients with colorectal cancer and used this method to demonstrate the probability of LN involvement according to the number of analysed LNs using the same staging as Kiricuta et al. [8]. In a study of 480 patients with colon cancer, Martínez et al. [11] also showed that the risk of an erroneous negative ganglionic classification in colon cancer can be individualised by calculating its probability according to Bayes' theorem.