A nomogram composed of clinicopathologic features and preoperative serum tumor markers to predict lymph node metastasis in early gastric cancer patients

Predicting lymph node metastasis (LNM) accurately is of great importance to formulate optimal treatment strategies preoperatively for patients with early gastric cancer (EGC). This study aimed to explore risk factors that predict the presence of LNM in EGC. A total of 697 patients underwent gastrectomy enrolled in this study, were divided into training and validation set, and the relationship between LNM and other clinicopathologic features, preoperative serum combined tumor markers (CEA, CA19-9, CA125) were evaluated. Risk factors for LNM were identified using logistic regression analysis, and a nomogram was created by R program to predict the possibility of LNM in training set, while receiver operating characteristic (ROC) analysis was applied to assess the predictive value of the nomogram model in validation set. Consequently, LNM was significantly associated with tumor size, macroscopic type, differentiation type, ulcerative findings, lymphovascular invasion, depth of invasion and combined tumor marker. In multivariate logistic regression analysis, factors including of tumor size, differentiation type, ulcerative findings, lymphovascular invasion, depth of invasion and combined tumor marker were demonstrated to be independent risk factors for LNM. Moreover, a predictive nomogram with these independent factors for LNM in EGC patients was constructed, and ROC curve demonstrated a good discrimination ability with the AUC of 0.847 (95% CI: 0.789-0.923), which was significantly larger than those produced in previous studies. Therefore, including of these tumor markers which could be convenient and feasible to obtain from the serum preoperatively, the nomogram could effectively predict the incidence of LNM for EGC patients.


INTRODUCTION
The incidence of early gastric cancer (EGC), defined as adenocarcinoma limited to the mucosa or submucosa of the stomach, irrespective of lymph node metastasis (LNM), has been increasing worldwide. [1][2][3] Apart from gastrectomy with lymphadenectomy, endoscopic surgical techniques including of endoscopic mucosal resection (EMR) and endoscopic submucosal dissection (ESD) have gained increasing popularity and have been widely regarded as an alternate treatment for some EGC patients [4,5], from which patients can avoid a potentially morbid surgical procedure and preserve stomach function as well as maintain high postoperative quality of life. [6][7][8][9] Nevertheless, endoscopic resection with curative intent should only be considered with the absence of regional Research Paper lymph node metastases, as regional lymph nodes are untreated in this procedure. [10,11] Thus, identifying the risk factors for LNM is of crucial importance to determine the optimal treatment for EGC patients.
Previous studies suggested that some clinicopathologic features, such as differentiated type, depth of invasion, tumor size and the presence of ulceration [12][13][14][15], and biological markers including of P53, ki67, Her-2 and E-cad [16,17], were the independent risk factors for LNM, even if unanimous agreement has not been reached. However, there were few studies evaluating the correlation between the preoperative serum tumor markers (CEA, CA125, CA19-9) and LNM in EGC [18,19], and nomogram has been applied to quantify risk factors of LNM in several carcinomas other than EGC [20,21]. Furthermore, there is no predictive nomogram analyzing the clinicopathologic features and preoperative serum tumor markers for the risk of LNM in EGC. Therefore, the aim of this study was to identify risk factors for LNM and to construct a nomogram based on these factors for EGC patients to guide treatment.

RESULT Correlation analysis between the clinicopathologic features and lymph node metastasis (LNM)
There were a total of 697 early gastric cancer (EGC) patients enrolled in this study, including 446 male patients and 251 female patients. The average age was 56.6 years old (range, 25-83 years old). 598 patients were enrolled in the training set, with 447 patients in LNM (-) group and 151 patients in LNM (+) group, while 99 patients were divided into the validation set, with 67 patients in LNM (-) group and 32 patients in LNM (+) group. Difference in terms of all the clinicopathologic features, was not found to be significant between the training set and validation set (all the p* value >0.05), indicating a similar constitution and a balanced baseline between them.
As shown in Table 1, LNM was found to be significantly associated with tumor size, macroscopic type, differentiation type, ulcerative findings, lymphovascular invasion, depth of invasion and combined tumor marker both in the training set and validation set. To be specific, there were significantly more patients with larger tumor size, depressed/mixed macroscopic type, undifferentiated type, submucosa invasion, the presence of ulcerative findings or combined tumor marker in LNM (+) group than those in LNM (-) group.

Identification of risk factors and multivariate analysis for LNM
As illustrated in Table 2, logistic regression analysis was performed to determine the risk factors for LNM. The nomogram for predicting the LNM Nomogram was furtherly constructed by these independent risk factors in the training set to predict the LNM for patients with EGC. This nomogram model based on these risk factors which could affect the incidence of LNM was displayed in Figure 1. For each patient, points were assigned for each of these clinicopathologic risk factors (tumor size, differentiation type, ulcerative findings, lymphovascular invasion, depth of invasion and combined tumor marker), while a total point, calculated from the nomogram, was visually corresponded to a predictive value for LNM.
In addition, ROC curve and calibration plot were displayed to validate the predictive accuracy of the nomogram model. Specifically, ROC in Figure 2 illustrated an AUC of 0.847 (95% CI: 0.789-0.923), which revealed a good concordance and a reliable ability to estimate the status of lymph nodal involvement. Besides, calibration plot in Figure 3 showed the performance characteristics of the nomogram. The x-axis was the prediction calculated with the nomogram while the y-axis was the actual prediction for LNM. In the plot, dotted line (blue) indicated the ideal nomogram in which predicted and actual probabilities were perfectly identical, whereas dashed line (red) indicated actual nomogram performance with apparent accuracy and solid line (black) presented bootstrap corrected performance of our nomogram, scatter estimate of future accuracy. Note that the predicted probability calculated using the nomogram corresponded accurately to the actual outcomes, because that the solid line was close to the dotted line.
In order to assess whether this model was indeed trustworthy and evaluate how much improvement was gained using these clinicopathologic features and biomarkers in this study, we also validated several predictive models, composed of different factors which were reported in previous studies [7,[22][23][24][25], to generate several corresponding areas under the curve  Table 3. Furtherly, we compared them to the AUC produced by our predictive model in our study, and found that the AUC value was significantly larger than those produced in previous studies (all p<0.05, Table 3).

DISCUSSION
Predicting lymph node metastasis (LNM) accurately is of great importance to formulate optimal treatment strategies preoperatively for patients with early gastric  cancer (EGC). This study evaluating a number of EGC patients revealed detailed data on LNM risk factors and developed a nomogram to predict the risk value for LNM in EGC patients. In this study, various factors, such as tumor size, differentiation type, ulcerative findings, lymphovascular invasion, depth of invasion were independent risk factors for LNM. To be specific, large tumor size was reported to be generally characterized by aggressive tumor behaviors, which were significantly related to disadvantages in overall survival. [26] The depth of tumor invasion, which could reflect the progression of a primary tumor originating from the mucosal layer, was significantly correlated to the presence of LNM in EGC. [13] In this present study, patients in LNM group were found to be more frequently with larger tumor size (≥2 cm), deeper tumor invasion (submucosa), which were consistent with previous studies. A majority of studies showed that patients with poorly differentiation type and ulcerative findings had higher rates of LNM, [22,[27][28][29][30] having a poor prognosis, while some authors insisted that differentiation type and ulcerative findings were not significant associated with LNM, [15,16,31] being not prognostic factors for EGC patients. Our findings suggested that LNM were more likely to appear in patients with undifferentiated type, lymphovascular invasion, ulcerative lesions, which were consistent with the former reports. We believed that LNM, as an unfavorable factor, could be correlated with undifferentiated type, and ulcerative lesion in gastric cancer, because of worse biological behavior and tumor progression.
Tumor markers, which could be easily obtained from serum before gastrectomy or endoscopic intervention, were also evaluated in this study. In a recent study, the elevated preoperative serum levels of CEA and CA-153 were illustrated to be independent predictive factors of axillary lymph node metastasis in patients with breast cancer. [32] A previous study revealed that, the tumor makers CA724, CA242, CA199 and CEA were significantly associated with LNM in the gastric patients, and combination of these four tumor markers could be a diagnostic index of LNM. [18] Despite that none of these preoperative tumor markers (CEA, CA19-9 and CA125), which were defined as positive and negative subgroup respectively by cutoff points produced in this study, was individual risk factor for LNM, combined tumor marker proposed in our study which was integrated with these three markers, was demonstrated to be independent risk factor for LNM. Although it did not weigh too much in the nomogram model (OR=1.231, p=0.034), combination work could be more effectively than any of the biomarkers considered alone. Thus, predicting preoperatively the status of lymph nodal involvement could become more feasible than ever before, which is due to the consideration that, as a promising and noninvasive method, monitoring with combination of these serum tumor markers is much more convenient than other factors (e.g. tumor size, differentiation type, depth of invasion, etc.).  Nomogram, corresponding to a predictive model including the independent risk factors that may affect the incidence of LNM, was constructed in our study in the training set. A ROC curve and calibration plot were furtherly developed to validate this nomogram, illustrating a good predictive accuracy, which revealed a good concordance and a reliable ability to estimate the status of lymph nodal involvement. This nomogram provided a helpful method to predict the likelihood of lymph node metastasis for EGC patients, by which individual patient could receive appropriate treatment, e.g. an undifferentiated submucosal EGC patient with the presence of ulcerative findings, tumor size ≥2cm and positive combined tumor marker may have a probability of more than 90% to be together with LNM. So, in this case, gastrectomy with lymphadenectomy but not endoscopic therapy is suggested through this nomogram model. On the contrary, patient with the opposite characteristics should receive endoscopic resection, as the risky value of LNM is lower than 10%. Moreover, in order to evaluate the predictive improvement using these clinicopathologic features and biomarkers in this study, we also validated several previous predictive models and compared them to our model with AUC value, revealing that the AUC value was significantly larger than those produced in previous studies (all p<0.05, Table 3), which illustrated the current model could produce the best prognostic discriminatory ability and predictive accuracy. Therefore, we believe this nomogram model will assist surgeons in formulating the optimal treatment strategy for EGC patients in terms of the probability of LNM.
There were also limitations in our study. Firstly, as a retrospective single-center study, our findings could have been observed by chance, and the optimal cutoff points of serum tumor markers could only make difference in our study. Besides, CA72-4 and CA15-3 were not routinely tested for GC patients in our center before 2012, so they were not evaluated in this study. Furthermore, sample size was not large enough, and external validation with different population should be needed before stronger statement can be done. Moreover, most of the factors enrolled in the nomogram were postoperative variables, only tumor markers could be obtained before surgery, which could limit its use for surgeons to choose the optimal treatment before surgery. However, given that tumor size, differentiation type and ulcerative findings as well as the invasion depth could be roughly measured by preoperative gastroscopy, EUS and CT, we suggested that the endoscopic resection was recommended firstly if the patient was evaluated to be with a low possibility to LNM according to these preoperative findings. After endoscopic resection, additional surgical intervention could be determined using the proposed nomogram model on the basis of a comprehensive review of the endoscopic specimen. Therefore, a surgical strategy should be considered for each patient on a case-by-case basis before the establishment of an accurate preoperative diagnostic method for LNM in early gastric cancer patients.
As shown in our results, the nomogram proposed in this study could effectively predict the incidence of lymph node metastasis for EGC patients, through which surgeons could make optimal treatment strategy for EGC patients.

Patients
The West China Hospital Research Ethics Committee approved the retrospective analysis of anonymous data involved in this study. The data retrieval of this study was based on the Surgical Gastric Cancer Patient Registry in West China Hospital [33]. Patient records were anonymized and de-identified prior to analysis and signed patient informed consent was waived per the committee approval because of the retrospective nature of the analysis.
A total of 697 consecutive EGC patients who received gastrectomy in West China Hospital from January 2000 to December 2015, were retrospectively enrolled in this study. Patients were included on the conditions that: 1) they were histologically proven to be with primary gastric cancer before surgery; 2) Pathological examination confirmed that they had received R0 resection [34], a curative resection with negative residual margins; 3) there were no preoperative distant metastases; 4) The clinicopathologic features and the serum tumor markers including of CEA, CA19-9 and CA125 were clearly recorded. And patients were excluded if they had any of the following situations: 1) with an earlier history of gastrectomy; 2) with any pre-operative chemotherapy or radiotherapy; 3) with positive residual margins; 4) with another malignancy or any other life-threatening diseases diagnosed during three years prior to the operation; 5) death due to postoperative complications in hospital. Finally, of these patients, 598 enrolled from the year 2000 to 2013 were used as the training set, while 99 patients from 2014 to 2015 were regarded as the validation set.

Definition of combined tumor marker and clinicopathologic features
Preoperative serum tumor markers, CEA, CA19-9 and CA125, were divided into negative and positive groups respectively by the cutoff points, 3.54ng/ml, 12.83U/ml, 17.96U/ml, produced by ROC analyses (Figure 4). We proposed a new clinicopathologic factor, combined tumor marker, which was composed of the three tumor markers, and it was regarded as positive on the condition that two or three of the tumor markers were found to be positive, while it was defined as negative if two or three of these tumor markers were negative. The clinicopathologic features including of gender, age, tumor location (upper www.impactjournals.com/oncotarget third, middle third, lower third), tumor size (the maximum diameter of the gastric tumor), count of lymph node (number of lymph node retrieved from the surgery), macroscopic type (elevated, flat, depressed, mixed), tumor differentiation (differentiated: well or moderately differentiated adenocarcinomas, undifferentiated: poorly or undifferentiated adenocarcinomas), ulcerative findings, lymphovascular invasion, depth of invasion (mucosa, submucosa), and the combined tumor marker were analyzed in this study. The presence of lymph node metastasis (LNM) was defined as LNM (+), while the absence of LNM was considered as LNM (-).

Statistical analysis and nomogram construction
All statistical analyses and graphics in this study were demonstrated by the SPSS version 19.0 and R (version3.1.2 URL http://www.R-project.org/). The optimal cutoff points for CEA, CA19-9 and CA125 were produced using receiver operating characteristic (ROC) analyses. Chi-square test was performed to analyze unordered categorical variables, whereas Mann-Whitney U test was used to evaluate ranked variables. Logistic regression analysis was used to analyze risk factors for LNM, while a nomogram was displayed as a model for predicting the risk of LNM, and it illustrated graphically the factors which could be applied to calculate the risk value of LNM for patients. The predictive accuracy of the nomogram was then validated using ROC and quantified by the area under the curve (AUC). An AUC of 0.5 indicates no relationship while an AUC of 1.0 tells a perfect concordance. [35] Moreover, the nomogram was subjected to 1000 boot strap resamples for reduction of overfit bias and for internal validation with logistic calibration plot. The two-sided p value of less than 0.05 was considered to be statistically significant.