Development and External Validation of a Nomogram for Predicting Overall Survival in Stomach Cancer: A Population-Based Study

Objective The study was to develop and externally validate a prognostic nomogram to effectively predict the overall survival of patients with stomach cancer. Methods Demographic and clinical variables of patients with stomach cancer in the Surveillance, Epidemiology, and End Results (SEER) database from 2007–2016 were retrospectively collected. Patients were then divided into the Training Group (n = 4,456) for model development and the Testing Group (n = 4,541) for external validation. Univariate and multivariate Cox regressions were used to explore prognostic factors. The concordance index (C-index) and the Kolmogorov–Smirnov (KS) value were used to measure the discrimination, and the calibration curve was used to assess the calibration of the nomogram. Results Prognostic factors including age, race, marital status, TNM stage, surgery, chemotherapy, grade, and the number of regional nodes positive were used to construct a nomogram. The C-index was 0.790 and the KS value was 0.45 for the Training Group, and the C-index was 0.789 for the Testing Group, all suggesting the good performance of the nomogram. Conclusion We have developed an effective nomogram with ten easily acquired prognostic factors. The nomogram could accurately predict the overall survival of patients with stomach cancer and performed well on external validation, which would help improve the individualized survival prediction and decision-making, thereby improving the outcome and survival of stomach cancer.


Introduction
Stomach cancer is a kind of malignant tumor with high invasiveness and heterogeneity, which is a global health problem [1]. It remains the fifth most common cancer and the third leading killer of cancer-related deaths worldwide despite the decreasing trend of new morbidity and mortality [2,3]. It is estimated that about 783,000 people died of stomach cancer in 2018, and 769,000 died in 2020 globally [2,4].
ere will be approximately 26,560 new cases of stomach cancer and 11,180 deaths in the United States in 2021 according to the American Cancer Society (https:// www.cancer.org/cancer/stomach-cancer/about/key-statistics.html). e prognosis of early stomach cancer is relatively good, and the 5-year survival rate is about 69-82% [5]. Despite advances in radical surgical techniques and perioperative chemotherapy, the survival rate of patients with advanced stomach cancer remains poor. e 5-year overall survival of them is mostly under 50% [6]. Herein, it is of great significance to identify independent prognostic factors of stomach cancer for the better treatment and prognosis of cancer.
Some known demographic and clinicopathological variables affect the survival of gastric cancer patients [3,[7][8][9][10], and a comprehensive model based on these factors needs to be developed to predict individual survival. Several previous studies have proposed nomograms predicting survival of patients with stomach cancer [11][12][13]. Kim et al. developed prognostic nomograms based on several clinicopathological variables, which can independently predict the overall survival of advanced gastric cancer patients (unresectable or metastatic gastric cancer after combined cytotoxic chemotherapy as first-line treatment) [11]. Besides, another study showed a prognostic nomogram utilizing the systemic immune inflammation index to predict the overall survival of patients with stomach cancer after operation [12]. Although the evaluation indicators of these nomogram models were great, most of the studies were based on patients who received surgery, or patients with a specific stage of disease, such as patients in advanced stages or after radiotherapy and chemotherapy. Later, the prognostic nomogram for this research included nonresection patients, but it only used clinical prognostic factors [13].
In the present study, we aimed to develop a prognostic nomogram, which included patients with or without surgery and other treatments, using easy-to-collect demographic (age, race, and marital status) and clinicopathological variables (TNM stage, surgery, chemotherapy, grade, and the number of regional nodes positive).
e prognostic nomogram may help clinicians more accurately predict the overall survival of patients with stomach cancer, thereby optimizing the treatment selection and improving the prognosis of cancer.

Study Population.
e SEER database is the most comprehensive registry of cancer incidence and survival in the United States and representative of 34.6% of the US population. It complies with patient-level data collected from 18 geographically diverse populations that represent rural, urban, and regional populations [10]. In the present study, 10,430 patients diagnosed with stomach cancer in the SEER database from 2007 to 2016 were retrospectively reviewed. After screening, a total of 1,433 patients were excluded for incomplete clinical data on race, marital status, tumor stage, treatment methods, etc. Finally, a total of 8,997 patients were included and divided into the Training Group and the Testing Group.

Data Collection.
Baseline variables including age, gender, race, marital status, the primary site of tumors, American Joint Committee on Cancer (AJCC) stage, TNM stage, tumor size, insurance situation, treatment methods (including surgery, radiotherapy, and chemotherapy), grade, vital status, the number of regional nodes positive, and follow-up time were collected.

Statistical Analysis.
Based on the data from the Training Group, baseline variables were first included in the univariate analysis. e variables with statistical significance were then included in the multivariate Cox regression to explore prognostic factors associated with the overall survival of stomach cancer, and a nomogram was thereby developed. Subsequently, the data of the Testing Group were applied to externally validate the predictive effect of our nomogram. All statistical tests were performed using the two-sided test, and P < 0.05 was considered statistically significant. e Kolmogorov-Smirnov (KS) test was used for the measurement data. Normally distributed data were described as mean ± standard deviation (Mean ± SD), and nonnormal data were described as median and quartile M (Q 1 , Q 3 ). e enumeration data were described as the number of cases and constituent ratio n (%). Cox proportional hazards model was used in both univariate and multivariate analyses, where hazard ratios (HRs) and 95% confidence intervals (CIs) were determined. A nomogram was plotted according to the results of the multivariate Cox analysis. R version 4.0.2 ( e R Foundation for Statistical Computing, Vienna, Austria) was used for plotting the nomogram, Kaplan-Meier (KM) curves, calibration curves, receiver operating characteristic (ROC) curve, and KS curve. KM curves were utilized to assess the survival of stomach cancer patients, in terms of age, races, marital status, TNM stages, chemotherapy, surgery, radiotherapy, grade, primary site, and tumor size. e discrimination of the nomogram for predicting the mortality was evaluated using ROC and KS curves. e agreement between the predicted and actual survival of patients was assessed by calibration curve.
Patients in the Testing Group showed similar characteristics to those in the Training Group. e baseline characteristics of the Training Group and the Testing Group are summarized in Table 1.

Univariate and Multivariate Analyses.
By analyzing the five-year survival in the Training Group, the results suggested that compared with patients <65 years old, the mortality risk was 0.147-fold higher in patients between 65 and 80 years (HR � 1.147, 95% CI: 1.038-1.267) and 0.717fold higher in those ≥81 years (HR � 1.717, 95% CI: 1.521-1.939). In contrast to White patients, patients of other races (including American Indian, Alaska Native, and Asian-Pacific Islander) had a 0.294-fold reduced mortality risk (HR � 0.706, 95% CI: 0.663-0.788). As regards marital status, the risk of death was 1.457 times higher in widowed patients than that in married patients (HR � 1.457, 95% CI: 1.293-1.643), and the risk was 1.150 times higher in patients with other marital status (including single, divorced, separated, and unmarried) (HR � 1.150, 95% CI: 1.034-1.277). Compared with patients with the unclear primary site, patients with the primary site of the body of the stomach had a 0.300-fold reduced risk of mortality (HR � 0.700, 95% CI: 0.581-0.843); patients with the primary site of the antrum had a 0.184-fold reduced risk (HR � 0.816, 95% CI: 0.698-0.953); and patients with the primary site of the lesser curvature had a 0.297-fold reduced risk (HR � 0.703, 95% CI: 0.586-0.843). Concerning tumor stage, as compared with patients at the T1 stage, the mortality risk was 1.439 Figures 1 and 2).
After the univariate analysis, statistically significant variables were included in the multivariate Cox regression for further analysis.
e results showed that age, race, marital status, TNM stage, surgery, chemotherapy, grade, and regional nodes positive were all identified as independent prognostic factors for the survival of stomach cancer patients (Table 3).

Development and Validation of a Nomogram.
Based on the results of the multivariate analysis, the nomogram was plotted (Figure 3). Take one case in the Training Group as an example, the patient was married and from other races, aged 70 years old. e patient had a tumor grade of III + IV and was at M1, N3, and T4 stages with 11 regional nodes positive. Also, the patient received surgery and chemotherapy. In our Cox model, the patients had a total score of 545 points, and the probability of survival longer than 20 months was 70.5%. e outcome for the patient was dead, confirming the accuracy of our model (Figure 4). e formula for prediction was h (t, X) � h 0 (t) exp  Table 4). e ROC curves for predicting the mortality of patients with stomach cancer are shown in Figure 5

Discussion
At present, stomach cancer is a very common cancer worldwide with a poor prognosis and long-term survival. In the current study, we developed a novel nomogram model to predict the overall survival of patients with stomach cancer. Variables including age, race, marital status, TNM stage, surgery, chemotherapy, tumor grade, and the number of regional nodes positive were significantly associated with the overall survival. As expected, our nomogram model showed Journal of Healthcare Engineering       T stage  T1  T2   T3  T4   997  870  690  494  306  152  0  580  493  371  264  157  69  0  1436  1138  739  449  251  120  0  1443  926  493  272  138  47  0  T4   T3   T2   T1   0  1   Currently, the constructed nomogram includes several prognostic factors containing age, race, marital status, TNM stage, surgery, chemotherapy, tumor grade, and the number of regional nodes positive. Many studies have demonstrated that age is an important prognostic factor in the survival of cancer patients [7,[13][14][15]. Our results indicated that older age was associated with an increased risk of poor survival in patients with stomach cancer. Also, patients of non-White or non-Black races and those receiving surgery or chemotherapy showed lower mortality risks, which were all consistent with previous studies [14,16]. A higher TNM stage and a higher tumor grade were also associated with worse survival. Besides, patients at the later TNM stage showed increased mortality risk. A possible explanation for this might be that quite a few patients with stomach cancer were already at a later stage at the time of diagnosis and never underwent surgery or chemotherapy before, which may increase the risk of recurrence or even death [17]. Besides, in a Chinese population-based study, T stage, number of metastatic lymph nodes, lymph node-positive rate, adjuvant chemotherapy, and diameter of the tumor were included in the nomogram [16]. In a Korean study, age, gender, tumor location, depth of invasion, number of positive lymph nodes, and number of examined lymph  121  83  53  29  9  0  629  506  330  212  123  59  0  1649  1270  866  549  321  136  0  228  171  113  69  38  19  0  671  548  365  242  141  72  0  248  186  128  87  46  25  0  478  347  222  140  73  27  0  389  278  186  127  81  nodes were significantly associated with the overall survival [15]. However, according to our multivariate Cox regression, some variables such as gender, tumor size, and tumor location were not significantly associated with the overall survival. It might be speculated that the difference was due to different populations, which require multicenter studies for verification.
Over the past few decades, the AJCC staging system has become the most widely accepted and used classification system for stomach cancer. However, recent studies have proposed that the AJCC staging system ignored the biological heterogeneity of patients and was not sufficient to predict the recurrence of cancer, resulting in great differences in treatment effects even in patients with the same stage using the same treatment regimen [18][19][20]. To date, some studies have established nomograms to predict the overall survival of stomach cancer [7,13,14,16,21]. However, most of them were based on the patients who received surgical treatment, and patients who did not were excluded. e proposed nomogram also included patients who did not receive surgery or other treatment and set whether patients received surgery, chemotherapy, and radiotherapy or not as variables for analysis. At the same time, the prognostic factors in our nomogram were all available and easily collected in clinical practice. To further assess the performance of the nomogram model, the calibration, ROC, and KS curves were plotted. e nomogram showed good discrimination with a C-index of 0.790 and a KS value of 0.45 for the training set. Moreover, external validation was also performed, and the C-index of 0.789 for the testing set confirmed the good performance of our nomogram.
However, there are still some limitations in our study. Although we selected the patient data from 2012-2016 for external validation, the data were all derived from the SEER database, which is mainly composed of the American population with limited universal applicability. For special medical images, the treatments involve not only regional assessments and surgical planning but also segmentation and thickness computation [22][23][24][25]. In addition, the SEER database is an open data platform, which collects data on       patient demographics, primary tumor site, tumor morphology, stage at diagnosis, and first course of treatment, and patients were followed up for vital status. e potential factors including rural or urban areas and physical health (such as height, weight, and diet) that could affect the survival of stomach cancer patients were not recorded in the database, so further studies should be conducted to improve the nomogram. In the future, the results of the study would be more accurate if our nomogram was externally validated in other cohorts including more populations within the same period.

Conclusion
In the present study, we have developed an effective nomogram with ten easily acquired prognostic factors including age, race, marital status, TNM stage, surgery, chemotherapy, tumor grade, and the number of regional nodes positive. e nomogram could accurately predict the overall survival of patients with stomach cancer and performed well on external validation. We expect that the nomogram would be helpful for both patients and clinicians to improve the individualized survival prediction and decision-making, thereby improving the outcome and survival of stomach cancer.

Data Availability
e data utilized to support the findings are available from the corresponding authors upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.