Establishment and Verification of Synchronous Metastatic Nomogram for Gastrointestinal Stromal Tumors (GISTs): A Population-Based Analysis

Aim Assess the risk of synchronous metastasis and establish a nomogram in patients with GISTs. Methods Surveillance, Epidemiology and End Results database (2004-2014) was accessed. With the logistic regression model as the basis, a nomogram was constructed. Results 7,256 target patients were contained in our study. The nomogram discrimination for mGIST prediction revealed that tumor size contributed most to synchronous metastasis, followed by lymph nodes, extension, pathologic grade, tumor location, and mitotic count. C-index values of predictions were 0.821 (95% CI, 0.805-0.836) and 0.815 (95% CI, 0.800-0.831), and Brier score were 0.109 and 0.112 in training and validation group, respectively. The value of area under the ROCs were 0.813 (p < 0.001) in the primary cohort and 0.819 (p < 0.001) in the validation cohort. Through the calibration curves (as seen in the figures), nomogram prediction proved to have excellent agreement with actual metastatic diseases. Conclusion A new nomogram was created that can evaluate synchronous metastatic diseases in patients with GISTs.


Introduction
Gastrointestinal stromal tumors (GISTs) constitute the most frequent mesenchymal malignancy of the digestive tract, with an approximate incidence of 7.8 cases per million people per year [1]. It is now well accepted that all GISTs can exhibit malignant behavior and none of them can be labeled as definitely benign based on clinicopathologic features. Similar to other malignant tumors, GIST can also be metastasized to other sites of the body [2] and the incidence of metastatic diseases is 15-50% in patients with GISTs [3,4]. Timely detection of metastatic disease is crucial since metastases are considered to be major factors associated with mortality.
Due to the rareness of GISTs, there is currently no coefficient scoring system to assess the risk of synchronous metastatic GISTs (mGISTs). In fact, identification of risk factors for synchronous metastases can have a great influence on the changes in surveillance and management strategies in GISTs, such as administration of preoperative imatinib and follow-up cycles. Therefore, it is extremely important to create a nomogram that can evaluate synchronous metastatic diseases in patients with GISTs. Moreover, clinical characteristics should be focused since the advantages of convenient availability and wide applicability [5]. The Surveillance, Epidemiology, and End Results database (SEER database) is a kind of population-based cancer registration system of the USA taking 34.6% Americans into account, which can provide some necessary clinical data and be used to be an excellent database to explore GISTs. Therefore, a nomogram in patients with GISTs was created to assess the synchronous metastatic risk based on clinical factors by analyzing the SEER database in our study.

Materials and Methods
2.1. Patients. Data in this retrospective analysis were extracted from the SEER-linked database. The SEER Program of the National Cancer Institute is an authoritative source of information on cancer incidence and survival in the United States (US) that is updated annually. SEER currently collects and publishes cancer incidence and survival data from population-based cancer registries covering approximately 34.6% of the population from the US [6]. The target population was limited to the patients with GISTs, including GIST esophagus, GIST stomach, GIST small Intestine, GIST peritoneum, GIST appendix, and GIST colorectum, diagnosed in the periods of 2004-2015, 7,635 patients in total. Exclusion criteria are as follows: unknown metastatic status and code of CS tumor size is 0. The final study sample contained 7,256 patients.
For each patient, the following data were acquired: age at diagnosis, gender, race, insurance, tumor size, location, grade, extension, mitotic count, lymph nodes status, and metastatic status. We divided patients into metastatic GISTs (code: 10-60) and nonmetastatic GISTs (code: 0,) according to CS Mets at Dx. According to information of CS extension, we classified patients with codes 0-400 as mild extension and those with codes 440-800 as grievous extension. All patients were inconsistently separated into two groups (training group, n = 4,837, and validation group, n = 2,419).

Methods
Intergroup comparisons were analyzed using Pearson's chisquared test. An odds ratio (OR) and a 95% confidence interval (CI) were evaluated by a single factor and a multivariate logistic regression model. Variables with significant differences in univariate analysis were included in the logistic regression model for multivariate analysis. With the multivariate analysis results as the basis, by means of R 3.4.1 software (Institute for Statistics and Mathematics, Vienna, Austria; http://www.r-project.org/), a nomogram was constructed. Statistical analyses were performed with IBM SPSS statistics trial ver. 25.0 (IBM, Armonk, NY, USA). All reported p values lower than 0.05 were considered significant.

Results
4.1. Patients Characteristics. 7,256 target patients were contained in our study. Patients were randomly distributed into training cohort and validation cohort in the ratio of 2 : 1 for building the metastatic predictive model. The parameter of patients in the training and validation cohorts are listed in Table 1. The percentages of patients with metastatic GIST in the training cohort and the validation cohort were 17.12% (828/4837) and 17.49% (423/2419), respectively. The majority of the patients were elder (>55 years) and male. White people comprised 68.22% of the study population. More than half of patients were married (57.35%) and purchased health insurance (76.34%). Tumor location, pathologic grade, tumor size, lymph nodes, extension, and mitotic count were significantly different between nonmetastasis and metastasis for both of training and validation groups.

Establishment of Metastatic Nomogram.
Univariate analysis of variables with significant differences were included in the logistic regression model for multivariate analysis. The independent GIST metastatic odds ratios (ORs) for gender, insurance, tumor location, pathologic grade, tumor size, lymph nodes, extension, and mitotic count were presented in Table 2 for the logistic model. Metastatic status of GIST was highly associated with most characteristics in this study, including tumor location, pathologic grade, tumor size, lymph nodes, extension, and mitotic count.
The significantly independent risk factors identified by multivariate analyses were integrated to construct the nomogram for predicting synchronous metastatic diseases. The point scale at the top of each nomograms was used first to give every risk variable a score; then, the scale at the bottom of each nomogram was used (adding up the scores of all variables) to predict synchronous mGIST rates. The nomogram discrimination for synchronous mGIST prediction revealed that tumor size contributed most to metastasis, followed by lymph nodes, extension, pathologic grade, tumor location, and mitotic count ( Figure 1).

Verification of Metastatic Nomogram.
Internal validation for nomograms was performed by C-index, Brier score, receiver operating characteristic (ROC) curve, and calibration. C-index values of predictions were 0.821 (95% CI: 0.805-0.836) and 0.815 (95% CI: 0.800-0.831), and Brier score were 0.109 and 0.112 in training and validation groups, respectively ( Table 3). Both of them suggest that these models made accurate predictions. The values of area under the ROCs were 0.813 (p < 0:001, Figure 2(a)) in the primary cohort and 0.819 (p < 0:001, Figure 2

Discussion
The results of this study show that 1 out of 6 patients with GISTs presents with synchronous metastases. Considering 2 Gastroenterology Research and Practice 3 Gastroenterology Research and Practice such a high metastatic ratio, we have created an effective and accurate nomogram, which is related to cumulative risk score of synchronous mGIST, based on tumor and demographic variables available at the time of diagnosis that could be incorporated into clinical practice to guide surveillance and management strategies of GISTs.
Gaitanidis et al. also explored the synchronous metastatic risk factors of GISTs based on the SEER database [7], but their research had some shortcomings. First of all, their research was limited to liver metastases. Secondly, their research did not construct a nomogram, which can guide clinical practices better. This study was dedicated to building a comprehensive scoring system and made up for these shortcomings. Meanwhile, most studies estimated the metastatic risk of GIST based on tumor location, tumor size, and mitotic count but ignored some other features of tumor [2,7,8]. Our nomogram showed that small intestine held the highest metastatic risk. Moreover, synchronous metastatic risk was positively correlated with tumor size and mitotic counts in patients with GISTs. These results were consistent with most of previous studies.
It is now well accepted that all GISTs can exhibit malignant behavior and none can be labeled to be definitely benign based on clinicopathologic features. Lymphatic metastasis and tumor extension are peculiar features of malignant tumors. In fact, Gaitanidis et al. proved that lymph node metastasis was associated with mGIST [9]. Pathological grade was also associated with recurrence and metastasis in many other tumors more than the GIST [10][11][12]. This study believed that regional lymph node, grade, and extension should not be passed over owing to the significant meaning in the logistic regression model. In addition, these clinical features may have a certain relevance with special types of GISTs. For example, regional lymph nodes, but not the mitotic counts, can be used as a metastatic risk factor in patients with SDH-(succinate dehydrogenase-) deficient GISTs [13]. Therefore, we incorporated these clinical features into the nomogram. Surprisingly, the parameters of grade, regional lymph nodes, and extension presented higher risk scores than tumor location and mitotic counts. In addition, this study contained some demographic variables without statistical significance unfortunately, such as age, sex, and race.
Some clinicians lack a knowledge base to assess metastatic risk of GISTs since they are a rare tumor type. And it is the reason why part of the data source is NOS in patients with GISTs. This study suggested that NOS should not be ignored and was included in the predictive nomogram. The various   [14][15][16][17]. Interestingly, NOS hold the highest risk score in these recommended indicators, but median risk score regarding grade and lymph nodes. It fully reflected the increased metastatic risk caused by some inexperienced clinicians who ignored the progress of diagnosis and treatment of GISTs. Therefore, the nomogram is equally applicable to referral patients, even if the previous visit information may not be perfect. To our best knowledge, it is the first study to excavate the metastatic risk score of GISTs based on the SEER database. However, there were also some limitations existing in our study. First, this study was a kind of retrospective study which might have selection bias because not all the GIST patients had routine tests for metastasis resulting in an underestimated incidence of metastasis. Second, the SEER database did not provide such information about the status of KIT, DOG1, and PDGFRA, which are the important immunohistochemical markers for GIST diagnosis and prognosis. Finally, external validation is still required to verify the validity of this nomogram.

Conclusion
This study created and examined a new nomogram that can evaluate synchronous metastatic diseases in patients with GISTs, and these brand-new predicted methods could be incorporated into clinical practice to guide surveillance and management strategies in GISTs.