Prognostic biomarker SGSM1 and its correlation with immune infiltration in gliomas

Glioma was the most common type of intracranial malignant tumor. Even after standard treatment, the recurrence and malignant progression of lower-grade gliomas (LGGs) were almost inevitable. The overall survival (OS) of patients with LGG varied widely, making it critical for prognostic prediction. Small G Protein Signaling Modulator 1 (SGSM1) has hardly been studied in gliomas. Therefore, we aimed to investigate the prognostic role of SGSM1 and its relationship with immune infiltration in LGGs. We obtained RNA sequencing data from The Cancer Genome Atlas (TCGA) to analyze SGSM1 expression. Functional enrichment analyses, immune infiltration analyses, immune checkpoint analyses, and clinicopathology analyses were performed. Univariate and multivariate Cox regression analyses were used to identify independent prognostic factors. And nomogram model has been developed. Kaplan–Meier survival analysis and log-rank test were used to estimate the relationship between OS and SGSM1 expression. The survival analyses and Cox regression were validated in datasets from the Chinese Glioma Genome Atlas (CGGA). SGSM1 was significantly down-regulated in LGGs. Functional enrichment analyses revealed SGSM1 was correlated with immune response. Most immune cells and immune checkpoints were negatively correlated with SGSM1 expression. The Kaplan–Meier analyses showed that low SGSM1 expression was associated with a poor outcome in LGG and its subtypes. The Cox regression showed SGSM1 was an independent prognostic factor in patients with LGG (HR = 0.494, 95%CI = 0.311–0.784, P = 0.003). SGSM1 was considered to be a new prognostic biomarker for patients with LGG. And our study provided a potential therapeutic target for LGG treatment.


Introduction
Gliomas were the most common primary intracranial malignant tumors which originated from glial cells [1][2][3]. According to the World Health Organization (WHO) grading system, grade II and III gliomas were classified as lower-grade gliomas (LGGs) [4][5][6]. The median overall survival (OS) of grade II and III glioma patients were 78.1 months and 37.6 months, respectively [7]. Although LGG was a more indolent precursor to glioblastoma (GBM) and less invasive, it caused considerable morbidity and raised a difficult challenge for therapy due to the heterogeneity of clinical behavior [8,9]. The complete resection of LGG was considered to be still impossible due to the invasive nature. Despite the use of radiotherapy and chemotherapy, local recurrence and progress into GBM were almost inevitable, which led to the decrease in therapeutic effect and a poor prognosis [10][11][12]. Therefore, prognostic biomarkers were explored to provide a prediction on patients' survival and response to individualized therapy.
Small G Protein Signaling Modulator 1 (SGSM1), located on chromosome 22q11.2, was found to mainly express in brain tissue [13]. Previous research showed the strong association of SGSM1 with neuronal function. SGSM1 protein was localized in the trans-Golgi network. Furthermore, SGSM1 protein possessed RUN domain and TBC domain which was associated with RAP and RAB-mediated cellular signaling. SGSM1 mediated the interaction between intracellular signaling pathways and vesicle transportation. A recent study has found that SGSM1 degradation led to the invasion and metastasis of nasopharyngeal carcinoma [14]. Another parallel sequencing research has shown that SGSM1 was a potential candidate gene for schwannomatosis [15]. However, the role of SGSM1 has hardly been studied and its prognostic value in LGGs remained unclear.
The data was obtained from TCGA. We investigated the expression patterns of SGSM1 in LGGs and evaluated its prognostic value. SGSM1 was down-regulated with the increase of glioma grades, and its low expression indicated a poor prognosis in LGG patients. Moreover, SGSM1 was associated with immune responses which provided a new sight for personalized treatment. Therefore, SGSM1 could be a prognostic indicator and a potential therapeutic target for LGGs.

RNA-sequencing data acquisition
We downloaded the pan-cancer RNA-seq data of TCGA and GTEx conducted by Toil process uniformly from UCSC XENA (https:// xenab rowser. net/ datap ages/) [16,17]. For further analyses, we obtained level 3 HTSeq-FPKM and HTSeq-Count data of 529 LGG samples from the TCGA database (https:// portal. gdc. cancer. gov/). This study was entirely following the publication guidelines provided by TCGA and GTEx.

Differential expression gene (DEG) analysis
The median SGSM1 expression was regarded as the cut-off value to identify DEGs between the two groups (low-and high-expression) of SGSM1 in LGG samples (HTseq-Count), and we used the DESeq2 R package (1.26.0) for analysis [18].

Functional enrichment analysis
The threshold of DEGs performed for functional enrichment analysis was defined for |logFC| over 2 and adjusted P-value less than 0.05. Gene Ontology (GO) comprising of biological process (BP), cellular component (CC), and molecular function (MF), as well as Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were implemented with ClusteProfiler R package (3.14.3) [19,20].

Gene set enrichment analysis (GSEA)
We used ClusteProfiler R package (3.14.3) to explore the functional and pathway differences between the two groups of different SGSM1 expression [21]. For each analysis, the permutation number was set to 1000 times. Enrichment results met the conditions of p.adj < 0.05 and FDR q-value < 0.25 were defined to be statistically significant.

Prognostic model development
We performed univariate and multivariate Cox regression analyses to evaluate whether SGSM1 could be used as an independent prognostic factor. We have involved clinical parameters, including age, gender, WHO grade, IDH status, and 1p/19q codeletion. Furthermore, nomogram and calibration plot were generated by the RMS package (version 6.2-0) and survival package (version 3.2-10) for predicting 1-year, 3-year, and 5-year OS [2,25]. We have included the same variables as the Cox regression analyses. The calibration plot has been graphically evaluated by mapping the probabilities predicted by nomogram to observed rates. The diagonal was used as the best predictive value. Concordance index (C-index) was used to determine the discrimination. And the bootstrap method was used to calculate 1000 resamples [26]. In addition, receiver operating characteristic (ROC) curve was used to evaluate the predictive accuracy of the nomogram.

Validation for survival analyses
Gene expression data and clinicopathological information of 625 LGG samples were retrieved from two RNA-sequencing datasets of CGGA database (http:// www. cgga. org. cn/) [27]. It was selected as the validation set to verify the survival analyses and prognostic role of SGSM1.

Statistical analyses
All the statistical analyses and graphs were conducted by the R programming language (version 3.6.3). The expression of SGSM1 was analyzed by Wilcoxon ranksum test in unpaired samples. Cox regression analyses assessed the hazard ratios (HRs) and 95% confidence intervals (CIs) of different clinical characteristics, and identified independent prognostic factors. Kaplan-Meier survival analyses and log-rank tests were used to estimate the survival distributions. A two-sided P value less than 0.05 was set to be statistically significant.

The expression of SGSM1 in pan-cancers and LGG
Comparing SGSM1 expression between normal tissues and tumor samples from TCGA and GTEx databases, we found that SGSM1 was significantly down-regulated in most types of cancer (Fig. 1a), including LGG (P < 0.001, Fig. 1b).

Identification of DEGs with SGSM1 and functional enrichment analyses
A total of 836 DEGs were identified between two groups (low-and high-expression) of SGSM1 with the criterion of |logFC|> 2 and Padj < 0.05, including 454 up-regulated and 382 down-regulated genes (Fig. 2).
The results of GO functional analysis and KEGG enrichment analysis have been shown below. BP included humoral immune response, lymphocyte mediated immunity, regulation of humoral immune response, phagocytosis, and regulation of immune effector process. CC included immunoglobulin complex, synaptic membrane, synaptic vesicle, ion channel complex, and transmembrane transporter complex. MF included antigen binding, immunoglobulin receptor binding, neurotransmitter receptor activity, passive transmembrane transporter activity, and ion channel activity (Fig. 3a). KEGG included neuroactive ligand-receptor interaction, retrograde endocannabinoid signaling, synaptic vesicle cycle, GABAergic synapse, cAMP signaling pathway, and calcium signaling pathway (Fig. 3b).
We performed GSEA analysis for further identification in biological functions involved in LGGs with different SGSM1 expression level using the MSigDB collection. Among the significantly enriched gene sets, five GO categories, including lymphocyte mediated immunity, phagocytosis, humoral immune response, immunoglobulin production, and immune response regulating signaling pathway, showed significantly differential enrichment in SGSM1 low expression phenotype (Fig. 4a); five GO categories, including neurotransmitter transport, neurotransmitter secretion, synaptic vesicle membrane, synaptic vesicle exocytosis, and regulation of synaptic plasticity, showed significantly differential enrichment in SGSM1 high expression phenotype (Fig. 4b). Five KEGG categories, including pathways in cancer, B cell receptor signaling pathway, natural killer cell mediated cytotoxicity, leukocyte transendothelial migration, and T cell receptor signaling pathway, showed significantly differential enrichment in SGSM1 low expression phenotype (Fig. 4c); five KEGG categories, including neuroactive ligand receptor interaction, long term potentiation, calcium signaling pathway, gap junction, and phosphatidylinositol signaling system, showed significantly differential enrichment in SGSM1 high expression phenotype (Fig. 4d). Five hallmark items, including epithelial mesenchymal transition, IL6-JAK-STAT3 signaling, TNFα signaling via NFκB, inflammatory response, and IL2-STAT5 signaling, showed significantly differential enrichment in SGSM1 low expression phenotype; none in SGSM1 high expression phenotype (Fig. 4e). These results indicated the potential role of SGSM1 in tumor microenvironment and immune responses which were critically important in LGG patients.

Immune infiltration analyses in LGG
Tumor immune infiltration played an important role in the prediction of OS rates. The proportions of 24 subtypes of immune cells in different SGSM1 expression groups have shown that mast cells (P = 0.011), NK CD56bright cells (P < 0.001), TFH (T follicular helper, P < 0.001), Th1 cells (P = 0.042), TReg (P < 0.001), and pDCs (plasmacytoid dendritic cells, P = 0.001) were  (Fig. 5b, 5c). We assessed the possible correlations between the 24 types of immune cells. The heat map has shown that the ratios of different tumor-infiltrating immune cells subtypes were weakly to moderately correlated (Fig. 5d).

Association between SGSM1 expression and clinical features
The main clinical features between low and high SGSM1 expression groups in LGGs were analyzed (Table 1). In high-expression group, the ratio of WHO grade II (P < 0.001), IDH mutation (P < 0.001), and 1p/19q  Fig. 7 Association between SGSM1 expression and clinical features codeletion (P < 0.001) cases was significantly higher than low-expression group. Moreover, we evaluated the SGSM1 expression level with different clinical characteristics (Fig. 7). The results showed that SGSM1 was significantly down-regulated in WHO grade III group (P < 0.001), IDH wild-type group (P < 0.001), and 1p/19q non-codeletion group (P < 0.001).

Relationship between SGSM1 expression and prognosis
We analyzed the potential predictors by Cox regression analyses, including age, gender, WHO grade, IDH1 status, 1p/19q status, and SGSM1 expression level. The univariate analysis showed that age, WHO grade, IDH1 status, 1p/19q status, and SGSM1 expression level were significantly associated with the OS (P < 0.001 for all, Table 2). These risk factors were further included in multivariate Cox regression (Fig. 8). The results suggested that SGSM1 was an independent prognostic factor (HR = 0.494, 95%CI = 0.311-0.784, P = 0.003). Then we analyzed the correlation between risk score, survival time, and SGSM1 expression profiles (Fig. 9).
Kaplan-Meier analyses showed the relationship between SGSM1 expression and OS of LGG patients (Fig. 10). Patients with high SGSM1 expression had a significantly better prognosis than those with low SGSM1 expression (P < 0.001). We further performed Kaplan-Meier analysis in the subgroups of WHO grade, and the results showed that high SGSM1 expression was correlated with better prognosis in grade II (P = 0.026) and grade III (P < 0.001), respectively.

Fig. 8 Multivariate Cox analysis of SGSM1 and other clinicopathological variables
The clinical features were integrated into the nomogram model (Fig. 11a), and the C-index was 0.804 (95%CI = 0.779-0.828). We have developed timedependent ROC curves and calibration plots predicting the probability of 1-year, 3-year, and 5-year OS rates (Fig. 11b). The AUCs in terms of 1-year, 3-year, and 5-year were 0.685, 0.742, and 0.636, respectively. The predicted probability of calibration plots was consistent with the observed results (Fig. 11c).

Validation of survival analyses
Using the CGGA database, we validated that SGSM1 was an independent prognostic factor for LGG prognosis with Cox regression analyses (HR = 0.597, 95%CI = 0.451-0.791, P < 0.001, Table 3). We performed the Kaplan-Meier survival analyses in CGGA database (Fig. 12). The results showed that patients with low SGSM1 expression were correlated with poor outcome in LGG (P < 0.001), WHO grade II (P < 0.001) and grade III (P = 0.001), respectively.

Discussion
Glioma was the most common type of intracranial malignant tumor [1]. Although LGG was less invasive, the recurrence and malignant progression were almost inevitable even after standard treatment [28]. Thus immunotherapy, gene therapy, and other new therapies have become a promising hope for LGG treatment [29]. It has been necessary to identify prognostic factors to optimize treatment for patients. SGSM1 was mainly expressed in brain, and it was considered to correlate with small G protein-mediated signal transduction pathway [13]. There were few studies on the potential prognostic role of SGSM1 in LGGs. Our results have shown SGSM1 expression was significantly associated with immune infiltration and OS in patients with LGG.
In this study, we first compared SGSM1 expression in different tumors. The expression of SGSM1 was significantly down-regulated in most types of cancer, including LGG. Then we analyzed the gene function of SGSM1 with enrichment analyses. It indicated that SGSM1 was related to immune response. With the development of tumor microenvironment research, immune cells were considered to play a complex and important role in tumor progression [30][31][32][33].
Based on the results of enrichment analyses, we explored the immune infiltration levels by ssGSEA. We found a substantial negative connection of SGSM1 expression with most immune cells. These immune cells were high infiltrated in low SGSM1 expression tumors. We considered the excessive immune response and disorganized immune microenvironment contributed to the short survival of these patients [34][35][36]. Among the immune cells, macrophages (P < 0.001) had the highest correlation with SGSM1 expression, and the infiltration level indicated the prognosis. Increased infiltration of macrophages in low SGSM1 expression tumors suggested that immune microenvironment was driven from anti-tumor state to immunosuppressive state due to the phenotypic transformation of tumor-associated macrophages, indicated a higher risk of tumor invasion [37]. NK CD56bright cells (r = 0.483, P < 0.001) were positively correlated with SGSM1 expression; thus, the infiltration of NK CD56bright cells in tumors was low. NK CD56bright cell had a strong ability to produce cytokines and mainly played an immunomodulatory role [38,39]. This might lead to the dysregulation of tumor immunosurveillance and anti-tumor effect. Moreover, we revealed the negative correlation between SGSM1 expression and immune checkpoints, including PD1, PD-L1, CTLA4, LAG-3, TIM3, and CD48. SGSM1 potentially influenced tumor immunology, and could be a potential therapeutic target for immunotherapy rather than a simple prognostic biomarker. The ratio of WHO grade II, IDH mutation, and 1p/19q co-deletion were significantly higher in the high SGSM1 expression group. SGSM1 enhanced in subsets of WHO grade II, IDH mutation, and 1p/19q codeletion groups. It suggested that SGSM1 played a potential role in positive prognostic prediction in some way. Then we analyzed the prognostic role of SGSM1 in LGG patients. Cox regression analyses showed that SGSM1 was an independent prognostic factor for LGGs in addition to traditional risk factors, including age, WHO grade, and IDH status. By Kaplan-Meier survival analyses, we found that SGSM1 expression was correlated to the OS. Low SGSM1 expression was related to a poor outcome in LGGs, WHO grade II and grade III, respectively. The survival analyses and Cox regression were validated in the CGGA database. The nomogram prognosis model based on SGSM1 expression level was further established to predict the 1-year, 3-year, and 5-year OS of LGG. The C-index was 0.804 (95%CI = 0.779-0.828). Time-dependent ROC curves and calibration plots illustrated the reliable predictive ability of the nomogram. Our model could provide a new point in outcome prediction and personalized assessment of LGG patients. However, there were still some limitations in this study. Clinical samples should be included for validation. The regulatory mechanism and signaling pathway related to SGSM1 needed further investigation. The prediction model should be verified in future multicenter studies.

Conclusion
In summary, SGSM1 was low expressed in LGGs, and the down-regulation was related to a poor prognosis. Our study has raised a new point of view that SGSM1 was a promising prognostic factor and a potential therapeutic target for LGGs. Our future study will focus on the mechanism of SGSM1 in LGGs.