S100A gene family: immune-related prognostic biomarkers and therapeutic targets for low-grade glioma

Background: Despite the better prognosis given by surgical resection and chemotherapy in low-grade glioma (LGG), progressive transformation is still a huge concern. In this case, the S100A gene family, being capable of regulating inflammatory responses, can promote tumor development. Methods: The analysis was carried out via ONCOMINE, GEPIA, cBioPortal, String, GeneMANIA, WebGestalt, LinkedOmics, TIMER, CGGA, R 4.0.2 and immunohistochemistry. Results: S100A2, S100A6, S100A10, S100A11, and S100A16 were up-regulated and S100A1 and S100A13 were down-regulated in LGG compared to normal tissues. S100A3, S100A4, S100A8, and S100A9 expression was up-regulated during the progression of glioma grade. In addition, genetic variation of the S100A family was high in LGG, and the S100A family genes mostly function through IL-17 signaling pathway, S100 binding protein, and inflammatory responses. The TIMER database also revealed a relationship between gene expression and immune cell infiltration. High expression of S100A2, S100A3, S100A4, S100A6, S100A8, S100A9, S100A10, S100A11, S100A13, and S100A16 was significantly associated with poor prognosis in LGG patients. S100A family genes S100A2, S100A3, S100A6, S100A10, and S100A11 may be prognosis-related genes in LGG, and were significantly associated with IDH mutation and 1p19q codeletion. The immunohistochemical staining results also confirmed that S100A2, S100A3, S100A6, S100A10, and S100A11 expression was upregulated in LGG. Conclusion: The S100A family plays a vital role in LGG pathogenesis, presumably facilitating LGG progression via modulating inflammatory state and immune cell infiltration.


INTRODUCTION
AGING hypothesis that regulation of the tumor immune process may potentially play a role in the treatment of glioma [11]. However, there are currently many challenges in glioma immunotherapy. For example, gliomas themselves secrete inhibitory cytokines, which leads to an immunosuppressive microenvironment [12,13]. Moreover, the immune resistance of tumor cells in gliomas is significantly increased, and immune cells may transition from an anti-tumor phenotype to pro-tumor phenotype [14,15]. Therefore, it is necessary to develop appropriate targets as the operational points of glioma treatment. Secretion of S100A protein is detectable in the extracellular space and in certain body fluids, such as serum, urine, sputum, cerebrospinal fluid, and feces [16][17][18]. S100A family members are widely involved in a variety of inflammatory disease regulatory processes, including ischemic heart inflammation, Kawasaki disease, eye inflammation, and chorioamnionitis, among others [19][20][21][22][23]. Multiple studies have shown that some of the S100A family members regulate tumor development by mediating tumor immune processes [24][25][26][27]. Therefore, we investigated the role of the S100A protein family in glioma pathology.
We assessed differential expression of 12 S100A family genes across LGG, GBM, and normal tissues via the GEPIA database Single Gene Analysis module. The results showed that S100A2, S100A6, S100A10, S100A11, and S100A16 were upregulated and S100A1 and S100A13 were downregulated in LGG compared to normal tissues (p<0.05). S100A2, S100A3, S100A4, S100A6, S100A8, S100A9, S100A10, S100A11, and S100A16 were upregulated and S100A1 was downregulated in GBM compared to normal tissues (p<0.05). There was no significant difference in the expression of S100A3, S100A4, S100A8, and S100A9 in LGGs compared with normal tissues. However, expression of these genes was upregulated during the progression of glioma grade (Figure 2A-2L).

Gene mutations and PPI networks of S100A family in LGG
Via the cBioPortal database, we analyzed genetic variation of the S100A family based on the LGG sample data from the TCGA database. The results are presented in Figure 3A. The mutation rate was 4% for S100A2, S100A4, and S100A6, 5% for S100A1, S100A3, S100A8, S100A9, S100A11, S100A13, and S100A14, and 6% for S100A10 and S100A16.

Survival analysis of S100A family genes in LGG
We analyzed the relationship of S100A family genes with DFS and overall survival in LGG patients based on GEPIA. Results showed that differences in the expression of S100A2, S100A3, S100A4, S100A6, S100A8, S100A9, S100A10, S100A11, S100A13, and S100A16 were significantly associated with the overall survival of LGG patients, and the high expression of forementioned genes could be a risk factor for poor prognosis of LGG patients ( Figure 6A-6L).

Correlation analysis and prognostic value analysis of S100A family in LGG
We performed correlation analysis of S100A family genes in CGGA LGG samples with the R corrplot package, and found a significant positive correlation between S100A family genes ( Figure 7A). S100A2, S100A3, S100A6, S100A10, and S100A11 were subsequently identified as possible prognosis-related genes in LGG patients by COX regression analysis ( Figure 7B).
Survival analysis revealed that statistically significant differences between the high and low expression groups of S100A3, S100A6, S100A10, and S100A11 genes, and high expression groups of S100A3, S100A6, S100A10, and S100A11 genes may be a risk factor of poor prognosis for LGG patients (P<0.05, Figure 8A-8E). AGING Table 2. Kinase target, transcription factor target, miRNA target of S100A family genes in LGG. The AUC (area under curve) values of the 3-year and 5year survival ROC curves for S100A2, S100A3, S100A6, S100A10, and S100A11 were >0.6, indicating moderate accuracy for predicting LGG prognosis ( Figure 8F-8J). Clinical correlation analysis showed that the expression of S100A2, S100A3, S100A6, S100A10, and S100A11 was significantly lower in IDH Mutant and 1p19q codeletion LGG than in wildtype and 1p19q non-codeletion ( Figure 9A-9J).

Immunohistochemistry
The immunohistochemical staining validated the previous database analyses indicating that S100A2, S100A3, S100A6, S100A10, and S100A11 expression was upregulated in LGGs. The results revealed that S100A2, S100A3, S100A6, S100A10, and S100A11 were primarily expressed in the cytoplasm of cells, and S100A2, S100A3, S100A6, S100A10, and S100A11 expression was significantly higher in LGG compared with normal brain tissue ( Figure 10).
Thus, in the current study, we explored additional driver genes exerting immunosuppression. Calcium-binding activity of S100A family members was noted during our analysis. For instance, S100A4 has been reported to promote malignant progression of glioma [27].
The S100A protein family consists of 16 members, which regulate cell proliferation and differentiation, Ca2+ homeostasis, and inflammation among other processes [30]. During the early stages of inflammation, interleukin-1α (IL-1α), interleukin-33 (IL-33), and S100A family proteins function together in the regulation and warning of inflammation [31,32]. Once released into the extracellular space, specific S100A proteins modulate innate and acquired immune responses, direct cell migration and chemotaxis, and induce tissue development and repair by interacting with various receptors [20,30,31].
The S100A family genes are differentially expressed in many tumors and have been detected in multiple tumors over the years. For example, through a comprehensive analysis of information from multiple patients, S100A2 was identified as a potential predictor of breast cancer [33]. Breast cancer studies showed that S100A2 is a tumor suppressor gene that is primarily regulated by BRCA1/p63 and plays a role in regulating the stability of mutant p53 [34]. In squamous cell carcinoma, FADU and RPMI-2650 cell lines showed high and low levels of S100A2, respectively, and S100A2 expression had a significant inhibitory effect on cell activity [35]. Glucose transporter type 1 (GLUT1) plays an important role in the process of glycolysis. Studies showed that activation of the S100A2/GLUT1 axis can promote colon cancer progression by regulating the glycolytic process [36].
In the analysis of Wang et al., S100A3 was found to be a potential predictive marker for gastric cancer [37]. Some researchers reported that inhibiting the expression of S100A3 significantly reduced the invasion ability of prostate cancer and inhibited tumor growth [38]. In hepatocellular carcinoma (HCC), S100A3 expression is associated with tumorigenesis and tumor aggressiveness [39]. The pharmacological activity of all trans-retinoic acid (ATRA) is in part mediated by retinoic acid receptor (RAR) transcription factors. S100A3 knockdown reduced the amount of RARα in breast and lung cancer cells, and thus induced resistance to ATRA differentiation, suggesting that S100A3 is an important regulatory factor affecting breast and lung cancer cell differentiation [40]. However, the role of S100A2 and S100A3 in brain tumors has been less well studied. S100A family proteins are important regulators of immune-related biological behavior in glioma cell lines. (A) S100A1, (B) S100A2, (C) S100A3, (D) S100A4, (E) S100A6, (F) S100A8, (G) S100A9, (H) S100A10, (I) S100A11, (J) S100A13, (K) S100A14, (L) S100A16.

AGING
Macrophages contribute to immune defense, immunomodulation, and tissue repair processes [41].
AGING classical monocytes. In the process of differentiation from monocytes to macrophages, the expression and protein secretion of S100A12 is significantly decreased [47]. In a mouse model of peritonitis, S100A10 is directly involved in inflammation-stimulated plasminogen dependent macrophage recruitment [48]. Dulyaninova Ng et al. established a S100A4-deficient mouse model, and they found that S100A4 can regulate macrophage invasion through myosin-dependent and independent mechanisms [49]. Moreover, S100A4 can regulate the recruitment and chemotaxis of macrophages in mice [50]. S100A4 is an upstream regulator of epithelialmesenchymal transition (EMT) master regulators SNAIL2 and ZEB, as well as other mesenchymal transition regulators of glioblastoma. Tumors with high S100A4 expression present higher tumor initiation and spheroid ability [27]. In glioma, S100A4 expression is affected by DNA methylation, β-linked proteins, and extracellular factors including epidermal growth factor and tumor necrosis factor alpha (TNF-α) [51]. By binding to calcium, S100A4 affects tumor cell motility and metastasis. Moreover, as the grade of glioma increases, S100A4 expression also increases [52]. S100A4 protein serves as a direct signaling target of receptor tyrosine kinase 2 (ERBB2) in medulloblastoma through a pathway involving phosphatidylinositol 3 kinase AKT (PI3K/AKT), which is ultimately blocked by the ERBB tyrosine kinase inhibitor OSI774 [53].
The infiltrating state of neutrophils is closely related to glioma development [54,55]. Centrogranulocytes can promote the progression and growth of glioma through S100A4 [56]. S100A4 protein alters the expression of transcription factors and signal transduction pathway genes involved in T cell differentiation. Studies found that in S100A4-transfected T cells, the proportion of Th1-polarized cells was reduced and the Th1/Th2 balance shifted towards a Th2 tumorigenic phenotype [57]. Also, the expression of S100A4 and S100A6 was Figure 10. Immunohistochemical staining of S100A2, S100A3, S100A6, S100A10 and S100A11 in normal brain and low-grade glioma. Magnification, ×200.
AGING significantly increased, indicating a strong correlation between them [58]. S100A6 has a tumor-promotional effect in a variety of cancers. In colorectal and cervical cancer, S100A6 stimulates the proliferation and migration of cancer cells through the mitogen-activated protein kinases (MAPK) and PI3K/AKT signaling pathways, respectively [59,60]. Moreover, S100A6 can regulate acetylation of P53 gene, thereby regulating the activity of lung cancer cells [61]. In an earlier study, changes in S100A6 protein expression levels were markers of differentiation between low-grade astrocytic tumors [62]. In conclusion, S100A6 is differentially expressed in a variety of cancers, and detection of serum S100A6 levels may aid in cancer diagnosis [18]. S100A8 and S100A9 proteins are potential immune modulators. They are involved in the regulation of cyclin expression [63]. S100A8 regulates activation of monocyte toll-like receptor 4 (TLR4) and controls the development of immune processes [64,65]. Bone marrow-derived immunosuppressive cells were found to greatly hinder immune recognition of glioma cells [66]. High expression of S100A8 and S100A9 is strongly linked to tumor promotion [67][68][69]. High expression of S100A8 and S100A9 in glioma inhibits T cell function and their differentiation via interferon-alpha (INF-α) to regulate production of macrophages or dendritic cells [70,71]. Furthermore, S100A8 and S100A9 are constitutively expressed in neutrophils and monocytes as Ca(2+) sensors and are involved in cytoskeletal rearrangement and arachidonic acid metabolism. By stimulating leukocyte recruitment and inducing cytokine secretion, they regulate inflammatory responses [72]. Acute-phase response protein serum amyloid SAA1 and SAA3 regulate the transcription of S100A4 through the TLR4/NF-κB signaling pathway. They can moderate transcription of S100A8 and S100A9 proteins and exert immunomodulatory functions [73].
The cancer-promoting effect of S100A10 is obvious in ovarian cancer [74,75], the protein also stimulates production of breast cancer stem cells [76]. S100A10 mediates macrophage migration to tumor sites and increases fibrosarcoma invasion [77]. However, investigations of the role of S100A10 in brain tumor immunity are scarce. S100A11 is known to be associated with poor prognosis in glioma patients. S100A11 upregulation can activate the NF-kB pathway and stimulate the invasion and migration of glioma. S100A11, whose associations with Annexin A2 (AXNA2) have been demonstrated, is suggested to be involved in regulation of the cell cycle [78].
In conclusion, multiple S100A family protein members contribute to tumor progression via their close relationship with inflammation.

CONCLUSIONS
Through the verification of multiple databases, we observed important roles of the inflammatory S100A family in glioma. Immunoinfiltration analysis showed that expression of multiple genes in the S100A family was highly correlated with the infiltration state of macrophages, neutrophils, and dendritic cells. There is considerable evidence of the correlation between S100A4 and glioma. There are relatively few studies on S100A2 and S100A3 genes in glioma immunity, and there is no sufficient evidence to show their specific role in glioma immunity at present. S100A6 has a tumor-promoting effect in a variety of cancers, and S100A6 expression may be an important factor to distinguish glioma grade, potentially in serum. In addition, S100A8, S100A9, S100A10, and S100A11 genes also showed a strong correlation with immune infiltration. In our data, S100A10 and S100A11 had the strongest correlation with the immune cell infiltration in glioma.
Therefore, we believe that the immunomodulatory effect of the S100A family members is an important factor affecting the progression of glioma.

MATERIALS AND METHODS
A study flowchart is presented in Figure 11.

ONCOMINE
Oncomine (https://www.oncomine.org) is a large-scale oncogene microarray database, covering 65 gene microarray datasets, 4,700 microarrays, and expression data for 480 million genes, which can be used for analyzing gene expression differences, finding outliers, predicting co-expressed genes, etc. Data from Oncomine can be classified based on clinical information such as tumor stage, grade, and tissue type. We retrieved the mRNA expression data of S100A family genes in tumor tissue and normal tissue from database. In our analysis, p < 0.05, 2-fold change, and a top 10% gene rank were set as thresholds.

GEPIA
The GEPIA database (http://gepia.cancer-pku.cn/index. html) integrated TCGA cancer profiles and GTEx normal profiles to address important questions in cancer biology. With bioinformatics methods, we can reveal cancer subtypes, driver genes, alleles, and differentially expressed or carcinogenic factors, enabling in-depth AGING exploration of novel cancer targets and markers. We employed the GEPIA Single Gene Analysis module and analyzed mRNA expression of the S100A gene family in LGG, GBM, and normal tissues. Finally, we used the survival analysis module to assess relations between S100A family gene expression and overall survival in LGG patients to plot survival curves. Hazard ratios were calculated based on the Cox PH (proportional hazard) model, the 95% confidence interval (CI) is indicated by dashed lines, the x-axis unit is months, and p < 0.05 is considered statistically significant. database which integrated data mining, data integration, and visualization. We obtained mutation profile and genetic variation of the S100A family in 511 LGG samples (Samples with log2 copy-number data) from Brain Lower Grade Glioma (TCGA, PanCancer Atlas). mRNA expression was z-scored relative to all samples (log RNA Seq V2 RSEM), and a z-score threshold was set at ±2.

String
The String database (https://string-db.org/) is a searchable tool of known and predictive protein interactions in 2031 species, which includes 9.6 million proteins and 13.8 million protein interactions. It contains experimental data, results from text mining of PubMed abstracts, data synthesized from other databases, and predicted results via bioinformatics methods. We pictured protein-protein interaction (PPI) networks of the S100A gene family via the String database.

GeneMANIA
GeneMANIA (http://www.genemania.org) was used to predict protein interactions and to analyze coexpression, co-localization, pathways, physical interactions of the S100A gene family. Website prediction and other profiles were used to explore the potential protein functions of genes.

WebGestalt
WebGestalt (http://www.webgestalt.org/) is a widely used set of gene set enrichment analysis tools for functional enrichment analysis in different biological contexts. It is a powerful integrated data mining system capable of managing, retrieving, organizing, visualizing, and statistically analyzing large amounts of genes. We performed Gene ontology (GO) functional and KEGG pathway enrichment analysis of S100A family genes and protein network-related genes of GeneMANIA via WebGestalt's Over-Representation Analysis (ORA). GO annotations are divided into three major categories: Biological Process (BP), Cellular Components (CC), and Molecular Function (MF). The KEGG pathway was designed to identify the various pathways involved in the function of individual genes. The screening criteria were set at significance level of TOP10. The FDR value was set at 0.05.

LinkedOmics
The LinkedOmics database (http://www.linkedomics. org/) consists of multi-omics and clinical profiles from 32 cancer types, as well as 11,158 patient profiles from the Cancer Genome Atlas (TCGA) project. We studied AGING kinase target, transcription factor target and miRNA target of the S100A gene family based on the Gene Set Enrichment Analysis (GSEA) of the LinkInterpreter module. Gene Set Enrichment Analysis (GSEA) was conducted under the following criteria: Rank Criteria (from LinkFinder Result) is the FDR value, Minimum Number of Genes (Size) is 3, and Simulations is 500.
TIMER TIMER (https://cistrome.shinyapps.io/timer/) is a web tool created by Harvard University. It provided the profile of six types of infiltrating immune cells (B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells) in tumor tissues which was detected by RNA-Seq expression profiling data. We assessed the correlation of S100A family expression levels with immune cell infiltration, LGG patient survival via the TIMER gene module, and survival module based on the Cox proportional risk model.

CGGA and R (4.0.2)
The CGGA database (http://www.cgga.org.cn/) is a database including brain tumor datasets from a Chinese cohort of more than 2000 samples. It contains whole exome sequencing, DNA methylation, mRNA sequencing, mRNA and microRNA microarrays, and matched clinical data. R is an open software programming language and operating environment for statistical analysis, mapping, and data mining. In this study, based on data from 218 LGGs with mRNAseq 693 and mRNAseq 325 samples in CGGA, we performed correlation analysis of S100A family genes using R 4.0.2. We also screened LGG prognosis-related genes via COX regression models, and their accuracy as prognostic genes was verified by survival analysis of high and low expression groups, Receiver Operating Characteristic (ROC) Curve analysis, and clinical correlation analysis.

Immunohistochemistry
Immunohistochemical staining was used to detect the expression of prognosis-related genes in normal brain and LGG tissues. The experiments utilizing human tissue were approved by the ethics committee of the First Hospital of Shanxi Medical University. Five samples of normal brain tissue in patients with epilepsy and traumatic brain injury, and five LGG samples were collected from the First Hospital of Shanxi Medical University. All postoperative tissues were examined pathologically in the Department of Pathology, First Hospital of Shanxi Medical University. After routine paraffin-embedding, tissue sections were obtained, placed on glass microscope slides, de-paraffinized, and rehydrated. Antigen retrieval and blocking of endogenous peroxidases were performed, followed by exposure to corresponding gene polyclonal antibodies (Sangon, Shanghai, China) and enzyme-labeled IgG polymers. Finally, antibodies were visualized using a diaminobenzidine (DAB) chromogenic solution and hematoxylin as a counterstain.

Ethics approval and consent to participate
This human tissue study was reviewed and approved by the Ethics Committee of the First Hospital of Shanxi Medical University (K042-2020-04-03), and patients provided written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
Yu Zhang, Xin Yang, and Xiao-Lin Zhu contributed to the entire project, from the design proposal, to the experimental supplement and collation of data, to the writing of the paper. Hao Bai helped retrieve and organize the data, while Zhuang-Zhuang Wang and Jun-Jie Zhang were responsible for statistical analysis. Chun-Yan Hao and Hu-Bin Duan are responsible for supervising and providing financial support.

CONFLICTS OF INTEREST
The authors declare that they have no conflicts of interest.

&
This corresponding author has a verified history of publications using the personal email address for correspondence.