Prognostic Value and Immune Inltration of Novel Biomarkers in Glioma: Evidence From a Comprehensive Analysis of Tumor Microenvironment

In the glioma microenvironment, inltrating immune cells has been shown to possess benecial effects for tumor progression. Immune cells and stromal cells dominate the tumor microenvironment in glioma. The complex interplay between the tumor progression with immune cells or stromal cells was still unknown. In this study, we used Estimation of stromal and immune cells in malignant tumor tissues using expression data (ESTIMATE) calculations to calculate the proportion of tumour-inltrating immune cells (TIC) and the number of immune and stromal components in glioma cases from the cancer Genome Atlas (TCGA) and the Chinese Glioma Genome Atlas (CGGA) databases. Differentially expressed genes (DEG) were analyzed by COX regression analysis and protein-protein interaction (PPI) network construction. Then, JAK3, IL2RB and CD3E were identied as predictors by the intersection analysis of univariate COX and PPI, and further analysis showed that the expression of them were positively correlated with survival and clinicopathological characteristics of glioma patients. Finally, the Cell type Identication By Estimating Relative Subsets Of known RNA Transcripts (CIBERSORT) deconvolution algorithm was applied to quantify the fraction and inltration of 22 types of immune cells in glioma.


Introduction
Glioma is the most common human malignancies in the brain. They are highly malignant and invasive, with a high rate of relapse and poor prognosis. At present, comprehensive treatments for patients with glioma include surgery, radiotherapy and chemotherapy. Surgical resection continues to be the principal treatment for glioma, but is only applicable in patients with early-stage glioma. Therefore, a deeper understanding of the molecular mechanisms involved in glioma tumorigenesis is fully urgent for the improvement of glioma diagnosis and treatment. Accumulating evidence indicates that the tumor microenvironment (TME) constitutes a robust prognostic indicator, the immune microenvironment conveys robust prognostic information [1,2]. The TME is the internal and external environment where tumors occur, grow, and metastasize, and it affects the structure, function, and metabolism of tumor tissue. The TME outside tumor cells is a complex mixture of tumor associated broblasts, in ltrating immune cells and signaling molecules and it has emerged as a key factor to drive tumor initiation and metastasis [3][4][5]. In the glioma microenvironment, in ltrating immune cells have demonstrated to possess bene cial effects for tumor progression. Overexpression of PD-L1 by tumor cells and on tumor-in ltrating nonmalignant cells in the tumor microenvironment has the potential to interact with PD-1-expressing T cells and B cells, which results in the inhibition of antitumor immune response [6,7]. However, the impact on the TME and anti-tumor immunity in the general population is not clear.
In order to evaluate the effects of stromal cells and immune cells on glioma in TME, we downloaded RNAseq expression data from The Cancer Genome Atlas (TCGA) and The Chinese Glioma Genome Atlas (CGGA, http://www.cgga.org.cn) to use ESTIMATE to get the scores of both stromal cells and immune cells. Here we embarked from differentially expressed genes (DEGs) generated by comparison between immune components and stromal components in glioma samples and revealed that the JAK3, IL2RB and CD3E might be a potential indicator for the alteration of TME status in glioma. Finally, we analyzed the immune in ltration relative levels of 22 subsets of immune cells in complex gene expression mixtures via a widely accepted evaluation algorithm, CIBERSORT [8,9]. The work presented here provides a detailed analysis of the role of JAK3, IL2RB and CD3E in glioma development, which will aid the understanding of the mechanisms underlying glioma.

Raw data
Expression data and clinical variables data of glioma cases were downloaded from TCGA database and CGGA database. From the TCGA database, the clinicopathological characteristics and RNA-seq data from 169 glioblastoma (GBM) samples and 529 low grade glioma (LGG) samples, ranging from WHO grade II to grade IV, were analyzed in our study, and we also validated our ndings using 325 GBM samples and 693 LGG samples from the CGGA database. A total of 698 samples from TCGA and 1018 samples from CGGA were processed expression pro les of each sample and corresponding clinical data were manually organized.
Generation of ImmuneScore, StromalScore, and ESTIMATEScore ESTIMATE algorithm by feat of R language version 3.5.1 loaded with "estimate" package (33) was used to estimate the ratio of immune-stromal component in TME for each sample, ImmuneScore, StromalScore, and ESTIMATEScore were exhibited and these three scores were positively correlated with the ratio of immune, stromal, and the sum of both, respectively, the higher score estimated in ImmuneScore or StromalScore were represented for the larger amount of the immune or stromal components in TME, and the ESTIMATEScore was the sum of ImmuneScore and StromalScore denoting the comprehensive proportion of both components in TME.
Survival analysis "Survival" and "survminer" packages with R-Studio were applied for the survival analysis. All tumor cases had a detailed survival time record which were used for survival analysis. Kaplan-Meier method was used to plot the survival curve, and log rank as the statistical signi cance test; p < 0.05 was considered signi cant.
DEGs were generated from High-Score and Low-Score Groups regarding ImmuneScore and StromalScore 698 tumor samples from TCGA and 1018 tumor samples from CGGA were de ned as high score or low score depending on the comparison to the median score in regarding ImmuneScore and StromalScore, respectively. Differentiation analysis of the gene expression was compared between the high-score samples and low-score samples performed by "Limma" package. DEGs with fold change larger than 1 after transformation of log2 (high-score group or low-score group) and false discovery rate (FDR) < 0.05 were considered signi cant. Heatmaps of DEGs were produced by R language with package "pheatmap".

Enrichment analysis
DEGs were performed GO and KEGG enrichment analyses with R-Studio, both p-and q-value of <0.05 were considered signi cantly enriched. Gene set enrichment analysis (GSEA) was used to investigate the potential mechanisms in the "Molecular Signatures Database" of Hallmark and C7 gene sets v6.2 collections. The number of random sample permutations was set at 1000, and the signi cance threshold was set at p < 0.05 and FDR < 0.25.

Difference analysis of scores with clinical characteristics
The clinical characteristics data corresponding to the samples was downloaded from the TCGA and CGGA databases. The analysis was performed by R language, and Wilcoxon rank sum or Kruskal-Wallis rank sum test as the signi cance test depending on the number of clinical stages for comparison. PPI network construction PPI network was constructed by STRING database, followed by reconstruction with Cytoscape of version 3.6.1. Nodes with con dence of interactive relationships larger than 0.95 were used for building networks.
COX regression analysis R language loaded with package "survival" was used for univariate and multivariate COX regression. The three genes in univariate and multivariate COX were shown in the plot.
Pro le of tumor in ltrating immune cells CIBERSORT computational method was applied for estimating the TIC abundance pro le in all tumor samples, which was followed by quality ltering that tumor samples with p < 0.05 were selected for the following analysis.

Results
Analysis process of this study Data of a total of 1716 patients diagnosed with glioma from TCGA and CGGA datasets were retrospectively analyzed in this study. The work ow of study is shown in Figure 1. In TCGA dataset, we obtained mRNAseq data of 698 glioma samples, and the other 1018 samples were collected from CGGA dataset, ranging from WHO grade II to grade IV. All samples were performed to estimate the proportion of TICs and the amount of immune and stromal component in glioma samples. Then the ESTIMATE score was calculated and DEGs shared by ImmuneScore and StromalScore were used to construct proteinprotein interaction (PPI) network and univariate COX regression analysis. We repeated the above operation for the data of CGGA database respectively and intersected the results of the two databases. Then genes were obtained from intersection between the core nodes in PPI network and the top signi cant factors of COX regression in TCGA and CGGA databases. Then we focused on JAK3, CD3E and IL2RB for the subsequent series of analysis, including survival and clinicopathological characteristics correlation analysis, COX regression, GSEA, and CGGA dataset was used as validation cohort. At the end, TICs pro le was analyzed by "CIBERSORT" in R-Studio.
ESTIMATEScores were correlated with the survival of glioma patients Kaplan-Meier survival analysis was used for ImmuneScore, StromalScore, and ESTIMATEScore to establish the correlation of the estimated proportion of immune and stromal with the survival rate. As shown in Figure 2, ImmuneScore, StromalScore, and ESTIMATEScore showed positive correlation with the survival rate. The results implied that the results of ESTIMATE in TME were suitable for indicating the prognosis of glioma patients.
Scores were associated with the pathological age gender and WHO grade of glioma patients The corresponding clinical information of glioma cases from TCGA database to determine the relationship between the proportion of immune and stromal components with the pathological characteristics. We decided two groups according to the age, one group is patients aged less than or equal to 50, and another group is more than 50. As shown in Figure 3 (A-C) and Supplement Figure 2, ImmuneScore, StromalScore, and ESTIMATEScore were related to the age, gender and WHO grade. Divided into two groups according to age of 50.
DEGs shared by ImmuneScore and StromalScore were predominantly presented as the enrichment of Immune-Related genes The comparison analysis between high-and low-score samples were carried out according to the median to ascertain the alterations of gene pro le in TME regarding immune and stromal components, In TCGA database, we obtained 2337 DEGs from ImmuneScore (up-regulated 1526 genes and 811 down-regulated genes) and 2592 DEGs from StromalScore(up-regulated 1841 genes and 751 down-regulated genes). All these DEGs were intersected as is shown by Venn plot( overall functions of DEGs seemed to map on some immune-related activities, which implied that the involvement of immune factors was a predominant feature of TME in glioma.

Intersection analysis of PPI network and COX regression
The intersectGenes of up-regulated DEGs and down-regulated DEGs were performed to construct PPI network based on the STRING database online, and then we use Cytoscape software [National Institute of General Medical Sciences (NIGMS) USA] to explore the underlying mechanism of these intersectGenes.
The interactions between 2117 genes are shown in Figure 6A and Supplement Figure 5A, and the bar plots were represented for the top 30 genes ranked by the number of nodes ( Figure 6B and Supplement Figure 5B). COX regression analysis for the survival of glioma patients was performed to determine the signi cant factors among all DEGs and only Hazard ratio > 5 and p < 0.05 DEGs were choose. And then, the intersection between the leading nodes in PPI network and the top 60 factors ranked by the p-value of univariate COX regression was carried out, and we repeated the above operation for the CGGA database and intersected the results of the two databases. Subsequently, many genes were overlapping from the above analyses and we focused on the JAK3, CD3E and IL2RB and for further analysis. Univariate and multivariate COX regression analysis with JAK3, IL2RB and CD3E were shown in Figure 7 and In our study, samples were grouped into high-expression group and low-expression group compared with the median expression. These results clearly indicated that the expression of JAK3, IL2RB and CD3E in TME was positive correlation with the WHO grade of glioma patients, which were shown in Figure 8 and Supplement Figure 7. The survival analysis showed that glioma patients with high expression had longer survival than that of low expression (Figure 9 and Supplement Figure 8).
Biological phenotypes associated with the JAK3, IL2RB and CD3E Gene expression data were analyzed to explore the potential biological phenotypes associated with the JAK3, IL2RB and CD3E model. First, we specially focused on the correlation between JAK3, IL2RB and CD3E and the expression of selected immune-related genes. For Hallmark collection de ned by MSigDB, GSEA results show that the genes in JAK3, IL2RB and CD3E high-expression group were mainly enriched in multiple biological processes, such as apoptosis, in ammatory response, complement, PI3K_AKT_MTOR_signaling and interferon response ( Figure 10A-C and Supplement Figure 9A Figure 11C, F). These results indicated that the levels of JAK3, IL2RB and CD3E affected the immune activity of TME.

Discussion
Despite the general decline in the incidence of glioma, poor prognosis is the most common problem for advanced and metastatic glioma. In recent years, immunotherapy has become a new treatment option, although PD-1, CAR-T and other new immunotherapies have made great progress in the treatment of glioma, the TME creates a complex and robust barrier for immune cells to penetrate tumor tissue. In particular, the interaction between tumor cells and the associated stroma represents a strong relationship that affects disease occurrence, progression, and patient prognosis [10]. Communication between cells and their microenvironment is essential for normal tissue dynamic balance and tumor growth. Within the glioblastoma TME, tumor cells, stromal cells, and immune cells continuously interact and exchange signals through various secreted factors including cytokines, chemokines, growth factors, and metabolites. Such interactions can determine fates and functions of tumor cells as well as immune cells. Tumor has evolved a series of ways to evade immune surveillance and even to promote angiogenesis and tumor metastasis in the face of constant immune stress in TME [11]. Given stromal cells and immune cells play a key role in the glioma microenvironment, it is important to further elucidate the correlation between progression and immune cells or stromal cells of glioma.
Most of the predictive models established in previous studies on glioma development and malignant transformation were based on differential expressed genes, but they may neglect that immune cells may also play an important role in tumorigenesis. In this study, we started with the immune cell and stromal cell scores of TME and downloaded the gene expression pro le and clinical database of glioma from the TCGA database, then acquired the composition of glioma TME including stromal cells and immune cells by ESTIMATE algorithm, and the association between survival and ESTIMATE score was analyzed in patients with glioma. Our results implied that the immune components in TME were suitable for indicating the prognosis of glioma patients and suggested that the ratio of immune and stromal components was associated with the progression of glioma. DEG was obtained from gene expression pro le by R-Studio and GO analysis, KEGG and PPI network construction were implemented for the interaction and feature of DEGs, we found the ratio of immune and stromal components was also associated with the grade of glioma patients. Gene ontology (GO) enrichment analysis indicated that the DEGs almost mapped to some immune-related GO terms, such as leukocyte chemotaxis, leukocyte chemotaxis, leukocyte migration neuroactive ligand−receptor interaction and myeloid leukocyte migration ( Figure 5A-C and Supplement Figure 4A-C). The Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis also displayed the enrichment of cytokine-cytokine receptor interaction, antigen processing and presentation, Th17 cell differentiation and phagosome ( Figure 5D-F and Supplement Figure 4 D-F). Next, we applied GSEA to understand glioma genes expression in a particular set of immune genes and whether there is any statistical signi cance in their expression. Meanwhile, through the comparison of TCGA and CGGA databases, we found three previously unreported genes (JAK3, IL2RB and CD3E) were associated with the survival, grade, and ESTIMATEscore of glioma for further analysis. TME is known as a critical regulator of cancer progression in primary and metastatic brain tumors, while several of these cell types are also prevalent in brain tumors, there are some important features that distinguish the normal brain from other tissues [12]. Therefore, the abundance ratio of 22 kinds of immune cells was obtained through CIBERSORT deconvolution algorithm. CIBERSORT is a versatile computational method for quantifying cell fractions from bulk tissue gene expression pro les, it can accurately estimate the immune composition of a tumor biopsy [8]. Understanding on proportion of immune cells in glioma raises the possibility to target them as a potential adjuvant therapy. Tumorassociated macrophages and microglia (TAMs) arise from two distinct sources, including the periphery (bone marrow-derived macrophages) or the brain-intrinsic microglia. TAMs engage in signi cant bidirectional cross-talk with tumor cells in the brain, whereby brain tumor cells release cytokines and chemoattractants to recruit TAMs to the microenvironment, and TAMs in turn supply pro-tumorigenic, prosurvival factors [12,13]. The majority of immune cells within brain tumors are macrophages, often comprising up to ~30% of the tumor mass [13][14][15]. TAMs are driven by glioma-secreted cytokines to acquire plasticity, which is divided into M1 or M2 phenotypes [16]. Our results showed that Macrophages M0, Macrophages M1, Macrophages M2, Monocytes, T cells CD4 resting and Mast cells activated cover a big proportion in tumor samples and associated with the expression of JAK3, IL2RB and CD3E, and there was a strong negative correlation between Macrophages M0 and Monocytes (Pearson correlation = 0.74), and a positive correlation between Mast cells activated and Eosinophils (Pearson correlation = 0.45).
According to our results, glioma tissues with higher expression of these genes are in ltrated with more glioma related immune cells, more understanding about these prognostic genes may enhance the e cacy of immunotherapy for these patients.
The Janus kinase 3(JAK3) is a member of the Janus kinase family of tyrosine kinases involved in cytokine receptor-mediated intracellular signal transduction. It is predominantly expressed in immune cells and transduces a signal in response to its activation via tyrosine phosphorylation by interleukin receptors. Mutations in this JAK3 are associated with autosomal severe combined immunode ciency disease [17] and several lymphoproliferative disorders, including adult T cell lymphoma/leukemia [18,19] and anaplastic large cell lymphoma [20]. JAK3 inhibitors could also bene t anti-cancer therapies and its mutations also detected upregulate PD-L1 expression patients who may bene t from anti-PD1 treatment [21]. The IL-7R/JAK3 / STAT5 signaling pathway promotes the sensitivity of NSCLC cells to cisplatin [22]. Further investigation on JAK3 in glioma may step forward towards a new understanding of PD-1 treatment and chemotherapy.
The interleukin 2 receptor, which is involved in T cell-mediated immune responses, is present in 3 forms with respect to the ability to bind interleukin 2. This protein encoded by this gene represents the beta subunit and is a type I membrane protein. The mutations in IL2RB and IL2RA share clinical features of severe immune dysregulation, re ecting an important role of regulatory T cells in maintaining immune tolerance [23]. Signal transduction for IL-2R is mediated by IL-2RB and IL-2RC, with common downstream signaling proteins JAK3 and signal transducer and activator of transcription 5 (STAT5) [24,25]. A systematic survival analysis suggested the strength of ceRNA-ceRNA interactions of three immune response genes (CCL22, IL2RB, and IRF4) is predictive of the glioma patient survival, which is validated in two independent cohorts [26]. At the same time, the IL2RB polymorphism is also involved in the progression of lung cancer [27]. Our results indicated that the IL2RB is a biomarker for the diagnosis and prognosis of glioma, and increased IL2RB expression referred to a higher presence of resting T cells CD8 cells, T cells CD4 memory activated cells, Macrophages M1 cells, with a low presence of Mast cells activated cells. IL2RB plays an important role in the maintenance of immune tolerance in regulatory T cells [23], Zhang et al [28] implies that IL-2RB related to T cell activation in addition to susceptibility to infection and autoimmunity in mice. The expression of IL-2R has been well documented in a number of nonhematopoietic cancer-derived tumors [29,30]. Arturo et al [31]demonstrated that cervical cancer cells not only express IL-2R, but also produce IL-2 as an autocrine growth factor that activates the JAK-STAT pathway [32], and the use of IL-2 by the tumor cells depletes this growth factor at the tumor site where the growth factor is absolutely needed by the tumor in ltrating lymphocytes to become cytotoxic. Therefore, it is possible that the capacity acquired by the tumor cells to use IL-2 for their own proliferation could result in a competition with lymphocytes for this growth factor. Inhibitors of JAK3 and the downregulation of JAK3 have been proposed for the treatment of leukemia [33,34]. Thus, it is important to ascertain whether these same inhibitors could be used in nonhematologic solid tumors, including glioma, insights from these studies can inform the development of IL2-JAK3 based immunotherapy for glioma.
In ammation is a key aspect of GBM although it remains unclear how it contributes to GBM pathogenesis [12,35]. The evidence shows that CD3E plays crucial roles in activating T cell activation GBM tissues exhibited signi cantly elevated expression of multiple immune (CD3E, CD163, CD68, MX1, ARG1) and in ammation [36]. Chen et al Despite the robust in ltration of macrophages and lymphocytes as suggested by the induction of both CD68 and CD3E as well as in ammasome genes in the present GBM samples, the evidence suggests in ammatory programmed cell death termed pyroptosis was not occurring [37]. Our results also provide evidence that the expression of CD3E is a biomarker for the diagnosis and prognosis associated with glioma, GSEA results indicated that apoptosis and complement were signi cantly enriched in the high-expression group.
Due to the lack of normal samples of brain tissue or the low number of normal samples, we were unable to make further comparisons between tumor tissue and normal brain tissue. The lack of clinical information of patients in the TCGA database also led us to fail to carry out further exploration of IDH mutation, 1p19q codeletion and MGMT_p methylation status. And in CGGA, there were opposite correlation between JAK3, IL2RB and CD3E expression and TIC of Neutrophils in TCGA database, TIC of Macrophages M2 also showed the opposite result in JAK3 expression( Figure 12 and Supplement Figure  11). That's probably due to the difference in LGG and GBM distribution between TCGA and CGGA databases, or the uniqueness of glioma patients in the Chinese population, which requires further study of larger sample size.
At present, few studies have explored the relationship between the prognosis of glioma and the in ltration of immune cells or stromal cells in TCGA and CGGA databases. However, the larger samples would help us better understand the changes in the tumor microenvironment of gliomas. In this study, the results of ESTIMATE showed that the scores were closely related to the survival rate of glioma. JAK3, IL2RB and CD3E may also be used as prognostic references for such tumor. In conclusion, we used the ESTIMATE algorithm for the rst time to evaluate the score of stromal cells and immune cells, CIBERSORT deconvolution algorithm was used to analyze the content ratio of 22 kinds of immune cells in cervical cancer based on TCGA and CGGA databases, and analyzed the DEGs between immune cells and stromal cells through systematic bioinformatics analysis. Potential relationship between glioma microenvironment and tumors were promoted, and JAK3, IL2RB and CD3E expression level were independent indicators for clinical prognosis, their high expression is correlated with malignancy of gliomas. Of course, these molecular markers for the diagnosis and prognosis of glioma still need to be further veri ed in clinical.

Conclusion
It is the rst time con rmed that the JAK3, IL2RB and CD3E can be used as diagnostic and prognostic biomarkers for glioma, and our results convey robust prognostic information of immune microenvironment of glioma, which might be a potential target for anti-glioma therapy. However, how JAK3, IL2RB and CD3E participate in the progression of glioma is still under further study.

Availability of data and materials
The data and material in this review all come from published databases.

Competing interests
The authors declare that they have no competing interests.

Funding
This study was supported by funds from Beijing Science and Technology Planning Project (CN) (No.2019-B02).
Authors' contributions BZ and ZH: design, data analysis and statistical analysis, and manuscript writing and revision. FG, GT and BD: development of methodology. BM and XL: data acquisition. All authors read and approved the nal manuscript.