Exploration of immune-related cells and ceRNA in squamous cell lung cancer

Abstract The treatment for squamous cell lung cancer (SqCLC) is limited, and the prognosis of SqCLC is poor. In this article, we aimed to analyze and identify immune-related cells and competition endogenous RNA (ceRNA) that influence the prognosis of SqCLC. SqCLC and lung adenocarcinoma data were downloaded from TCGA-GDC. A total of 22 types of immune cell fractions were estimated using CIBERSORT. R software was used to identify any significantly different transcriptome data, including mRNA, LncRNA, and miRNA. The univariate cox regression method was applied to screen for prognosis-related lncRNA, miRNA, mRNA and tumor-infiltrating immune cells. There were 504 patients included in this study. There was a higher proportion of memory activated CD4+ T cells and CD8+ T cells in younger women. Follicular helper T (Tfh) cells were predictive of a good prognosis and reflected immune activation in SqCLC. The SFTA1P/NKX2-1-AS1, hsa-mir-503, GREM2 ceRNA axes and NKX2-1-AS1, hsa-mir-96, PROK2 ceRNA axes were found to be important for the immune function, pathogenesis, and prognosis of SqCLC. Collectively, the immune-related ceRNA and tumor-infiltrating immune cells in SqCLC are likely important determinants of SqCLC pathogenesis, prognosis, and immune status.


Introduction
Lung cancer is the leading cause of cancer-related death worldwide and approximately 85% of lung cancers are non-small cell lung cancer (NSCLC). [1] Squamous cell lung cancer (SqCLC) represents around 20% to 30% of all NSCLC. In addition, SqCLC has specific clinicopathologic characteristics, including the central location of tumors, diagnosis at an advanced stage, older age, and presence of comorbidities (eg, chronic obstructive pulmonary disease, cardiovascular disease, and diabetes), which present limitations and challenges to treatment. Due to a lack of genetic alterations with approved targeted treatments, chemotherapy is the primary first-line treatment for SqCLC. The discovery of immune checkpoint inhibitors, particularly inhibitors of the tumor programmed death 1 (PD-1) axis, provide critical options for patients with SqCLC in whom PD-L1 expression ≥50%. [2] However, the prognosis of SqCLC remains extremely poor.
Many kinds of immune cells and immune-related genes are related to immune checkpoint inhibitor, such as, tumor mutation burden, the presence of tumor infiltrating lymphocytes (TILs). [3] Recent studies indicate that high tumor-infiltrating CD 68-positive cells are correlated with better progression-free survival for lung adenocarcinoma. [4] Based on TCGA data set, Kuo Hao Ho and so on have found that high proportion of tumor infiltrating B cells is associated with better prognosis of NSCLC. [5] Moreover, immune gene participate in brain metastasis in NSCLC. [6] In addition, patients with high expression of immune gene PSMB9 showed longer PFS in NSCLC. [7] However, since the immune-related cells and competition endogenous RNA (ceRNA) in SqCLC not fully research, there is a need to understand the factors associated with the immune mechanism, pathogenesis, and prognosis of SqCLC.
In this study, we report that there was a higher proportion of CD8+ T cells and memory-activated CD4+ T cells in younger women with SqCLC. We further showed that follicular helper T (Tfh) cells were predictive of a good prognosis and reflected immune activation in SqCLC. Moreover, the SFTA1P/NKX2-1-AS1, hsa-mir-503, GREM2 ceRNA axes and NKX2-1-AS1, hsamir-96, PROK2 ceRNA axes were found to be important for the immune function, pathogenesis, and prognosis of SqCLC.

Data processing and differential expression analysis
Genomic and clinical data were downloaded from The Cancer Genome Atlas (TCGA; https://portal.gdc.cancer.gov/). The RNA expression data included 502 tumor samples and 49 normal samples, the miRNA expression data contained 478 neoplasm samples and 45 normal samples. A total of 504 clinical data were obtained. We used the HUGO gene Nomenclature Committee (HGNC; http://www.genenames.org/) to annotate the lncRNA and mRNA genes for RNA expression profiling. The R (version 3.5.3; https://cran.r-project.org/) software limma package (version 3.47.2; http://www.bioconductor.org/packages/devel/bioc/ html/limma.html) was used to normalize and analyze the data we collected. Finally, we obtained the data regarding the differentially expressed mRNA, lncRNA, and miRNA molecules.

Estimation of the immune cell type fraction
CIBERSORT (https://cibersort.stanford.edu/) is a versatile computational method for quantifying the cell fractions from bulk tissue gene expression from the purified leukocyte subset, and has been confirmed by flow cytometry. [8] The LM22 gene signature contains 547 genes which can discriminate between the 22 human hematopoietic cell phenotypes (B cells, T cells, natural killer cells, macrophages, dendritic cells, and myeloid subsets). The mRNA expression data were extracted from the RNA expression profile, which was downloaded from the TCGA datasets. The R software Limma package was used to normalize the mRNA data. The CIBERSORT, LM22 gene signature, and normalized mRNA were used to quantify the proportion of immune cells in the SqCLC. Patients with a CIBERSORT P < 0.05 were considered eligible for further analysis.

Association between TILs, clinical-pathological variables and prognosis
Age, grade, sex, Primary tumor (T), Regional lymph node (N), distance metastasis (M), stage, and overall survival (OS), were collected from the clinical data downloaded from the TCGA database. Kruskal-Wallis tests and Wilcoxon tests were used to test the relationship between TILs and clinical-pathological variables. The R software survival package and univariate cox regression method were used to screen for prognosis-related TILs.
2.4. Immune-related lncRNA-miRNA-mRNA regulatory network construction A list of the genes involved in immune modulation was downloaded from the IMMPORT database(https://www.imm port.org/shared/home). The intersection of the IMMPORT genes and differentially expressed genes (DEGs) was used as the immune-related DEGs (iDEGs). We applied pearson correlation analysis in the R software to identify the potential DElncRNA related with iDEGs as immune-related DElncRNA (iDElncRNA). If the correlation coefficient (jRj) >0.4 was considered as a strong correlation, and P < 0.01 was statistically significant. Based on the comprehensive GENCODE gene annotation, miRcode (http://www.mircode.org/) provides "whole transcriptome" human miRNA target predictions. The miRcode database was conducted to predict the interaction between iDElncRNAs and miRNAs. iDElncRNAs correlated with DEmiRNA were perfectly matched. The target DEmiRNA mRNAs were predicted using TargetScan (http://www.targetscan.org/), miRDB (http://www. mirdb.org/miRDB/), and miRTarBase (http://mirtarbase.mbc. nctu.edu.tw) online analysis tools. Furthermore, we used a Venny diagram to obtain the intersecting part that was used as the DEmRNA between the target mRNAs and iDEGs. We eventually obtained the relationship pair: DElncRNA-DEmiRNA and DEmiRNA-DEmRNA. We used cytoscape software (version 3.6.0; https://cytoscape.org/) to construct the lncRNA-miRNA-mRNA ceRNA network.

Prognosis values of DElncRNA, DEmiRNA, and DEmRNA in the ceRNA network
The survival package (version 1.2; https://cran.r-project. org/ package=survival) and univariate cox regression method were used to screen the prognosis-related DElncRNA, DEmiRNA, and DEmRNA in the ceRNA network. We divide the expression of SFTA1P and NKX2-1-AS1 in tumor patients into high-expression group and low-expression group based on the average value. P < 0.05 was considered to be statistical significantly.

Relationship between ceRNA and TILs
Tumor Immune Estimation Resource (TIMER, https://cistrome. shinyapps.io/timer/), is a comprehensive resource for the clinical relevance of tumor-immune infiltrations. We uploaded GREM2 to explore the relationship among ceRNA and TILs.

Statistical analysis
The differential gene expression between SqCLC and the normal tissues was assessed using an unpaired t test. The association between tumor-infiltrating immune cells or immune related genes and the corresponding clinical follow-up were analyzed using Kaplan-Meier curves. P values of <.05 were considered significant.

Identification of different level of the 22 types of TILs
To elucidate the differential level of the 22 TILs between SqCLC, lung adenocarcinoma (LUAD) and normal lung tissues, we constructed a violin plot, as shown in Figure 1. The violet plot showed the percentage of the 22 TILs as well as any statistical differences. The percentage of plasma cells, CD4+ memory activated T cells, follicular helper T cells, M0 macrophages, and M1 macrophages were significantly higher in SqCLC than both in the normal lung tissues and LUAD. The number of resting CD4+ memory T cells, NK cells, monocytes, M2 macrophages, and

Association between the 22 types of TILs and clinicalpathological variables
The correlation between the percentage of the 22 types of TILs with the clinical-pathological variables was further investigated. The percentage of CD4+ memory-activated T cells and CD8+ T cells were higher in females and patients aged 65 years (Fig. 3).

Prognosis prediction for the 22 types of TILs
We conducted a univariate Cox regression to analysis to screen for OS-related TILs. A K-M curve was subsequently drawn according to the percentage of each TILs. The high percentage of Tfh cells was significantly associated with a favorable prognosis. The high percentage of resting dendritic cells was also associated with a good prognosis, but this difference is barely noticeable (Fig. 4).

Construction of the ceRNA
Using limma package, we obtained differentially expressed lncRNA, mRNA, and miRNA. We used heatmap to show the top 10 lncRNA, mRNA, and miRNA with the most significant differences (Fig. 5). Meanwhile, the top 10 immune-related DEmRNA and DElncRNA were also shown by heatmap (Fig. 6). As described in the methods, we created a ceRNA network composed of DEmRNA, DElncRNA, and DEmiRNA. As a result, 9 immune-related DEmRNA (TGFBR2, PROK2, NR4A2, FOS, GREM2 downregulated, ULBP2, FGFR3, and IL11 upregulated) were identified, 10 DEmiRNAs were upregulated and 19 DEmiRNA were downregulated. A total of 15 immune related DElncRNA (9 upregulated and 6 downregulated) were enrolled in the ceRNA network (Fig. 10). The results indicate that differentially expressed lncRNAs could indirectly interact with mRNAs though miRNA in SqCLC.

Association between TILs and ceRNA
We analyze the relationship between GREM2 which was an important member in the ceRNA network and immune related cells. We found that the expression level of GREM2 significantly correlated with the infiltration level of B Cell, CD8+ T Cell, CD4+ T cell, macrophage, neutrophil and dendritic cell, as shown in Figure 11.

Discussion
In the present study, based on the deconvolutions of bulk gene expression data from the TCGA cohort, we found that there was a higher proportion of memory activated CD4+ T cells and CD8+ T cells in younger women. The high fraction of Tfh cells was significantly associated with a favorable prognosis. We identified the DElncRNA (SFTA1P/NKX2-1-AS1), DEmiRNA (hsa-mir-503), DEmRNA(GREM2) ceRNA axes and DElncRNA (NKX2-   Several studies have shown that CD4+ T cells can recognize tumor antigens presented on the surface of tumor cells and directly activate anti-tumor immunity associated with MHC class II molecules. [9] Srivastava et al demonstrated that activated tumor-specific CD4+ T cells may be useful for the treatment of NSCLC. [10] In addition, CD4 + T cells were found to correspond to sex and age in mice. [11] Our findings revealed that there was a higher proportion of memoryactivated CD4+ T cells in younger women. This finding may provide a suitable population for CD4 + T cell-related immunotherapy for SqCLC. DCs are antigen-presenting cells that function to process antigenic material and present it on the cell surface to T cells. [12] Multiple studies have demonstrated that DCs used for immunotherapy can induce an immune response against NSCLC and prolong the OS time, without adverse effects. [13,14] We found that resting DCs were associated with a better outcome. However, the prognosis of DCs in SqCLC requires further study. Tfh cells may represent an important mechanism contributing to an exacerbated humoral response and autoantibody production. [15] Tfh cells function as a critical regulator in several human malignancies. Infiltrating Tfh cells play a protective role in breast cancer. [16] In ovarian cancer, Tfh cells are significantly enriched among TILs and are related to IL-21 and IL-10 secretion. [17] Moreover, infiltrating Tfh cells participate in anti-tumor immunity in NSCLC patients, and are associated with favorable clinical outcomes. [18,19] Our results were consistent with previous studies which show that Tfh cells were predictive of a good prognosis. NK cells can induce lung cancer regression, and NK cell immunotherapy has shown promising results. [20] Our  research has demonstrated that greater anti-tumor immunity was induced (including NK cells) as the number of infiltrating Tfh cells increased. Thus, Tfh cells may play an important role in killing squamous lung cancer cells by immune cells.
The dysregulation expression of lncRNA contribute to the biological functions of cancers, and lncRNA are becoming prognostic biomarkers in cancer. [21] More and more evidences shown that lncRNA could bind with miRNAs, and act as endogenous miRNA sponges. [22,23] In NSCLC, lncRNA KCNMB2-AS1 can sponge miRNA-374a-3p to facilitate the cancer progression. [24] lncRNA LINC01246 could also sponge miRNA-519d-5p to promote NSCLC progression. [25] Recent studies have demonstrated that lncRNAs play important roles in the development and differentiation of NSCLC. [26] Dong and so on [26] identified seven immune related lncRNA that could predict the prognosis of lung adenocarcinoma. However, only limited study has described the immune related lncRNA as ceRNA in SqCLC. In our study, we found that lncRNA (SFTA1P/NKX2-1-AS1), miRNA (hsa-mir-503), mRNA(GREM2) ceRNA axes and lncRNA (NKX2-1-AS1), miRNA (hsa-mir-96), mRNA (PROK2) ceRNA axes were associated with immune function and pathogenesis of SqCLC for the first time.
In addition, survival analysis indicated that lncRNA SFTA1P and NKX2-1-AS1 related with OS. Various studies have shown that SFTA1P is involved in the progression in many tumors. [27][28][29][30] In papillary thyroid cancer (PTC), SFTA1P is the main constituents of competing endogenous lncRNA, and involved in the formation of PTC. [21] Meanwhile, SFTA1P also action as competing endogenous RNA in head and neck squamous cell carcinoma. [31] In hepatocellular carcinoma, AFTA1P could accelerate the tumor proliferation by down regulating mi4766-5p. [32] Li et al [33] and Xiong et al [34] have shown that SFTA1P associated with prognosis and cisplatin chemosensitivity in lung squamous cell carcinoma. In gastric cancer, NKX2-1-AS1 is closely correlated with oncogenesis and progression. [35] In human lung carcinoma, NKX2-1-AS1 could negatively regulated CD274/PD-L1, cell interaction genes, and cell migration. [36] We reported that SFTA1P and NKX2-1-AS1 as an immune-related competing endogenous lncRNA in lung squamous cell carcinoma for the first time.
However, this article has some limitations. First, the clinical data were not perfect, no smoking history, brain metastasis, and liver metastasis data. Second, only lncRNAs in ceRNA network were significantly related with OS. Third, this study was based on the database and needs more basic experiments to further verify. In summary, there were a higher proportion of CD8+ T cells and memory activated CD4+ T cells in younger women. Tfh cells were predictive of a good prognosis and reflected immune activation in SqCLC. SFTA1P/NKX2-1-AS1, hsa-mir-503, GREM2 ceRNA axes and NKX2-1-AS1, hsa-mir-96, PROK2 ceRNA axes were identified to be associated with the immune function, pathogenesis, and prognosis of SqCLC.