An immune-related signature predicted survival in patients with Kidney papillary cell carcinoma

: Immune-related genes are important factors in tumor progression. The main aim of this study was to identify the immune-related genes in Kidney papillary cell carcinoma (pRCC) patients. We downloaded RNAseq data and clinical information of pRCC patients from the TCGA database and retrieved the immune-related genes list from Immport. From the data, we mined out 2468 differential expression genes (DEGs) and 183 immune-related DEGs. Four hub DEGs ( NTS , BIRC5 , ELN , and CHGA ) were identified after conducting Cox analysis and LASSO analysis. Moreover, the prognostic value of the signature based on 4 hub DEGs was verified using Kaplan-Meier analysis (P=0.0041 in the training set and p=0.021 in the test set) and ROC analysis (AUC: 0.957 in 1 year, 0.965 in 2 years, and 0.901 in 3 years in the training set, and 0.963 in 1 year, 0.898 in 2 years, and 0.742 in 3 years in the test set). Furthermore, we found that the high-risk score group had a higher percentage of B cells in the immune component, a higher expression of immune-related genes ( CTLA4 , LAG3 , PDCD1LG2 , and TIGIT ), and a better immunotherapy response.


Introduction
Kidney papillary cell carcinoma (pRCC) is the second most common type of renal cancer after renal clear cell carcinoma 1 . It is worth noting that the first choice treatment method for pRCC patients is maximum resection of the tumor. However, a previous study reported that partially treated patients face the challenge of disease progression 2 . Therefore, immunotherapy has become the latest choice for advanced metastatic pRCC patients 3 . Several new clinical trials have shown that immunotherapy can extend the overall survival time or delay tumor progression in partially advanced metastatic pRCC patients 4,5 . However, different trials exhibited different rates of clinical benefits. Moreover, all the trials have proven that immunotherapy is not beneficial to partially treated patients. According to EAU guidelines 6 , immunotherapy was a second-line therapy option for advanced metastatic pRCC patients by the end of 2020. The prognostic value of immune-related genes has become a subject of persistent focus in cancer research. Several previous studies have reported that some special immune-related genes or the signature had a significant survival prognostic value for tumor patients 7,8 . Our comprehensive literature review indicated that only few studies focused on the immune-related genes in pRCC patients. Therefore, we used COX analysis and LASSO analysis in this study to identify the valuable immune-related genes in pRCC patients. In addition, we developed an immune-related signature and validated its survival value.

Methods and Results
Identification of immune-related genes using differential expression data We downloaded the gene expression RNAseq data and clinical phenotype of TCGA kidney papillary cell carcinoma (KIRP) from the UCSC Xena database (https://xenabrowser.net/datapages/). In total, we obtained 321 sample data which contained 289 cases of kidney papillary cell carcinoma (pRCC) and 32 normal samples. The data was then pre-processed with the cancer tissue using the following steps: 1. Exclusion of the samples without clinical data; 2. Exclusion of genes with FPKM <1 from all samples. Finally, we enrolled 289 pRCC and 32 normal samples. Among them, 2468 differentially expressed genes (DEGs): 638 up-regulated genes and 1830 downregulated genes (fig 1 A-B), were identified using R package "limma" (condition: adjust P.Value<0.01, and |logFC|>2) 9 . Furthermore, we downloaded the list of the immune-related genes from the ImmPort Portal database (https://www.immport.org/home/) which contained 2483 immune-related genes 10 . Intersection of the immune-related genes and 2483 DEGs resulted in the identification of 183 immune-related DEGs.

Functional and pathway enrichment analysis of immune-related DEGs
We used an online analysis tool created by David (https://david.ncifcrf.gov/) to perform gene ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis for all the 183 immune-related DEGs 11 . GO analysis classified the DEGs into three groups: molecular function group, biological process group, and cellular component group. The biological results revealed that DEGs were primarily enriched in inflammatory response, positive regulation of ERP1 and ERK2 cascade, and immune response (fig 1 C). The cellular component results indicated that DEGs were mainly enriched in extracellular space, an integral component of the plasma membrane (fig 1 D). The molecular function results showed that DEGs were mainly enriched in semaphorin receptor binding, chemokine activity, and cytokine activity (fig 1 E). Moreover, KEGG analysis indicated that the DEGs' pathways were mainly enriched in neuroactive ligand-receptor interaction and cytokine-cytokine receptor interaction (fig 1 F).

Training and test sets
We divided 289 samples into a training set (n=203) and a test set (n=86). The two sets satisfied the following criteria: 1. Samples were randomly divided into two sets; 2. Age, clinical stage, and follow-up time between the two sets were similar. The clinical information obtained from the two sets is shown in Table 1.

Identification of survival genes from the immune-related DEGs in the training set
Firstly, we used univariate Cox proportional hazard regression to analyze the RNAseq expression and survival date of all the 183 DEGs in the training set. Secondly, LASSO Cox regression analysis was used to analyze the valuable DEGs. Finally, multivariate Cox regression analysis was performed to identify survival DEGs. R package "survival" and "glmnt' were used for the above calculation, and p<0.05 was considered to be statistically significant. Univariate Cox regression analysis results identified 39 DEGs (Table 2), of which, five were left after conducting LASSO Cox regression analysis for a thousand times (fig 2 A-B). Finally, four DEGs (NTS, BIRC5, ELN, and CHGA) were selected as the survival genes after conducting multivariate Cox regression analysis ( Table 3).
The survival state, risk score, and heatmap of 4 hub genes in the training set were showed (fig 2 C-E).

Analysis of hub DEGs
Three hub genes (NTS, ELN, and CHGA) had lower mRNA expression and one hub gene (BIRC5) had higher mRNA expression in primary cancer tissue compared to the normal kidney tissue (fig 3  A). In addition, Kaplan-Meier analysis results indicated that three genes (BIRC5, ELN, and CHGA) were survival-related (fig 3 B-E). We used the STRING online tool (https://string-db.org/) 12

Prognostic value verification of the 4-mRNA signature in the training set and test set
The risk score of each patient was calculated based on the coefficients: Risk score = (0.250656*Exp NTS) + (0.465259*Exp BIRC5) + (0.251223*Exp ELN) + (0.241936*Exp CHGA). We found that the 4-mRNA signature risk score was an independent factor in multivariate Cox analysis (table 4).
A total of 203 samples in the training set were divided into high-risk score group (n=101) and low-risk score group (n=102) according to the risk score. A comparison of the two groups indicated that the high-risk score group had a higher mortality rate, while the low-risk score group had a large number of surviving patients (fig 3 C-E). Moreover, similar results were observed when 86 samples in the test set were divided into high-risk score group (n=42) and low-risk score group (n=44). We also performed Kaplan-Meier analysis and ROC analysis for risk score on the training set and test set. Kaplan-Meier analysis in the training set found that the high-risk score group had a significantly shorter overall survival time (p =0.0041, fig 5 A)

Correlation between risk score and clinical feature
We analyzed the correlation between risk score and different clinical information (tumor, lymph node, metastasis degrees, and grades). The results showed that the risk score had significant differences in tumor, lymph node, metastasis degrees, and grades (fig 6 A). Moreover, time-dependent ROC analysis found that the 4-mRNA signature had better accuracy compared to other clinical features (fig 6 B).

The relationship between the risk score and immune cell component, expression of immune-related genes, and immunotherapy response
We explored the immune cell component using two online analysis tools (TIMER: https://cistrome.shinyapps.io/timer/ 18 and ImmuCellAI: http://bioinfo.life.hust.edu.cn/ImmuCellAI/ 19 ). TIMER analysis results showed that a significantly higher percentage of B cells (P=0.0029), T cell+CD4 (P=0.0002), T cell+CD8 (P=0.0029), Neutrophil cells (P=0.0004), and DC cells (P=0.0047) would appear in the high-risk score group in the training set (fig 7 A). On the other hand, ImmuCellAI analysis results indicated that the high-risk score group had a higher percentage of Exhausted cells (P=0.00003) and B cells (P=0.0003) (fig 7 B). We also analyzed the expression of seven immune-related genes (CTLA4, CD274, LAG3, SIGLEC15, PDCD1LG2, HAVCR2, and TIGIT) and correlated the expression with the risk score. The results showed that CTLA4, LAG3, PDCD1LG2, and TIGIT had a higher expression in the high-risk score group (P<0.05, fig 7 C). In addition, the expression of PDCD1LG2 and TIGIT genes correlated with the risk score value (P<0.05, fig 7 D-E). According to the immunotherapy response results obtained after ImmuCellAI analysis, the high-risk score group had a better immunotherapy response (P=0.0013, fig 7 F).

Discussion
In this study, we identified four hub mRNA genes (NTS, BIRC5, ELN, and CHGA) using univariate and multivariate Cox analysis, and LASSO analysis. Previous studies have reported that NTS is a neurotensin receptor that participates in the gastro-intestinal and cardiovascular functions 20,21 . Another study also found that the NTS gene activates the Wnt/β-catenin signaling pathway, thereby promoting tumor metastasis 22 . BIRC5 is a member of the apoptosis inhibitor gene family, which encode regulatory proteins that prevent apoptotic cell death 23 . It regulates several types of cancer cells by activating a multiple-step cell apoptosis process 24,25 . ELN encodes the elastin protein, which is a key protein in the tumor microenvironment 26 . Moreover, the protein encoded by the CHGA gene is a member of the chromogranin/secretogranin family of neuroendocrine secretory proteins 27 . Its gene product is a precursor of the peptides which act as autocrine or paracrine negative modulators of the neuroendocrine system 28 .
We then verified the prognostic value of the signature in both training and test sets. Kaplan-Meier analysis showed that the high-risk score group had a bad survival time, while ROC analysis found that the AUC of the signature was excellent (0.957 in 1 year, 0.965 in 2 years, and 0.901 in 3 years in the training set). A previous study had developed a 5-mRNA genes signature for pRCC patients and proved that the AUC of the signature was 0.82 29 . Furthermore, we conducted a correlation between the risk score and clinical classification, and found that the risk score was correlated with the TNM stage. The above results convinced us that the signature had an accurate prognostic value.
Finally, we conducted an analysis of the immune component and found different immune components in the two risk score groups. TIMER and ImmuCellAI analyses results indicated that the high-risk score group had a higher percentage of B cells in the immune component. Moreover, we conducted a correlation between the risk score and expression of immune-related genes. Our results indicated that the high-risk score group had a higher expression level of CTLA4, LAG3, PDCD1LG2, and TIGIT. In addition, the high-risk score group had a better immunotherapy response. In summary, this study has identified four hub immune-related genes (NTS, BIRC5, ELN, and CHGA) in pRCC patients. We also developed a signature of four hub genes which can act as an independent prognostic factor for overall survival. Our results suggest that pRCC patients with a high-risk score have a shorter survival time and a better immunotherapy response.      Declarations Ethics approval and consent to participate No application Consent for publication All the authors agreed to publish the paper.
Availability of data and material We used the public data from TCGA KIRC data.
Author contributions Shen junwen wrote the paper. Wang rongjiang and Zhang Lisha edited the paper. Chen yu, Fang zhihao, Yao jianxiang，Zhang xu and Ling yuhang analyzed the data. Tang jianer made the images out.

Funding
The Zhejiang province medical and health project (2020KY937) disclosure The authors have declared no conflicts of interest.