Bioinformatics profiling integrating a four immune-related long non-coding RNAs signature as a prognostic model for papillary renal cell carcinoma

Background: Papillary renal cell carcinoma (pRCC) was the 2nd most common subtype, accounting for approximately 15% incidence of renal cell carcinoma (RCC). Immune related long non-coding RNAs (IR-lncRs) plentiful in immune cells and immune microenvironment (IME) are potential in evaluating prognosis and assessing the effects of immunotherapy. A completed and meaningful IR-lncRs analysis based on abundant pRCC gene samples from The Cancer Genome Atlas (TCGA) will provide insight in this field. Results: 17 IR-lncRs were selected by Pearson correlation analysis of immune score and the lncRNA expression level, and 5 sIRlncRs were significantly correlated with the OS of pRCC patients. 4 sIRlncRs (AP001267.3, AC026471.3, SNHG16 and ADAMTS9-AS1) with the most remarkable prognostic values were identified to establish the IRRS model and the OS of the low-risk group was longer than that in the high-risk group. The IRRS was certified as an independent prognosis factor and correlated with the OS. The high-risk group and low-risk group showed significantly different distributions and immune status through PCA and GSEA. In addition, we further found the expression levels of SNHG16 was remarkably enhanced in female patients with more advanced T-stages, but ADAMTS9-AS1 showed the opposite results. Conclusion: The IRRS model based on the identified 4 sIRlncRs showed the significant values on forecasting prognoses of pRCC patients, with the longer OS in the low-risk group. Methods: We integrated the expression profiles of LncRNA and overall survival (OS) in the 322 pRCC patients based on the TCGA dataset. The immune scores calculated on account of the expression level of immune-related genes were used to verify the most relevant IR-lncRs. Survival-related IR-lncRs (sIRlncRs) were estimated by COX regression analysis in pRCC patients. The high-risk group and low-risk group were identified by the median immune-related risk score (IRRS) model established by the screened sIRlncRs. Functional annotation was displayed by gene set enrichment analysis (GSEA) and principal component analysis (PCA), and the immune composition and purity of the tumor were evaluated through microenvironment cell count records. The expression levels of sIRlncRs of pRCC samples were verified by real-time quantitative PCR.

(pRCC), following clear cell RCC (ccRCC), was the 2nd most common subtype, accounting for approximately 15% incidence of RCC [2,3]. Characterized by the heterogeneous multifocal or isolated tumor, pRCC also showed indolent or aggressive the two distinguish behavioral features [4]. In 2016, the WHO revised the new classification of renal tumors, of which pRCC was further classified into types I and II in greater detail [5,6]. Given the unique clinicopathological features of various subtypes, series of researchers began paying attentions to discovery of pRCC.
With the deeper understanding of the crucial roles of immune and stromal cells on tumor biological progress, a growing body of researchers were motivated to discover immunotherapy of pRCC, and increasing immunotherapy drugs obtained rewarding treatment effects [7][8][9][10]. However, the limited response rate and the emergence of resistance also occurred in another parts of patients [11]. Therefore, researchers started focusing subsequent studies on identifying more accurate and sensitive biomarkers to distinguish patients probably with the satisfied therapy response rate and prognosis, so as to achieve greater benefits on therapeutic efficiency and survival [12].
With the increasingly clear cognition of the crucial role of genetics and epigenetics on tumor pathological features, biological behaviors, therapy strategies and prognosis, more pRCC-related immune genes and genetic modification approaches were identified to be involved in the diagnosis process and prognosis prediction [13,14] Long non-coding RNAs (lncRNAs) are a kind of transcripts lacking the potential of coding proteins, but showed the pivotal position, such as antigen exposure, recognition and immune infiltration [15]. Thus, the potential of immune-related lncRNAs (IR-lncRs) on forecasting tumor progression and prognosis are drawing increasing attentions. Xiao Y. detected the prognostic effect of lncR-H19 in glioma [16]. Chen S. revealed that the expression level of lncR-PVT1 was significantly associated with the overall survival (OS) of patients with osteosarcoma [17].
However, the potential of certain genetic markers in immune microenvironment (IME) especially for iIR-lncRs on prognosis forecast have yet to be adequately elucidated [18,19]. Therefore, identifying some novel and sensitive genetic biomarkers which are potential in predicting progression and prognosis of pRCC might be helpful to provide more personalized guideline and appropriate therapeutic strategy.
To give an insight into the clinical potency of IME and IR-lncRs on prognosis evaluation of pRCC, we extracted a class of IR-lncRs in IME predicting poor prognosis in pRCC patients, together with their clinicopathologic signatures, we further calculated their correlations with OS. The results establish a more personalized precision predicting model of pRCC, and provide the guiding light for making clinical decision.

Acquisition of IR-lncRs
Transcriptome data and clinical data of patients with pRCC were fetched from TCGA database. Next, transcriptome data was processed to convert the data ensembl ID into gene names. Following that, transcriptome data were divided into lncRNA and mRNA. From the Immune system process M13664 and Immune response M19817 of Molecular Signatures Database, we identified 331 pRCC IRGs, of which 17 lncRNAs were validated to be the IR-lncRs by correlation analysis.

The relevance of IR-lncRs and prognosis
Based on COX Regression model, we then identified 5 IR-lncRs which were associated with prognosis (sIRlncRs), such as AP001267.3, SNHG16, AC021054.1, AC026471.3 and ADAMTS9-AS1. The relationships between these sIRlncRs and prognosis were clearly illustrated in the forest map ( Figure 1).

Clinicopathologic characteristics of the high-risk group and the low-risk group
The top 4 sIRlncRs (AP001267.3, AC026471.3, SNHG16 and ADAMTS9-AS1) among the 5 sIRlncRs were included to establish the risk evaluating model, by which the pRCC samples were divided into the highrisk group and the low-risk group based on the intermediate risk score (Figure 2A). The mortality rate constantly increased with the higher risk score ( Figure  2B). And with the increase of risk score, the expression levels of AC026471.3 and SNHG16 were elevated, while AP001267.3 and ADAMTS9-AS1 expressed decreasingly ( Figure 2C). The survival of the high-risk group was significantly shorter than that of the low-risk group ( Figure 3).

The clinical application of the IRRS and the relationships between the IRRS and clinicopathologic indicators
To investigate the relevance of the sIRlncRs and clinicopathological features of pRCC, we analyzed the correlation between the risk score and the clinical and demographic characteristics, such as age, gender, stage, T-stage, N-stage and M-stage. Under the IRRS, the scores of older patients ( Figure 4A), female patients AGING  Survival status between the high-risk group and the low-risk group (B). The heatmap of expression profile of contained sIRlncRs (C). In the heatmap, red parts represent up-regulation, green parts represent down-regulation, and black parts represent sIRlncRs without differential expression.
( Figure 4B), patients with advanced stage ( Figure 4C), advanced T-stage ( Figure 4D), advanced M-stage ( Figure 4E) and advanced N-stage ( Figure 4F) were significantly increased. The above results elucidate some clinical and demographic characteristics that are sensitive to the IRRS and further corroborate the clinicopathological application value of the model.
We also analyzed the relationships between the compositions of the IRRS and the aforementioned tumor characteristics which all illustrated in Table 1. The expression levels of SNHG16 and AC026471.3 were higher in female patients ( Figure 5A-5B). However, the expression levels of AP001267.3 and ADAMTS9-AS1 were lower in female patients ( Figure 5C-5D).     Note: t: t value of student's t test; P: P-value of student's t test.
We further found the expression levels of AP001267.3 and ADAMTS9-AS1 were gradually decreased in the more advanced stage, T-stage, M-stage and N-stage, while the expression levels of AC026471.3 and SNHG16 were enhanced ( Figure 6A-6D). Detection of the roles of different sIRlncRs in indicating tumor characteristics provides insight into the further discovery of biomarkers.
We then conducted the independent risk analysis, the results showed age, stage, N-stage, M-stage and risk score were significantly correlated with OS in univariate analysis (P<0.05). But in the multivariate analysis, Mstage and risk score showed more remarkable correlations with OS ( Table 2). The ROC curve represented the accuracy of the model. The AUC of risk score was 0.958 ( Figure 7). These results suggested the risk score was an independent prognostic factor.

Analysis of the immune status of the high-risk and low-risk population
We employed the PCA to detect the different distribution patterns between the low-risk group and the high-risk group by the immune gene sets and the genome-wide expression sets. In the IRGs set, the lowrisk group and the high-risk group were observably separated with the lower immune scores in the low-risk group ( Figure 8A). While we didn't detect the significant the separation of the immune scores on the basis of the genome-wide expression profiles ( Figure 8B). The results of GSEA further proved the functional annotation, with the more active immune-related responses and processes in the high-risk group ( Figure 8C and 8D).
In order to verify whether the immune genome accurately reflects the status of the tumor immune microenvironment, we analyzed the relationships between the sIRlncRs and immune cell infiltration ( Figure 9A-9F). We found that only B cell showed the most significant relationship with sIRlncRs ( Figure 9A).

SNHG16 and ADAMTS9-AS1 were respectively high-expression and low-expression in pRCC patients especially in female with advanced T-stages
We previously explored the roles of SNHG16 and ADAMTS9-AS1 on predicting prognosis by bioinformatics methods (Supplementary Figure 1). To further verify the relationships between sIRlncRs and the clinicopathologic features, as well as discover the roles of sIRlncRs on indicating clinical prognosis, we detected the expression levels of SNHG16 and ADAMTS9-AS1 in carcinoma and adjacent tissues of pRCC patients with different genders and T-stages.
Given age was insignificant associated with the five sIRlncRs, it was removed in the verification process. As illustrated in Figure 10, compared with adjacent tissues, the higher expression of SNHG16 ( Figure 10A) was detected in carcinoma tissues, but ADAMTS9-AS1 ( Figure 10B) showed the reversed expression level. Besides, the expression of SNHG16 ( Figure 10C) and ADAMTS9-AS1 ( Figure 10D) were significantly more and lower respectively in T3 and 4 stages compared with that in earlier T-stages. Female patients were detected to obtain the higher expression levels of SNHG16 ( Figure 10E) and the lower expression levels of ADAMTS9-AS1 ( Figure 10F) than male.

DISCUSSION
The significance of tumor immunization activities on tumorigenesis and development and the individual variation at the genetic level attracted a growing body of researchers pay attention to discover the implications of differential genes which were potential for distinguishing pRCC patients responding heterogeneously and predicting prognosis, thus increasing miRNAs and lncRNAs are being identified to be labelled on the immune correlation [20][21][22][23] However, the mechanisms of IR-lncRs on pRCC at the level of whole genome have yet to be adequately elucidated [24].  To detect the roles of lncRNAs in assessing prognosis, Seema Khadirnaikar established a lncRNA immune prognostic signature score model based on seven lncRNAs, and demonstrated a negative correlation between the score and survival of patients with ccRCC [25]. To elucidate the functional roles and regulatory mechanism of lncRNAs in pRCC, Xin Z. constructed a pRCC-associated competing endogenous RNA network, with 57 lncRNAs, 11 miRNAs, and 28 mRNAs, besides, they identified three mRNAs (ERG, RRM2 and EGF) regarded as negative indicators of prognosis [26]. Ze G. also developed an assessing model based on the expression profile of five mRNAs (CCNB2, IGF2BP3, KIF18A, PTTG1, and BUB1) to predict the survival of pRCC patients [27]. These discoveries offer us inspiration to focus subsequent studies on detecting some specific IR-lncRs and developing a practical immune-related risk scoring model to assess the immune status and indicate prognosis of pRCC patients. The AUC is 0.958. In the present study, 322 pRCC patients were enrolled in a genome-wide analysis for lncRNAs, combining with 311 IRGs screened in Molecular Signatures Database v4.0 (Immune system process M13664, Immune response M19817), and 17 IR-lncRs were identified eventually. We further detected the relation between the prognosis of pRCC patients and the expression levels of the 17 IR-lncRs, of which 5 sIRlncRs indicated the significant correlation with OS. Utilizing multivariate Cox and risk score model, we further identified 4 IR-lncRs to establish a risk evaluating model which was available to distinguish pRCC patients into the high-risk group and low-risk group with obviously differences of OS. Due to the molecular heterogeneity, the accuracy and sensitivity of the present clinical risk factors in predicting the survival of pRCC patients remain unsatisfying, we further validated the predicting value of the three sIRlncRs by multivariate analysis. We found the 4 sIRlncRs were independent of traditional risk factors and molecular characteristics. According to the IRGs set, the low-risk group and the high-risk group tended to be divided into two parts, with the low-risk group having lower immune scores than the high-risk group. GSEA was employed to further verify the functional annotation, and we found the more abundant immune-related responses and processes in the high-risk group. Therefore, immune-related scores are bound up with the immune status of pRCC, with higher scores indicating the poor prognosis. Besides, we found only B cell showed the most prominent relationship with sIRlncRs, which is probably due to the remarkable effects of macrophage polarization on the tumorigenesis, progression, prognosis and tumor behaviors of pRCC.

Variables
The results motivated us to further discover the underlying functions and mechanisms in future studies.     Furthermore, SNHG16 and ADAMTS9-AS1 were further verified in 22 pRCC samples, and the significant correlations between the two sIRlncRs and T stages and genders were detected. We found female patients obtained the higher expression levels of SNHG16 and the lower expression levels of ADAMTS9-AS1 than male. It is probably due to the transcription factors which affect genes expression levels are different between male and female. Some sex-biased genes have transcription factors with different expression levels between different genders in the promoter region where genes can be activated. Besides, sex chromosomes and hormones control the differences between male and female, and these differences could eventually affect transcription factors [28]. These findings suggest that the risk evaluating scores based on the 4 sIRlncRs can contribute to identify the high-risk patients from patients with the same clinical or molecular characteristics, thereby realize individualized and appropriate therapeutic strategy.
In the analyzing process, the large numbers of pRCC patients were enrolled to further enhance the reliability and persuasion in guiding clinical strategic decisions. Besides, some specific IR-lncRs with remarkable differences under variable risk factors have been further corroborated to be implicated in the progression and prognosis of pRCC. More importantly, these IR-lncRs as the molecular bioindicators, showed significant potency in forecasting and evaluating the OS.
Although we elucidated the roles of sIRlncRs on forecasting prognosis and verified the expression levels of SNHG16 and ADAMTS9-AS in tumor tissues, some limitations remain to be further discussed. Firstly, we didn't combine with the detection of proteomics, metabolomics and immunogenomics. Then, the practical application values of these sIRlncRs have yet to be adequately elucidated and need to be further wide verification. Thirdly, except for SNHG16 and ADAMTS9-AS, other sIRlncRs included in the IRRS model are also needed to be explored.
In conclusion, we comprehensively analyzed and verified the effects of IR-lncRs on forecasting clinical prognosis of pRCC. The results will contribute to develop a reliable and referable risk evaluating model and provide new insight into the immune-related researches and treatment strategies.

Ethics statement
Informed consent forms have been signed by all patients before this study. The research protocol has been approved by the Ethics Committee of The First Affiliated Hospital of Chongqing Medical University and is based on the ethical principles of medical research involving human subjects in the Helsinki Declaration.

Immune-related long non-coding RNA acquisition
The Molecular Signatures Database v4.0 (Immune system process M13664, Immune response M19817, http://www.broadinstitute.org/gsea/msigdb/index.jsp) was used to explore the immune-related gene participating in the immune process. Immune related gene was used to establish the immune score of pRCC gene by GSEA.
The correlation between the immune score and the expression of lncRNA in pRCC patients was analyzed by Pearson correlation analysis. IR-lncRs was identified by a criterion of |r|>0.8 and P<0.001.

Survival-related IR-lncRs
IR-lncRs associated with clinical outcomes in pRCC patients were regarded as survival-related IR-lncRs (sIRlncRs). Univariate COX analysis was used to screen sIRlncRs (p<0.01). Hazard ratio (HR) was used to specified sIRlncRs into protective and deleterious portion. These sIRlncRs were selected for follow-up study.

Establish the immune-related risk score model (IRRS)
To verify the reliability, sIRlncRs were submitted for the multivariate analysis, while the integrated sIRlncRs were still used as an independent prognostic indicator to develop the IRRS (p<0.05) ( Table 3). In order to explore the heterogeneous clinical prognostic outcomes, based on the differential expression of sIRlncRs, we performed a risk score model to divide pRCC patients into the highrisk group and the low-risk group according to the median risk score which was chosen as the cutoff point. . The values of IRRS were employed to evaluate various subtypes of pRCC patients. To further investigate the relevance of the sIRlncRs and clinicopathological features of pRCC, we analyzed the relationship between the IRRS and clinicopathologic characteristics, of which the "TNM staging method" is the most common way to describe the tumor status. The division of "T-stage" pRCC was based on the maximum diameter of tumors and extent of tumor invasion, with the bigger tumors and more extensive invasive statues in the more advanced T stages. "N-stage" reflects the lymph node metastasis conditions with more metastatic lymph nodes in the more advanced N stages. "M-stage" is distinguished according to whether the tumor exhibits distant metastasis, and advanced M stages usually represent poor tumor conditions. In addition, "stage" is a comprehensive method combining T-stage, N-stage and M-stage to divide patients with pRCC into I, II, III and IV stage.

Bioinformatics analysis
The survival ROC curve was employed to verify the prognostic performance through the survival ROC package of the R software. The survival time, survival status and risk scores of patients with pRCC were used to predict the prognosis over a 5-year period, then the ROC curve was drawn and the value of AUC was calculated. Abscissa was the false positive rate and ordinate represented the true positive rate. Principal component analysis (PCA) was displayed to demonstrate the expression levels of pRCC samples and gene set enrichment analysis (GSEA) was used to detect the different functional phenotypes between the lowrisk group and high-risk group. The tumor infiltrating immunocytes were evaluated through the TIMMER database. The levels of immune infiltration in pRCC patients were downloaded, and the relationship between IRRS and immune cell infiltration was calculated.

Real-time quantitative PCR
Triazole (Invitrogen) was used to extract total RNA from tissues and cell lines under various experimental conditions according to the manufacturer's instructions. cDNA Synthesis Kit (Osaka, Japan of TaKaRa) combining with RNA (1μg) was utilized to reverse transcribed cDNA. The reaction steps were as follows: 37°C for 15min, 85°C for 5s, and quantitative polymerase chain reaction (qPCR) was performed on an ABI 7500 real-time PCR system (Applied Biosystems) using SYBR-Green method (TaKaRa). The reaction cycle conditions were performed (95°C 30s, followed by 40 cycles of 95°C for 5 s and 60°C for 34 s).
Relative expression level of lncRNAs normalized to βactin was calculated by the 2 −ΔCt method. The primer sequences are shown in Table 4. Three replicate assays were performed for each cDNA sample.

Statistical analysis
Univariate Cox regression analysis and Pearson correlation analysis were used to identify the target IR-lncRs. Kaplan-Meier curve was used to evaluate the OS between low-risk group and high-risk group. Univariate and multivariate Cox regression analysis were used to verify the independent prognostic factors for pRCC patients. All statistical analysis was conducted using SPSS21.0 software (SPSS Inc, Chicago, IL) and GraphPad Prism5 (GraphPad Software Inc, La Jolla, CA). Varieties in clinical parameters were determined using independent t-tests. P<0.05 was considered significantly statistical difference.