Construction and Validation of an Immune-related lncRNA Prognosis Model in Thyroid Cancer

A growing number of studies have shown that immune-related long non-coding RNAs (lncRNAs) play an important role in the development of cancer. The aim of this study was to identify immune-related lncRNAs in thyroid cancer (THCA) and to develop a prognostic model for THCA.


Background
Thyroid cancer (THCA) is a malignant tumour of the thyroid epithelium.Five-year survival rate for patients with invasive THCA is less than 50%, most non-invasive patients have a good prognosis [1].
However, some patients with non-invasive THCA are prone to recur and distal metastases may occur [2].
Existing prognostic predictors for THCA do not accurately predict patients' survival and recurrence [3,4].Therefore, there is an urgent need to nd new and reliable markers for the prognosis of patients with THCA.
Tumour immunotherapy has developed into an important treatment approach.It is gradually becoming a hot topic in oncology therapeutics and brings hope to cancer patients.In advanced follicular carcinoma and anaplastic THCA, multiple preclinical studies of cancer immunotherapy showed prospect for anticancer applications [1].Anti-PD-1 treatment enhanced the effect of BRAF immunosuppression on tumour regression and the anti-tumour immune response in mesenchymal THCA [5].A variety of immune cells are present in the THCA microenvironment.These cells can ght tumours or promote them.In addition, pro-in ammatory cytokines and chemokines secreted by immune cells could promote or inhibit the proliferation of tumour cells [6].The fact that cancer can survive in this immune microenvironment illustrates the critical role of immune regulation in THCA.Studies also showed that chronic in ammation was associated with the presence and severity of THCA [7].The expression of immune-related genes in cancer cells can regulate the abundance of in ltrating immune cells.This affects tumour diagnosis, prognosis and sensitivity to clinical treatment.These ndings have prompted researchers to explore the expression of immune-related coding and non-coding genes in THCA and provide ideas for the clinical diagnosis, treatment and prognosis of THCA from an immune perspective.However, the immune-related genes associated with the prognosis of THCA patients have not been fully revealed to date.Long noncoding RNAs (lncRNAs) are important components of non-coding RNAs and play an important role in pretranscriptional, post-transcriptional and post-translational levels as well as in tumour development.
Liyanarachi S et al. found that the expression of lncRNAs XLOC051122 and XLOC006074 correlated signi cantly with lymph node metastasis and mutations in BRAF (V600E) in patients with papillary THCA [8].Lei H et al. found that lncRNA TUG1 was related to tumor progression of THCA [9].Teng H et al. showed that lncRNA PACERR and CYP4A22-AS1 were associated with the aggressiveness and diseasefree survival of papillary THCA [10].The clinical relevance and prognostic importance of immune-related lncRNAs need to be further investigated in THCA.
In this study, we provided insight into the lncRNAs associated with THCA prognosis.A prognostic model was constructed and demonstrated that this model could act as an independent prognostic factor.Our study provides potential therapeutic targets for THCA.
2 Materials And Methods

Data Collection
Expression pro les and clinical information of mRNA and lncRNA of THCA patients were obtained from The Cancer Genome Atlas (TCGA) (https://cancergenome.nih.gov/).It contains 58 paracancerous tissues and 509 THCA tissues.Patients with a survival time of less than 30 days and unknown survival time were excluded.

Immune-related lncRNAs
We searched the Gene Set Enrichment Analysis (GSEA) (https://www.gseamsigdb.org/gsea/msigdb/index.jsp), using immune-related keywords 'IMMUNE_RESPONSE' and 'IMMUNE_SYSTEM_PROCESS', then downloaded the gene sets.The expression of immune-related genes was extracted from TCGA and analysed by Perl.We got immune-related lncRNAs by analyzing the relevance between lncRNAs and the immune-related genes, by the limma in R. The lter criteria were: (1) the level of all genes in all samples > 0.5; (2) the level of lncRNAs in all samples were in uctuations.

Prognostic model construction
Screening for genes signi cantly associated with prognosis based on univariate and multivariate COX regression analyses of the expression of lncRNAs in relation to patient survival time and status using the 'survival' package in R (P < 0.01).We use the function predict to calculate the risk value for each patient.

Patients and Samples Collection
24 patients with TCGA admitted to Chongqing University Cancer Hospital from January to June 2017 were randomly selected.The pathological diagnosis of all patients was assessed by three pathologists simultaneously according to the WHO classi cation criteria.Patients who received preoperative chemotherapy, radiotherapy or any other adjuvant treatment were excluded from this study.The ethics has been approved by Chongqing University Cancer Hospital.All the participants signed informed consent forms.

RNA Extraction and Quantitative Real-Time PCR (qRT-PCR) analysis
Tissues frozen in liquid nitrogen were taken and RNA was extracted according to the miRNeasy FFPE kit (Qiagen).MiRcute Enhanced miRNA cDNA First Strand Synthesis Kit (TIANGEN) was used to reverse transcribe the RNA into complementary DNA.The qRT-PCR analysis was performed using qPCR miRcute enhanced miRNA uorescence quanti cation kit (TIANGEN).Results were normalised to glyceraldehyde-3-phosphate dehydrogenase (GAPDH) expression.We conducted three replicate experiments.The primers synthesized by thermo sher and were shown in Table S1.

Statistical analysis
All analyses were performed using R 3.4.3.The "survival" and "survminer" packages were used to plot survival curves for the high-risk and low-risk groups.Heat maps were drawn using the "heatmap" package.The "ggpubr" package analyzed the correlation between lncRNAs and clinical features.Principal component analysis was performed using the 'limma' and 'scatterplot3d' packages.Univariate and multivariate Cox regression analysis were used to test whether the constructed model could be used as an independent prognostic factor.SPSS 24.0 statistical software was used to analyze the experimental data.Relative expression of lncRNA was calculated using the 2 −ΔΔCt method relative to GAPDH.The difference was considered signi cant when P < 0.05.

Acquired the immune-related lncRNAs in THCA
To make the research process clear, a ow chart was drawn to illustrate the research design and analysis methodology (Fig. 1).We obtained sequencing data and clinicopathological information from TCGA for 58 paraneoplastic and 497 cancerous tissues (Table 1).Immune-related gene sets named "Immune Response" and "Immune System Processes" were downloaded from the GSEA website.331 immunerelated genes were obtained by gene expression analysis.Next, 822 immune-related lncRNAs were obtained by correlation analysis of immune-related genes with lncRNAs (|Cor |>0.4,P < 0.001).

Constructed the prognostic model
To get the immune lncRNAs associated with prognosis, we analysed the relationship between the expression of immune lncRNAs and survival status.Univariate Cox regression analysis revealed that 26 immune lncRNAs were signi cantly associated with patients' survival (P < 0.01) (Table 2).Multivariate Cox regression analysis was performed on the above 26 immune-related lncRNAs to screen out 7 lncRNAs with high prognostic diagnostic value and to establish a prognostic model for immune-related lncRNAs in THCA (Table 3).The formula for calculating each patient's risk score was as follows: Risk

Predicted good and poor prognosis in THCA patients
Based on the median risk score, patients were divided into a high risk group (n = 248) and a low risk group (n = 249).We ranked the risk score and marked the survival status of all patients in a scatter plot.Mortality was higher in the high-risk group than in the low-risk group, and mortality increased with the increasing risk score (Fig. 2A).The heat map revealed the relationship between the expression of lncRNAs in the model and the risk score in each patient (Fig. 2B).As the expression risk score increased, ve lncRNAs had elevated expression and were de ned as high risk lncRNAs (i.e.LINC01614, AC017074.1,LINC01184, LINC00667, ACVR2B-AS1), while AC090673.1,LINC00900 had decreased expression and were de ned as low risk lncRNAs.In addition, the results of the survival analysis showed a signi cant difference between the high and low risk groups (P = 1.41e-05) (Fig. 3).

Discriminated between high-risk and low-risk groups in different genomes
To verify the importance of lncRNAs as prognostic markers for THAC patients in our model, we did a principal components analysis to explore whether lncRNAs could distinguish between the high and low risk groups in our model.There were signi cant differences in the expression pro les of immune-related lncRNAs between the two groups (Fig. 4).Results indicated that the high and low risk groups can be discriminated by the lncRNAs in our model.

Evaluated the prognostic models
To further validate the accuracy and speci city of the prediction model, we performed the univariate and multivariate analyses to assess whether the model could be used as an independent prognostic factor.Univariate independent prognostic analysis revealed that the following factors were statistically signi cant, namely, age (P < 0.001), stage (P = 0.003), T stage (P = 0.030) and the prognostic model (P < 0.001) (Fig. 5A).Multivariate independent prognostic analysis displayed that age (P = 0.002) and the model (P = 0.025) could predict the overall survival (OS) for patients with THCA (Fig. 5B).This indicated that the model we constructed could predict the prognosis of THCA patients.

Analyzed the correlation between lncRNAs and clinical features
To investigate the correlation between the expression of lncRNAs in the model and clinical characteristics, a correlation analysis was performed.The results suggested that the level of AC090673.1(P< 0.05),LINC01184(P < 0.001) LINC01614(P < 0.001) were related to the clinical stage.As the disease stage increased, the expression of AC090673.1 and LINC01614 went down then up, with overexpression in the late stage.In contrast, the expression of LINC01184 was relatively high and showed a trend of increasing before decreasing (Fig. 6A).The level of LINC00900 (P < 0.001) LINC01614 (P < 0.001) were associated with the T stage.As T stage increased, the expression of LINC00900 rose slightly and then fell, while the expression of LINC01614 decreased and then increased (Fig. 6B).These lncRNA prognostic markers could be seen to more accurately predict the clinicopathological characteristics of patients.

Detected the lncRNA expression levels in THCA patients
In order to further evaluate the reliability of the constructed model, we examined the expression levels of 7 lncRNAs in cancer and paracancer tissues of 24 THCA patients by qRT-PCR (Table 4).The results showed that LINC01614, ACVR2B-AS1, AC017074.1,LINC01184 and LINC00667 were up-regulated in THCA, while LINC00900 and AC090673.1 were down-regulated (Fig. 7).The experimental results were consistent with the bioinformatic analysis.4 Discussion lncRNAs play critical roles in different stages of cancer immunity, including antigen release, antigen presentation, immune activation, immune cell migration, in ltration of cancer tissue and killing of cancer cells, and immune escape of tumour cells.And lncRNAs may be key regulators in reshaping the tumour immune microenvironment [11,12].We analysed the correlation between lncRNAs and immune-related genes in THCA and developed a lncRNA prediction model consisting of seven lncRNAs (LINC01614, AC017074.1,LINC01184, LINC00667, ACVR2B-AS1, AC090673.1,LINC00900).The high-risk lncRNA LINC01614 in this model has been studied in breast cancer (BC).Vishnubalaji R et al. found an increased expression of the lncRNA LINC01614 in primary BC tissues and BC cell lines.It was an unfavorable prognostic marker for BC [13].Wang Y et al. con rmed this result [14].Sun Y et al. also found that high expression of LINC01614 indicated a low OS rate in non-small cell lung cancer (NSCLC) patients [15].In our model, another high-risk lncRNA INC00667 was proved to be a tumor promoter in NSCLC.Involved in the proliferation, migration and angiogenesis of NSCLC cells, it was signi cantly associated with OS time of patients [16].lncRNA INC00667 also correlated with the prognosis of BC, ovarian and hepatocellular carcinomas [17][18][19][20].Zhou M et al. found that LINC01184 was overexpressed in tumor-in ltrating B lymphocytes and was involved in immunosuppression of cancer [21].Our model suggested that LINC01184 was a high-risk lncRNA associated with clinical stage.ACVR2B-AS1 was also one of the highrisk lncRNAs in our model.High expression of ACVR2B-AS1 was an independent poor prognostic factor for OS and relapse-free survival in patients with liver cancer [22].With regard to LINC00900, we found it to be relevant to the clinical T stage.Wang W et al. showed that the expression of LINC00900 increased with tumour grade [23].
In THCA, the potential role of lncRNA as an immune-related prognostic marker remains unclear.The prognostic role of lncRNAs in THCA has been investigated from different perspectives, such as methylation and competing endogenouse RNA.Li Q et al. analyzed the methylation and transcriptome of THCA, identi ed 483 epigenetically regulated lncRNAs (EpilncRNAs) associated with the development of THCA, and constructed a prognostic model of EpilincRNAs.The methylation-driven 5-lncRNA-based signature was found to be an independent prognostic factor and to be involved in immune and in ammation-related biological processes [24].Li X et al. identi ed Lnc5m4 as an independent prognostic factor in a competing endogenouse RNA network consisting of lncRNA, miRNA and mRNA.Patients with THCA could be divided into high and low risk groups [25].In this study, we veri ed the accuracy and speci city of the predictive model through univariate and multivariate Cox regression analysis.The model we constructed could be used as an independent prognostic factor.
In addition, we further analysed the correlation between the clinical characteristics of the patients and the 7 lncRNAs in the model.The results showed that AC090673.1,LINC01184 and LINC01614 were associated with clinical stage.LINC00900 and LINC01614 were associated with the T stage.LINC01614 has been shown to correlate with clinical stage in patients with glioma and NSCLC [26,27].LINC01184 was correlated with several clinicopathological pro les of rectal cancer, including the patient's clinical stage, tumour size, lymph nodes and distal metastases [28].LINC00900 was not differentially expressed in neurogliomas of different clinical stage [29].The relationship between these lncRNAs and the clinical characteristics of THCA patients needs to be further investigated.
There are some limitations to this study and the constructed lncRNA prognostic model needs more sample validation before it can be put into clinical application.Recently, an immune-related lncRNA model was constructed based on the Gene expression omnibus database to predict the prognosis of THCA [30].The immune-associated gene sets in our study were derived from the GSEA database and the prognostic models constructed were validated in multiple clinical samples, which made our conclusions more convincing.The prognostic model constructed in that study crossed over with our model but was not fully consistent.This may be mainly due to the different sources of data and the different sets of immune-related genes selected.

Conclusion
In summary, we constructed and validated a prognostic model for immune-related lncRNAs in THCA.The The process of building the prognostic model. Abbreviations

Figure 2 The
Figure 2

Figure 3 Overall
Figure 3

Figure 4 Different
Figure 4