A prognostic model for melanoma patients on the basis of immune-related lncRNAs

The prognosis of melanoma patients is highly variable due to multiple factors conditioning immune response and driving metastatic progression. In this study, we have correlated the expression of immune-related lncRNAs with patient survival, developed a prognostic model, and investigated the characteristics of immune response in the diverse groups. The gene expression profiles and prognostic information of 470 melanoma patients were downloaded from TCGA database. Significantly predictive lncRNAs were identified by multivariate Cox regression analyses, and a prognostic model based on these variables was constructed to predict survival. Kaplan-Meier curves were plotted to estimate overall survival. The predictive accuracy of the model was evaluated by the area under the ROC curve (AUC). Principal component analysis was used to observe the distribution of immune-related genes. CIBERSORT and ESTIMATE were used to evaluate the composition of immune cells and the immune microenvironment. Eight immune-related lncRNAs were determined to be prognostic by multivariate COX regression analysis. The patient scores were calculated and divided into high- and low-risk groups. The model could effectively predict the prognosis in patients of different stages. The AUC of the model is 0.784, which was significantly higher than that of the other variables. There were significant differences in the distribution of immune-related genes between two groups; the immune score and immune function enrichment score were higher in the low risk group.


INTRODUCTION
Melanoma is a highly malignant tumor originating from melanocytes; its prevalence lately shows a global annual increase of about 3%-7% [1]. The prognosis is favorable when the disease is detected at an early stage (stages I and II); however, the prognosis of advanced diseases (stages III and IV) is extremely poor, and the 5-year survival rate on conventional treatment (including chemotherapy and biotherapy) is only 5-10% [2]. Melanoma development is strongly influenced by inflammation and immune cell infiltration [3]. In fact, immune checkpoint inhibitors (ICIs), especially anti-PD-1 and anti-PD-L1 blockers, have changed the traditional treatment mode and achieved unprecedented long-lasting responses in patients with melanoma. However, the effective rate of anti-PD-1 blockade fluctuates between 20% and 40%, and the complete response rate is only about 5% [4]. Therefore, risk stratification of melanoma patients by the combination of immune-related factors and pathological classification is key to predict the prognosis and treatment response.
Long noncoding RNAs (lncRNAs) refer to RNA molecules longer than 200 nucleotides which lack protein coding function. lncRNAs play an extensive regulatory role in different stages of tumor immune response, such as antigen exposure, antigen recognition, immune activation, immune cell infiltration, and tumor clearance [5]. Previous studies showed that a variety of lncRNAs, such as UCA1, DSCAM-AS1 and MIR155HG, can promote melanoma development and are potential therapeutic targets [6][7][8]. However, the role of immune-related lncRNAs in the prognosis of melanoma is still unclear. In this study, gene expression profiles from the TCGA database were analyzed to establish a relationship between immune-related lncRNAs and the prognosis of melanoma patients followed by construction of a new prognostic model.

Characteristics of patients
A total of 470 melanoma patients were analyzed in this study, including 290 males (61.7%) and 180 females (38.3%), with an average age of 59.2 years (18-90 years). The clinicopathological features taken into consideration consisted of age, gender, T stage, lymph node involvement, distant metastasis and clinical stage (Table 1). We drew a flowchart to illustrate the study design and the type of samples involved and analyzed at each step, where each sample represents one melanoma patient ( Figure 1).

Prognostic significance of the immune-related lncRNAs in melanoma
Three hundred and thirty-one immune-related genes were extracted from the TCGA data set based on the GSEA immune-related gene sets (Immune system process M13664, Immune response M19817). Sixty immune-related lncRNAs were identified with the co-expression method (correlation coefficient Cor>0.7, P<0.001). Furthermore, thirty two immune-related lncRNAs previously associated with the prognosis of melanoma were screened by univariate COX regression analysis ( Table 2).

Establishment of a prognostic model
Multivariate analysis was performed on the aforementioned 32 immune-related lncRNAs. Eight of 32 lncRNAs were finally identified based on the AIC value, and they were used to construct the prognostic model ( Table 3). The risk score of each patient was calculated according to the expression level of each lncRNA and its corresponding regression coefficient. Patients were divided into a high-risk (n = 223) and a low-risk group (n = 224), based on the median risk score. The survival rate was significantly different between two groups (P < 0.001, Figure 2A). The 5-year survival rate of patients in the highand low-risk groups was 42.7% and 73% respectively. The risk score of each patient and the expression levels of eight immune-related lncRNAs are shown in Figure 2B.

Verification and validation of the prognostic model
To further evaluate the ability of the prognostic model in stratification of patients with different TNM stages, we performed survival analyses and found that the model could effectively differentiate the prognosis among patients of stage I (P<0.001), stage II (P=0.023), stage III (P<0.001), and stage IV (P=0.013) ( Figure 3A). The area under the ROC curve (AUC) of the prediction model was 0.784 ( Figure 3B). In addition, the gene expression profiles of 95 melanoma patients derived from two public datasets (44 samples from GSE98394 and 51 samples from GSE91061) were mined to extract the expression levels of eight lncRNAs and validate the prognostic model using the same algorithm. Of significance, we also observed significant differences in survival between the two risk groups ( Figure 3C). The area under the ROC curve (AUC) of the validation dataset was 0.687 ( Figure 3D). Our validation analysis demonstrated that this prognostic model can be extended and applied to multiple panels of melanoma cancer patients.

Immune characteristics of high-and low-risk groups
By principal component analysis (PCA) we found that there were significant differences in the expression profile of immune-related genes within the high-and low-risk groups ( Figure 4A, 4B). Functional annotation was further performed by GSEA, revealing that differentially expressed genes between the two groups were enriched in immune system process and immune response pathway signatures ( Figure 4C, 4D). After performing the ESTIMATE analysis, we observed that the immune score of the low-risk group was higher than that of the high-risk group ( Figure 5A). The low-risk group contained a higher proportion of immune cells and stromal cells; however, the tumor purity is relatively low (P<0.001) ( Figure 5B). In the low-risk group, CD8 + T cells were relatively abundant, whereas a lower proportion of M0 macrophages were detected ( Figure 5C). Moreover, GO and KEGG enrichment analysis revealed that the biological functions associated the low risk group were mainly concentrated in immune related functions (Table 4).

DISCUSSION
The immune system is involved in recognizing and eliminating tumor cells. They can evade this surveillance through immune escape and immunosuppression; indeed, an abnormal immune response is closely associated with tumor development [9]. In recent years, studies have found that lncRNAs play an important regulatory role in the immune response, especially through the following mechanisms: i) they regulate hematopoietic stem cells' differentiation in the bone marrow into specific mature immune cell populations [10]; ii) they are involved in the peripheral differentiation of dendritic cells, NK cells and innate lymphocytes [11]; iii) they modulate the immune response by controlling the expression of immune-related AGING genes, toll-like receptors (TLRs), and cytokine receptors [12]. iv) they regulate differentiation and activation of T cells and B cells [13]; v) they participate in the activation of autophagy, and other inflammation-associated processes [14]. Immune-related lncRNAs have predictive value for survival and prognosis of a variety of tumors and are potential targets for cancer treatment [15,16].
In this study, we performed a genome-wide analysis of expression data derived from 470 melanoma patients found in TCGA database, and explored the relationship between immune-related lncRNAs and prognosis of melanoma through survival analysis and Cox regression model. Among the eight immune-related lncRNAs selected to construct the prognostic model, MIR205HG, AC004847.1 and AL590764.1 were poor outcome risk genes, while U62317.1, USP30-AS1, AL133371.2, AL365361.1 and HLA-DQB1-AS1 were protective genes. Previous studies have identified MIR205HG as a potential therapeutic target, as it can promote tumor development and progression by regulating gene transcription [17]; while other lncRNAs have not been characterized previously by relevant clinical or basic studies, and their mechanism of action is still unclear, warranting future investigation.
The incidence and clinical features of melanoma, such as pathological classification, anatomical site and prognosis, are significantly different in different ethnic groups [18]; and the prognosis of patients defined at the same stage is also different. Therefore, a better understanding of the prognostic factors of melanoma is needed. In the proposed model, patients were divided into high-and low-risk groups, and the survival rate between the two groups showed statistically significant differences. AUC (area under the ROC curve) of this prognostic model was 0.784, which was significantly higher than those of the other clinicopathological factors, such as age, gender, T stage, lymph node involvement, distant metastasis and clinical stage. Furthermore, this prognostic model can effectively stratify the risk of melanoma patients belonging to different stages, which indicate that it can be used as a useful complement to the TNM staging system.
Immune checkpoint inhibitors are used as a major therapeutic option for melanoma patients in advanced stage; however, their curative effect depends on the immunogenicity of the tumor. At present, the predictive markers of ICIs in melanoma treatment include the expression level of PD-L1 in tumor cells, the tumor mutation burden (TMB), the presence of tumor infiltrating lymphocytes (TIL), the presence of insertion and deletion (indel) mutations, the number of POLE mutations, specific gut microbiota, etc. [19]. Of course, the characteristics of the tumor microenvironment and of infiltrating lymphocytes play a major role in the process of anti-tumor immunity [20]. In this study, we found that the tumor       involved in inhibition of the recruitment and activation of T cells, thereby leading to resistance to ICI drugs. Recently, many ongoing clinical trials focus on therapies that inhibit the proliferation or polarization of TAMs to further enhance the anti-tumor immune response [22]. Consistent with previous studies, we found that the low-risk group with higher proportion of CD8 + T cells and lower proportion of M0 macrophages had better prognosis. GO and KEGG enrichment analyses showed that immune-related functions were represented mostly in this group. There were significant differences in terms of expression levels of the immune-related genes in two groups. Of note, the immune score and the PD-L1 expression levels were higher in the low-risk group. Based on the GSEA, the low-risk group contained pathways related to immune response and immune system process. Thus, the low-risk group might be more suitable for immunotherapy as a result of high immunogenicity. Nevertheless, the mechanism is still unclear. Eventually, the response rate to ICIs in different risk groups need to be further evaluated in future studies to validate our findings.

AGING
In summary, we analyzed the relationship between immune-related lncRNAs and survival status of melanoma patients and constructed a model to predict the prognosis of patients. There were significant differences in immune status between the high-risk and the low-risk patient group, which may also be applied as a biomarker to evaluate their suitability for immunotherapy. The establishment of this model may thus become convenient for clinicians to choose the most appropriate treatment. We validated this model using two public datasets and their corresponding survival data. Although we observed significant statistical difference of survival outcomes between the high-and low-risk groups, the relatively low frequency of melanoma cancer patients limits the opportunities of validation using additional cohorts. The application value of this proof-of-concept study could be further validated by multi-centric clinical studies including high number of patients.

Data collection
The gene expression profile and corresponding prognosis information of 470 patients with melanoma derived tumor and normal tissues were downloaded from the TCGA database (The Cancer Genome Atlas, https://cancergenome.nih.gov/). Patients with the survival time less than 30 days were excluded from further analyses because they may have died of other fatal complications. Ultimately, 447 patients were enrolled in our research and each tumor case corresponded to one independent patient. Data collection time: October 1, 2019.

Extraction of immune-related lncRNAs
Methods for the extraction of immune related lncRNAs have been described in previous studies [23,24]. The immune-related gene sets (Immune system process M13664, Immune response M19817) were downloaded from GSEA web site (http://software.broadinstitute.org/ gsea/index.jsp) [25]. We obtained the expression levels of immune genes and lncRNAs in each sample, followed by identification of a co-expression cohort of immune-related lncRNAs through Pearson's correlation analysis by the cor.test function of R (correlation coefficient Cor>0.8, P<0.001).

Prognostic model construction and validation
The significance of lncRNAs for survival was analyzed using the Cox proportional hazards model by the Survival package of R (3.5.2) software. Univariate Cox analysis showed that expression of 32 lncRNAs were correlated with survival. Multivariate Cox analysis of these 32 lncRNAs revealed 8 lncRNA with significant prognostic value. We fitted all models with different variable combinations, and used Akaike Information Criterion (AIC) to evaluate goodness of fit (GOF) of each model. Finally, we selected AIC to build the model with the smallest variable. The risk score of each patient was determined according to the lncRNA expression level and the regression coefficient (β) of the weighted linear combination in the multivariate analysis. The calculation formula of risk score for each patient is as follows: Risk score =βgene1×expr(gene1)+ βgene2×expr(gene2)+...+ βgeneN×expr(geneN), (expr: lncRNA expression level), β: regression coefficient). The median risk score was used to divide the group into a low-risk and a high-risk group. To evaluate the accuracy of the prognostic model, the same algorithm was applied in the validation sets (GSE91061 and GSE98394) with survival outcomes. Melanoma patients' mRNA sequencing data from GSE91061 and GSE98394 were downloaded using the "prefetch" software in SRA format. The SRA files were converted to fastq format data by "fastq-dump" program. "fastqc" was used to control quality. "trim_galore" was used to cut adapt. "hisat2" and "samtools" were used to generate bam files. "featureCounts" was used to get the counts of each genes. After including the clinical survival data, 44 samples from GSE98394 and 51 samples from GSE91061 were screened out for next analysis. "SVA" package in R program was used to move the batch effect in combining the two datasets.

Tumor component assessment
We use the CIBERSORT analytical tool to identify the relative percentages of 22 types of tumor infiltrating immune cells (TIICs) with normalized gene expression data [26]. The gene sets of twenty nine immune markers were defined according to the function of the immune genome [27]. The ESTIMATE (Estimation of STromal and Immune cells in MAlignant Tumors using Expression data) program was used to evaluate the immune score, stromal cell content, and tumor purity of each sample [28].

Statistical analysis
Overall survival (OS) was defined as follows: the time passed from the date of diagnosis to the end of the follow-up period, to the date of death from any cause, or to the date when patient cannot be followed-up anymore. Kaplan-Meier curves were plotted to estimate overall survival; and the log rank test was performed to evaluate statistical significance of differences in survival. Univariate and multivariate analysis was performed by Cox proportional hazards model. ROC curves and area under the curve (AUC) were calculated to determine their predictive value. Principal component analysis (PCA) was apply to observe the clustering of immune-related genes belonging to the two groups. All statistical analyses were carried out using R (3.5.2) software and P<0.05 (bilateral) was defined as a statistical difference.

AUTHOR CONTRIBUTIONS
YW and HB performed study conception and design, acquisition of data, analysis and interpretation of data, critical revision and drafting the manuscript; XW and MZ were involved in acquisition of data, analysis and interpretation of data; CK and LT were involved in interpretation of data and critical revision; LW and HY were involved in study conception and design, interpretation of data, and drafting of manuscript.