Identification of long noncoding RNA RP11-89K21.1 and RP11-357H14.17 as prognostic signature of endometrial carcinoma via integrated bioinformatics analysis

Endometrial carcinoma (EC) is one of the most common malignant tumors in gynecology. The potential functions and mechanisms of long noncoding RNAs (lncRNAs) in the occurrence and progression of EC remains unclear. It’s meaningful to explore lncRNAs signature for providing prognostic value of EC. The differentially expressed lncRNAs and their prognostic values in EC were investigated based on The Cancer Genome Atlas (TCGA) database; the transcriptional factors (TFs), the competing endogenous RNA (ceRNA) mechanism, functional regulatory network and immune infiltration of RP11-89K21.1 and RP11-357H14.17 were further explored by various bioinformatics tools and databases. We firstly identified high expression of RP11-89K21.1 and RP11-357H14.17 were closely associated with shorten overall survival (OS) and poor prognosis in patients with EC. We also elucidated the networks of transcription factor and co-expression genes associated with RP11-89K21.1 and RP11-357H14.17. Furthermore, the ceRNA network mechanism was successfully constructed through 2 lncRNAs (RP11-89K21.1 and RP11-357H14.17), 11 miRNAs and 183 mRNAs. Functional enrichment analysis revealed that the targeting genes of RP11-89K21.1 and RP11-357H14.17 were strongly associated with microRNAs in cancer, vessel development, growth regulation, growth factor and cell differentiation, and involved in pathways including pathways in cancer, microRNAs in cancer and apoptotic signaling pathway. We demonstrated for the first time that RP11-89K21.1 and RP11-357H14.17 may play crucial roles in the occurrence, development and malignant biological behavior of EC, and can be regarded as potential prognostic biomarkers for EC.


Background
Endometrial carcinoma (EC) is one of the most common types of malignancies in the female reproductive system, accounting for 20% to 30% of the total number of malignant tumors in female genital tract. In some developed countries, the incidence of EC is higher than that

Open Access
Cancer Cell International *Correspondence: linbei88@hotmail.com 2 Key Laboratory of Maternal-Fetal Medicine of Liaoning Province, Key Laboratory of Obstetrics and Gynecology of Higher Education of Liaoning Province, Liaoning, China Full list of author information is available at the end of the article of cervical cancer, which ranks first among gynecological tumors [1]. In recent years, the incidence of EC in China has increased year by year and ranked second of gynecological cancer, showing younger trend and severely threatening the physical and mental health of women [2]. At present, the treatment of endometrial cancer included surgery, radiotherapy, and chemotherapy, while patients with advanced EC may have distant metastasis, postoperative recurrence, and poor prognosis. Therefore, exploring potential biomarkers closely associated with the occurrence and development of EC are of great value for early diagnosis and targeted therapy of endometrial carcinoma.
Long noncoding RNAs (lncRNAs) are a group of noncoding RNAs with more than 200 bp in length with no or limited protein-coding function, which were first discovered in mice in 2002 and lack of specific and complete open reading frame [3]. LncRNAs, as important regulators of transcription and translation, have been found not only involved in physiological and pathological processes, including chromatin remodeling, transcription, post-transcriptional translation, cell proliferation, differentiation and metabolic reprogramming [4,5], but also playing a pivotal role in the occurrence and development of malignant tumors [6]. Abnormal expression of lncRNAs can affect the development and progression of many kinds of malignant tumors, such as prostate cancer [7], ovarian cancer [8], breast cancer [9] and gastric cancer [10]. In recent years, a variety of lncRNAs have been identified to be essential for the initiation, progression and malignant behaviors of endometrial carcinoma [11]. High expression of MALAT1 [12], HOTAIR [13] and NEAT1 [14] were closely associated with the poor prognosis of EC and promoted the proliferation, metastasis and EMT of endometrial cancer cells. Other studies have shown that the expression of MEG3 [15] and FER1L4 [16] were decreased in EC, and high expression of MEG3 and FER1L4 inhibited the proliferation, migration and invasion of endometrial cancer cells. These studies suggest that lncRNAs play a crucial role in the prognosis and malignant biological behaviors of EC.
In this study, we investigated the differentially expressed lncRNAs in EC based on the Cancer Genome Atlas (TCGA) database and identified two lncRNA RP11-89K21.1 and RP11-357H14. 17 and their correlation with the occurrence, development, prognostic value and functional regulatory network of EC. We also explored the upstream transcriptional regulatory factors, co-expression genes and binding proteins of lncRNAs and their relationship with immune infiltration. Furthermore, we explored their potential roles and molecular mechanisms in EC utilizing competing endogenous RNA (ceRNA) (lncRNA-miRNA-mRNA) hypothesis, which is extremely meaningful to provide a new strategy for early diagnosis and treatment of endometrial carcinoma.

Screening for differentially expressed lncRNAs by circlncRNAnet
CirclncRNAnet (http://120.126.1.61/circl nc/index .php) [17] is an online tool for exploring lncRNA and circRNA chip or sequencing expression data integrated with more than 20 tumor types in TCGA database, which included several analysis modules, such as heat map, box diagram, co-expression scatter map, circos map, gene functional enrichment analysis, RBP-RNA binding protein network and miRNA network. We explored the differentially expressed lncRNAs in EC based on Uterine Corpus Endometrial Carcinoma (UCEC) with circlncRNAnet, and constructed Circos map and heat map of LncRNAs co-expression genes, Pearson's correlation analysis was employed for exhibiting significant correlation between co-expressed genes and these lncRNAs (default: |r| > 0.5). A total of 10,978 lncRNAs were identified in the cohort and we obtained 121 dysregulated lncRNAs (77 upregulated and 44 downregulated).The screening criteria were defined as follows: |Log2 fold change | > 4 and P < 0.01.
The prognosis of dysregulated lncRNAs analyzed with GEPIA and Kaplan-Meier plotter GEPIA (http://gepia .cance r-pku.cn/) [18] is a newly webbased tool that contains sequencing expression data from 9736 tumor samples of 33 cancer types and 8587 normal samples. The database includes a variety of analysis modules such as differential gene expression analysis, survival and prognosis analysis, correlation analysis, as well as dimensionality reduction analysis. In this study, GEPIA database was employed to further analyze the expressionand prognostic value of differentially expressed lncR-NAs in UCEC. The expression analysis of these genes performed by one-way ANOVA, and the filter criteria were as follows: |Log2FC| > 1, P value < 0.05, "median", Hazards Ratio (HR) and 95% confidence interval. The Kaplan-Meier (KM) Plotter (http://kmplo t.com) [19] is an effective tool for detecting the prognosis of patients with tumors. According to the expression of lncRNAs, patients with EC were divided into two groups: high and low expression group. The hazard ratio (HR) at a 95% confidence interval and log-rank P-values were also investigated online. The filter conditions were as follows: cancer: pan-cancer RNA-seq (Uterus corpus endometrial carcinoma); survival: overall survival (OS); follow-up threshold: 120 months.

The cellular localization of lncRNAs
UCSC (https ://genom e-asia.ucsc.eduk/index .html) [20] provides a web-based interface that helps users browse gene information, view genome annotation assemblies and download gene sequences. LNCipedia (https ://lncip edia.org) [21] is a freely available annotated database of human lncRNAs transcriptional sequences and structures, which utilizes secondary structure information to establish a standard and unified classification and naming system. The tool offers insights into functions of over 1500 human lncRNAs, including evaluating coding ability, predicting open reading frame and secondary structure. LncLocator (https ://LncLo cator www.csbio .sjtu.edu. cn/bioin f/lncLo cator /) [22] is a free public platform to predict the subcellular localization of lncRNAs based on a stacked ensemble classifier. Only by utilizing lncRNA sequence information, the distribution proportion of lncRNA in 5 subcellular localizations, including cytoplasm, nucleus, ribosome, cytosol and exosome, can be quickly obtained. In this study, lncRNAs sequences information were detected by UCSC and LNCipedia databases, and the cellular localization of lncRNAs were then determined by LncLocator.

Prediction and expression of candidate miRNA with AnnoLnc and starBase database
AnnoLnc (http://annol nc.cbi.pku.edu.cn) [23] is a web interface to systematically annotate newly identified human lncRNAs based on more than 700 data sources. The systematic annotation of lncRNAs cover a wide range of functions, including genome location, secondary structure, expression pattern, transcriptional regulation, miRNA interaction, protein interaction, genetic association and evolution. StarBase (http://starb ase.sysu. edu.cn/) [24] provides an widely-used open-source platform for exploring ncRNA interactions based on 10,882 RNA sequence and 10,546 miRNA sequence of 32 cancer types, the platform can be used to perform the survival and differential expression analysis of miRNAs and lncRNAs. We predicted lncRNAs-binding miRNAs with AnnoLnc and further explored the expression of miRNAs in UCEC by starBase.

Protein-protein interaction network and transcriptional regulatory network
GeneMANIA (http://www.genem ania.org) [25] is a flexible and user-friendly platform that can predict gene function, analyze gene lists and sequence genes with function assays, which provides three main cases: single gene queries, multiple gene queries and network search. The online tool can be used to construct protein-protein interaction (PPI) network and protein-DNA interaction, investigate potential signal pathway, gene and protein expression and protein domains. We explored lncRNArelated proteins and transcriptional regulatory molecules with AnnoLnc, and visualized the functions and regulatory networks of these molecules using GeneMANIA.
Construction of lncRNA-miRNA-mRNA regulatory network miRTarBase (http://mirta rbase .mbc.nctu.edu.tw/php/ index .php) [26] is a popular web interface which mainly collects miRNA target genes verified by different validation experiments and provides supportive evidences with literatures or assays. The database can be queried through different categories, such as miRNA, gene, disease, pathway and so on. Cytoscape [27] is a very powerful software for visualizing and analyzing network data, which allows users to construct many complex biological networks. Node and edge are the two core elements in the network diagram constructed by Cytoscape. miRNA target genes were explored by miRTarBase, and only those genes verified by at least one powerful experimental method were identified as miRNAs targets (reporter assay, Western blot or quantitative reverse transcription PCR). Cytoscape 3.7.1 was further employed to construct competing endogenous RNA (ceRNA) network (lncRNA-miRNA-mRNA).

Functional enrichment analysis with Metascape
Metascape (http://metas cape.org) [28] is a user-friendly and effective tool for comprehensively annotating and analyzing single or multiple genes lists, which integrates many authoritative database resources such as Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), UniProt and Drugbank. We can not only complete pathway enrichment and biological process annotation, but also construct protein-protein interaction (PPI) networks with Metascape. In this study, Metascape was used to analyze the GO and KEGG enrichment of differentially expressed genes related to lncRNAs. Restrictions: P < 0.01, a minimum count of 3, enrichment factor > 1.5 were considered to be statistically significant. The PPI enrichment analysis in Metascape was based on the following databases: BioGrid, InWeb_IM and OmniPath. In addition, Molecular Complex Detection (MCODE) algorithm is applied to mine molecules with deeper network regulation relationships.

ImmLnc
(http://bio-bigda ta.hrbmu .edu.cn/ImmLn c) [29] is an online analysis website for investigating the immune-related function of lncRNAs across 33 cancer types with high-throughput methods, Users can investigated the lncRNA-pathways, lncRNA-immune cell type's correlation, and cancer-related lncRNAs. The ImmLnc serves as a valuable resource for exploring the lncRNA function and to further advance the identification of immunotherapy targets. In this study, we explored the correlation between lncRNAs and immune cell infiltration with ImmLnc.

Functional enrichment analysis of lncRNA-related targets
To explore the potential functions and mechanisms of RP11-89K21.1 and RP11-357H14.17 in the development of UCEC, GO and KEGG enrichment analysis of  Table 3). RP11-357H14.17 targeted genes were mainly located in perinuclear region of cytoplasm, participated in cyclindependent protein kinase holoenzyme complex, adherens junction and transcription factor complex, also regulated protein kinase activity. transcription factor and kinase binding ( Fig. 8a-d, Additional file 1: Table S5, 6). RP11-89K21.1 targeted genes mainly involved in biological processes such as blood vessel development, regulation of transferase activity, apoptotic signaling pathway, regulation of cell death and cell differentiation, response to oxygen levels ( Fig. 8e-f and Table 4).
KEGG enrichment analysis showed that RP11-89K21.1 targeted genes were significantly enriched in pathways in cancer, endocrine resistance and micro-RNAs in cancer, regulated apelin signaling pathway, Th17 cell differentiation and hippo signaling pathway (Fig. 7g, h, Table 5). RP11-357H14.17 targeted genes were significantly enriched in pathways in cancer, micro-RNAs in cancer, PI3K-AKT signaling pathway, AGE-RAGE signaling pathway in diabetic complications, cell cycle and cytokine-mediated signaling pathway (Fig. 8g, h, Table 6). These signaling pathways played key roles in the occurrence and development of a variety of tumors, including endometrial carcinoma. Moreover, in order to better understand the relationship between RP11-89K21.1, RP11-357H14.17 and UCEC, we performed protein-protein interaction (PPI) enrichment analysis using Metascape (Figs. 7i, 8i). The most important 10 and 8 MCODE components in PPI network, pathway and enrichment process analysis were applied to each MCODE component independently.

Discussion
A series of biological processes are involved in the occurrence and progression of EC, such as abnormal expression of genes and transcription factors, dysregulation of cellular signal transduction pathway and imbalance of cell microenvironment homeostasis. Pathological changes and molecular characteristics determine the level of risk and prognosis of patients with EC. In recent years, 1ncR-NAs have been identified to exert various malignant biological behaviors in EC including differentiation, proliferation, invasion and metastasis [30]. Therefore, it is valuable and helpful to explore the potential functions and molecular mechanisms of lncRNAs in EC, which contribute to prognostic prediction and therapeutic target of endometrial carcinoma.
In this study, 121 differentially expressed lncRNAs in UCEC were identified by circlncRNAnet, including 77 upregulated and 44 downregulated lncRNAs. We further confirmed for the first time that only high expressions of RP11-89K21.1 and RP11-357H14.17 were significantly associated with shortened OS and poor prognosis of patients with UCEC, which suggested that RP11-89K21.1 and RP11-357H14.17 played oncogene roles in the occurrence, progression of endometrial carcinoma. It was reported that the expression of lncRNAs were regulated by transcription factors [31]. We found that EZH2 was the common transcriptional regulator of RP11-89K21.1 and RP11-357H14.17 in endometrial carcinoma with AnnoLnc, Moreover, EZH2 was positively correlated with the expression of RP11-89K21.1 and RP11-357H14.17. Some studies have showed that lncRNA DLEU2 interacted with EZH2 to promote the proliferation, migration and invasion of hepatocellular carcinoma, thus accelerating the malignant progression of hepatocellular carcinoma [32]. In gastric cancer, lncRNA UCA1 enhanced the translation of cyclin D1 via recruiting EZH2 and further precipitated the proliferation and cell cycle progression of gastric cancer [33]. In lung cancer, the expression of lncRNA-SVUGP2 could be suppressed by EZH2 and further promoted the occurrence and development of lung cancer via Wnt/β-catenin pathway [34]. These studies suggest that there exists potential regulatory mechanism between EZH2 and RP11-89K21.1, RP11-357H14.17 involved in the occurrence and development of endometrial carcinoma, and the specific mechanism remains to be further explored and verified.
Studies have shown that lncRNAs are located in different subcellular structures, including cytoplasm, nucleus, ribosome, cytosol and exosome. Functions and regulatory mechanisms of lncRNAs are closely associated with subcellular localization. We detected that RP11-89K21.1 and RP11-357H14.17 were mainly located in cytosol. Growing evidence suggested that, in the cytoplasm and cytosol, lncRNAs not only regulated the stability and translation of mRNA, but also had an impact on the post-transcriptional modification of proteins and cell signal transduction. Based on the ceRNA hypothesis, lncRNA can competitively bind to miRNAs acting as sponge of miRNAs, detaining or adsorbing miRNAs, thus relieving the inhibition of miRNAs on downstream target genes [35,36]. Some studies have shown that exosomal lncRNA ARSR could competitively bind to miR-34 and miR-449 to regulate the expression of AXL, c-MET, and then promoted the drug resistance of renal cell carcinoma [37]. In prostate cancer, lncRNA TTTY15 acted as a ceRNA and negatively regulated miR-let-7 to promote expression of the target genes (CDK6, FN1) [38]. In recent years, mounting studies have shown that lncRNA, acting as ceRNA, regulated the expression of downstream oncogenes and tumor suppressor genes in endometrial carcinoma through a miRNA regulatory mechanism. In endometrial carcinoma, lncRNA HOTAIR facilitated the expression of NPM1 by negatively regulating miR-646, and thereby promoting the proliferation, migration and invasion of EC cells [39]. Maziveyi et al.  reported that lncRNA TUSC7 promoted the expression of SOCS4 (SOCS5) through acing as sponge of miR-616, thus inhibiting the proliferation, migration and invasion of endometrial carcinoma [40]. We further explored the potential role of ceRNA network mechanism regulated by RP11-89K21.1 and RP11-357H14.17 in the progression of EC. The binding miRNAs of RP11-89K21.1 and RP11-357H14.17 were retrieved from miRTarBase and 4 overlapped miRNAs (miR-27b, miR-4770, miR-143, miR-204) were downregulated in UCEC, and other RP11-89K21.1 binding miRNA (miR-125a-5p, miR-125b-5p, miR-139-5p, miR-670-3p) and RP11-357H14.17 binding miRNA (miR-24-1-5p, miR-503) were also decreased in UCEC. miRNAs were involved in a variety of malignant biological behaviors and mechanisms such as proliferation, invasion and migration of tumors. The expression of miR-27b-3, miR-204-5p was decreased in EC, and high expression of miR-27b-3 and miR-204-5p could significantly inhibit the proliferation, migration and invasion of EC cells [41,42]. The expression of miR-143 and miR-503 were also downregulated in EC, and high expression of miR-503 inhibited the proliferation and cell cycle of EC cells through negatively regulating CCND1 [43,44]. These findings indicated that RP11-89K21.1 and RP11-357H14.17 may play oncogene roles in endometrial carcinoma by regulating candidate miRNAs and their targeted genes. Therefore, we have successfully constructed a new lncRNA-miRNA-mRNA ceRNA regulatory network associated with the prognosis of patients with EC, and further experiments are required to verify molecular mechanisms in the regulatory network.
Many studies have shown that lncRNAs affects the initiation and development of tumors by regulating a variety of molecular mechanisms and signaling pathways. Researchers observed lncRNA LSINCT5 promoted proliferation, invasion and metastasis of EC cells   by regulating HMGA2/Wnt/β-catenin signaling pathway [45]. LncRNA OGFRP1 promoted the malignant progression of endometrial carcinoma by regulating the miR124-3p/SIRT1 axis and activating the PI3K/AKT/GSK-3β pathway [46]. In order to better clarify the biological function, molecular mechanism and regulatory network of RP11-89K21.1 and RP11-357H14.17-related targeting genes in EC, we carried out GO and KEGG enrichment   KEGG enrichment analysis showed that RP11-89K21.1 and RP11-357H14.17-related targeting genes were significantly enriched in microRNAs in cancer, cytokinemediated signaling pathway, transmembrane receptor protein tyrosine kinase signaling pathway, apoptotic signaling pathway. The above signaling pathways were closely associated the progression and biological behaviors of EC [47]. Therefore, we speculated that RP11-89K21.1 and RP11-357H14.17 can affect the occurrence, development and biological behavior of EC by regulating the above tumor-related pathways, which provided more evidences for further exploring the molecular mechanisms of RP11-89K21.1 and RP11-357H14.17 in endometrial carcinoma. Accumulating studies have demonstrated that a large number of immune cells and cytokines can be observed in EC, which can enhance the endogenous anti-tumor immune response and affect prognostic value and immunotherapy of EC. Immunotherapy plays a well-established role in the treatment of EC. Our results showed that the expression of RP11-89K21.1 was negatively correlated with CD8_T cell, Macrophage and positively correlated with CD4_T cell, Neutrophil. The expression of RP11-357H14.17 was negatively correlated with CD8_Tcell and positively correlated with CD4_Tcell. Studies have shown that CD8_T cell and Macrophage could participate in the malignant progression of EC and serve as a potential therapeutic target for EC [48]. CD4_T cell could promote the capacity of initiating CD4_T cell rapidly by mediating immune response, and kill tumor cells directly or indirectly by stimulating and recruiting CD8_T cell cells or other immune cells [49]. The number of CD4_T cells in peripheral blood of patients with EC was significantly increased [50]. Neutrophil was the main type of immune cells in tumors, which can eliminate pathogens and prevent host from being infected by microorganisms, playing a key role in chemotherapy resistance and anti-angiogenesis therapy of tumors [51]. Some studies have displayed that neutrophils were closely correlated with survival and prognosis of patients with EC [52]. The results showed that the above immune cells were of great value in the occurrence, development and prognosis of EC. However, it's not sufficient to conclude that RP11-89K21.1 and RP11-357H14.17 play critical role in regulating the infiltration of immune cells in tumor microenvironment because of the small correlation coefficients (Rs value), and further experiments are required to verify the function of RP11-89K21.1 and RP11-357H14.17 in immune infiltration.

Conclusion
In summary, with a series of integrated databases, we demonstrated for the first time that high expressions of RP11-89K21.1 and RP11-357H14.17 were closely associated with the poor prognosis of patients with EC. We further identified transcriptional regulatory factors, co-expressed genes, interacting proteins of RP11-89K21.1 and RP11-357H14.17. Regulatory networks of biological function, signaling pathways may contribute to illuminate the potential function and mechanism of RP11-89K21.1 and RP11-357H14.17 in EC. Moreover, we speculated that the ceRNA network associated with RP11-89K21.1 and RP11-357H14.17 provided novel and valuable insights into the molecular mechanisms underlying the initiation and progression of EC. Therefore, RP11-89K21.1 and RP11-357H14.17 can potentially be identified as tumor biomarkers for early diagnosis, prognosis evaluation and therapeutic targets of EC. Although existing research may not be optimal, we think it should be adequate to make a conclusion that RP11-89K21.1 and RP11-357H14.17 contributed to poor prognosis in EC. The functional experiments of RP11-89K21.1 and RP11-357H14.17 will be further conducted in the follow-up study.