A Simple Competing Endogenous RNA Network Identifies Novel mRNA, miRNA, and lncRNA Markers in Human Cholangiocarcinoma

Background Cholangiocarcinoma (CCA) is the second most common malignant primary liver tumor and has shown an alarming increase in incidence over the last two decades. However, the mechanisms behind tumorigenesis and progression remain insufficient. The present study aimed to uncover the underlying regulatory mechanism on CCA and find novel biomarkers for the disease prognosis. Method The RNA-sequencing (RNA-seq) datasets of lncRNAs, miRNAs, and mRNAs in CCA as well as relevant clinical information were obtained from the Cancer Genome Atlas (TCGA) database. After pretreatment, differentially expressed RNAs (DERNAs) were identified and further interrogated for their correlations with clinical information. Prognostic RNAs were selected using univariate Cox regression. Then, a ceRNA network was constructed based on these RNAs. Results We identified a total of five prognostic DEmiRNAs, 63 DElncRNAs, and 90 DEmRNAs between CCA and matched normal tissues. Integrating the relationship between the different types of RNAs, an lncRNA-miRNA-mRNA network was established and included 28 molecules and 47 interactions. Screened prognostic RNAs involved in the ceRNA network included 3 miRNAs (hsa-mir-1295b, hsa-mir-33b, and hsa-mir-6715a), 7 lncRNAs (ENSG00000271133, ENSG00000233834, ENSG00000276791, ENSG00000241155, COL18A1-AS1, ENSG00000274737, and ENSG00000235052), and 18 mRNAs (ANO9, FUT4, MLLT3, ABCA3, FSCN2, GRID2IP, NCK2, MACC1, SLC35E4, ST14, SH2D3A, MOB3B, ACTL10, RAB36, ATP1B3, MST1R, SEMA6A, and SEL1L3). Conclusions Our study identified novel prognostic makers and predicted a previously unknown ceRNA regulatory network in CCA and may provide novel insight into a further understanding of lncRNA-mediated ceRNA regulatory mechanisms in CCA.


Introduction
Cholangiocarcinoma (CCA) is an epithelial cell malignancy arising from different locations within the biliary tree. According to its anatomical location, CCA is classified into intrahepatic (iCCA), perihilar (pCCA), and distal (dCCA) subtypes [1]. Surgical treatment is the preferred option for all subtypes [2]; however, most patients with CCA present with unresectable or metastatic disease. As one of the deadliest cancers, patient prognoses remain poor after systemic chemotherapy [3]. More than 90% of patients die within five years, and the majority of patients survive less than 12 months after diagnosis [4]. Additionally, the incidence of CCA seems to be increasing in many western countries [2], and because of this increase, cumulative CCA mortality rates have also increased by 39% [5]. The increased incidence and mortality have both contributed to the rising interest in this cancer.
In 2011, Leonardo Salmena and colleagues first proposed the famous competitive endogenous RNA (ceRNA) hypothesis, positing that RNAs can engage in crosstalk with one another and manipulate biological functions independently of protein translation [7]. lncRNAs are endogenous noncoding RNAs longer than 200 nucleotides and constitute the largest portion of the human noncoding transcriptome [8].
Recent studies have revealed that lncRNAs can act as ceRNAs to compete with mRNAs for binding to miRNAs like sponges, and thereby regulate the expression level of genes [9,10]. Moreover, this pattern has been extensively demonstrated to play important roles in carcinogenesis and in the development of numerous cancers, including bladder cancer, thyroid 2 BioMed Research International cancer, and gastric cancer [11][12][13]. However, compared with other malignancies, the ceRNA research in CCA remains sparse. Sun et al. discovered that KCNQ1OT1 can act as a ceRNA to improve CCA progression by regulating the miR-140-5p/SOX4 axis [14]. In addition, lncRNA H19, HULC, and TUG1 have also been shown to promote CCA progression as ceRNAs [15,16]. However, the essential biomarkers and mechanisms for the development and progression of CCA remain unclear.
With the rapid development of gene profile and next generation sequencing technology, bioinformatics analysis can provide more valuable information for new research [17]. However, like most bioinformatics studies on CCA, Huang and Zhong's groups have only identified differentially expressed genes (DEGs) between CCA and normal tissues and constructed a protein-protein interaction (PPI) network [18,19]. An analysis of lncRNA and miRNA functions and their interactions is absent. Additionally, bioinformatics studies related to the ceRNA network in CCA are rare. At present, only a few articles [20][21][22] can be found. In contrast to all the aforementioned studies, all of the molecules that we selected in the construction of our network were proven to be differentially expressed and prognosis-related. In addition, we determined the relationships between RNAs from the expression data in the CCA samples rather than in RNA interaction databases. The method of functional enrichment analysis was also different. We built a GO terms network to further visualize the biological process and molecular function of crucial genes. Then, the KEGG, Reactome and WikiPathways were used to enrich the related mRNAs pathways as opposed to the sole use of KEGG, which can help to obtain a more comprehensive analytic result. Therefore, there is some merit to our research in terms of a deeper understanding of CCA.
In the present study, with the aim of identifying the potential ceRNA interactions that contribute to patient survival, we constructed a ceRNA regulatory network that included mRNAs that were dysregulated in CCA as compared to normal tissues. The expression and clinical data were acquired from the Cancer Genome Atlas (TCGA), a publicly funded project that aims to catalogue and discover major cancercausing genomic alterations to create a comprehensive "atlas" of cancer genomic profiles [23]. All RNAs participating in the construction of the network are demonstrated to be dysregulated and prognosis-related. Because of this strict standard, the network appears to be simple; however, compared with networks containing hundreds of molecules, it may provide more reliable clues for ongoing basic research.
. . Identification of Differentially Expressed RNAs. RNAs with average expression values greater than 1 were considered for further analysis. lncRNAs and mRNAs are defined based on the annotations of the GeneCards database (https://www.genecards.org/). We used the R and Bioconductor packages of DESeq2 and limma to explore the significantly differentially expressed mRNAs (DEmRNAs), lncRNAs (DElncRNAs), and miRNAs (DEmiRNAs) between CCA tissues and normal tissues. Fold-change was used to measure the change degree. For p-values, the false discovery rate (FDR) was applied to determine the significance threshold for multiple tests. The cut-off value was |log 2 FoldChange| > 2 for mRNA and lncRNA, |log 2 FoldChange| > 1.5 for miRNA. FDR < 0.05 was considered significant.
. . Survival Analysis. Univariate Cox regression was performed to evaluate the association between survival and the expression levels of DEmRNAs, DElncRNAs and DEmiRNAs. Then, an intermediate expression value was used as the cut-off point and the patients were divided into high-expression and low-expression groups. Survival curves were generated using the Kaplan-Meier method, and the log-rank test was used to compare the differences between groups. P < 0.05 was the threshold for significance. DEmRNAs, DElncRNAs, and DEmiRNAs related to survival were considered as prognostic DEmRNAs, prognostic DElncRNAs, and prognostic DEmiRNAs, respectively.
. . Functional Enrichment Analysis. Generally, lncRNAs and miRNAs do not encode proteins but their functions are associated with related protein coding genes. To understand the underlying biological mechanisms, Gene Ontology (GO) annotation and pathway analyses of prognostic DEmRNAs were conducted by DAVID (https://david.ncifcrf.gov/), Kyoto Encyclopedia of Genes and Genomes (KEGG, https://www.kegg.jp/), Reactome (https://reactome.org/), and WikiPathways (https://www.wikipathways.org/). GO terms and pathway analyses with P<0.05 were considered to be significantly enriched functional annotations. The relationships between GO terms and related genes were visualized by Cytoscape.

Results
. . Identification of DElncRNAs, DEmRNAs, and DEmiR-NAs in CCA Patients. The lncRNA, mRNA, and miRNA expression data of 36 CCA samples and 9 adjacent normal samples were downloaded from the TCGA project. After annotating different RNA types, we identified a total of 7783 lncRNAs, 17685 protein coding mRNAs, and 370 miRNAs with average expression values greater than 1. Under the criteria of differential analysis, as described above, 3657 DEmRNAs (2287 upregulated and 1370 downregulated), 1622 DElncRNAs (970 upregulated and 652 downregulated), and 66 DEmiRNAs (23 upregulated and 43 downregulated) were significantly dysregulated. Heat maps of the clustering analysis of these RNAs are presented in Figures 1-3 and show that the three kinds of RNAs can well-distinguish tumor tissues from normal tissues.
. . Functional Enrichment Analysis of Prognostic DEmRNAs. After performing prognostic DEmRNAs gene ontology analysis (GO) with DAVID, we found that these genes were classified into the following three functional groups: the molecular function group, the biological process group, and the cellular component group. As shown in Table 5, in the biological process group, genes were mainly enriched in the regulation of cell proliferation, cell motion, cell migration, and reproductive cellular process. In the molecular function group, genes were mainly enriched in hormone activity, endopeptidase inhibitor activity, and growth factor activity.
In the cellular component group, genes were mainly enriched in extracellular region. The results are further visualized by a network ( Figure 5). Next, pathway enrichment was performed using KEGG, Reactome, and WikiPathways. We found that target genes were associated with salivary secretion, prolactin signaling, and complexed PIWIL4 binding to cleaved transposon RNA (Table 6).

Discussion
Cholangiocarcinoma (CCA), which emerges from the malignant proliferation of cholangiocytes, is an epithelial tumor of the biliary trees [24]. Concealed symptoms, a lack of early diagnostic markers, and low sensitivity to conventional radiotherapy and chemotherapy treatments are the crucial causes underlying poor prognoses for CCA patients [25]. Due to high morbidity and mortality in CCA, revealing the underlying molecular mechanisms, identifying molecular biomarkers for early diagnosis, prevention, and personalized therapy are critically important and highly demanded.
Currently, increasing studies have demonstrated the crucial biological functions of ceRNAs involved in the development and progression of many cancers. Additionally, the ceRNA hypothesis suggests a novel regulatory mechanism that can be mediated by lncRNAs. For example, the lncRNA XIST promotes the epithelial to mesenchymal transition of retinoblastoma via sponging miR-101 [26]. Moreover, Dong et al. found that knockdown of the lncRNA HOXA-AS2 could suppress chemoresistance in acute myeloid leukemia via the miR-520c-3p/S100A4 axis [27]. With the deeper understanding of ceRNA network, we can no longer regard miRNAs, lncRNAs, or mRNAs as separate elements in disease. Following the expression change of the affected miRNAs, the expression of target genes regulated by miRNAs will also change, leading to the occurrence of a variety of diseases, including tumors. Besides, the mechanism of occurrence and progression of cancer is too complex and the effect of single gene or single pathway is very limited. Therefore, systematic construction and analysis of ceRNA network can provide us with a more specific area to research and a new perspective on the underlying mechanism of cancer. However, few studies have described the interactions between different types of RNAs in CCA, and ceRNA network research has been rarely reported.
In our study, we chose three prognosis-related miRNAs (hsa-mir-1295b, hsa-mir-33b, and hsa-mir-6715a) to build the ceRNA network. hsa-mir-33b has been reported to be involved in multiple types of human cancers. For example, Tian et al. reported that hsa-mir-33b is downregulated in hepatocellular carcinoma and suppresses the proliferation and metastasis of HCC cells through the inhibition of Sal-like protein 4 (SALL4) [28]. Moreover, hsa-mir-33b can inhibit lung adenocarcinoma cell growth, invasion, and epithelialmesenchymal transition by suppressing Wnt/ -catenin/ZEB1 signaling [29]. hsa-mir-33b has also been shown to be dysregulated and prognosis-related in ovarian cancer [30]. However, a report of hsa-mir-33b has not been made for CCA, and the underlying regulatory mechanisms of this important molecule remain largely unclear. In addition, hsamir-1295b was found to be significantly downregulated in colorectal cancer [31,32], and to the best of our knowledge,  no research on hsa-mir-1295b has been reported for any other type of cancer. Regarding hsa-mir-6715a, we could not identify any report related to cancer. Therefore, it is necessary to further research these miRNAs. Our study demonstrates that low-expression levels of hsa-mir-33b, hsa-mir-1295b, and hsa-mir-6715a are associated with poor OS in CCA patients, and these miRNAs are candidate novel therapeutic targets that require further fundamental research.
Although COL18A1-AS1 has been reported to act as a biomarker involved competitive endogenous RNA in clear cell renal cell carcinoma [33,34], little is known about the function of COL18A1-AS1 in CCA, and research on other lncRNAs in CCA is scarce. Therefore, further investigations are needed to clarify the functions of these lncRNAs in CCA and other cancers.
In the present study, 90 prognostic DEmRNAs were identified based on univariate Cox regression analysis. GO enrichment and pathway analyses were performed to further study the biological functions of these genes. It was found that these genes were mainly involved in regulating cell proliferation, cell motion, cell migration, and reproductive   [35][36][37]. Further studies are required, and our prediction of these functional mechanisms may provide a foundation for future research.
After systematically analyzing the expression data in CCA tissues and normal tissues using TCGA, we identified prognostic tumor-specific miRNAs, lncRNAs, and mRNAs and constructed a ceRNA network with these genes. This network can prompt us to explore the regulation of ceRNAs in CCA further. For example, metastasis-associated in colon cancer 1 (MACC1), identified in colon cancer patients for the first time, has been found to play multiple important roles in many malignancies, such as hepatocellular cancer and lung cancer [38][39][40]. Lederer et al. found that MACC1 expression in hilar cholangiocarcinoma tissue was significantly higher than in corresponding normal tissue. They also discovered that patients with high MACC1 had a significantly shorter overall and disease-free survival [41]. However, despite the important clinical significance, the underlying mechanism of MACC1 remains largely unknown. In our network, MACC1 was also proved to be prognosis-related and we obtained a meaningful lncRNA-miRNA-mRNA regulatory axis involved MACC1 (COL18A1-AS1-hsa-mir-1295b-MACC1). The prediction of this axis can provide us with a clue that COL18A1-AS1 and hsa-mir-1295b may participate in the protumor process of MACC1 and these molecules may become novel diagnostic biomarkers and therapeutic targets for CCA. Researchers can carry out further fundamental studies to validate this axis. Consequently, through this study, we can not only obtain important lncRNAs, miRNAs, and mRNAs related to the occurrence and progression of CCA but also further understand its underlying mechanism, which will contribute to the early diagnosis, treatment, and prognosis of CCA.
Although several similar studies have been performed, our research still provides some novel insights. Wan et al. [20] constructed a ceRNA network based on DEmRNAs, DElncRNAs, and all miRNAs. miRNAs involved in the network did not prove to be differentially expressed. Moreover, in contrast to all other ceRNA network bioinformatic studies [20][21][22], to the best of our knowledge, all the RNA molecules that we selected in the construction of our network were differentially expressed and prognosis-related. In addition, we acquired the interactions between RNAs from the expression profiles of CCA tissues rather than RNA interaction databases. The relationship between different types of RNAs was confirmed to be accordant with their expression levels in CCA samples, which adds credibility to our network. We also built a GO terms network to further visualize the biological process and molecular functions of crucial genes. The KEGG, Reactome, and WikiPathways databases, rather than the KEGG database alone, were used to enrich the related mRNA pathways, which confers our study with more comprehensive analytic results. Finally, under our strict standard, we constructed a simple network containing fewer RNAs than other studies. However, compared with networks including hundreds of molecules, our network may provide a more reliable and precise basis for further fundamental research.
There are several limitations to our study. First, we only acquired 36 tumor samples and nine nontumor samples. This small sample number may introduce false positives and reduce the reliability of our findings. Second, all of our samples and clinical data were based on TCGA. Another independent database that includes large-scale multicenter clinical samples should be used to verify our findings. Third, a lack of biological validation is present in our study. Our future research will focus on validating these RNAs in CCA and uncovering the molecular mechanisms of ceRNAs in vivo and in vitro to validate our discovery.
In conclusion, we identified prognostic tumor-specific miRNAs, lncRNAs, and mRNAs and constructed a ceRNA network with these genes. Our study predicted a previously unknown ceRNA regulatory network and may provide novel insights into and a better understanding of the development and progression of CCA.

Data Availability
The data used to support the findings of this study are included within the article.

Conflicts of Interest
The authors declare no conflicts of interest for the publication of this manuscript.