Identification of the Key MicroRNAs and the miRNA-mRNA Regulatory Pathways in Prostate Cancer by Bioinformatics Methods

Objective To identify key microRNAs (miRNAs) and their regulatory networks in prostate cancer. Methods Four miRNA and three gene expression microarray datasets were downloaded for analysis from Gene Expression Omnibus database. The differentially expressed miRNA and genes were accessed by a GEO2R. Functional and pathway enrichment analyses were performed using the DAVID program. Protein-protein interaction (PPI) and miRNA-mRNA regulatory networks were constructed using the STRING and Cytoscape tool. Moreover, the results and clinical significance were validated in TCGA data. Results We identified 26 significant DEMs, 633 upregulated DEGs, and 261 downregulated DEGs. Functional enrichment analysis indicated that significant DEGs were related to TGF-beta signaling pathway and TNF signaling pathway in PCa. Key DEGs such as HSPA8, PPP2R1A, CTNNB1, ADCY5, ANXA1, and COL9A2 were found as hub genes in PPI networks. TCGA data supported our results and the miRNAs were correlated with clinical stages and overall survival. Conclusions We identified 26 miRNAs that may take part in key pathways like TGF-beta and TNF pathways in prostate cancer regulatory networks. MicroRNAs like miR-23b, miR-95, miR-143, and miR-183 can be utilized in assisting the diagnosis and prognosis of prostate cancer as biomarkers. Further experimental studies are required to validate our results.


Introduction
Prostate cancer (PCa) is a major cancer bothering men worldwide. With an estimated 19% new cancer incidence in men, PCa continues to hold the highest morbidity among all cancer types in the United States [1]. During the past decades, urologists have made great progress in the diagnosis and treatment of PCa [2]. Currently, the new types of biomarkers such as liquid biopsy and imaging biomarker start to make contributions to clinical decision making [3][4][5]. However, PCa is a heterogeneous cancer with individual disparity; therefore, genomic and molecular resources still play an important role in discriminating aggressive cases and aiding prognosis prediction [6].
MicroRNA (miRNA), also known as short noncoding RNA, is a group of single-stranded RNA molecules that act as posttranscriptional gene regulation [7]. MiRNAs are involved in many cellular biological processes such as proliferation, migration, apoptosis, invasion, and epithelial-mesenchymal transition (EMT). Accumulating evidence shows that miR-NAs can functionally act as oncogene or tumor suppressors [8]. In addition, circulating miRNA can be detected without invasion like biopsy [9]. Hence, miRNAs can be applied to help diagnosis and prognosis. Multiple studies of miRNA-PCa have been performed, but they often had the disadvantages of heterogeneous design, small sample size, and less clinical information.
To the best of our knowledge, there are few studies integrating microarray datasets to access key molecules and investigate miRNA-mRNA regulatory networks. The objective of this study is to find the key miRNAs and their potential regulatory mechanisms in PCa by bioinformatic approaches.   GSE26367  24713434 GPL11350  n=173  n=11  microRNA  GSE64333  26089375 GPL8227  n=27  n=27  microRNA  GSE21036  20579941 GPL8227  n=113  n=28  microRNA  GSE54516  25485837 GPL18234  n=51  n=48  microRNA  GSE55945  19737960 GPL570  n=13  n=8  Gene  GSE60329  27384993 GPL14450  n=54  n=14  Gene  GSE103512 29133367 GPL13158 n=60 n=7 Gene value) from the Benjamini-Hochberg method could correct the false positive results. So we chose the "adj. value<0.05" and "|logFC|>1" as a primary cut-off criteria to interpret the results. Among the DEMs from the four datasets, only those appeared in two or more datasets were considered as the significant DEMs.

2.3.
Identification of miRNA Targets. The MiRWalk 2.0 database provides a large collection of predicted and experimentally verified miRNAs-targets binding sites information [10]. We downloaded the significant DEMs-mRNA intersection data from the MiRWalk 2.0-validated-target miRNAgene retrieval system, which contains all literature-reported miRNA-target genes. However, the validated target genes were from various diseases models, so the expression of those genes in PCa might not be consistent with other diseases. We subsequently selected the genes from the intersection between the significant DEMs-mRNA intersection data and the DEGs from the three GEO datasets. These genes were defined as the significant DEGs.

Functional Enrichment Analysis.
The Database for Annotation, Visualization, and Integrated Discovery (DAVID) is an online program that offers functional annotation of enormous quantity of genes derived from various genomic resources [11]. We used the DAVID database to perform Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis on significant DEGs. The species was limited to Homo sapiens and the " value <0.05" was considered statistically significant.

Interaction and Regulatory Network Establishment.
Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) was used to construct protein-protein interaction (PPI) network. The STRING database collects and predicts interaction information from genomic context predictions, high-throughput lab experiments, coexpression, automated text-mining, and previous knowledge in databases [12]. To access more objective and reliable results, we restricted the sources as high-throughput lab experiments and previous knowledge in databases. In addition, the minimum interaction score was set at high confidence (0.700).
The miRTarBase (http://mirtarbase.mbc.nctu.edu.tw/) is a miRNA intersection database based on validation from experiments such as assay, microarray, sequencing quantitative PCR, and western blot. The miRanda algorithm from the "microRNA.org-Targets and Expression" was freely available and can be applied to the whole genome sequences using identified miRNA sequences [13]. The two programs above were used to predict target mRNAs of the significant DEMs. When two miRNAs shared a common target mRNA; they might exist in a similar regulatory pathway. The miRNA-mRNA regulatory network was visualized by Cytoscape 3.6.0 [14].

Validation and Survival Analysis. The Cancer Genome
Atlas (TCGA) database provides abundant clinical information from huge sample size. Therefore, the dataset "TCGA-PRAD" was used to find if the significant DEMs were correlated with survival outcome. The dataset "TCGA-PRAD" consists of 498 PCa samples with 52 normal prostate samples. Among the 494 clinical outcome events available, 484 patients were living and 10 were dead when the followup ended. We examined the significant DEMs expression, relationship with PCa stages, and survival. The "t-test" was applied to check expression level, while the chi-square test was used on different stages. The Kaplan-Meier plots were constructed by OncomiR [15]. The "P value <0.05" was considered statistically significant.

DEGs/DEMs Identification.
After being compared with normal prostate tissues, the results of GEO2R analysis  (Figure 1(a)) in PCa samples. Among them, 26 DEMs appeared in the results of two datasets at least, so they were identified as significant DEMs. Additionally, miR-183 was the only DEM in all four datasets simultaneously. The following DEMs were found in three datasets: miR-96, miR-375, miR-143, miR-1, and miR-145.
By MiRWalk 2.0-validated-target miRNA-gene retrieval system, we got 4475 candidate genes of the 26 significant DEMs. The intersection number of these candidate genes and the upregulated DEGs was 633. For the downregulated DEGs, 261 genes were intersected with the 4475 candidate genes. Therefore, the 633 upregulated and the 261 downregulated genes were identified as the final sets of significant DEGs.

GO and Pathway Enrichment.
The GO ontology contains three terms: cellular component (CC), molecular function (MF), and biological process (BP). The results demonstrated that the most significant GO terms for upregulated significant DEGs were "protein binding (MF)", "nucleoplasm (CC)", and "poly(A) RNA binding (MF)", whereas for the downregulated significant DEGs, "extracellular matrix (CC)", "cell surface (CC)", and "protein binding (MF)" turned to be important (Table 2).

PPI Network.
PPI networks were established separately by significant upregulated DEGs (Figure 2(a)) and significant downregulated DEGs (Figure 2(b)). In respect of the former, 294 nodes and 926 edges in total constituted the PPI network. In regard to the latter, the network was composed of 93 nodes and 153 edges.
In a PPI network, the more edges a gene has, the more important role it plays (like a seed). We used the parameter "degree" to calculate edge counts of every single gene in a PPI network. The top 5% degree genes were listed in Table 3, which were assessed as hub genes.  Figure 3, the target genes were predicted by miRanda and miRTarbase, among which two or more miRNA might target the same mRNA. For instance, miR-1, miR-145, miR-143, miR-205, and miR-31 all had interactions with cyclin dependent kinase 4 (CDK4). There were 21 mRNA nodes which might interact with more than two miRNAs.

Validation and Kaplan-Meier Curves.
After comparing the expression level of miRNAs in the TCGA dataset, we found that 23 among the 26 significant DEMs were in accordance with the results in GEO databases. However, the expression of miR-1, miR-519b, and miR-572 was not significantly different between PCa samples and normal control prostate tissues. The relationship of miRNAs between the clinical TNM stages was also tested in the TCGA dataset using chi-square analysis. The results of the significant DEMs were shown in Table 4.
We subsequently drew Kaplan-Meier plots (Figure 4) based on the TCGA survival data. Four significant DEMs, miR-95, miR-23b, miR-143, and miR-183 were found related to overall survival (OS) in PCa patients. The higher expression of these miRNAs meant poor OS in patients with PCa.

Discussion
In the present study, we identified 26 significant DEMs, 633 upregulated DEGs, and 261 downregulated DEGs. The results of functional enrichment analysis indicated that the significant DEGs were related to TGF-beta signaling pathway and TNF signaling pathway in PCa. The key DEGs such as HSPA8, PPP2R1A, CTNNB1, ADCY5, ANXA1, and COL9A2 were found as hub genes in PPI networks. Importantly, some of the DEMs were validated and found correlated with tumor stages and survival, which meant the miRNAs could not only     regulate cellular process but also be of high value in clinical practice.
In fact, prostate cancer shares a plenty of pathways with other cancers such as TNF, TGF-beta pathways. Some unique signal pathways like the androgen receptor (AR) and transmembrane protease serine 2 (TMPRSS2) relevant pathways also play a crucial role in PCa. In these complicated processes, miRNAs nearly take part in all key cellular pathways considering that one miRNA can interact with many mRNAs and one mRNA can also interact with many miRNAs. Interestingly, miR-183 was the only one significant DEM expressed in all four different datasets. We assumed that ethnics and sample size may cause this heterogeneity. In accordance with our result, miR-183 has been testified overexpression in PCa serum, tissue, and cell line [16]. Ueno reported that miR-183 could target Dkk-3 and SAMAD4 and has an oncogenic biological behavior [17]. Due to the paralogous property of miR-183, miR-96, and miR-182, these three miRNAs have been studied as miR-183 cluster, which could regulate zinc levels and carcinogenic pathways in prostate cells [18]. MiR-96 was also reported enhancing PCa cell proliferation through FOXO1 [19], which could be speculated in the PPI network (Figure 2(a)). Except for the previously reported miRNAs, there are also some miRNAs not reported in PCa before like miR-95. We found that higher miR-95 expression indicated poor survival based on 494 patients in TCGA. However, we do not know if miR-95 is related to the survival of patients from other regions or countries. The mechanisms and in-depth pathways of miR-95 should also be investigated by experiments in the future.
By constructing the PPI network, we can recognize the key genes that miRNAs may interact with. Since the genes were filtered by the 26 significant DEMs potential targets, there were still 633 upregulated and 261 downregulated. Therefore, when it comes to all the 197 DEMs, the enormous and complicated miRNA-mRNA regulatory network can be imaginable. The hub genes of a network are always important like "seeds", which connect different signal pathways. In the present study, we identified several hub genes. The regulating mechanisms HSPA8 were reported in malignancies like glioblastoma and myeloid leukaemia [20,21]. The PPP2R1A mutation in uterine cancer was reported acting through a dominant-negative mechanism to promote cancer cell growth [22]. Besides, emerging evidence points out that the histone cluster is contributed to various kinds of cancers [23]. As a phospholipid-dependent, membranebinding protein, the ANXA1 upregulation can enhance drugresistance of PCa therapy [24]. In our predicted miRNA-mRNA network, the key mRNAs like CDKN1A,B,C and CDK4 are cell-cycle related modulators [25]. To sum up, the miRNA regulatory network may provide potential targets for new drug development and treatment. By exploiting the bioinformatic analysis on miRNAs and regulatory pathways, we can discover important and novel potential targets in the tumor-genesis, diagnosis, prognosis, and key mechanisms in different cancers such as lung cancer [26], gastric cancer [27], colorectal cancer [28], and thyroid cancer [29]. Although this study firstly investigated miRNAs' regulatory role in PCa by integrating multiple microarray datasets, we still need to clarify some limitations. Firstly, the stages and aggressiveness of PCa were not restricted. We only researched the PCa versus normal prostate. Key miRNAs and genes in different periods like metastatic castration-resistant PCa call for deeper exploration. Secondly, the source of microarray is only tissues. Body fluid like serum, urine, and prostatic fluid may contain circulating miRNAs, which are more likely to be accepted for clinical application. Thirdly, though we validated our results in TCGA data, the microarray results were not validated by experiments like qRT-PCR, functional experiments in vitro. Further studies are needed to validate our results by experiments on large samples.

Conclusions
We identified 26 miRNAs that may take part in key pathways like TGF-beta, TNF pathways in prostate cancer regulatory networks. MicroRNAs like miR-23b, miR-95, miR-143, and miR-183 can be utilized in assisting the diagnosis and prognosis of prostate cancer as biomarkers. Further experimental studies are required to validate our results.

Data Availability
All raw data in this article can be obtained by emailing the corresponding author.