Clinical value of integrated‐signature miRNAs in esophageal cancer

Abstract MicroRNAs (miRNAs) are crucial regulators of gene expression in tumorigenesis and are of great interest to researchers, but miRNA profiles are often inconsistent between studies. The aim of this study was to confirm candidate miRNA biomarkers for esophageal cancer from integrated‐miRNA expression profiling data and TCGA (The Cancer Genome Atlas) data in tissues. Here, we identify five significant miRNAs by a comprehensive analysis in esophageal cancer, and two of them (hsa‐miR‐100‐5p and hsa‐miR‐133b) show better prognoses with significant difference for both 3‐year and 5‐year survival. Additionally, they participate in esophageal cancer occurrence and development according to KEGG and Panther enrichment analyses. Therefore, these five miRNAs may serve as miRNA biomarkers in esophageal cancer. Analysis of differential expression for target genes of these miRNAs may also provide new therapeutic alternatives in esophageal cancer.


Introduction
Esophageal cancers, including esophageal adenocarcinoma and esophageal squamous cell carcinoma, were the fifth highest in cancer incidence in China, accounting for 50% of all new cases with liver cancer and 49% of cancerrelated deaths worldwide according to the 2014 World Cancer Report [1]. At present, the best strategy to improve the prognosis of esophageal cancer is early detection, diagnosis, and treatment. However, accurate diagnosis with an effective treatment strategy is still a challenge. Several miRNAs have been identified to play fundamental roles in the occurrence and development of cancer, and show altered expression in human various cancers [2], which motivated us to further explore the potential application of miRNAs in esophageal cancer diagnosis and therapy.
miRNA is a single-stranded noncoding RNA consisting of 20-24 nucleotides, which functions in suppressing target gene expression by binding to complementary sequences in the 3ʹ-untranslated region of mRNAs, leading to degradation of mRNA and inhibition of their translation [3]. Complex interactions existing among miRNAs, target genes, and phenotypes have emerged as valuable biomarkers of diagnosis and prognosis associated with various phenotypes in diseases. Since the first report of a direct link between miRNAs and human cancer, in which miR-15 and miR-16 were found to be absent or downregulated in B-cell chronic lymphocytic leukemia, many studies have evaluated

ORIGINAL RESEARCH
Clinical value of integrated-signature miRNAs in esophageal cancer their biological roles and associations with disease [4]. Owing to innovations in biotechnology, large datasets have emerged, providing a substantial amount of valuable miRNA data. However, miRNA profiling efforts are often inconsistent between studies because of small sample size, different technological platforms, and different methods for processing and analysis. To overcome these problems, a more advanced and powerful strategy is necessary to handle these comprehensive and complex high-throughput data. These specific miRNA biomarkers could be applied in early diagnosis and therapy of esophageal cancer in the future.

Data collection and processing
We search for articles in PubMed published from 1999 to 2016 (last accessed on 20 January 2016), by means of the combination of the following key terms: ((esophageal and (cancer* or carcinoma or tumour* or tumor*)) and (microRNA* or miRNA* or miR-*) and profil*. In total, we obtain 302 relevant articles, and 217 of them could be downloaded with full text and miRNA profiles from English-language journals. Further screening filters out 48 papers in which the patient samples are cancer tissues and paired adjacent noncancerous tissues, and not serum samples. To maintain the strategic accuracy and standardize these data as much as possible, the fold change in each miRNA expression is regarded as rule for further screening, and then 12 articles are retained finally.
We download and extract the TCGA data including miRNA expression data, gene expression data and patients' clinical information from the TCGA Data Portal (https://tcga-data. nci.nih.gov/tcga/; last release, March 2016) [5]. Gene expression profiles (Reads per kilobase per million, RPKM) and miRNA expression profiles (reads per million, RPM) are log 2 -transformed and used for subsequent analysis. In total, 13 pairs of samples are applied to evaluate the expression levels of miRNAs by paired-samples t-test and their target genes by the DESeq2 program [6] between solid tumors and adjacent noncancerous tissues. The clinical record in 185 esophageal cancer samples is for prognosis analysis.

Standardization of miRNA data
Frequently, a miRNA exists as various precursors in primary data. Additionally, the mature form of miRNA always plays crucial roles in regulation of target gene expression. To analyze these data with greater precision and high efficiency, a unified standardization strategy is applied such that the precursors or alias of each miRNA are converted to the corresponding mature form according to the miRBase database (http://www.mirbase.org/; last release, Release 21 in June 2014) [7].

RRA analysis
The novel RRA method [8], based on the leave-one-out cross-validation and Bonferroni correction, assigns a P-value to each miRNA in the last aggregated list. The P-value for each miRNA indicates how much better it is ranked compared with a null model, expecting random ordering. With the RRA method, we analyze the ranking miRNA lists based on their fold change in expression level from 12 studies to obtain meta-signature miRNA. Here, the P-value for each miRNA can indicate whether its expression in esophageal cancer tissues is statistically significant or not, compared with paired normal tissue. The RRA approach is openly available in Comprehensive R Archive Network (http://cran.r-project.org/).

Enrichment analysis
To elucidate the biological function of these miRNAs, the candidate target genes are subjected to functional enrichment analyses individually with GeneCodis3 software. The GeneCodis3 (http://genecodis.cnb.csic.es/), a web-based application for singular and modular enrichment analysis, integrates information for various types of data (functional, regulatory, and structural) by searching for frequent patterns among annotations and evaluating statistical relevance [15]. This new approach, superior to DAVID, Onto-Express, ProfCom or FATIGO+ by overcoming the lack of termterm relationships in these analyses, profiles different sides of the same information and offers a more accurate interpretation of the data.
GeneCodis3 is applied to the functional enrichment analyses (KEGG and Panther pathways) of the significant target genes for each miRNA with a hyp-c ≤0.05. Here, the hyp-c, the P-value for the hypergeometric test with multiple hypothesis corrections, represents the significance of the association between each enrichment pathway and the input list of genes.

Prognosis for differentially expressed miRNAs
The association between miRNA expression and survival for esophageal cancer patients is explored by separating the cases from each cohort into a group with a high expression level and another with a low expression level. The data-driven approach [16], a novel computational method for the identification of miRNAs with a significant influence on survival and patient grouping, estimates the optimal threshold expression level for each miRNA for grouping of patients by maximizing the separation of the survival curves related to the risks of the disease. The log-rank test, based on Kaplan-Meier plots, determines the differences among survival curves with respect to the miRNA expression levels. The univariate HR value, based on the Wald's test, determines statistical significance along with 95% CIs in the Cox proportional hazards model.

Predicting, integrating, and verifying target genes
To further explore the function and mechanism of these five miRNAs in esophageal cancer, we identified the predicted target genes for each miRNA in six databases with the given integration strategy. The expression levels of these target genes were analyzed in 13 paired esophageal cancer and adjacent noncancerous tissue samples. In total, 1969 target genes were predicted and 686 of these target genes are verified as having significantly different expression between cancer and non-cancer tissues (P ≤ 0.05). Then, 384 target genes were confirmed with a further filter (hsa-miR-100-5p and hsa-miR-133b: log 2 [fold change] >0; hsa-miR-155-5p, hsa-miR-21-5p, and hsa-miR-223-3p: log 2 [fold change] <0) ( Table 3).

Functional enrichment of target genes for miRNA
Overall, 38 KEGG pathways (Kyoto Encyclopedia of Genes and Genomes), and 22 Panther pathways were enriched for the target genes of hsa-miR-100-5p, hsa-miR-133b, hsa-miR-155-5p, hsa-miR-21-5p, and hsa-miR-223-3p. In the KEGG pathways (Table 4), the target genes for hsa-miR-100-5p were involved in cell growth and death, the immune system, the digestive system, and various cancers. Hsa-miR-133b functions as a signaling molecule and in cellcell interaction, cell communication, the digestive system, and cancer. Hsa-miR-155-5p functions in cell communication, cell motility, transport and catabolism, signal transduction, cancer, and the immune system. Hsa-miR-223 functions in cancer protein folding, sorting, and degradation.

Effects of differentially expressed miRNAs on prognosis
To evaluate the prognostic impact of these five candidate miRNAs, survival and Cox regression analyses were applied to determine the risk of death and disease progression related to miRNAs in 185 esophageal cancer patient samples ( Table 6). Owing to the complexity of miRNA metabolism, we selected the stem-loop of mature miRNAs to standardize the analysis. The survival analysis results (Fig. 2) show that hsa-mir-100 and hsa-mir-133b were significantly associated with 3-year and 5-year survival, respectively. However, the other three miRNAs showed no significant difference

Discussion
In general, the individual risk grade and decision of treatment largely depend on pathological and clinical factors, which show great variation between individuals, thereby influencing the predictive accuracy in cancer. A recent study demonstrates that differential expression of miRNAs can reflect tissue-specific expression signatures through promotion or suppression of tumor development and progression [38]. The application of miRNA-based biomarkers for diagnosis thus provides a promising alternative.
In our study, we identified five differentially expressed miRNAs in esophageal cancer with comprehensive analyses of reported miRNA-microarray sequencing data, miRNA-generated sequencing data from the TCGA database, and RT-PCR data from published studies. Here, we used paired cancer and adjacent noncancerous tissue samples to test miRNA and gene expression, which is more representative of the physiological status in the body than cell or serum samples. These five miRNAs are mainly involving in regulating esophageal cancer occurrence and development, according to comprehensive considerations of KEGG and Panther enrichment analyses. Additionally, high expressions of hsa-mir-100 and hsa-mir-133b, and low expressions of hsa-mir-155 and hsa-mir-21 tended to show a better prognosis, especially for hsa-mir-100 and hsa-mir-133b, which suggested high clinical value for esophageal cancer. Notably, hsa-mir-223 levels were increased in cancer tissue compared to normal tissue, but a good prognosis was also associated with a high expression level. In our results, hsa-miR-223-3p mainly participates in protein folding, sorting, and degradation (Tables 4 and 5), which may underlie the anti-tumor effects. Some studies have regarded the expression levels of different miRNAs as prognostic biomarkers in various cancers such as lung cancer [39], gastric cancer [40], and colorectal cancer [41]. In our work, hsa-miR-100-5p, hsa-miR-133b, hsa-miR-155-5p, and hsa-miR-21-5p were identified as differentially regulated, and associated with good prognosis; therefore, they can be used as miRNA biomarkers to increase the predictive ability in esophageal cancer, which will provide more choices in the treatment of esophageal cancer with further study. Hsa-miR-223-3p also showed different expression in esophageal cancer tissue. Two or more of these miRNAs as testing indices could further improve the diagnostic accuracy of esophageal cancer. Some significant target genes in our analysis have been verified experimentally in previous studies, for example, SMARCD [42] and PTEN [43], which play important roles in esophageal cancer occurrence and development ( Table 5). Study of the other genes in Tables 4 and 5 may provide more constructive suggestions for esophageal cancer prognosis and treatment.