Analysis of Transcription Factor-Related Regulatory Networks Based on Bioinformatics Analysis and Validation in Hepatocellular Carcinoma

Hepatocellular carcinoma (HCC) accounts for a significant proportion of liver cancer, which has become the second most common cause of cancer-related mortality worldwide. To investigate the potential mechanisms of invasion and progression of HCC, bioinformatics analysis and validation by qRT-PCR were performed. We found 237 differentially expressed genes (DEGs) including EGR1, FOS, and FOSB, which were three cancer-related transcription factors. Subsequently, we constructed TF-gene network and miRNA-TF-mRNA network based on data obtained from mRNA and miRNA expression profiles for analysis of HCC. We found that 42 key genes from the TF-gene network including EGR1, FOS, and FOSB were most enriched in the p53 signaling pathway. The qRT-PCR data confirmed that mRNA levels of EGR1, FOS, and FOSB all were decreased in HCC tissues. In addition, we confirmed that the mRNA levels of CCNB1, CCNB2, and CHEK1, three key markers of the p53 signaling pathway, were all increased in HCC tissues by bioinformatics analysis and qRT-PCR validation. Therefore, we speculated that miR-181a-5p, which was upregulated in HCC tissues, could regulate FOS and EGR1 to promote the invasion and progression of HCC by p53 signaling pathway. Overall, the study provides support for the possible mechanisms of progression in HCC.


Introduction
Hepatocellular carcinoma (HCC) is one of the common digestive system malignancies with high mortality, which has become the second most common cause of cancer-related mortality worldwide [1,2]. There are more than 466,100 estimated new cases and 422,100 estimated death cases every year in China [3]. Although there is extensive research about the therapies for HCC, the specific mechanisms of HCC occurrence and development were not clear. It is significant to investigate the underlying mechanisms of HCC invasion and progression to develop effective diagnostic and therapeutic strategy.
Over the last decades, bioinformatics and microarray technology have been widely used to screen the molecular mechanisms of HCC, which provide powerful research support for the diagnosis and treatment of HCC [4][5][6]. By bioinformatics analysis and microarray technology, functions of some differentially expressed genes (DEGs) in HCC have been explored, which paved the way for exploring complicated process in the occurrence and development of HCC [5,[7][8][9]. As an important part of participating in life activities, 2 BioMed Research International transcription factors (TFs) have been reported to play an important role in the development of HCC by numerous studies [10,11]. A work by S. Hua et al. found that ETS1, a cancer-related TF, could through interaction with miR-139-5p inhibit cell proliferation, migration, and invasion in HCC [12]. Another research reported that NF B/EGR1 signaling pathway induced miR-3928v to promote the progression of HCC [13]. Even so, more research is still needed to explore the specific role of different TFs in the progression of HCC.
During the present study, we analyzed 4 mRNA microarray datasets and 1 miRNA microarray dataset from Gene Expression Omnibus (GEO) to obtain DEGs and differentially expressed miRNAs (DEMs) between HCC tissues and adjacent noncancerous tissues by GEO2R. Subsequently, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, and protein-protein interaction (PPI) network analysis were performed to reveal the interaction relationship between DEGs and DEMs and to explore the underlying molecular mechanisms in the carcinogenesis and progression of HCC [14][15][16][17]. In addition, we screened differentially expressed TFs in HCC, constructed miRNA-TF-mRNA networks, and proposed a potential miRNA-TF-signaling pathway axis, which identified a systematic exposition of the relevant transcriptional regulation modes associated with invasion and progression of HCC. This study could provide insightful information and the valuable clue for biomarker discovery and novel treatment strategy in HCC.

Microarray Data Information and DEGs Identification.
NCBI-GEO is a free database of microarray/gene profile and next-generation sequencing, from which HCC and adjacent nontumor tissue gene expression profile of GSE84402, GSE33006, GSE84005, GSE12941, and GSE64632 were obtained [4,18,19,21]. The DEGs and DEMs between HCC tissues and adjacent nontumor tissues were screened using GEO2R (http://www.ncbi.nlm.nih.gov/geo/geo2r), which was an interactive web tool for identifying DEGs across experimental conditions among two or more datasets in a GEO series. Identification of commonly differentially expressed mRNAs from the four cohort profile data sets (GSE84402, GSE33006, GSE84005, and GSE12941) was performed by FunRich software (available online: http://www .funrich.org/). |logFC| (fold change) >1 and P value <0.05 were considered statistically significant. The pipeline of the whole process of this study was shown in Figure 1.

Functional and Pathway Enrichment Analysis.
Omicsbean is an online biological information database that integrates biological data and analysis tools and provides a comprehensive set of functional annotation information of genes and proteins for users to extract biological information.
To analyze the function of DEGs, GO and KEGG pathway enrichment analysis were performed using Omicsbean online database (http://www.omicsbean.com:88). P value<0.05 was considered statistically significant.

Validation in TCGA Dataset and Modular Analysis of the Key Genes.
Validation of the key genes in the TCGA Dataset was performed by UALCAN (http://ualcan.path.uab.edu/ index.html) [25]. We performed Kaplan-Meier plots and boxplots to verify the differential expression of the key genes between primary tumor and normal liver and effect of key genes expression level on LIHC patients' survival. Correlation analysis was performed by Linkedomics (http://www.linkedomics.org) [26].

Validation of the Key Genes by Quantitative Real-Time RT-PCR (qRT-PCR).
A total of 20 HCC patients were recruited for tumor and adjacent nontumor tissues collection from the China-Japan Union Hospital of Jilin University, Changchun, China. This study was approved by the Ethics Committee of the Second Clinical Medical College, Jilin University, and each patient consented in a written informed consent form. The data were analyzed anonymously. All tissues were taken from the surgery room and snap-frozen and stored in liquid nitrogen within 10 min after the resection. The clinicopathological characteristics of 20 HCC patients were shown in Table S1.
Tissue RNA was isolated using Trizol (Invitrogen, CA, USA) and further purified using the MiniBEST Universal RNA Extraction Kit (TaKaRa, China) according to the manufacturer's instructions. RNA concentration was then measured using the NanoDrop 2000 spectrophotometer (Thermo Scientific, MA, USA) with A260/A280 ratio in the range 1.8∼ 2.0 and RNA concentration ranged from 100 ng/ l to 1 g/ l.
For qRT-PCR analysis, less than 5 g total RNA including microRNA was reverse transcribed to cDNA with 1st strand cDNA Synthesis Kit (Takara, China) and miRNA First Strand cDNA Synthesis (Sangon, China); the expression of mRNA and microRNA was examined by qRT-PCR with TransStart 5 Top Green qPCR SuperMix (TransGen Biotech, China), microRNAs qPCR Kit (Sangon, China), and Applied Biosystems 7500 Fast Real-Time PCR System. The relative expression of mRNA and microRNA was normalized to GAPDH/U6 expression by comparative Ct method (2 −ΔΔCt , ΔCt =Ct target -Ct GAPDH/U6 , and ΔΔCt=ΔCt tumor -ΔCt non-tumor ). All primers were designed with Primer 7.0  Software; primer sequences for amplification were listed in Table 8.

Statistical Analysis.
Data from qRT-PCR were analyzed with GraphPad Prism version 7.0, and differences between the two groups were statistically evaluated by two-tailed Student's t-test with p value <0.05 considered as significant. The Pearson correlation coefficient was used to examine the relationship between key genes.

Identification and Enrichment Analysis of DEGs in HCC.
HCC and adjacent nontumor tissue gene expression profile of GSE84402, GSE33006, GSE84005, and GSE12941 were obtained from NCBI-GEO. We used p<0.05 and |logFC| >1 as cut-off criterion. After integrated bioinformatical analysis, a total of 237 differentially expressed genes were identified from the four profile datasets, including 57 upregulated genes and 180 downregulated genes (Tables 1-3, Figure 2). The enrichment analysis of DEGs was performed. As shown in Table 4 and Figure 3, in the biological process group, the DEGs were mainly enriched in response to chemical. In the cellular component group, the DEGs were most enriched in the extracellular region. In the molecular function group, the DEGs were most enriched in protein binding. And the most enriched KEGG pathway was complement and coagulation cascades.

Identification and Functional Analysis of Key Differentially
Expressed Genes, Construction of TF-Related Networks in HCC. We intersected 237 common DEGs and curated 36 cancer-related TF families to get three differentially expressed TFs (EGR1, FOS, and FOSB) in HCC (Table 5). We searched       included these four TFs in the subsequent data of network construction. We screened the 4 TFs in the TRRUST version 2 database query to get each TF target genes and the DEGs from four profile datasets in GEO and TF-target genes for intersection analysis to obtain 42 differentially expressed target genes of the TFs (key genes) ( Table 6), which laid the foundation for our next step to construct the gene signaling regulatory network in HCC. Based on the above TRRUST version 2 database, we constructed the TF-target transcription regulatory network with the 42 key genes in HCC by Cytoscape 3.6.0 software ( Figure 4). The TF-target network complex contained 42 key genes and 7 TFs. We performed enrichment analysis for the 42 key genes; the results were followed in Figure 6.
Subsequently, the HCC and adjacent tissue gene expression profile of GSE64632 were analyzed by GEO2R. We use p<0.05 and |logFC| >1 as cut-off criterion to obtain 703 DEMs, including 452 upregulated DEMs and 251 downregulated DEMs. The interactions between TFs and microRNAs were predicted by miRTarBase. And we performed intersection analysis between the miRNA-targeting genes and DEMs in the microRNA assay data from GEO. Based on the above TRRUST version 2 database and miRTarBase, we constructed miRNA-TF-mRNA regulatory network by Cytoscape software ( Figure 5). The results showed that some genes were coregulated by the same miRNA or a few miRNAs, which suggested that miRNAs could play an important role in the progression of HCC. Using these coregulatory genes combined with regulatory networks, we constructed the miRNA-TF-mRNA regulatory network, which was shown in Figure 5. In addition, The Prediction results of differentially expressed miRNA targets revealed that EGR1 and FOS are coregulated by 2 microRNAs including hsa-miR-181a-5p and hsa-miR-192-5p, which were upregulated in HCC based on microRNA assay data. In the targeting relationship between miR-181a-5p and EGR1, FOS has been experimentally validated, especially.

Functional and Pathway Analysis of the Key Genes.
In the current study, we performed enrichment analysis for the 42 key genes which were differentially expressed in HCC tissues. And the results were shown in Figure 6 and Table 7. And the most enriched KEGG pathway of key genes was p53 pathway.   We queried all the samples from TCGA liver hepatocellular carcinoma (LIHC) with RNA-seq v2 data in our study. The boxplots showed that the expression level of EGR1, FOS, and FOSB was significantly lower in primary tumor than that in the normal liver for LIHC patients from TCGA (p<0.001) (Figure 7(a)). The overall survival analysis of the three TFs was performed using the Kaplan-Meier curve. We found that the overexpression of FOSB in HCC was associated with reduced survival in cases with three TFs high expression, compared with the remaining cases with low/medium expression (Figure 7(b)). Through correlation analysis, we found that there was a strong positive correlation The ellipses in the TF-gene network represented mRNAs with red (upregulated) and green (downregulated), and the diamonds represented TFs. The ellipses with a number were the clustered genes in the brief framework and the number of genes is shown inside. The interaction relationship between TFs and mRNAs were represented by arrows, and the direction of the arrow was from the source to the target. Different colors in the lines represented the different interaction relationship between the TFs and targets: red was for activation, green for repression, and grey for unknown. between the expression level of EGR1 and FOS mRNA. And there existed strong positive correlation toward coexpression of FOS and FOSB among target samples. The correlation between the expression level of EGR1 and FOSB mRNA was moderate (Figure 7(c)).
Moreover, CCNB1, CCNB2, and CHEK1, which were three of key genes in the TF-target network, were involved in the p53 pathway. The boxplots showed that the expression levels of CCNB1, CCNB2, and CHEK1 were significantly higher in primary tumor than those in the normal liver for  Figure 5: miRNA-TF-mRNA regulatory network for HCC. The squares in the network represented miRNAs, and the ellipses represented mRNAs, and the diamonds represented TFs. The nodes in red were upregulated, whereas the nodes in green were downregulated. The interaction relationship among TFs, mRNAs, and miRNAs was represented by arrows, and the direction of the arrow was from the source to the target. Different colors in the lines represented the different interaction relationship among TFs, mRNAs, and miRNAs: red was for activation, green for repression, and grey for unknown. LIHC patients from TCGA (p<0.001). The overall survival rates of patients with high expression of CCNB1, CCNB2, and CHEK1 were all significantly lower than those of patients with low/medium expression. The correlation analysis results showed that there existed strong positive correlation among CCNB1, CCNB2, and CHEK1 mRNA expression (Figure 7(c)).

Discussion
Although some progress has been made in the study of HCC, the exact mechanisms of occurrence and development in HCC are not yet clear. In the current study, we constructed networks related to transcription regulatory modes in HCC and performed functional and KEGG pathway analysis for the key genes. The enrichment analysis results showed that key genes in the TF regulatory network were enriched in the p53 signaling pathway. In addition, we performed qRT-PCR and verification in the TCGA Dataset to confirm the results based on bioinformatics analysis. FOS (also known as c-FOS) and FOSB were both protooncogenes, which belong to the activator protein 1(AP1) family [27]. Many studies have reported that FOS was involved in proliferation, migration, and invasion of some malignancies [28,29]. In some cases, expression of the FOS gene has also been associated with apoptotic cell death [30]. However, there still existed some contradictions about the role that c-FOS might play in the tumor progression. Some studies showed that overexpression of FOS might be associated with inhibition of tumor [31,32]. Oliveira-Ferrer L et al. reported that c-FOS overexpression increased the apoptotic potential of ovarian cancer cells and inhibited tumor growth and metastasis, which could be achieved by changing the adhesion of human ovarian cancer cell [33]. A study by Guo J et al. showed that the expression of c-FOS in the tumor tissues of pancreatic cancer seemed to be lower than that in adjacent nontumor tissues [34]. However, some other research showed that FOS could contribute to carcinogenesis [35,36]. Therefore, the above research work indicated that c-FOS is expressed differently in different histological types and is closely related to the proliferation, differentiation, invasion, and apoptosis of tumor cells, which provided a new therapeutic target for cancer by regulating the expression of c-FOS. EGR1 (early growth response 1) belongs to EGR  family of transcription factors that includes four members: EGR1, EGR2, EGR3, and EGR4. It is a nuclear protein and functions as a transcriptional regulator, which is a component of p53 signaling [37,38]. Abundant studies found that expression of EGR1 was associated with HCC metastasis and proliferation [39,40]. And a number of studies suggested that EGR1 exhibits prominent tumor-suppressive activity by activating major tumor suppressor factors, including transforming growth factor-1, p53, p73, fibronectin, and PTEN [40,41]. In the present study, the mRNA expression of FOS and EGR1 was lower in HCC tissues than that in nontumor adjacent tissues based on bioinformatics analysis and qPCR verification. And the low expression of FOS and EGR1 was negatively associated with overall survival of HCC patients based on TCGA data, although this correlation was not statistically significant. The results above suggested that EGR1, FOS, and FOSB, three cancer-related TFs, could play an important role in the progression of HCC.
In the current study, we found that miR-181a-5p was upregulated in HCC based on miRNA microarray data and qRT-PCR verification. As a member of the miR-181 family, the level of miR-181a-5p was overexpressed in many cancers including breast cancers, multiple myeloma, pancreatic and gastric cancer, and hepatocellular carcinoma [42][43][44][45]. And   some studies have reported that miR-181a was involved in the pathogenesis of HCC by inducing hepatocyte epithelialmesenchymal transition and decreasing autophagy [46,47].
A study by C. Zou et al. confirmed that miR-181a plays an important role in the progression of HCC autophagy-related modulator [48]. We found that FOS and EGR1 were both targeted by miR-181a-5p based on prediction results of key genes target, and the targeting relationship between miR-181a-5p and FOS, EGR1, has been confirmed in some studies [49,50]. And based on correlation analysis from TCGA data, there was a tendency to a positive association between FOS and EGR1. Furthermore, the expression of FOS protein downregulated by miR-181a has been reported [51]. And another study revealed that aberrant EGR1 expression could be suppressed by miR-181a-5p directly [52]. The role of the p53 pathway in the progression of HCC has been reported in the literature [53,54]. As an important component of the p53 pathway, TP53 (tumor protein p53) is the common target of EGR1 and FOS. Studies found that increased EGR1 expression could activate p53 signaling pathway to induce apoptosis in HCC cells [55,56]. Association between FOS and p53 signaling has also been reported, but the specific interaction relationship is not yet clear [57]. In the current study, we found that CCNB1, CCNB2, and CHEK1 were the target genes of TP53 based on bioinformatics analysis and literature confirmation. Cyclin B1 (CCNB1) and cyclin B2 (CCNB2) are important members of the cyclin family and are important cell cycle regulators related to G2/M detection sites [58,59]. One of its important roles is to modulate and form a complex with cyclin-dependent kinase 1 (CDK1) to phosphorylate the substrate, initiate the cell to G2/M phase from G1/S, and promote mitosis [60]. Checkpoint kinase 1 (CHEK1), as a DNA damage sensor and cell death pathway stimulator, regulates the progression of the cell cycle at the S phase, G2/M checkpoint [61]. Damaged DNA activates CHEK1, causing cell cycle arrest and repairing damaged Detection of EGR1, FOS, FOSB, CCNB1, CCNB2, and CHEK1 mRNA expression and hsa-miR-181a-5p expression in HCC versus adjacent nontumor tissues was performed using qRT-PCR. Levels of EGR1, FOS, and FOSB mRNA were 0.493±0.558-, 0.494±0.476-, and 0.500±0.551-fold downregulated in tumor tissues, respectively, compared to those in the adjacent nontumor ones. And the levels of CCNB1, CCNB2, and CHEK1 mRNA were 3.938±3.887-, 3.225±3.388-, and 3.186±3.508-fold upregulated. The relative expression of hsa-miR-181a-5p was 1.694±1.236-fold upregulated. * p<0.05, * * p<0.01, and * * * p<0.001.
DNA; if extensive damage is not repaired, it induces apoptosis, thereby maintaining genome integrity and stability [62]. Due to its anti-injury effect, CHEK1 plays an important role in tumor development and apoptosis [63][64][65]. Many studies have reported that CCNB1, CCNB2, and CHEK1 were involved in p53 signaling pathway [66][67][68]. We found that mRNA levels of three p53 markers including CCNB1, CCNB2, and CHEK1 were significantly upregulated in 20 HCC tumor tissues versus nontumor adjacent tissues by bioinformatics analysis and experimental verification, and their mRNA expression levels were all negatively correlated with HCC patients' survival rates (p<0.001). Therefore, we speculated that as an onco-microRNA, miR-181a-5p could inhibit expression of FOS and EGR1 to regulate p53 signaling pathway, which may be achieved through the upregulation of CCNB1, CCNB2, and CHEK1, thereby promoting the progression of HCC.

Conclusion
Using multiple cohorts profile datasets and integrated bioinformatical analysis, we identified commonly 237 DEGs, and finally found EGR1, FOS, and FOSB, 3 cancer-related TFs, which were downregulated in HCC. In addition, we constructed the transcription factor-related regulatory networks based on EGR1, FOS, and FOSB and identified possible miR-181a-5p →FOS/EGR1 →p53 signaling pathway axis. It should be noted that this study examined the conclusion by qRT-PCR and bioinformatics analysis; further research needs to be done to explore more specific mechanisms. Notwithstanding its limitation, these findings significantly improved the understanding of underlying molecular mechanisms in HCC, and the key genes and pathways could be used as diagnostic and therapeutic targets and diagnostic biomarkers.

Conflicts of Interest
The authors declare that they have no conflicts of interest.