Screening and Bioinformatics Analysis of Competitive Endogenous RNA Regulatory Network ––Related to Circular RNA in Breast Cancer

Purpose Circular RNA as a competitive endogenous RNA (ceRNA) plays a significant role in the pathogenesis and progression of breast cancer. In this study, a circular RNA-related ceRNA regulatory network was constructed, which provides new biomarkers and therapeutic targets for the treatment of breast cancer. Materials and methods. The expression profile datasets (GSE101123, GSE143564, GSE50428) of circRNAs, miRNAs, and mRNAs were downloaded from the GEO database, and then differentially expressed RNAs (DEcircRNAs, DEmiRNAs, DEmRNAs) were obtained through the CSCD, TargetScan, miRDB, and miRTarBase databases. CircRNA-miRNA pairs and miRNA-mRNA pairs were constructed. Finally, a ceRNA regulatory network was established. Downstream analysis of the ceRNA network included GO, KEGG analysis, survival analysis, sub-network construction, the BCIP, and qRT-PCR verification. Results In total, 144 differentially expressed (DE) DEcircRNA, 221 DEmiRNA, and 1211 DEmRNA were obtained, and 96 circRNA-miRNA pairs and 139 miRNA-mRNA pairs were constructed by prediction. The ceRNA regulatory network (circRNA-miRNA-mRNA) was constructed, which included 42 circRNA, 36miRNA, and 78 mRNA. GO function annotation showed genes were mainly enriched in receptor activity activated by transforming growth factor beta (TGF-beta) and in the regulation of epithelial cell apoptosis. KEGG analysis showed genes were mainly enriched in the TGF-beta signaling, PI3K-Akt signaling, and Wnt signaling pathways. Four genes associated with survival and prognosis of breast cancer were obtained by survival analysis, the prognostic sub-network included 4 circRNA, 4 miRNA, and 4 mRNA. BCIP analysis and qRT-PCR verification confirmed that relative mRNA expression levels were consistent with those in the GEO database. Conclusion A circRNA-related ceRNA regulatory network was constructed for breast cancer in this study and key genes affecting pathogenesis and progression were identified. These findings may help better understand and further explore the molecular mechanisms that affect the progression and pathogenesis of breast cancer.


Introduction
Breast cancer is the most common malignancy and poses a serious threat to the health of women worldwide. In 2020, new cases of breast cancer worldwide accounted for 30% of all malignant tumors, and it is also the second leading cause of cancer-related deaths among women [1]. Despite the continuous development of medical technology and the decline in mortality rates, the incidence of breast cancer is still increasing worldwide [2]. Breast cancer is a systemic disease, and its treatment methods include surgery, chemotherapy, targeted therapy, radiotherapy, immunotherapy, and neoadjuvant therapy. Although there are various treatment methods, the mortality rate of breast cancer remains high due to the high recurrence rate, distant metastasis, and drug resistance. Studies have shown that gene mutations and gene disorders are the main reasons leading to the occurrence and progression of many cancers [3]. Therefore, there is an urgent need for new biomarkers and therapeutic targets for breast cancer.
In 1976, researchers identified circular RNA genome in a virus [4]. Since then, studies have shown that circular RNA plays a crucial role in biological processes [5,6]. The rapid development of high-throughput microarray technology has allowed researchers to analyze the expression profiles of circular RNAs, which in turn has facilitated research into tumor-related genes and further exploration of the molecular mechanisms involved [7][8][9]. Circular RNA has a closed loop structure without a 5 ′ -end cap and 3 ′ -poly(A) tail [10]. When circRNA was first discovered, it had been mistakenly regarded as a byproduct of splicing, but it is now considered a key molecule in many biological processes [11]. Studies have shown that circRNA is involved in the onset and progression of many diseases, including a variety of cancers [12]. For instance, circHIPK3 as a molecular sponge of miR-558 inhibits the invasion and metastasis of bladder cancer [13]. Studies have also suggested that cir-cFBLIM1 can promote the development of hepatocellular carcinoma, and it may be used as a therapeutic target for hepatocellular carcinoma [14]. In breast cancer, certain cir-cRNAs have also been identified as oncogenes or tumor suppressor genes [15,16].
The databases and bioinformatics update provide resources and methods for studying the regulation of noncoding RNA [17][18][19]. Pandolfi et al. devised the competitive endogenous RNA hypothesis (ceRNA), which proposed that long non-coding RNA, mRNA, and other transcripts can competitively bind miRNA response elements (MREs) to mutually regulate the expression of their respective genes, revealing a novel mechanism of RNA interaction [20]. Cir-cRNA, which belongs to the non-coding RNA family can also competitively adsorb miRNAs, thereby regulating gene expression at the post-transcriptional level [5,21]. Deng et al. found that circRHOBTB3 can competitively bind miR-654-3p to inhibit the progression of gastric cancer [22]. CircGFRA1 as a ceRNA can be used to regulate GFRA1 by competitively binding miR-34a, and thereby, plays a regulatory function in triple-negative breast cancer (TNBC) [23]. Therefore, circRNA may become a potential therapeutic target and biomarker for malignant tumors. However, the biological function of most circRNA and their molecular mechanisms in breast cancer are still unclear.
In our study, three datasets, downloaded from the Gene Expression Omnibus (GEO) database, were grouped and analyzed to obtain the differentially expressed (DE) cir-cRNA, miRNA, and mRNA (DEcircRNA, DEmiRNA, and DEmRNA, respectively). The circRNA-related ceRNA regulatory network was constructed through predictions derived from the cancer-specific circRNA database (CSCD) database, miRDB, and TargetScan database. GO analysis, KEGG analysis, and survival analysis was performed on mRNA in the ceRNA network, and a prognostic sub-network was constructed. Finally, mRNA expression of prognostic-related genes was verified using the Breast Cancer Integrated Platform (BCIP) and quantitative polymerase chain reaction (qRT-PCR) assays. Cir-cRNA, miRNA, and mRNA correlating with the prognosis of breast cancer were discovered, which will help to further deepen our understanding of the molecular mechanism and biological process of breast cancer. The flowchart of this research is shown in Figure 1.   databases, the target mRNAs of intersection miRNAs are predicted, and only genes identified in the above three databases at the same time were selected as potential target mRNAs. Likewise, these target mRNAs are intersected with DEmRNAs in the GSE50428 dataset to obtain miRNA-mRNA pairs. Finally, the data of the regulatory network (cir-cRNA-miRNA-mRNA) was obtained through Perl software, and then a visual ceRNA regulatory network was established using Cytoscape 3.8.0 software.

GO Function Annotation and KEGG Pathway Analysis.
To further explore the pathway and mechanism of DEmRNA in the ceRNA network affecting breast cancer, we conducted function annotation and KEGG pathway analysis. Gene Ontology (GO) function annotation (http:// geneontology.org/) is a widely used bioinformatics analysis method, including molecular function, cellular components,   3 BioMed Research International and biological processes. The Kyoto Encyclopedia of Genes and Genomes(KEGG) database (https://www.kegg.jp/)is a widely used public resource that can be used to understand molecular-level information, biological systems, and advanced functions [24]. The http://org.Hs.eg.db package in R was used to convert gene symbols to gene IDs. The clus-terProfiler, enrichplot, colorspace, stringi, DOSE, and ggplot2 packages in R were used to complete GO analysis and KEGG pathway analysis.

Survival Analysis.
We conducted a survival analysis to explore whether DEmRNA in the ceRNA network affects the overall survival of breast cancer patients. The Cancer Genome Atlas (TCGA) (https://www.cancer.gov/about-nci/ organization/ccg/research/structural-genomics/tcga) is currently the world's largest public database of cancer genetic information. It aims to help researchers understand cancer more deeply, thereby improving cancer prevention, diagnosis, and treatment capabilities by means of highthroughput genome analysis technology. The TCGA database contains gene expression data, miRNA expression data, frequency, and standardized clinical data with large sample sizes. Perl software was used to extract the information in the TCGA database, and then survival and survminer packages are used to analyze the relationship between target genes and the survival rate of patients to obtain mRNAs related to the survival and prognosis of breast cancer. Then, using Cytoscape software, a prognostic sub-network was constructed for the obtained prognostic-related genes, and then the miRNA and circRNA exerting a regulatory relationship with them were confirmed.
2.6. Relative mRNA Expression Level by qRT-PCR. QRT-PCR was used to detect the relative expression levels of mRNA (TPD52, BTG2, CCND2, LIFR) in human breast gland cell line MCF10A and breast cancer cell line MCF-7. Primer sequence information for the PCR experiment is shown in Table 1. The above cell lines were purchased from the Shanghai Cell Bank of the Chinese Academy of Sciences. The real-time fluorescent quantitative PCR instrument (Bio-Rad iQ5) was used. Beta-actin was used as an internal reference, and the reaction procedure adopted a two-step method. The reaction conditions: pre-denaturation at 95°C for 10 min; denaturation at 95°C for 15 seconds, annealing/extension at 60°C for 60 seconds, 42 cycles. The Ct value of each sample is derived through analysis, and the relative expression level of mRNA was determined by calculating the 2 -△△Ct value.

The Expression Value of mRNAs in BCIP.
Through the BCIP database (using breast cancer tissues and adjacent normal breast tissues), we further verified the expression levels of candidate genes. The BCIP (http://www.omicsnet .org/bcancer/database) is a website that analyzes and visualizes the genes of breast cancer patients. The platform's data comes from public databases, including the TCGA database, GEO database, and the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC). The BCIP database can be used for multi-omics comprehensive analysis, by which we have obtained the expression value of 4 mRNAs in adjacent normal tissues and breast cancer tissues.

Results
3.1. DEcircRNA, DEmiRNA, and DEmRNA Were Extracted. Human-derived RNA datasets (GSE101123, GSE143564, and GSE50428) containing breast cancer tissues and normal breast tissues were selected. The differential expression analysis of datasets was carried out through the limma package, |log 2 fold change| >1, and the adjusted P-value was <0.05. In total, 144 DEcircRNAs were obtained, of which 64 were downregulated and 80 were upregulated. Of the 221 DEmiR-NAs, 97 were downregulated and 124 were upregulated. There were 1211 DEmRNAs, 713 were downregulated and 498 were upregulated. We selected the top 15 downregulated and upregulated DERNAs for heat map analysis, as shown in Figure 2. Basic information of the three datasets is shown in Table 2.

Identification of Target miRNA and mRNA of CircRNA.
CircRNA can combine with miRNA response elements (MREs) to achieve a regulatory role. Through the CSCD database, we predicted a total of 2199 target miRNAs that bind to DEcircRNA. However, in the CSCD database, we did not find any information about any of these 12 circRNAs including hsa_circ_0001625 et al. The structure pattern diagrams of 6 circular RNAs, are shown in Figures 3(A)-3(F). Subsequently, using the Venn diagram method, the predicted target miRNA and DEmiRNA were intersected, and 96 intersection miRNAs were obtained, as shown in Figure 3g. Next, through the TargetScan, miRTarBase, and miRDB databases, only genes identified in the above three databases at the same time were selected as potential target mRNAs, then, the above-mentioned intersection miRNAs were predicted to obtained target 2415 mRNAs. By  3.3. Construction of ceRNA Regulatory Network. Only genes that meet the following conditions will be selected into the ceRNA network: (1) All genes must be differentially expressed; (2) cir-cRNAs and mRNAs have a binding relationship with miR-NAs at the same time; (3) RNAs (circRNA, mRNA) and miRNAs that meet the above binding relationship must be negatively regulated. Then, through Cytoscape software, a ceRNA regulatory network (circRNA-miRNA-mRNA) was

GO Function Annotation and KEGG Pathway Analysis.
In the ceRNA regulatory network, GO analysis of 78 mRNAs showed that they are mainly enriched in: negative regulation of cellular response to transforming growth factor beta stimulus, regulation of epithelial to mesenchymal transition (EMT), regulation of epithelial cell apoptotic process, positive regulation of phosphatidylinositol 3-kinase (PI3K) signaling (Biological Process BPs), transcription regulator complex (Cellular Component CCs), transforming growth factor beta-activated receptor activity, kinase regulator activity, protein tyrosine kinase binding, and transcription corepressor activity (Molecular Function MFs). The histogram is shown in Figures 6(a)-6(c). KEGG pathway analysis results show mainly enrichment in the PI3K-Akt signaling pathway, TGF-beta signaling pathway, Wnt signaling pathway, ErbB signaling pathway, p53 signaling pathway, and in microRNAs in cancer (Figure 6(d)). Most of the pathways identified above are involved in the incidence and progression of breast cancer.

Survival Analysis and Prognostic Sub-Network
Construction. The survival package in R was used for survival analysis of all 78 mRNAs in the ceRNA regulatory network, and 4 mRNAs (TPD52, BTG2, CCND2, LIFR) were identified as being significantly correlated with breast cancer survival prognosis (Figures 7(a)-7(d)). Meanwhile, based on the above four mRNAs, a circRNA-miRNA-mRNA prognostic sub-network was constructed (Figure 8(a)).
3.6. qRT-PCR Verification of mRNA Expression. We verified the relative expression levels of 4 mRNAs identified in the above analysis in cell lines via qRT-PCR assays. Compared with the MCF-10A cell line, the relative expression levels of BTG2, CCND2, and LIFR mRNA were significantly downregulated in the MCF-7 cell line, while TPD52 was significantly upregulated (P <0.01) (Figure 8(b)).

The Expression of Four mRNAs Was Evaluated in the BCIP.
Through the analysis of breast cancer data in the BCIP, the results show that the expression value of the four mRNAs in adjacent normal tissues (AdjN) and breast cancer were consistent with the GEO database ( Figure 9). The median expression values of the four mRNAs are shown in Table 3.

Discussion
In recent years, researchers from different countries have made significant progress in breast cancer research, especially in the development of molecularly targeted drugs and antibody-drug conjugates, which have brought great benefits to breast cancer patients [25][26][27][28]. The current treatment methods for breast cancer include surgery, chemotherapy, endocrine therapy, molecularly targeted therapy, and radiotherapy. Although there are various treatment methods, most breast cancers achieve advanced stages, where the overall treatment effect is not satisfactory due to factors such as drug resistance, distant metastasis, and poor subtyping. The pathogenesis and progression of cancer are closely associated with gene mutations and gene disorders. Studies have shown that dysregulation of circular RNA expression plays an important role in the pathogenesis and progression of many tumors. CircRNA-related  ceRNA regulatory networks play a vital role in the pathogenesis and progression of bladder cancer, colorectal cancer, and other cancers [29][30][31]. To date, the role and mechanism of circRNA are still unclear in breast cancer.
In this study, we used a bioinformatics analysis approach to construct a ceRNA regulatory network to identify prognostic markers for breast cancer. First, three breast cancerrelated datasets, GSE101123, GSE143564, and GSE50428, were downloaded from the GEO database and were used for analysis to obtain differentially expressed RNAs (DEc-ircRNAs, DEmiRNAs, and DEmRNAs) in breast cancer patients. The circRNA-miRNA pairs and miRNA-mRNA pairs were obtained by predictive analysis defined by the CSCD, miRDB, miRTarget, and miRTarbase databases. Next, the ceRNA regulatory network was constructed and analyzed in function of GO and KEGG pathway analysis. The results of GO functional annotation showed that the mainly enriched genes were involved in the regulation of epithelial cell apoptotic process, regulation of EMT, negative regulation of cellular response to TGF-beta stimulus, Studies have shown that the biological processes involved in EMT play a role in the progression of pancreatic cancer, malignant melanoma, and other tumors [32][33][34][35][36][37]. Further TGF-beta has an important regulatory role in the progression of multiple tumors [38][39][40]. The KEGG pathway analysis results indicated that the main enrichment of differentially expressed genes was in the ErbB, Wnt, TGFbeta, PI3K-Akt, and p53 signaling pathways, and in micro-RNAs in cancer. The involvement of these pathways is supported by previous studies. Hoxhaj et al. found the PI3K-AKT signaling pathway could directly or indirectly regulate nutrient transport and metabolic enzymes to control the metabolism of cancer cells [41]. Schmid et al. determined that the AKT inhibitor capivasertib enabled patients with TNBC to achieve longer progression-free survival and overall survival [42]. Guo B et al. found that the expression of the micropeptide protein CIP2A-BP encoded by lnc00665 was down-regulated through the TGF-beta signaling pathway, thereby inhibiting the progress of TNBC [43]. Castagnoli et al. found that WNT signaling regulated the expression of PD-L1 in the TNBC stem cell compartment [44].
We identified 4 mRNAs (TPD52, BTG2, CCND2, LIFR) involved in the prognosis of breast cancer using survival analysis. The involvement of these genes in cancer progression is supported by previous studies. Fu et al. reported that TPD52, targeted by mir-103a-3p, could affect the progression of salivary adenoid cystic carcinoma [45]. Further, Shi et al. found that the miR-185-5p/TPD52 axis was associated with radiation sensitivity of TNBC [46]. The researchers found that BTG2 is closely associated with the metastasis and progression of non-small cell lung cancer and gastric cancer [47,48]. Yu et al. found that LncRNA TUG1 could promote cisplatin resistance in bladder cancer by regulating the expression of CCND2 [49]. The dysregulation of LIFR influences the development of prostate cancer and breast cancer [50,51].
A prognostic sub-network was constructed, and we identified miRNA and circRNA combined with mRNA that having a regulatory relationship. Studies have shown that in breast and ovarian cancer, miR-503-5p can lead to drug resistance through targeted binding with other genes, and may also affect the proliferation and apoptosis of ovarian cancer [52][53][54]. Regulation of miR-93-5p expression is closely associated with the progression of breast and lung cancer [55,56]. Wang et al. found that miR-21-5p could contribute to EMT in endometrial cancer [57].
Finally, we verified the expression of mRNA in the prognostic sub-network in breast cancer tissues and breast cancer cells, respectively, using BCIP and qRT-PCR.
Our research is innovative. Our findings were based on data obtained not only from the GEO database, but from subsequent bioinformatics analyses that also combined source data from TCGA, the METABRIC database, the BCIP, and qRT-PCR. Further, TCGA comprises a large sample size and detailed information related to prognosis. Nonetheless, our research also has certain limitations. First of all, the sample size in the GEO database is not very large. Secondly, our research methods were mainly based on bioinformatics analysis, and there is a lack of follow-up experiments verifying the relevant molecular regulatory mechanisms. The results obtained were also based on currently available tools and websites. Finally, only the relationship between the mRNA and the prognosis was evaluated. However, in the end, we constructed a prognostic sub-network of mRNA and comprising dysregulated miRNA and circRNA. Our findings 10 BioMed Research International indirectly demonstrate that these dysregulated genes may affect the progression of breast cancer, and provides a rationale for future studies.

Conclusion
In this study, breast cancer datasets from the GEO database were analyzed and the findings formed the basis for the construction of a circRNA-related ceRNA regulatory network for breast cancer. The relationship between the genes involved in the ceRNA network and the signaling pathways that may be involved were studied in association with the prognosis of breast cancer survival. Our findings help to better understand the ceRNA network related to circular RNA in breast cancer. Furthermore,  we identified genes related to the prognosis, which will provide new clues and a rationale for future studies designing therapeutic targets and biomarkers of breast cancer.
Abbreviations CSCD: Cancer-specific CircRNA database GO: Gene ontology KEGG: Kyoto encyclopedia of genes and genomes GEO: Gene expression omnibus BP: Biological process CC: Cellular component MF: Molecular function.

Ethical Approval
Almost all of the data comes from public databases that are widely available internationally. Ethics committee evaluation was not necessary for this study.

Conflicts of Interest
The authors report no conflicts of interest in this work.

Authors' Contributions
Tao Wang and Yi Zhang contributed equally to this work.

Funding
There is no funding to report.