Identification of Key Genes and Pathways in Breast Cancer Based on Bioinformatic Analysis


 Breast cancer(BC) is the most frequent cancer type in women. However, the pathogenesis of BC is still not well understood. Thus, we aim to explore key genes and pathways in relation to BC. We used the Gene Expression Omnibus (GEO) database to identify the differently expression of genes in the carcinogenesis and progression of BC. Then, bioinformatics analysis was performed to determine the key genes targets and pathways associated with BC. The gene expression profile of the hub genes in human tumor was displayed using GEPIA. Finally, the expression of hub genes, correlation between genes and miRNA were screened via the miRDB, MirTarBase and DIANA Tools. We screened 159 downregulated genes and 55 upregulated genes in BC patients among the 4 datasets. The enriched functions of the DEGs involved in cell proliferation, positive regulation of Akt signaling and extracellular exosome, PPAR signaling pathway and AMPK signaling pathway. 3 hub genes were screened out by construction of PPI network. MELK were found to be upregulated, and CLU and NTRK2 were downregulated. Further verification showed that MELK displayed higher levels in almost all tumors. We found correlation between these hube genes and the miRNAs. All in all, three key genes closely related to the incidence of BC were identified, and the results could provide new potential molecular targets for the diagnosis and treatment of BC. In particular, MELK is regulated by multiple miRNAs and participate in the development of BC.


Introduction
Breast cancer (BC) is the most common cancer type and leading cause of cancer deaths in women. Regarding estimates in the United States, around 279,100 new diagnoses of BC are expected in women for 2020, and around 42,690 women are likely to die from this disease (Siegel et al., 2020). The estimated data has increased than in past year (Siegel et al., 2020). The prognosis of BC patients has improved with the improvement of early diagnosis, surgical techniques, neoadjuvant chemotherapy and molecular target gene therapy. A variety of studies has demonstrated that genetic backgrounds are involved in the development of BC.
Up to 10% of BC patients are caused by a mutation of genes (Childers et al., 2017). In addition to BRCA1/2, the expression of protein kinase R (PKR) and phosphatidylinositol 4-kinase 2-alpha (PI4K2A) gene play an important role in the prognosis of BC (Pataer et al., 2020). Inhibiting the activity of fatty acid transport protein 1 (FATP1) can inhibit the ability of BC cells to obtain fatty acids and lead to the decrease of cell viability (Mendes et al., 2019). A vitro experiments show that combining the highly specific HER3 (human epidermal growth factor Receptor3) with siRNA can effectively inhibit the proliferation of HER3 positive BC cells (Nachreiner et al., 2019).However, BC is still the main cause of death in female cancer patients because of the lack of effective methods for early diagnosis and effective treatment strategies in the late stage. Therefore, it is crucial to figure out more potential molecular targets to provide theoretical basis for molecular targeted treatment of BC.
Due to the development of microarray technology and sequencing, bioinformatic analysis has been widely used to seek the differentially expressed genes (DEGs) and related signaling pathways. In order to obtain DEGs in BC, we selected four microarray datasets from Gene Expression Omnibus (GEO). Subsequently, GO, KEGG pathway enrichment and PPI network analysis were performed to help us understand the molecular mechanisms underlying carcinogenesis and found therapeutic targets. The workflow of the detailed analysis is summarized in Fig. 1.

Methods
Analysis of Gene microarray. In order to examine the potentially important genes in BC, networks. PPI networks were drawn using Cytoscape and the most significant module in the PPI networks was identified to be significant with a Molecular Complex Detection (MCODE) score >5, degree cut-off=2, node score cut-off=0.2, Max depth=100 and k-score=2. Genes with most significant module are considered as hub genes.Then, the GO enrichment and KEGG pathway analysis for these hub genes were performed using DAVID.   (Table 1). KEGG pathway analysis revealed that the downregulated DEGs were mainly enriched in in PPAR (peroxisome proliferators-activated receptor) signaling pathway and AMPK (Adenosine 5'-monophosphate-activated protein kinase) signaling pathway.

Identification of
PPI network analysis. We performed the PPI network analysis for DEGs (Fig. 2B). The tool Cytoscape was used to obtain the most significant module (Fig. 2C). According to the results we found the functional analyses of genes involved in this module were mainly enriched in exosome (Table 2).
Hub gene analysis. The full name and function for these hub genes are presented in Table 3.
The biological process analysis of the hub genes is shown in Fig. 3A. Hierarchical clustering showed that the hub genes could basically differentiate the BC samples from the normal samples (Fig. 3B). Then, the OS analysis of the hub genes was performed using Kaplan-Meier curve. BC patients with MELK (maternal embryonic leucine zipper kinase) alteration showed better overall survival. BC patients with CLU (Clusterin) and NTRK2 (neurotrophic tyrosine kinase receptor 2) alteration showed worse OS (Fig. 4A). Nonetheless, BC patients with CLU and NTRK2 alteration showed worse disease-free survival (Fig. 4B). However, this observation was not statistically significant (Fig. 4). The gene expression profile of MELK, CLU and NTRK2 in human tumor was displayed using GEPIA. We found that MELK displayed higher levels in almost all tumors except LAML(Acute Myeloid Leukemia) as compared with the matched normal tissues (Fig. 5A B). CLU displayed higher levels in GBM(Glioblastoma multiforme), LGG(Brain Lower Grade Glioma) and THCA(Thyroid carcinoma) as compared with the matched normal tissues. Similarly, NTRK2 displayed higher levels in LGG and THCA as compared with the matched normal tissues.
Identification of novel miRNAs. Common targets of three online databases such as miRDB, MirTarBase and DIANA Tools were filtered out by getting intersection elements, as demonstrated in Venn Diagram (Fig. 6). These three databases contain computationally predicted miRNA-target interactions. Only one miRNA (hsa-miR-22-3p) appeared to have a significant nominal p-value in three databases for NTRK2 (Fig. 6A). For the CLU gene, two miRNAs (hsa-miR-6817-3p, hsa-miR-7110-3p) were curated from three databases (Fig. 6B).

Discussion
BC is the most common malignant tumor in female (Siegel et al., 2020). BC exist familial clustering individual and race differences, which suggest that gene mutation plays an important role in carcinogenesis of BC. Therefore, targeted therapy may have great implications for the treatment of BC. However, how to discover the potential gene mutation is still a big challenge for targeted therapy (Gu et al., 2016).Microarray technology has rapidly evolved during the past decade, which help us to explore the genetic alterations in BC.
In this present study, a total of 214 DEGs were identified from the 4 datasets in GEO database, including 159 downregulated genes and 55 upregulated genes. GO and KEGG analysis were performed to explore interactions among the DEGs. GO enrichment analysis revealed that changes in the most significant modules were mainly enriched in cell proliferation, positive regulation of Akt signaling and extracellular exosome, while changes in KEGG were mainly enriched in PPAR signaling pathway and AMPK signaling pathway. As we all know that cell proliferation plays an important role in the carcinogenesis of tumors. Previous research has indicated regulation of the Akt signaling pathway may inhibit the growth and aggressiveness of BC cells . In our previous research, we found that exosomes may be selectively enriched with miRNA in BC (Ni et al., 2018). Recent study has shown that genes involved in PPAR signaling pathway can be as a potential biomarker for the early diagnosis of BC (Sultan et al., 2019). The reduction in the expression levels of PPAR potentially prevent the tumorigenesis of BC (Pirouzpanah et al., 2019). AMPK activation-dependent mechanisms may participate in the regulation of growth of BC cells (Pham et al., 2020). Taken together, all these observations are consistent with our results.
We screened 3 DEGs (MELK, CLU and NTRK2) as hub genes. MELK is a member of the snf1/AMPK family of protein serine/threonine kinases and highly associated with accelerated proliferation of cancer stem cells (CSC) in various organs (Ganguly et al., 2014). It has been considered to be a potential therapeutic target in cancer since 2005(McDonald andGraves, 2020). Previous research has showed that MELK has the ability of inhibition of different cell cycle occurred in BC cells (Li et al., 2018). The expression level of CLU were appreciably higher in cancer stem cells (Samson et al., 2019). CLU was proven associated with tumor size, lymph node and the clinical stage in triple-negative breast cancer(TNBC) (Zhang et al., 2012).
NTRK2 is the gene that encodes for the neurotrophin receptor Tyrosine Kinase receptor B (Pattwell et al., 2020). It may exacerbate symptoms in BC patients receiving chemotherapy.
We assessed the expression of MELK, CLU and NTRK2 in relation to overall and disease-free survival. Gene alteration in MELK showed increased in overall and disease free survival.
However, in the present study, those observations were not statistically significant. Another bioinformatic analysis found MELK were related to worse OS in BC (Young et al., 2017). CLU might be a predictive factor for recurrence in <T2 stage BC (Yom et al., 2009). Recent study showed that NTRK2 is related to the prognosis of invasive BC (Gao et al., 2019). We speculate that the reason may be that this database focused on differential gene expression and biological function, but not on outcomes. In addition, inadequate length of follow-up in BC patients with longer OS time may be one of the reasons. Oncomine analysis showed that higher levels of MELK, CLU and NTRK2 were associated with tumor grade, indicating vital roles of MELK, CLU and NTRK2 in the carcinogenesis or progression of BC. As further proof of the importance of MELK, CLU and NTRK2, we verified these genes in GEPIA. MELK is an important gene in the carcinogenesis and development of almost all tumors. CLU and NTRK2 in LGG and THCA were abnormal expression, similarly. These results further suggest the importance of these genes in carcinogensis in tumor.
Further exploration of the pathway of these genes is required. Therefore, we screened in miRDB, MirTarBase and DIANA Tools to find relevant miRNA-gene interactions, these being visualized as networks. Using two databases, we identified 16 miRNA that are associated with MELK, such as miR-205. MiR-205 inhibit the proliferation and promote the apoptosis of BC cells, but its mechanism is not clear (Du YE et al., 2017).Other study also showed that mir-205 can induce PI3K / Akt signaling pathway in BC by acting on HER3 (Orang et al., 2019). In three databases, miRDB, MirTarBase and DIANA Tools, we only found miR-22 has relationship with NTRK2. Previous work shows that miR-22 already be considered as a tumor suppressor or promoter in different cancers (Orang et al., 2019). Furthermore, multiple findings suggesting miR-22 have been implicated in non-small-cell lung cancer (NSCLC) through gene regulation (Ding et al., 2020;He et al., 2020). However, the role of miR-22 in BC remains unclear and need further exploration. For the CLU gene, miR-6817 and miR-7110 were curated from these three databases. MiR-6817 has not yet received experimental attention. At present, studies focusing on miR-7110 are still few in number, only found that miR-7110 was a novel prognostic, diagnostic and therapeutic target for Type I diabetes mellitus (Priyanka et al., 2018). These findings present crucial targets for BC therapy and have significant implications.
However, further studies are needed to elucidate the biological function and pathway of these genes in BC.

Conclusions
All in all, the present study was designed to identify DEGs that may be involved in the carcinogenesis or progression of BC. Three key genes closely related to the incidence of BC were identified, and the results could provide new potential molecular targets for the diagnosis and treatment of BC. In particular, MELK is regulated by multiple miRNAs and participate in the development of BC. Therefore, it would be interesting to elucidate the role of MELK and miRNA pathway in BC in further researches.