Mining immune-related genes with prognostic value in the tumor microenvironment of breast invasive ductal carcinoma

Abstract The tumor microenvironment (TME) plays an important role in the development of breast cancer. Due to limitations in experimental conditions, the molecular mechanism of TME in breast cancer has not yet been elucidated. With the development of bioinformatics, the study of TME has become convenient and reliable. Gene expression and clinical feature data were downloaded from The Cancer Genome Atlas database and the Molecular Taxonomy of Breast Cancer International Consortium database. Immune scores and stromal scores were calculated using the Estimation of Stromal and Immune Cells in Malignant Tumor Tissues Using Expression Data algorithm. The interaction of genes was examined with protein-protein interaction and co-expression analysis. The function of genes was analyzed by gene ontology enrichment analysis, Kyoto Encyclopedia of Genes and Genomes analysis and gene set enrichment analysis. The clinical significance of genes was assessed with Kaplan-Meier analysis and univariate/multivariate Cox regression analysis. Our results showed that the immune scores and stromal scores of breast invasive ductal carcinoma (IDC) were significantly lower than those of invasive lobular carcinoma. The immune scores were significantly related to overall survival of breast IDC patients and both the immune and stromal scores were significantly related to clinical features of these patients. According to the level of immune/stromal scores, 179 common differentially expressed genes and 5 hub genes with prognostic value were identified. In addition, the clinical significance of the hub genes was validated with data from the molecular taxonomy of breast cancer international consortium database, and gene set enrichment analysis analysis showed that these hub genes were mainly enriched in signaling pathways of the immune system and breast cancer. We identified five immune-related hub genes with prognostic value in the TME of breast IDC, which may partly determine the prognosis of breast cancer and provide some direction for development of targeted treatments in the future.


Introduction
Breast cancer is the most common malignant tumor in women worldwide. [1] Although comprehensive treatments such as surgery, chemotherapy, radiotherapy and molecular targeted therapy have effectively reduced the mortality rate of breast cancer, a large number of patients still die from local recurrence and distant metastasis. [2] Increasing evidence indicates that failures in treatment may be attributed to dysfunction in the tumor microenvironment (TME). [3] The TME refers to the environment in which tumors occur, grow, invade and metastasize, which is divided into tumor components and non-tumor components. [4] The latter contains cellular components (such as immune cells, fibroblasts, endothelial cells, inflammatory cells, adipocytes and other stromal cells) and non-cellular components (such as cytokines and extracellular matrix). [5] The TME is in a dynamic state and is heterogeneous between different tumors. [4] Stromal cells interact with tumor cells by secreting extracellular matrix proteins, chemokines and cytokines, resulting in activation of tumor cells that already have genetic abnormalities. [6] Studies have shown that the clinical characteristics and prognoses of patients can be predicted by analyzing the features of gene expression and cell infiltration in the stromal environment. [7][8][9] Tumor-infiltrating lymphocytes (TILs) are important prognostic indicators for triple-negative breast cancer (TNBC). [10] Tomioka et al. found that low levels of TILs and high levels of programmed death-ligand 1 (PD-L1) in the TME are unfavorable to the prognosis of TNBC patients. [11] In addition, blockade of some immune checkpoint proteins, such as CTLA-4, PD-1 and PD-L1, shows promising effects in breast cancer treatment. [12,13] Therefore, the study of TME is of great significance for the early diagnosis and targeted treatment of breast cancer.
In recent years, with the development of high-throughput genome sequencing technology, many biological information databases of high-quality have emerged, such as The Cancer Genome Atlas (TCGA; https://portal.gdc.cancer.gov), Molecular Taxonomy of Breast Cancer International Consortium (META-BRIC; http://www.cbioportal.org), Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo) and International Cancer Genome Consortium (https://icgc.org), which can be analyzed freely by researchers, providing the support of big data for tumor research. Yoshihara et al. designed the Estimation of Stromal and Immune Cells in Malignant Tumor Tissues Using Expression Data (ESTIMATE) algorithm based on genetic sequencing data of tumor samples to study the infiltration of immune cells and stromal cells in the TME and assess tumor purity. [14] The ESTIMATE algorithm generates an immune score, stromal score and estimate score through GSEA, which represent the infiltration of immune cells, the presence of stroma and tumor purity of the TME, respectively. [14] At present, the ESTIMATE algorithm has been used to analyze the TME of various tumors, such as breast cancer, glioblastoma and prostate cancer. [15][16][17] Studies on breast cancer have found that immune and stromal scores are related to the clinical characteristics and prognoses of patients, and that scores differ between races. [18] However, the detailed characteristics and molecular mechanisms of TME in breast cancer have not yet been fully elucidated.
In this study, the ESTIMATE algorithm was utilized to study the relationship between immune/stromal scores and clinical characteristics of breast invasive ductal carcinoma (IDC) patients. The R project was used to screen for hub genes with clinical significance in the TME of IDC, which may provide clues about the underlying molecular mechanisms active in breast TME.

Database
All data in this study are from public domain, no approval from the ethics committee is required. Gene expression profiles and clinical data of breast cancer patients were downloaded from TCGA (https://portal.gdc.cancer.gov) and the METABRIC database (http://www.cbioportal.org). Immune scores and stromal scores were calculated using the ESTIMATE algorithm in the R project (www.r-project.org). [14] The immune/stromal scores and gene expression levels were compared between groups according to age, molecular subtype, TNM stage, adjuvant treatment, cellularity and menopause state.

Functional and pathway enrichment analysis of DEGs
Functional and pathway enrichment analysis of DEGs was performed using R packages "clusterProfiler", "org.Hs.eg.db," "enrichplot" and "ggplot2". [22] The gene ontology analysis included biological process (BP), molecular function (MF) and cell component (CC). Kyoto Encyclopedia of Genes and Genomes analysis shows the enrichment of DEGs in signaling pathways. [23] Gene set enrichment analysis (GSEA) (https://gseamsigdb.org) was performed to study the enrichment in signaling pathways of the hub genes with data from the METABRIC database. [24] 2.4. Protein-protein interaction (PPI) network analysis The PPI network analysis of DEGs was performed with STRING (https://string-db.org). [25] The minimum required interaction score was 0.7 (high confidence). Edges and nodes of the PPI were counted in screening hub genes. Co-expression analysis was performed to identify the genes most positively or negatively related to the hub genes. [26]

Prognostic analysis
Patients were divided into low and high expression groups according to the median expression level of genes and Kaplan-Meier survival analysis was used to analyze overall survival (OS) using the R package "survival". [27] The univariate and multivariate Cox regression analyses were performed to select genes with prognostic value and ROC analysis was performed to evaluate the diagnostic capability of genes with the package "survivalROC". [27] 2.6. Statistical analysis were calculated with the log-rank test and P < 0.05 was considered statistically significant.

Immune scores and/or stromal scores correlate with overall survival and clinical characteristics of breast IDC patients
Immune scores and stromal scores, calculated with the ESTI-MATE algorithm based on genetic sequencing data of tumor samples, reflect the presence of immune cells and stromal cells, respectively, and tumor purity. Studies have shown that the immune score and stromal score are valuable for tumor diagnosis and prognostic evaluation, including breast cancer. [15][16][17] IDC and invasive lobular carcinoma (ILC) are the two most common pathological types of breast cancer. [28] To compare the immune/ stromal scores in these two types of breast cancer, the gene  [14] In order to study the relationship between immune/stromal scores and clinical characteristics of IDC, the clinical data of the above 764 IDC patients were downloaded from TCGA, including age, AJCC stage, TMN stage, molecular subtypes and survival information ( Table 1). The patients were divided into low and high immune/stromal score groups according to the median of scores, and Kaplan-Meier analysis was performed to compare the OS between the low score and high score groups. The results showed that the OS of patients with high immune scores was significantly longer than that of patients with low immune scores (Fig. 1B, p = 0.004). Next, the correlation between immune/stromal scores and clinical characteristics of IDC patients was analyzed. The results indicated that the immune scores were significantly related to the molecular subtype of breast IDC ( Fig. 1D, P < .0001), and the stromal scores were significantly related to the molecular subtype (

DEGs in the TME of breast IDC and their functions
Studies have shown that networks of gene regulation vary in different types of TMEs. [29] To study the relationship between immune/stromal scores and gene expression, the gene expression profiles of 764 breast IDC patients were analyzed and the results are displayed in the form of a heatmap ( Fig. 2A). The results showed that compared with the low immune score group, the high immune score group had 763 high expression genes and 106 low expression genes (fold change [FC]> 2, false discovery rate [FDR] < 0.05). Compared with the low stromal score group, the high stromal score group had 566 high expression genes and 106 low expression genes (FC> 2, FDR < 0.05). By intersecting these DEGs, we found that there were 164 common high-expressing genes and 15 common low-expressing genes (Fig. 2B).
To further analyze the function of the 179 common DEGs, Gene ontology function enrichment analysis was performed, including BP, MF and CC. The results showed that these genes were mainly enriched in functions of T cell activation, lymphocyte differentiation, immune response, plasma membrane and cytokine receptor activity. Figure 2C shows the top 15 enriched functions (Fig. 2C). We also performed Kyoto Encyclopedia of Genes and Genomes pathway analysis. These results showed that the 179 genes were mainly enriched in cytokine-cytokine receptor interaction, hematopoietic cell lineage and chemokine signaling pathways. Fig. 2D shows the top ten enriched pathways. Thus, these common DEGs are involved in the regulation of the immune process.

Hub genes in the TME of breast IDC
To study the interaction of DEGs in the TME of breast IDC, a protein-protein interaction (PPI) network was constructed through the STRING database. The results showed that there were 223 edges and 101 nodes in the PPI network of these 179 common DEGs (Fig. 3A). The number of nodes of each gene were counted and the top 30 genes with the most nodes are displayed in Figure 3B. In order to identify the genes with clinical significance, Kaplan-Meier survival analysis was performed for these 30 genes in patients with breast IDC. The results showed that CD5, CD3E, CD40LG, CD52, CD27, CD69 and IL7R were significantly related to the OS of patients with breast IDC (Fig. 3C-I). Then univariate and multivariate Cox regression analysis of these seven genes was performed. The results indicated that the hazard ratio (HR) values of CD3E, CD5, CD27, CD40LG and CD52 were Table 1 Clinical characteristics and immune/stromal scores of patients from The Cancer Genome Atlas database with breast invasive ductal carcinoma.

Characteristics
Number of case (%) 764 Immune score (range) Stromal score (range)  4A and B, Table 2). Thus, these five genes are likely the hub genes in the TME of breast IDC.

Validation in the METABRIC database
To verify the clinical significance of these five hub genes, another large breast cancer database, METABRIC, was analyzed. [30] The gene expression data and the complete clinical data of 384 patients with breast IDC from METABRIC were downloaded from the cBioPortal for Cancer Genomics database (http://www.cbioportal.org). The patients were divided into different groups according to their clinical features, such as age at diagnosis, cellularity, chemotherapy, Claudin subtype, ER status, and inferred menopausal status. Then the gene expression level between groups was compared, which showed that the expression levels of CD3E, CD5, CD27, CD40LG, and CD52 significantly differed between groups of clinical features (Fig. 4C-E, Table 3). Therefore, the expression levels of these five genes were related to clinical characteristics of breast IDC.
To further study the function and protein-protein correlation of these five genes, GSEA and co-expression analysis were performed with data from METABRIC. [24,26] The results of GSEA showed that the 5 genes were mainly enriched in immunerelated and breast cancer-related signaling pathways, such as T and B cell receptor signaling pathways, natural killer cellmediated cytotoxicity, MAPK, ERBB, WNT, VEGF signaling pathway, and others (Fig. 5A-B and Table 4). The results of coexpression analysis showed the genes with the strongest correlation with CD3E, CD5, CD27, CD40LG and CD52 (Fig. 5C-D and Table 5). Thus, we infer these five genes may play an important role in the TME to some extent.

Discussion
In this study, the immune scores and stromal scores of breast IDC were calculated using data from TCGA, and the relationship between the scores and clinical characteristics was analyzed. Then, DEGs were analyzed according to the immune/stromal scores and the hub genes in the TME of breast IDC were identified with PPI analysis and univariate/multivariate Cox regression analysis. Finally, the clinical significance and the correlation with expression of these hub genes were validated with data from METABRIC. Dysfunction of the TME is one of the most important causes of breast cancer recurrence and metastasis [31] . An increasing number of studies have recently focused on features of the breast TME. Chen et al. compared the characteristics of the TME of breast cancer in Asian and Western populations from GEO and TCGA databases by using ESTIMATE and CIBERSORT algorithms. Their results showed that the immune scores of Asian populations were higher than Western populations, especially for luminal B and HER2-enriched breast cancer patients. [18] Vincent et al. found that the immune/stromal scores of breast cancer cell lines were inconsistent with those of the corresponding breast cancer tissue by using data from TCGA, GEO, UCSC Cancer Genomics Browser and cBio Cancer Genomics Portal databases. [32] In a study of the relationship between DNA methylation and TME, Li et al. found that DNA methylation regulated the immune response by suppressing ACDY6 in the TME of luminal-like breast cancer. [33] However, the detailed molecular mechanism for this effect in the TME of breast cancer has not yet been elucidated.
It has been reported that immune/stromal scores vary in different molecular subtypes and different histological types of breast cancer. [34,35] In the present study, we compared the immune/stromal scores between breast IDC and ILC. The results showed that the immune scores of breast IDC were significantly lower than those of breast ILC, suggesting that different pathological types of breast cancer have different TME characteristics. According to the meaning of immune scores and stromal scores, we infer that this discrepancy is due to the difference in TME between IDC and ILC, that is, the immune and stromal components in the TME of IDC are less numerous than those of ILC. Several studies are consistent with our speculation. Tian et al. analyzed immune cell-type specific signatures of breast luminal (Lum) A IDC and ILC in TCGA and in the Genotype-Tissue Expression Project. Their results showed that LumA ILC had a higher proportion of high-immune phenotypes compared to LumA IDC. In addition, higher expression was observed for   that stromal and adipose tissue markers, such as FBN1, KRT5, APM1 and FABP4, presented mainly in the ILC samples. [38] However, there is one study that contradicts our speculation. Desmedt et al. compared the number of tumor-infiltrating lymphocytes (TILs) in ER-positive/HER2-negative ILC and IDC.
The study indicated that the TIL levels were significantly lower in ILC compared with IDC. [39] Thus, previous studies have explained our results in this study to some extent, but it is still too early to draw a conclusion and further research should be conducted to cover more subtypes of breast cancer and in a larger sample size. In addition, our results showed that LumA, LumB and HER2-enriched breast cancer have higher stromal scores and lower immune scores compared with basal-like breast cancer. Normal-like breast cancer had the highest immune scores (Fig. 1D), which is similar to previous findings. [33,40] Current studies indicated that immune scores and stromal scores reflect the purity of the tumor and the level of immune response, that is, low immune/stromal scores indicate high tumor purity, relatively weak immune response and poor prognosis to some extent. [14] In our study, we found that the stromal scores   The most enriched signaling pathways of hub genes from gene set enrichment analysis.
were significantly related to AJCC stage and T stage of breast IDC (as the stage increases, the stromal scores decrease). In addition, although not statistically significant, the immune/stromal scores were lower in patients younger than 60 years old ( Fig. 5C and I) and in patients with distant metastasis (Fig. 5H and N).
Therefore, the immune scores and stromal scores reflect the clinical characteristics of patients with breast IDC to a certain extent.
With PPI network analysis and univariate/multivariate Cox regression analysis, five hub genes in the TME of breast IDC with Table 5 The top 10 related hub genes from co-expression analysis.

Study
Breast cancer type Gene Brief summary Brummer et al. [42] Breast IDC CD40LG CCR2 signaling promoted breast cancer cell proliferation and invasion by inhibiting CD40LG while activating CCL2. In addition, high expression of CD40LG was a favorable indicator for recurrence-free survival of patients with breast IDC. Tong et al. [43] Breast cancer cell lines CD40LG Soluble recombinant CD40 ligand (CD40L) molecules effectively inhibited the growth of breast cancer in vivo by inducing apoptosis of breast cancer cells. Gomes et al. [44] Breast cancer cell lines CD40LG CD40LG inhibited the in vitro growth of CD40+ human breast cancer lines by blocking the cell cycle and inducing apoptosis of breast cancer cells. Pan et al. [45] Breast IDC CD40LG The expression levels of CD40/CD40L on B cells and T cells in breast IDC patients were significantly increased, and CD40/CD40L levels had a significant positive relationship with pathological grades. Voorzanger-Rousselot et al. [46] Breast cancer cell lines CD40LG CD40LG reduced the apoptosis of breast cancer cells induced by chemotherapeutic drugs.
Kim et al. [47] Breast cancer cell lines CD40LG The CD40-CD40L interaction promoted the proliferation of breast cancer cells (MDA-MB-231) by increasing TGF-b production and Th17 differentiation. Voorzanger-Rousselot et al. [48] Breast cancer cell lines CD40LG CD40L was expressed on a CD40-positive breast cancer cell line (T47D) and induced an antiapoptotic signal when cells were exposed to cytotoxic agents. Wang et al. [49] Breast cancer cell lines CD40LG Co-expression of Drosophila melanogaster deoxyribonucleoside kinase and CD40L decreased cell proliferation and induced cell apoptosis of MDA-MB-231 and MCF7 cells in vitro and in vivo. Yu et al. [50] Breast cancer cell lines CD40LG CD4+T cells in cytokine-induced killer cells induced Fas-dependent apoptosis of MDA-MB-231 cells through CD40/CD40L ligation by inhibiting synthesis of c-FLIP. Shousha et al. [51] Breast IDC CD5 Massive infiltration of axillary lymph nodes with CD5-positive B lymphocytes was found in a breast IDC patient. Strong staining for CD5 was also observed in tumor cells within the metastases of breast and lymph nodes. Walsh et al. [52] Breast cancer (pathological type not mentioned)

CD5
A negative correlation was found between CD5 positivity and tumor grade in breast cancer patients. Alotaibi et al. [53] Breast cancer cell lines CD5 The use of a function-blocking anti-CD5 monoclonal antibody or knockout of CD5 inhibited tumor growth in a breast cancer mouse model by enhancing the capability of CD8+ T cells. Xu et al. [54] Breast cancer (pathological type not mentioned)

CD27
The rs3136550 CT and rs2267966 AT genotypes of CD27 were associated with a decreased risk of breast cancer. In haplotype analysis, the CCGAG haplotype conferred an increased risk of breast cancer. Significant associations were shown between the SNPs of CD27 and lymph node metastasis, and ER and PR status. Han et al. [55] Breast cancer cell lines CD27 CD27 or 4-1BB-costimulated, self-enriched NKG2D CAR-redirected T cells effectively recognized and inhibited the proliferation of MDA-MB-231 cells in vitro and in vivo. Wang et al. [56] Breast cancer (pathological type not mentioned)

CD52
The expression level of CD52 was related to the prognosis and pathological stages of BRCA patients. Analysis based on RNA-seq and clinical data from TCGA datasets showed that CD52 was positively correlated with immune response-related pathways and immune metagenes. Wang et al. [57] Breast cancer (pathological type not mentioned)

CD52
The expression level of CD52 was related to the prognosis of breast cancer patients. Tumorinfiltrating immune cell analysis showed the relationship between CD52 expression and CD8 + T cells, activated memory CD4+ T cells, macrophage M1, and gamma delta T cells. Khatibi et al. [59] Breast cancer cell lines CD3E A recombinant anti-CD3E nanobody effectively suppressed angiogenesis and tumor cell proliferation in a breast cancer mouse model. Moradi-Kalbolandi et al. [58] Breast cancer cell lines CD3E A purified anti-CD3E nanobody effectively inhibited the growth of breast cancer in vivo.
He et al. Medicine (2021) 100:17 www.md-journal.com prognostic significance were identified: CD5, CD3E, CD40LG, CD52 and CD27. Receiver operating characteristic (ROC) curves were drawn with R package "survivalROC" to evaluate the diagnostic ability of these hub genes. [27] However, none of the genes were statistically significant under the curve values (data not shown). Next, the clinical significance of these genes was validated in the METABRIC database. The results showed that the expression level of these five hub genes was related to the clinical features of patients with breast IDC and were enriched in immune-related and breast cancer-related signaling pathways. Finally, genes whose expression levels were most significantly related to these five hub genes were identified. Thus, our study identified five genes with prognostic significance from the TME of breast IDC. The five hub genes, CD40LG, CD5, CD27, CD3E and CD52, are cluster of differentiation (CD) leukocyte surface antigens, which are mainly related to immune function. [41] Table 6 summarizes current studies on these five hub genes in breast cancer. [42][43][44][45][46][47][48][49][50][51][52][53][54][55][56][57][58][59] CD40LG, also called CD154 or CD40L, is mainly expressed in activated T lymphocytes and belongs to the tumor necrosis factor family. It regulates the immune response by binding to its ligand CD40. [60] Our results from PPI network analysis showed that CD40LG is closely related to many important proteins, including CD69, VCAM1 and TLR7, and its expression level is an independent prognostic factor in patients with breast IDC. Furthermore, the expression of CD40LG is closely related to cellularity and the Claudin subtype of breast IDC, and the results of GSEA indicated that CD40LG was mainly enriched in WNT, ERBB, VEGF, MAPK and JAK signaling pathways. We are mainly interested in CD40LG since several studies have already reported its function and molecular mechanism in the TME. Brummer et al. found that CCR2 signaling promoted breast cancer cell proliferation and invasion by inhibiting CD40LG while activating CCL2. In addition, high expression of CD40LG is a favorable indicator for the recurrence-free survival of patients with breast IDC. [42] Mesenchymal stem cells edited with TNF-a and CD40L enhanced the anti-tumor immune function of mice by promoting the function of Th1 cells and inhibiting the function of Th2 and Treg cells. [61] From immunotherapy research, CD8+ T cell-based adoptive cell therapy suppresses tumor development by activating dendritic cells and releasing tumor necrosis factor through the CD40/ CD40LG signaling pathway. [62] Soliman et al. found that a cell vaccine containing granulocyte macrophage-colony stimulating factor and CD40L enhanced anti-tumor ability in a mouse model of breast cancer by increasing infiltration of CD3+ lymphocytes through IL-2 and TNF-g activation. [63] Studies on non-solid tumors have shown that CD40LG and IL4 promote tumor cell proliferation and chemotherapy resistance by activating STAT3, NF-kB, ERK and AKT signaling pathways. [64,65] Thus, CD40L plays an important role in the TME of various tumors, but the functions and molecular mechanism of CD40L in the TME of breast IDC requires further research.
CD5 is a glycoprotein receptor on the surface of T lymphocytes. [66] A recent study found that the use of functionblocking anti-CD5 monoclonal antibody or the knockout of CD5 inhibits tumor growth in a breast cancer mouse model by enhancing the capability of CD8+ T cells. [67] Hosaka et al. detected the expression of CD5 in both thymic cancer tissue and lymphoid stroma. Their results showed that the expression of CD5 was significantly increased at the junction of tumor and stroma, and its expression in lymphocytes was significantly higher than in tumor cells. Furthermore, high levels of CD5 indicated elevated Ki-67 and activated tumor cell proliferation. [68] Another study on pancreatic ductal adenocarcinoma indicated that Bruton's tyrosine kinase promoted the proliferation of cancer cells by activating CD1d hi CD5 + regulatory B cells, which accumulated in the stroma of pancreatic lesions of mice. [69] CD27 is a member of the tumor necrosis factor receptor superfamily, which regulates lymph cell activation and immunoglobulin synthesis by binding to its ligand CD70. [70] Its potential in immune therapy has been recently noticed. [71] Roberts et al. demonstrated that agonistic anti-CD27 antibodies inhibited the growth and metastasis of melanoma in vivo by enhancing the function of tumor-specific CD8+ T cells. [72] In addition, clinical trials showed that an agonist of CD27, varlilumab, was effective and well tolerated in patients with advanced solid tumors by enhancing anti-tumor immunity driven by T cells. [73] For CD52 and CD3E, there are relatively fewer studies on their activity in the TME. CD52, also known as CAMPATH-1 antigen, is a glycoprotein which presents on the surface of lymphocytes. [74] Current research has mainly focused on the antitumor effects of the CD 52 antibody alemtuzumab in patients with lymphoma. [75] Together with CD3-gamma, -delta and -zeta, CD3-epsilon/CD3E forms the T cell receptor-CD3 complex, which plays a key role in antigen recognition and signal transduction. [76] Azadeh et al. found that a recombinant anti-CD3E nanobody effectively suppressed angiogenesis and tumor cell proliferation in a breast cancer mouse model. [59] In summary, the TME of breast cancer is affected by many factors, such as race, tumor type, clinical stage and molecular subtype. [31] In this study, the correlation between immune/ stromal scores and clinical characteristics was analyzed, and the hub genes in the TME of breast IDC were identified. Although similar results have been shown in other studies, they did not focus on breast IDC, and the identified genes were not further studied for their prognostic and clinical significance. [33,40] In addition, previous studies were mainly limited to one database while our study combined two databases to minimize the bias of data. [34,35,77] Nevertheless, more studies can be expected in the future, such as the analysis of infiltrated immune cells in the TME of breast IDC, and the verification of functions and molecular mechanisms of these hub genes in vitro and in vivo. We hope that our work will provide some new possibilities for the diagnosis and treatment of breast cancer.