Prognostic analyses of genes associated with anoikis in breast cancer

Breast cancer (BRCA) is the most diagnosed cancer worldwide and is responsible for the highest cancer-associated mortality among women. It is evident that anoikis resistance contributes to tumour cell metastasis, and this is the primary cause of treatment failure for BRCA. However, anoikis-related gene (ARG) expression profiles and their prognostic value in BRCA remain unclear. In this study, a prognostic model of ARGs based on The Cancer Genome Atlas (TCGA) database was established using a least absolute shrinkage and selection operator analysis to evaluate the prognostic value of ARGs in BRCA. The risk factor graph demonstrated that the low-risk group had longer survival than the high-risk group, implying that the prognostic model had a good performance. We identified 11 ARGs that exhibited differential expression between the two risk groups in TCGA and Gene Expression Omnibus databases. Through Gene Ontology and Kyoto Encyclopaedia of Genes and Genomes enrichment analyses, we revealed that the screened ARGs were associated with tumour progression and metastasis. In addition, a protein–protein interaction network showed potential interactions among these ARGs. Furthermore, gene set enrichment analysis suggested that the Notch and Wnt signalling pathways were overexpressed in the high-risk group, and gene set variation analysis revealed that 38 hallmark genes differed between the two groups. Moreover, Kaplan–Meier survival curves and receiver operating characteristic curves were used to identify five ARGs (CD24, KRT15, MIA, NDRG1, TP63), and quantitative polymerase chain reaction was employed to assess the differential expression of these ARGs. Univariate and multivariate Cox regression analyses were then performed for the key ARGs, with the best prediction of 3 year survival. In conclusion, ARGs might play a crucial role in tumour progression and serve as indicators of prognosis in BRCA.


INTRODUCTION
Breast cancer is the most commonly diagnosed cancer worldwide and has the highest mortality rate among malignant tumours in women (Sung et al., 2021). Despite comprehensive treatment, such as surgery, cytotoxic chemotherapy, and targeted therapy, improves clinical remission rates, dissemination and metastasis often lead to the failure of anticancer therapy (Loibl et al., 2021). Present reports suggest that metastasis is responsible for more than 90% of breast cancer-related deaths (Gupta & Massagué, 2006). Therefore, the efficacy of treatment for breast cancer is largely dependent on its capacity to prevent metastasis. Tumour cell metastasis involves multiple steps, such as basement membrane invasion, intravasation and transport in the vasculature, and distant organ colonisation (Valastyan & Weinberg, 2011). It has been confirmed that tumour cells must overcome various stresses, including anoikis, to achieve distant metastasis (Fanfone et al., 2022).
Anoikis is a form of apoptosis initiated when cells detach from the extracellular matrix (ECM) and has been suggested to be regulated via the extrinsic death receptor pathway and intrinsic mitochondrial pathway (Gilmore, 2005). Usually, epithelial and endothelial cells maintain contact-dependent growth by activating the cellular signalling pathway; however, their apoptotic program is initiated after the cell-matrix contact loss, which helps maintain tissue homeostasis by eliminating shed cells (Gilmore, 2005). Resistance to anoikis has been reported in malignancies, including breast cancer, and is considered a fundamental characteristic of metastatic cancer cells. Therefore, anoikis is a potential therapeutic target for inhibiting cancer metastasis (Adeshakin et al., 2021).
Hence, anoikis has recently become the focus of intense research in the field of tumour invasion, and potential molecular mechanisms of anoikis resistance have been explored in several tumours (Wang et al., 2022). A previous study indicated that 14-3-3σ induced anoikis resistance and hepatocellular carcinoma cell metastasis by epidermal growth factor receptor-dependent extracellular signal-regulated kinase 1/2 pathway activation (Song et al., 2021). In addition, anoikis-resistant gastric cancer cells exhibited stronger metastatic and proangiogenic traits based on CCAAT-enhancer-binding protein β-mediated plateletderived growth factor subunit B autocrine and paracrine signalling (Du et al., 2021). Furthermore, anoikis-resistant molecular mechanisms have been explored in breast cancer (Tajbakhsh et al., 2019). Human epidermal growth factor receptor (HER) 2 and Src-dependent lactate dehydrogenase A activation contributed to breast cancer cell anoikis resistance and metastasis (Jin et al., 2017), and HER2-dependent B cell linker down-regulation facilitated three-dimensional breast tumour growth (Liu et al., 2022b). Although research has demonstrated that several genes are involved in anoikis resistance in breast cancer (Tajbakhsh et al., 2019), research on the clinical relevance of ARGs and the systematic evaluation of the implications of anoikis in breast cancer is lacking.
In this research, we first constructed an ARG prognostic model for breast cancer based on the Cancer Genome Atlas (TCGA) Breast Invasive Carcinoma (BRCA) (TCGA-BRCA) database and subsequently screened 11 differentially expressed ARGs between high-and low-risk groups. Functional annotation enrichment and protein-protein interaction (PPI) network analysis were performed for the screened ARGs. We identified five key ARGs by assessing their potential diagnostic and prognostic values and determined the messenger ribonucleic acid (mRNA) levels of the five key ARGs by quantitative polymerase chain reaction (qPCR). Univariate and multivariate Cox regression analyses were employed to assess the ability of 2-, 3-, and 4-year survival predictions with respect to key ARGs.
Altogether, our findings demonstrate the significant roles of ARGs in breast cancer progression and provide novel research perspectives for anoikis resistance.

Data collection
RNA sequencing data of patients with breast cancer were downloaded from TCGA database (https://portal.gdc.cancer.gov/) using the R package TCGAbiolinks (Colaprico et al., 2016). After eliminating samples with incomplete clinical annotations, 1,083 cases of breast cancer projects, including 111 cases with matched adjacent tissues, were obtained (TCGA-BRCA), and the data format was level 3 HTSeq-fragments per kilobase per million. Additionally, corresponding clinical information was obtained from the UCSC Xena database (Goldman et al., 2020). The mRNA counts of TCGA-BRCA dataset were normalised by using the limma R package (Ritchie et al., 2015).
Eight gene sets of ARGs were retrieved from the Molecular Signatures Database (MSigDB) (Liberzon et al., 2015), which provides the most comprehensive information on human gene sets, resulting in 862 ARGs. Furthermore, 794 ARGs were retrieved from the GeneCards database (Stelzer et al., 2016). By taking the intersection of the two databases and omitting those genes with missing probes, we obtained 100 ARGs that are common genes in both databases (Table S1).

Prognostic model construction
The least absolute shrinkage and selection operator (LASSO) prognostic model of 100 ARGs for breast cancer was constructed using the TCGA-BRCA dataset, employing 10-fold cross-validation with seeds 2021 and a p-value of 0.05. We ran 1000 iterations to guard against overfitting. The outcomes of the LASSO analysis were visualised. All samples from the TCGA-BRCA dataset and the BRCA dataset were divided into high-and low-risk groups, respectively, according to the median risk score of the prognostic model, and the survival outcomes were displayed in spot graphs. Moreover, the ARG expression profiles for prognostic model construction were visualised using heatmaps.

Differential expression genes in the high-and low-risk groups
First, the batch effects of GSE42568, GSE20685, and GSE102484 were eliminated using the R package sva (Leek et al., 2012), and merged BRCA datasets were retrieved from the GEO database. Second, datasets from TCGA and GEO were standardised and then grouped into high-and low-risk groups according to the median risk score of the LASSO prognostic model. Differential gene expression analysis was performed between the two groups, and a volcano plot (ggplot2) and heatmap (R package pheatmap) were created to visualise up-regulated genes (log FC > 0.5 and p < 0.05) and down-regulated genes (log FC < −0.5 and p < 0.05).

Gene enrichment analysis and the PPI network
The clusterProfiler R package was used to perform Gene Ontology (GO) (Gene Ontology Consortium, 2015) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) (Kanehisa & Goto, 2000) pathway enrichment analyses (Yu et al., 2012) with p < 0.05 and a false discovery rate (FDR) value ( q-value) of < 0.2. The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database was accessed to provide information regarding known and predicted PPIs (Szklarczyk et al., 2019), From which information regarding ARGs was extracted from the STRING database, and a PPI network was constructed using Cytoscape (Shannon et al., 2003). Additionally, GeneMANIA was used for gene prediction of the selected ARGs, and the results were visualised (Franz et al., 2018).

Gene set enrichment analysis (GSEA) and gene set variation analysis (GSVA)
GSEA is an analytical method that determines whether predefined gene sets demonstrate statistically significant and consistent differences between two biological states (Subramanian et al., 2005). In the current study, TCGA-BRCA genes were divided into phenotype-related high-and low-risk groups based on the risk score median of the LASSO prognostic model, and GSEA was performed using the clusterProfiler R package. The procedure was repeated 1,000 times for each analysis, and c2.cp.v7.2.symbols were obtained from MSigDB to serve as the reference gene collection. Statistical significance was set at FDR < 0.25 and p < 0.05.
We downloaded ''h.all.v7.4.symbols.gmt'' from the MSigDB database to perform GSVA based on gene expression levels in the TCGA-BRCA dataset and analyse variations of functional enrichment between the high-and low-risk groups. Statistical significance was set at p < 0.05.

Analysis of the prognostic value and receiver operating characteristic (ROC) curve for ARGs
Kaplan-Meier curves were used to estimate the overall survival of patients with breast cancer based on the ARG expression derived from the TCGA-BRCA dataset. The ROC curve was used to assess the relationship between ARGs and tumourigenesis, and genes with an area under the curve (AUC) of the ROC curve >0.6 were considered key genes.

Correlation analysis between clinical characteristics and prognosis
First, we performed univariate Cox regression to evaluate the prognostic value of clinical characteristics and key ARGs in patients, and a multivariate Cox regression model was constructed from factors with a p-value of <0.1 in univariate Cox regression. Subsequently, nomograms were used to predict the 2-, 3-, and 4-year survival probabilities individually based on a multivariate Cox regression model. The calibration curves were evaluated graphically by plotting the nomogram-predicted probabilities against the observed occurrences, and the 45 • line represented the best predictive values. The nomograms and calibration curves were produced by the R package rms. Decision curve analysis (DCA) was performed using the R package ggDCA (Tataranni & Piccoli, 2019) to evaluate the predictive effect of the nomograms for survival probability.
Total RNA was extracted from cells which had been cultured in dishes using TRIzol (Life Technologies, USA). Reverse transcription was then performed using an Evo M-MLV RT Mix Tracking Kit with genomic deoxyribonucleic acid (DNA) Clean (Accurate Biology, Changsha, China), and qPCR was performed using an HS SYBR Green Premix Pro Taq qPCR Tracking Kit (Accurate Biology, Changsha, China). All steps were performed according to the manufacturer's instructions. The primer sequences were as follows: Ct values were calculated by subtracting the testing group target gene Ct values from the average Ct of the control group. Finally, relative mRNA expression was obtained by 2 − Ct .

Immunohistochemical (IHC) analysis
IHC information on key genes in breast cancer and normal breast tissues was obtained from the Human Protein Atlas (HPA) database (Colwill & Gräslund, 2011).

Statistical analysis
Clinical data handling in the study was performed using the statistical software R (version 4.1.2; R Core Team, 2021). Continuous variables were summarised using the mean ± standard deviation. The Wilcoxon-rank sum test was used to compare the two groups, and Student's t -test was used to compare the normally distributed data of the two groups. The Kruskal-Wallis test was used for comparing more than two groups, and categorical variables were compared using the chi-square or Fisher's exact tests. Survival analyses and univariate and multivariate Cox regressions were performed using the survival package in R, and LASSO analyses were performed using the R package glmnet. Furthermore, survivalROC in the R package was used for creating the ROC curve. The significance of survival difference was tested with the log-rank test. In addition, two-sided Spearman's rank correlation coefficients were employed. One-way analysis of variance was used for comparing multiple samples of experimental data. Differences with p-values <0.05 were considered significant.

LASSO prognostic models were established and 11 ARGs with possible prognostic values were screened
A flow chart of the study is provided in the Fig. 1. BRCA datasets of GEO were obtained from GSE42568, GSE20685, and GSE102484 after eliminating batch effects. Box plots and principal component analysis plots displayed the datasets before and after eliminating batch effects (Figs. S1A-S1D). The results revealed that the batch effect had been effectively eliminated. We performed LASSO analysis of ARGs based on the TCGA-BRCA dataset ( Fig. 2A) to evaluate the prognostic value of 100 ARGs from the intersection of MSigDB and the GeneCards database (Table S1), and the variable trajectories graph was used to visualise the results (Fig. 2B). The risk factor graph demonstrated that those in low-risk groups had longer survival compared with those in high-risk groups ( (Table 1). Next, we explored the 12 ARG profiles between two risk groups in the TCGA-BRCA and BRCA datasets, respectively, and visualised the results with pheatmaps (Figs. 2F and 2G). The box plot further showed a comparison of each ARG between the two risk groups, and there were 11 significant differentially expressed ARGs (BIRC3, CD24, IVL, KRT15, L1CAM, MIA, NDRG1, NOS2, PTPN3, SERPINA1, and TP63) with the same expression trends in both datasets (Figs. 2H and 2I).
Subsequently, we found 878 differentially expressed genes between the two risk groups based on the TCGA-BRCA dataset following thresholds of |log FC| > 0.5 and P.adj < 0.05. Of these, 244 genes were up-regulated, and 634 were down-regulated in the high-risk group, and the volcano plot displayed the results (Fig. 2D). In addition, 196 genes were obtained from the BRCA dataset under the same thresholds comprising 57 up-regulated genes and 139 down-regulated genes (Fig. 2E). The TCGA-BRCA and BRCA datasets were obtained from the TCGA and GEO databases, respectively. LASSO analysis was constructed using 100 ARGs based on the TCGA-BRCA dataset, and we screened 12 ARGs with possible prognostic values. Subsequently, all samples from TCGA-BRCA and BRCA datasets were divided into high-and low-risk groups based on the prognostic model's median risk score. Each ARG was compared between the two risk groups, and 11 significant differentially expressed ARGs with the same expression trends in both datasets were identified. Next, GO and KEGG enrichment analyses and PPI network construction were performed using the 11 ARGs. We further performed GSEA and GSVA based on the TCGA-BRCA dataset and obtained five key ARGs by plotting KM survival curves and ROC curves (AUC > 0.6) for each ARG based on the TCGA-BRCA dataset. Finally, univariate and multivariate Cox regression analyses were performed for key ARGs, and qPCR was used to detect gene expression in different cell lines. Results of ARG protein expression with an HR of >1 were downloaded from the HPA database. TCGA, The Cancer Genome Atlas; BRCA, breast invasive carcinoma; GEO, Gene Expression Omnibus; LASSO, least absolute shrinkage and selection operator; ARGs, anoikis-related genes; GO, Gene Ontology; KEGG, Kyoto Encyclopaedia of Genes and Genomes; PPI, protein-protein interaction; GSEA, gene set enrichment analysis; GSVA, gene set variation analysis; KM, Kaplan-Meier curve; ROC, receiver operating characteristic curve; AUC, area under the curve; qPCR, quantitative polymerase chain reaction; HR, hazard ratio; HPA, human protein atlas.

Results of the gene enrichment analysis and PPI network based on screened ARGs
We performed GO and KEGG enrichment analysis, including cellular components, biological processes (BPs), molecular functions (MFs), and signalling pathways, with the criteria of p < 0.05 and FDR (q-value) < 0.25 (Table 2)     enriched in BP-related processes (such as keratinocyte differentiation, response to hypoxia, and negative proteolysis regulation) and MF-related processes (such as endopeptidase inhibitor activity, peptidase inhibitor activity, and flavin mononucleotide [FMN] binding) (Fig. S2A). KEGG pathway analysis showed that the 11 ARGs were primarily associated with small-cell lung cancer and toxoplasmosis (Fig. S2A). Figure S2B illustrates the potential links between these items. Furthermore, the chord diagram illustrated each ARG's distribution in the GO items and KEGG pathways (Fig. S2C). CD24, NOS2, and NDRG1 were associated with the process of response to hypoxia. KRT15, TP63, and IVL were mainly enriched in the process of keratinocyte differentiation. BIRC3 and NOS2 were closely associated with small-cell lung cancer pathways. SERPINA1 and BIRC3 were associated with peptidase inhibitor activity. PPI networks were constructed using the STRING database with an interaction threshold of 0.15 to ascertain the interactions between the 11ARGs in breast cancer. Visualisations were performed using the Cytoscape software (Fig. S2D). The results indicated potential interaction between ARGs. Furthermore, genes with similar functions to ARGs were predicted, and a protein network was constructed using GeneMANIA, with ephrin type-B receptor 2, RAN binding protein 9, and activated leukocyte cell adhesion molecule being closely associated with the 11 ARGs (Fig. S2E).

Signalling pathways and hallmark gene sets associated with risk groupings
We subsequently performed GSEA and GSVA between high-and low-risk groups based on the TCGA-BRCA dataset to explore potential relationships between ARGs and gene sets related to breast cancer. The results of GSEA indicated several significantly enriched BPs (Table 3), including pre-Notch expression and processing (Fig. 3B), signalling by Notch (Fig. 3C), T cell factor-dependent signalling in response to Wnt (Fig. 3D), and signalling by Wnt (Fig. 3E), as illustrated in Fig. 3A. Other BPs, including transcription of androgen receptor-regulated genes kallikrein-related peptidase (KLK) 2 and KLK3, DNA damage telomere stress-induced senescence, and processes associated with cell cycle regulation, are presented in Table 3. We performed GSVA on all genes in TCGA-BRCA to further explore the differentially expressed hallmark gene sets between the two risk groups. Table 4 reveals that 38 hallmarks exhibited differences, which were visualised using heat maps (Fig. 3F), and the gene sets with p-values <0.001 were compared between two risk groups and visualised using box plot diagrams (Fig. 3G). The results implied that gene sets of Myc targets, E2F targets, hypoxia, p53, tumour necrosis factor (TNF)-α/nuclear factor kappa B (NF-κB), interleukin (IL)-6/Janus kinase (JAK)/signal transducer and activator of transcription 3 (STAT3), and mammalian target of rapamycin complex 1 (mTORC1) signalling pathway had marked differences in their enrichment between the two groups.

Correlation analysis between the clinical characteristics and prognosis
Clinical information of patients from TCGA-BRCA dataset was extracted to assess the LASSO prognostic model (Table 5). We initially performed univariate Cox regression to evaluate the prognostic value of crucial ARGs (CD24, KRT15, MIA, NDRG1, and TP63) in patients and then formulated a multivariate Cox regression model comprising of those factors with a p-value < 0.1 in the univariate Cox regression (Table 6). Forest plots revealed that two genes (CD24 and NDRG1) were associated with increased risk with HRs > 1, while three other genes (KRT15, MIA, and TP63) were protective genes with HRs < 1 (Fig. 5A). Subsequently, we constructed nomograms for ARGs to evaluate the predictive power of the Cox regression models (Fig. 5B). Furthermore, we performed calibration curves for 2-, 3-, and 4-year survival predictive power of Cox regression model nomograms, respectively (Figs. 5C-5E). The results indicated that the blue line corresponding to the 3-year survival was closest to the ideal diagonal 45 • line (grey line), implying that the Cox regression models yielded the most reliable prediction for the 3-year survival. Finally, DCA was used to assess the clinical applicability of the LASSO-Cox prognostic model for the 2-, 3-, and 4-year survival (Figs. 5F-5H), and the blue line in the graphs represents the model's predictive power. The range on the x-axis of the blue line was higher than that of the red line (all positive) and grey line (all negative) for the 3-year survival, indicating the best clinical applicability in terms of the 3-year survival.

mRNA levels of ARGs in MCF-10A, MDA-MB-231, HCC1954, and MCF-7 cell lines
Gene mRNA levels of key ARGs in normal mammary and breast cancer cell lines were detected using qPCR. We ascertained that CD24 expression was up-regulated in breast cancer cells compared with normal breast epithelial cells (Fig. 6A). Moreover, KRT15, MIA, and TP63 levels significantly decreased in breast cancer cells (Figs. 6B, 6C and 6E). However, mRNA levels of NDRG1 varied across the three breast cancer lines, being   up-regulated in MDA-MB-231 cells and down-regulated in HCC1954 and MCF-7 cells (Fig. 6D).

CD24 and NDRG1 protein levels in breast cancer tissues
The expression levels of CD24 and NDRG1 that had an HR of >1 in the multivariable Cox regression model were detected using IHC based on the HPA database. It was observed, in comparison to normal breast tissues, there was a significant increase in the CD24 and NDRG1 expression in breast cancer (Figs. S3A-S3D).

DISCUSSION
Breast cancer remains a serious threat to women due to its high incidence and mortality rates (Sung et al., 2021), with distant metastasis being the primary cause of breast cancerassociated mortality worldwide (Liang et al., 2020). However, treatments for metastasis remain elusive. Therefore, it is necessary to study the underlying molecular mechanisms of metastasis, while at the same time, there is an urgent need to explore novel clinical diagnostic and prognostic biomarkers and therapeutic targets for breast cancer. Anoikis resistance has been established as a hallmark of advanced malignancy due to its ability to promote tumour cell invasion and metastasis (Adeshakin et al., 2021). Recently, researchers focused more on ARGs due to their potential prognostic and therapeutic value (Liu et al., 2022a). However, a comprehensive understanding of ARGs in relation to breast cancer is lacking.
In this study, we performed a LASSO analysis based on the TCGA-BRCA dataset through ARGs and divided all samples of TCGA-BRCA and BRCA datasets into high-and low-risk groups based on the median risk score of the prognostic model. Then we screened 11 significant differentially expressed ARGs between two risk groups with the same expression patterns in both datasets, suggesting their potential prognostic value in breast cancer samples. Thus, to further investigate the potential function and associated molecular mechanisms of the screened ARGs, we performed function and pathway enrichment analysis. The results revealed that ARGs were primarily enriched in the processes of response to hypoxia, proteolysis regulation, peptidase inhibitor activity, FMN binding, and small-cell lung cancer. To the best of our knowledge, resistance to a hypoxic environment, which is one of the hallmarks of tumour cells, could enhance the anoikis resistance of tumour cells and promote cell survival and distant metastasis (Wang et al., 2021;Wobma et  al., 2018). As predicted, our results remain consistent with those previous studies that found ARGs to be enriched in hypoxia-related functional processes. Proteolysis is a fundamental metabolic process essential for life and abnormal regulation of proteolysis and peptidase activity are associated with cancer. For example, tumour cell metastasis can be facilitated by the degradation of the extracellular matrix by secreted proteinases (Osuala et al., 2019). FMN, which is a cofactor of enzymes, has been the focus of photodynamic therapy for tumours (Yang, Chang & Chen, 2017). Therefore, our data enhances our understanding of ARGs in breast cancer and guides further research.
It is now evident that activation of the Notch and WNT signalling pathway was associated with anoikis resistance in various malignancies, including breast cancer (Tan et al., 2019;Kwon et al., 2014;Leong et al., 2007). GSEA results revealed that the Notch and Wnt signalling pathways were enriched in the high-risk group, which implies that the poor prognosis of patients with breast cancer in the model might be associated with abnormal pathway activation. Notably, the results further revealed potential regulatory relationships between ARGs and the Notch and Wnt signalling pathways, which need to be further explored in the future. Growing evidence has indicated that the androgen receptor modulates breast cancer progression (Magklara, Brown & Diamandis, 2002). However, the relationship between anoikis and the expression of androgen receptor-regulated genes KLKs remains unclear. Herein, GSEA revealed that ARGs might be involved in the process of transcription of genes KLK2 and KLK3 in breast cancer, which might offer clues for a more in-depth investigation of ARGs and anoikis resistance in breast cancer. In addition, GSVA results revealed that 30 gene sets exhibited significant differences between the two risk groups, comprising Myc targets, E2F targets, hypoxia, p53, TNF α/NF-κB, IL-6/JAK/STAT3, and mTORC1 signalling pathways, implying a potential correlation between these pathways and the role of ARGs in the invasion and metastasis of breast cancer. Previous studies have demonstrated that p53 induced anoikis and decreased breast cancer metastasis (Cheng et al., 2009), and mTORC1 has been reported to be associated with anoikis in breast cancer (Ng et al., 2012). This further supported our study. However, the relationship between anoikis and Myc and E2F in breast cancer is yet to be identified. We uncovered their potential association in breast cancer in the study. Subsequently, to elucidate the prognostic value of ARGs and their correlation with breast cancer progression, we plotted survival and ROC curves individually. As a result, five AGRs (CD24, KRT15, MIA, NDRG1, and TP63), which met the criteria, were identified as key genes and were used for univariate and multivariable Cox regression analysis combined with clinical characteristics of the TCGA samples. Furthermore, calibration curves demonstrated a good prognostic performance of the Cox regression model, with the most accurate prediction being made at the 3-year survival. DCA indicated the best clinical applicability in the 3-year survival. Precise prognosis prediction is essential for treating patients with breast cancer. This new information in our study might provide more information for developing prognostic tools. Therefore, the five ARGs might play significant roles in breast cancer progression and aggressiveness. We confirmed the strong expression of CD24 in breast cancer through qPCR and IHC, which is consistent with previous reports (Barkal et al., 2019). CD24 is a mucin-like glycosylated glycophosphatidylinositol-anchored molecule, which could help resist anoikis in various tumours (Li, Sun & Wang, 2015). It has been suggested recently that CD24 is an immune checkpoint promoting immune evasion through its ability to bind to sialic acid-binding immunoglobulin-like lectin 10 ( Barkal et al., 2019). Furthermore, it has been confirmed that CD24 might be a potential therapeutic target, according to preclinical studies (Ni, Zhao & Wang, 2020). Nevertheless, there are some conflicting reports on the role of CD24 in cancer (Ni, Zhao & Wang, 2020). Our study provided clear evidence of significant CD24 up-regulation at the protein and mRNA levels in breast cancer, and it has been suggested to be a high-risk factor resulting in poor prognosis in patients with breast cancer.
KRT15, a type I keratin, has been shown to be highly expressed in oesophageal carcinoma, colorectal cancer, and gastric cancer (Chen & Miao, 2022), whereas it is down-regulated in breast and prostate cancers (Chong et al., 2012). Nevertheless, its precise functional role and associated molecular mechanisms in breast cancer remain unclear. Our data suggests that KRT15 might be a tumour suppressor gene and could be associated with longer overall survival of patients with breast cancer. Furthermore, enrichment analysis proposed further avenues of exploration into the role of KRT15 in anoikis.
MIA is secreted by malignant melanoma cells and is associated with cell detachment from the ECM and the promotion of melanoma cell invasion and metastasis (Schmidt & Bosserhoff, 2009). To date, the relationship between MIA and breast cancer has not been reported. Our study is the first to reveal that MIA acts in a protective manner in patients with breast cancer. We also used qPCR to confirm that the MIA mRNA levels were lower in breast cancer cell lines compared with normal breast cells, which gave insights into breast cancer progression and metastasis.
Currently, the exact role of NDRG1in breast cancer remains controversial (Joshi, Lakhani & Reed, 2022). Herein, survival analysis demonstrated that NDRG1 might be an oncogene associated with a poor prognosis. Furthermore, IHC results from the HPA database demonstrated that NDRG1 protein expression was higher in breast cancer tissues. However, NDRG1 mRNA levels were increased in MDA-MB-231 cells and decreased in HCC 1954 cells and MCF-7 cells compared with normal mammary cells, indicating that NDRG1 might play distinct roles in different breast cancer subtypes. Consequently, we shall further examine the protein and mRNA levels of NDRG1 in additional clinical samples. Simultaneously, a study of the underlying molecular mechanisms of NDRG1 is required to determine its role in breast cancer.
Protein p63 encoded by the TP63 gene belongs to the p53 transcription factor family, and various isoforms are generated due to alternative splicing (Gatti et al., 2019). Two major subtypes of p63 (TAp63 and Np63) have contrasting roles in breast cancer. TAp63 acts as a tumour suppressor, while Np63 might promote tumourigenesis and progression in breast cancer (Gatti et al., 2019). Our study demonstrated that decreased TP63 expression is a negative and independent prognostic factor in breast cancer, and the TAp63 mRNA level was down-regulated in MDA-MB-231, HCC1954, and MCF-7 cells in laboratory tests, consistent with previous reports. In conclusion, five key ARGs were closely associated with breast cancer progression.
In this study, we initially examined the prognostic values and expression profiles of ARGs in patients with breast cancer. However, our study has some limitations. We did not perform a prognostic analysis of breast cancer subtypes, and the laboratory data did not comprise the key ARG protein levels in clinical samples and cell lines. We leave these questions to future studies and further investigate the molecular mechanisms of key ARGs in breast cancer.

CONCLUSIONS
This study primarily aimed to determine the critical role of ARGs in breast cancer progression and search for valuable diagnostic and prognostic markers. The findings revealed that the ARG-based predictive model could accurately predict the survival of patients with breast cancer. We also investigated the possible BPs of ARGs and elucidated expression features and potential prognostic value for each ARG. Our data emphasize the significant roles of ARGs in breast cancer progression and provide novel research ideas for anoikis resistance.