Exosomal hsa-miR-21-5p is a biomarker for breast cancer diagnosis

Purpose Breast cancer (BC) is characterized by concealed onset, delayed diagnosis, and high fatality rates making it particularly dangerous to patients’ health. The purpose of this study was to use comprehensive bioinformatics analysis and experimental verification to find a new biomarker for BC diagnosis. Methods We comprehensively analyzed microRNA (miRNA) and mRNA expression profiles from the Gene Expression Omnibus (GEO) and screened out differentially-expressed (DE) miRNAs and mRNAs. We used the miRNet website to predict potential DE-miRNA target genes. Using the Database for Annotation, Visualization and Integrated Discovery (DAVID), we performed Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses on overlapping potential target genes and DE-mRNAs. The protein-protein interaction (PPI) network was then established. The miRNA-mRNA regulatory network was constructed using Cytoscape and the analysis results were visualized. We verified the expression of the most up-regulated DE-miRNA using reverse transcription and a quantitative polymerase chain reaction in BC tissue. The diagnostic value of the most up-regulated DE-miRNA was further explored across three levels: plasma-derived exosomes, cells, and cell exosomes. Results Our comprehensive bioinformatics analysis and experimental results showed that hsa-miR-21-5p was significantly up-regulated in BC tissue, cells, and exosomes. Our results also revealed that tumor-derived hsa-miR-21-5p could be packaged in exosomes and released into peripheral blood. Additionally, when evaluating the diagnostic value of plasma exosomal hsa-miR-21-5p, we found that it was significantly up-regulated in BC patients. Receiver operating characteristic (ROC) analysis also confirmed that hsa-miR-21-5p could effectively distinguish healthy people from BC patients. The sensitivity and specificity were 86.7% and 93.3%, respectively. Conclusion This study’s results showed that plasma exosomal hsa-miR-21-5p could be used as a biomarker for BC diagnosis.


INTRODUCTION
Breast cancer (BC) is a malignant tumor that originates from the glandular epithelium of the breast. Due to the lack of typical and specific clinical symptoms and signs in the early stage, most patients present with symptoms and are diagnosed in middle and late stages, making BC particularly dangerous. Although BC is the leading cause of cancer-related death in women (Bray et al., 2018), the current detection methods used to clinically diagnose BC are inadequate to a certain extent. Specifically, pathologic examinations are invasive and imaging examinations are not accurate. The sensitivity and/or specificity of existing biomarkers requires further analysis, which limits their application in BC diagnosis (Bevers et al., 2018;Holloway et al., 2010;Lian et al., 2019). Therefore, new biomarkers are needed to promptly detect and diagnose BC in order to improve the rate of survival of BC patients (De Santis et al., 2016).
MicroRNAs (miRNAs) are non-coding small RNAs in eukaryotes that are approximately 21 to 23 nucleotides long. MiRNAs have been found to be heavily dysregulated in many types of malignant tumors and to participate in a series of important biological activities, such as tumor cell proliferation (Chen et al., 2020;Roy et al., 2018;Wei & Gao, 2019), migration (Chen et al., 2020;Wei & Gao, 2019;Xu, Yu & Liu, 2020;Yu et al., 2019), and apoptosis Kasinski & Slack, 2011). It has been reported that miR-1246 targets CaV1 regulation and acts on the PDGFβ receptor in ovarian cancer cells, thereby inhibiting cell proliferation (Kanlikilicer et al., 2018). Hong et al. (2019) found that miR-204-5p directly regulates PIK3Cβ expression and the downstream PI3K/Akt signal pathway in BC, hence affecting BC growth and metastasis. Therefore, understanding tumor-related miRNAs might help reveal the molecular mechanisms underlying tumor development and formation, and provide new insights for clinical tumor diagnosis and treatment.
Exosomes are lipid bimolecular vesicles with a diameter of 30-150 nm (Deng & Miller, 2019). Exosomes exist in plasma, urine, saliva, and cell supernatant culture medium, and are released into the extracellular space by a variety of cells during exocytosis (Hesari et al., 2018). They contain receptors on their lipid bilayer membrane and carry proteins, lipids, mRNAs, miRNAs, and long non-coding RNAs derived from the original inside cells to protect them from degradation (De Veirman et al., 2016;Di Pace et al., 2020;Wang, Zheng & Zhao, 2016). Recently, a growing body of evidence shows that exosomal nucleic acids can act as novel biomarkers to diagnosis many types of diseases (Deng, Magee & Zhang, 2017).  showed excellent diagnostic value in identifying alcoholic hepatitis (AH), confirming that they are promising biomarkers for AH diagnosis (Momen-Heravi et al., 2015). Chen et al. (2015) found that plasma exosomal miR-214 is a potential biomarker for liver fibrosis.
High throughput microarray technology for expression profiling has become widelyused in the study of cancer genes and the identification of new biomarkers (Huang & Gao, 2018). In this study, we comprehensively analyzed the expression profile of the Gene Expression Omnibus (GEO) database to obtain differentially-expressed (DE) miRNAs and mRNAs. We used the miRNet website to predict potential DE-miRNA target genes. Gene Ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed on the intersection of potential target genes and DE-mRNAs using the the Database for Annotation, Visualization and Integrated Discovery (DAVID) database, and the protein-protein interaction (PPI) network was constructed using the STRING database. Additionally, the miRNA-mRNA regulatory network was constructed using Cytoscape and the results were visualized. Using our analysis, we explored the expression and diagnostic value of significantly different miRNAs across three specimen types (tissue, plasma, and cellular exosomes). This provided a basis for studying the molecular mechanisms underlying BC progression and identifying reliable biomarkers for diagnosis (Fig. 1).

Microarray dataset acquisition
The miRNA expression profile of GSE97811 and the mRNA expression profile of GSE29044 were downloaded from the GEO database (http://www.ncbi.nlm.nih.gov/geo/). GSE97811 was detected using the GPL21263 3D-Gene Human miRNA V21_1.0.0 platform, which included 16 normal controls and 45 BC tissues. GSE29044 was detected using the GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133Plus 2.0 Array platform, which included 36 normal controls and 73 BC tissues.

Predicting DE-miRNAs targeting mRNAs and acquisition of overlapping mRNAs
We used miRNet (https://www.mirnet.ca/) to predict the target mRNAs of the DE-miRNAs (Fan et al., 2016). The predicted target mRNAs were matched with the DE-mRNAs obtained by microarray analysis, and the overlapping mRNAs were obtained using the Venn online tool (http://bioinformatics.psb.ugent.be/webtools/Venn/).

Functional enrichment analysis
GO and KEGG pathway analyses were performed using the DAVID Bioinformatics Resource (version 6.8, https://david.ncifcrf.gov/), a database for annotation, visualization, and integrated discovery (Huang, Sherman & Lempicki, 2009a;Huang, Sherman & Lempicki, 2009b). The DAVID Bioinformatics Resource was helpful for understanding the biological functions and possible pathways of overlapping target genes. The results were visualized using the R software package (ggplot2 and Cairo). P < 0.05 was considered statistically significant. et al., 2018). Using the tool we explored the relationship between the target genes and the BC patients' prognosis.

Cell culture
Human non-tumorigenic epithelial cell line MCF 10A and BC cells MDA-MB-23, MCF7, and BT474 were purchased from the American Type Culture Collection (ATCC). MCF10A cells were cultured in an MEGM kit (CC3150; Lonza) with 100 ng/mL cholera toxin (C8052; Sigma). MDA-MB-231 and MCF7 cells were cultured in DMEM medium (Gibco, C11995500BT) and BT474 was cultured in RPMI-1640 medium (C11875500BT; Gibco). We added 10% exosome-free fetal bovine plasma and antibiotics (100 U/mL of penicillin and 100 U/mL of streptomycin) to the medium, and cultured it at 37 • C in a 5% CO 2 incubator.

Isolating and identifying exosomes
Peripheral blood was centrifuged at 4 • C for 1200 ×g for 10 min to obtain plasma. Plasma-derived exosomes were isolated through differential ultracentrifugation (Théry et al., 2006). The exosome morphology was analyzed by transmission electron microscopy (TEM). Ten minutes after 10 µL of the exosomes were pipetted onto a grid coated with formvar and carbon at room temperature, excess fluid was removed, and the sample was negatively stained with 3% phosphotungstic acid (pH 6.8) for 5 min. Finally, the samples were analyzed by TEM. The exosome concentration and size range were identified using a NanoSight NS300 system (NanoSight, Salisbury, United Kingdom) supplied with a fast video capture and Nanoparticle Tracking Analysis (NTA) software. Before performing the experiments, the instrument was calibrated with 100 nm polystyrene beads (Thermo Scientific, Waltham, MA, USA). The samples were captured for 60 s at room temperature. NTA software processed the video captures and measured the particle concentration (particles/ml) and size distribution (in nanometer). Each specimen was measured three times.

Western blotting
The total exosomal proteins were extracted and then subjected to SDS-PAGE electrophoresis to transfer the protein to the PVDF membrane. The 5% skimmed milk was sealed at room temperature for 2 h, washed three times with TBST, incubated overnight at 4 • C with TSG101 (ab125011; Abcam) and Calnexin (ab133615; Abcam) antibody, and incubated for 1 h in HRP-labeled secondary antibody (SA00001-2; Proteintech) at room temperature. After washing three times, the exosome protein marker expression was detected using ECL luminescence and Calnexin as the negative control.

Reverse transcription and quantitative polymerase chain reaction (RT-qPCR) analysis
Total RNA was extracted from tissues, cells, and exosomes using the TRIzol TM Reagent (Invitrogen, 11596026) according to the manufacturer's instructions. The primers used in the study (Table S1) were synthesized by Sangon Biotech (Shanghai, China). Reverse transcription was performed in accordance with the PrimeScript TM RT Reagent Kit (RR037A; TaKaRa) manufacturer's instructions. We performed DE-miRNA RT-qPCR using TB Green c PreMix Ex Taq TM II (RR820A; TaKaRa) with corresponding forward and reverse primers. U6 snRNA was used for normalization, and DE-miRNA relative expression was calculated using the 2 − Ct method.

ROC curve analysis of the most significant difference in miRNA performance in BC diagnosis
RT-qPCR was used to determine the expression of DE-miRNAs in BC and adjacent tissue, and the miRNAs with the most significant differences were selected. We used a receiver operating characteristic (ROC) curve to evaluate the miRNA values as a diagnostic marker for BC. The Youden index was used to determine the cut-off value of the most significant difference in the relative expression of DE-miRNA. At the same time, we randomly selected samples from six healthy people, six BC patients, six liver cancer patients, six lung cancer patients, six cervical cancer patients, and six ovarian cancer patients to detect the miRNA expression with the most significant difference in plasma-derived exosomes.

Statistical analysis
Statistical analysis was performed using SPSS24.0 software. A paired sample t test was used to compare hsa-miR-21-5p expression in BC and adjacent tissue, and in plasma-derived exosomes of BC patients before and after surgery. The Pearson correlation was used to analyze the correlation between hsa-miR-21-5p in plasma-derived exosomes and cancer tissues. P < 0.05 was considered statistically significant.

Identifying DE-miRNAs and DE-mRNAs
The GEO2R online tool was used to analyze the GEO datasets. Of the 18 DE-miRNAs screened from the miRNA expression profile dataset GSE97811, six miRNAs were upregulated and 12 miRNAs were down-regulated ( Fig. 2A). After analyzing the mRNA

Predicting target genes and overlapping mRNA acquisition
The 2,064 target mRNAs of the DE-miRNAs predicted by miRNet (Table 1) were matched with the 479 DE-mRNAs obtained by GSE29044, and 52 overlapping mRNAs were identified for further analysis (Fig. 3).

Functional enrichment analysis of overlapping mRNAs
Fifty-two overlapping mRNAs were analyzed. Biological process (BP) included the positive regulation of transcription from RNA polymerase II promoter, cell division, and cell proliferation (Fig. 4A). Molecular function (MF) mainly included ATP and chromatin binding (Fig. 4B). The cellular components (CC) were mainly receptor complexes and proteinaceous extracellular matrix (Fig. 4C). Additionally, the KEGG pathway was mainly enriched in cancer-related pathways (Fig. 4D), such as miRNAs in cancer (hsa05206), proteoglycans in cancer (hsa05205), and the p53 signaling pathway (hsa04115).

hsa-miR-21-5p expression was significantly up-regulated in BC tissue and cells
Based on our analysis, we selected hsa-miR-21-5p for follow-up study because it had the greatest up-regulated expression. The RT-qPCR results showed that hsa-miR-21-5p expression was up-regulated in BC tissue when compared with adjacent tissue (Fig. 6A),  which was consistent with our results from the GEO database (Fig. 6B). Additionally, our results showed that hsa-miR-21-5p expression in BC cells and the derived exosomes was up-regulated, especially in MDA-MB-231 cells (Fig. 6C).

Exosome identification
After the exosomes were isolated from plasma, we used TEM to examine the exosomes and found that they were typically dish-shaped and contained low electron density substances (Fig. 7A). NTA showed that the exosome particle size was between 100 and 300 nm and was relatively uniform, and the diameter was approximately 128 nm (Fig. 7B). Western blotting detected exosome protein marker TSG101. Calnexin was not expressed (Fig. 7C).
The results indicated that the exosomes were successfully extracted.

Elevated plasma hsa-miR-21-5p was tumor-derived and packaged into exosomes
RT-qPCR was used to determine hsa-miR-21-5p expression in the plasma-derived exosomes of BC patients before and after tumor resection. The expression was significantly down-regulated after tumor resection (Fig. 8A). Additionally, we found that plasma exosomal hsa-miR-21-5p expression in BC patients was positively correlated with has-miR-21-5p expression in BC tissue (Fig. 8B). We then cultured MDA-MB-231 and MCF7 cells, collected exosomes in the culture medium at different time points, and noted that exosome hsa-miR-21-5p expression in the two cells' culture media increased over time (Fig. 8C).

Exosomal hsa-miR-21-5p was significantly up-regulated in BC patient plasma and had remarkable diagnostic value
When evaluating the diagnostic value of plasma exosomal hsa-miR-21-5p, we found that its expression was most significantly up-regulated in BC patients, but there was no significant difference in plasma-derived exosomes across lung, liver, cervical, and ovarian cancer patients (Fig. 9A). Additionally, ROC curve analysis showed that plasma exosomal hsa-miR-21-5p could distinguish healthy people from BC patients, with an area under the curve of 0.961 (95% CI [0.920-1.00], P < 0.05), and a sensitivity and specificity of 86.7% and 93.3%, respectively (Fig. 9B).

Down-regulated expression of target genes TGFβR3 and EGFR in BC
The UALCAN analysis results showed that the expression of TGF βR3 and EGFR, the potential hsa-miR-21-5p target genes, was lower in BC tissue than in normal tissue. There was no significant difference in TGF βR3 expression across different BC stages (Figs. 10A-10B). The results of our HPA database analysis showed that at the protein level, hsa-miR-21-5p's potential target genes (TGF βR3 and EGFR) were lower in BC tissue than in normal tissue (Fig. 10C).

TGF βR3 is associated with overall survival (OS) in BC patients
The Kaplan-Meier curve showed that hsa-miR-21-5p's potential target gene, TGF βR3, was significantly correlated with the OS of BC patients. Based on the log-rank test, we found no significant differences in the OS between EGFR and BC patients (Figs. 11A-11B).

DISCUSSION
BC is one of the most common malignant tumors in women. Its morbidity and mortality rates increase every year (Bray et al., 2018), making it the number one threat to women's health. Therefore, BC diagnosis, treatment, and prognosis have become the focus of contemporary scholars. With recent and continuous developments and advances in gene detection, microarray and high-throughput sequencing technology play an increasingly important role in investigating biomarkers related to tumor diagnosis, treatment, and prognosis . Bioinformatic analysis of BC miRNA and mRNA expression profiles could quickly and effectively help us identify biomarkers for BC diagnosis. In this study, we analyzed two datasets from the GEO database and identified 18 DE-miRNAs and 479 DE-mRNAs. After constructing the miRNA-mRNA regulatory network, we screened and selected hsa-miR-21-5p, the core network and the most up-regulated, as the subject of follow-up research. Our experimental verification results were consistent with trends in bioinformatics analysis, indicating that the hsa-miR-21-5p expression was up-regulated in tissues and plasma-derived exosomes of BC patients and in the BC cells and exosomes of cell culture media. Previous studies on miRNA-21 mainly concentrated on peripheral blood (Zhou et al., 2018) and tissue (Gao et al., 2012). The results of this study were also consistent with those of Wu, Zhu & Mo (2009). Additionally, we further verified the hsa-miR-21-5p expression in exosomes. We found that hsa-miR-21-5p expression in BC tissues was positively correlated with its expression in plasma-derived exosomes. hsa-miR-21-5p expression in plasma-derived exosomes decreased post-operatively, indicating that tumors caused the increase in plasma-derived exosomal hsa-miR-21-5p. This enriched the study of BC miRNA-21 from the perspective of plasma-derived exosomes. Moreover, our results showed that hsa-miR-21-5p's sensitivity and specificity in BC diagnosis were 86.7% and 93.3%, respectively. Additionally, there was no significant difference in the hsa-miR-21-5p expression in plasma-derived exosomes in lung, liver, cervical, and ovarian cancer patients. This finding further confirms that plasma-derived exosomal hsa-miR-21-5p has the potential to serve as a biomarker for BC diagnosis.
In order to further explore hsa-miR-21-5p's mechanism in BC, we used bioinformatics prediction to obtain two potential target genes: EGFR and TGF βR3. Our results show that at the protein and gene level, both EGFR and TGF βR3 expression was down-regulated in BC patient tissue. We speculated that their low expression may have been caused by the negative regulation of hsa-miR-21-5p. TGF βR3 is often used as a tumor suppressor gene for various cancers, including lung cancer , pancreatic cancer (Hou et al., 2021), and head and neck cancer (Fang et al., 2020). When a tumor progresses, TGF βR3 expression decreases and is associated with a poor patient prognosis (Hou et al., 2021). TGF βR3, also known as betaglycan, is the largest kind of transmembrane glycoprotein distributed on the cell surface. This gene is located on human chromosome 1p33-p32 and is the most abundant receptor in transforming growth factor-β (TGF β) signal transduction (Hou et al., 2021). It plays a biological role by binding to specific receptors on the cell membrane. We analyzed TGF βR3 expression in 1,065 BC patients in the UALCAN database. Our results showed no significant relationship between TGF βR3 expression and the patients' BC stage (P > 0.05). Dong et al.'s (2007) study of BC samples showed that the loss of TGF βR3 expression was related to BC progression and its expression was significantly correlated with the BC stage (P < 0.05). Our results were different from those of Dong et al. (2007), but there are a number of potential explanations for this. For example, Dong et al. (2007) did not include the BC patients' clinical information, such as age and race. This data could not be analyzed with the subjects included in the UALCAN database leading to inconsistent results. Our initial analysis showed that BC is a highly heterogeneous disease. Russnes et al. (2011) found clear heterogeneity across estrogen receptor-positive (ER + ) tumors using microarray analysis at the genome, transcriptome, and epigenetic levels. Baretta, Olopade & Huo (2015) evaluated the prognostic impact of heterogeneity in hormone-receptor status in bilateral synchronous (SBC) and metachronous breast cancer (MBC) patients. In both patient cohorts they found that heterogeneity in hormone-receptor status could be used to predict OS and BC-specific It is possible that low TGF βR3 expression may reduce the binding rate of ligand TGF-β, which survival. ER status had a greater prognostic value compared to progesterone receptor (PR) status. in turn reduces the active signal transmitted to downstream genes, leading to a decrease in the tumor suppressor effect of the TGF-β pathway and ultimately leads to a poor prognosis for BC patients.

CONCLUSION
In conclusion, our study demonstrated that the plasma-derived exosome hsa-miR-21-5p could be used as a biomarker for BC diagnosis. General 097). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Abbreviations
• Yun He performed the experiments, analyzed the data, authored or reviewed drafts of the paper, and approved the final draft.
• Jiaoyan Yan and Ye Yang performed the experiments, authored or reviewed drafts of the paper, and approved the final draft.
• Jian Huang and Shu Zhang conceived and designed the experiments, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft.

Human Ethics
The following information was supplied relating to ethical approvals (i.e., approving body and any reference numbers): The Ethics Committee of the Affiliated Hospital of Guizhou Medical University approval to carry out the study within its facilities (Ethical Application NO. 2019032K and NO. 2019033K).

Data Availability
The following information was supplied regarding data availability: