Primary breast cancer biomarkers based on glycosylation and extracellular vesicles detected from human serum

Abstract Background Breast cancer is a very common cancer that can be severe if not discovered early. The current tools to detect breast cancer need improvement. Cancer has a universal tendency to affect glycosylation. The glycosylation of circulating extracellular vesicle‐associated glycoproteins, and mucins may offer targets for detection methods and have been only explored in a limited capacity. Aim Our aim was to develop an approach to detect the aberrant glycosylation of mucins and extracellular vesicle‐associated glycoproteins from human sera using fluorescent nanoparticles, and preliminarily evaluate this approach for the differential diagnosis of breast cancer. Methods and results The assay involved immobilizing glycosylated antigens using monoclonal antibodies and then probing their glycosylation by using lectins and glycan‐specific antibodies coated on Eu+3‐doped nanoparticles. Detection of mucin 1 and mucin 16 glycosylation with wheat germ agglutinin, and detection of the extracellular vesicle‐associated CD63 were found to have better diagnostic ability for localized breast cancer than the conventional assays for mucin 1 and mucin 16 based tumor markers when the receiver operating characteristics were compared. Conclusions These results indicate that successful differential diagnosis of primary breast cancer may be aided by detecting cancer‐associated glycosylation of mucin 1 and mucin 16, and total concentration of CD63, in human serum.


| INTRODUCTION
Breast cancer (BrCa) is the second most diagnosed cancer and the leading cause of cancer death in women, with an estimated number of 627 000 cancer deaths in 2018. The incidence of BrCa is increased in countries with high human development index but even in other countries, the incidence rates are on the rise. The underlying cause for the differences in incidence rates correlating with human development index, is most likely due to many social and economic factors. 1 Small primary tumors are treatable with surgery and have a high relative 5 year survival rate; even 100% for ≤1 cm tumors. 2 That is why early detection is important and screening for BrCa can reduce mortality. Detecting cancer early can also reduce costs in healthcare as early stage cancer does not require continued expensive treatment.
Mammography-based screening is the main modality of screening used for BrCa, but it has its limitations. It has been reported that overdiagnosis and consequent overtreatment, false-positive biopsies and radiation-induced BrCa are the main harms of mammography-based screening. 3 Dense breast tissue also decreases the sensitivity and accuracy of mammographic screening. 4 The breasts of premenopausal women are generally denser than those of older women. 5 On average, women from industrialized countries experience menopause at between 50 and 52 years of age, with differences related to ethnicity, demographics, and lifestyle. 6 About 31% of women diagnosed with BrCa are under the age of 50. 1 Although BrCa is generally more uncommon in premenopausal women, the tumor size is on average larger and the tumor stage is more advanced than tumors diagnosed in postmenopausal women, making the prognosis generally worse and the survival rate lower. 7 This could also to some extent be due to the fact that it is commonly recommended for women to start undergoing mammographic screenings after the age of 45. 8 Estrogen and progesterone receptor negative (ER À , PR À ) and human epidermal growth factor receptor 2 negative (HER2 À ), that is, triple negative BrCa has higher incidence in premenopausal women compared to older women. 9 Premenopausal BrCa is suggested to have distinct biological characteristics that differentiate it well from postmenopausal BrCa and contribute to its poor prognosis. 10 These differences highlight the importance of using a cohort of BrCa samples from the same age group (premenopausal/postmenopausal) in research for early diagnostics and new screening methods.
Currently, no serological biomarkers are used for BrCa diagnosis or screening due to the lack of sensitivity and specificity. The only FDA-approved serum markers for BrCa are based on mucin 1 (MUC1) epitopes (CA15-3, CA27.29) and are used for monitoring the disease. 11 However, other glycoprotein markers, such as cancer antigen 125 (CA125/mucin 16), cancer antigen 19-9 (CA19-9), and carcinoembryonic antigen (CEA), may also be elevated in some BrCa patients. 12 Presently, serological biomarkers have very limited utility for aiding clinical decisions concerning BrCa diagnosis or treatment.
Protein glycosylation is a co-and post-translational process that is readily influenced by the conditions of the cell. It has been extensively reported that glycosylation is affected by malignant transformation, and the microevolutionary processes of the cancer microenvironment guide the resulting glycovariant proteins toward a malignant phenotype. These glycovariants may contribute to all hallmarks of cancer and be found in bodily fluids, such as blood and urine. 13 These kinds of molecules may offer sensitive and minimally invasive ways to help diagnose early stage cancers if suitable methodology for detecting them is introduced. Cancer-associated glycoproteins may also be attached to the surfaces of extracellular vesicles (EVs), such as exosomes. 14 Especially tetraspanin-30 (CD63), and its aberrant glycosylation, has been associated with BrCa cell malignancy. 15 The CD63 glycoprotein is abundantly expressed on different subtypes of EVs and can be detected directly from human blood. 16 Currently, methods for early detection of BrCa are lacking, but there are several leads for new tumor markers, including CA125, CA15-3, and CD63 glycovariants. We used a cost-effective and high sensitivity immunoassay-type approach to screen for glycovariants of these glycoproteins. We found that glycovariants of CA15-3 defined by wheat germ agglutinin (WGA), anti-T antibody, and anti-Tn antibody, glycovariants of CD63 defined by Ulex Europaeus agglutinin (UEA), and glycovariants of CA125 defined by WGA and anti-T antibody were elevated in BrCa cell lines compared to a non-malignant breast cell line. Assays measuring the glycovariant markers and the total CD63 in serum performed significantly better in differentiating BrCa patients from healthy individuals than the conventional CA125 and CA15-3 assays. The cancerous samples were mostly early stage BrCa samples but no histopathological analysis were performed on them so connecting breast cancer subtypes to certain glycovariants was not possible. The results imply that there is room for improvement upon the conventional tumor markers for differential diagnosis, although the results should be verified on a larger cohort.

| Clinical samples
Serum samples (N = 18) from healthy female volunteers were purchased from TUAS Lab (Turku, Finland) and serum samples (N = 16) from premenopausal BrCa patients were purchased from Discovery Life Sciences, Inc. (Huntsville, Alabama). The age of healthy volunteers ranged from 31 to 49 years, and the age of the cancer patients from 33 to 50 years. The median of ages was 38.5 and 42 years for the controls and cases, respectively. All the cancer samples had been taken before the start of cancer treatment, all were from Caucasian females, and none of them had a history of benign disease. Two of the samples were from stage I, seven from stage II, five from stage III, and two from stage IV BrCa patients.

| Cell cultures
The

| Glycan-binding reporter molecules
The molecules used for coating the Fluoro-Max™ carboxylate-modified Eu +3 -doped nanoparticles (Eu +3 -NPs) are presented in Table 1. The coating and usage of Eu +3 -NPs as bioconjugated reporter molecules has been described previously. 17 The conjugation of the GBPs was per-

| Screening for glycovariant biomarkers
All incubations were performed at room temperature and all washes were done using Kaivogen wash buffer diluted to 1Â concentration.
All assays were done in triplicates. The antibodies against CA125,

| Assays on human serum
The assays on human serum samples were performed essentially like in the screening for glycovariant tumor markers. The exceptions were that 80 ng of each of the biotinylated antibodies were used for the immobilization to the streptavidin plate, and 1.5 μL of serum was added into 28.5 μL of RED assay buffer for CA15-3 and CA125 glycovariant assays and 6 μL of serum into 24 μL RED assay buffer was added for CD63 glycovariant assays. TSA-BSA was used as a blank. The CD63 glycovariant assay was also performed using Eu +3 -NPs coated with anti-CD63 antibody. The concentration of CA125 and CA15-3 was also measured using Fujirebio Diagnostics enzyme immunoassay kits according to the manufacturer's instructions and using all optional steps.

| Statistical analyses
The glycovariant tumor marker screening results were plotted using RStudio 19 software with ggpubr 20 and ggplot2 21 R packages. The concentrations from serum sample measurements were calculated and the box plots were plotted using Origin 2016 22 software. The statistical differences (α = 0.05) between measured human serum concentrations of controls and cases were evaluated using the Wilcoxon rank sum test in RStudio. 19 Logistic regression probabilities for BrCa using the assay measurements as classifiers were calculated using RStudio. 19 Based on the probabilities, Receiver Operating Characteristics (ROC) and the ROC area under curve (AUC) were calculated and plotted with the pROC package 23 in RStudio 19 and the AUCs were compared with the same package using the bootstrap method. All p-values were corrected for multiple testing using the Benjamini-Hochberg method 24 in Rstudio. 19 F I G U R E 1 Schematic representation of the glycovariant assay principle. The biotinylated capture antibodies are immobilized on the surface of streptavidin-coated yellow microtiter wells. The target antigen is then recognized by the capture antibody. In the final step, Eu +3 -doped nanoparticles coated with glycan-binding proteins (lectins or antibodies) are added to the wells: the proteins coated on their surface recognize altered glycans on the cancer-related target antigen. Between each step, the wells are washed to remove the unbound fraction of assay components. The measurement of time-resolved fluorescence is performed on a spot on the bottom of each well 3 | RESULTS

| Glycovariant biomarker screening
Antigens CA125, CA15-3, and CD63 from three different BrCa cell lines, MCF7, SKBR3, and T47D, and the breast epithelial cell line MCF10A were immobilized from their concentrated spent media and their binding with glycan-specific reporter molecules was determined using the assay depicted in Figure 1.

| Glycovariant assays on clinical samples
The glycan-binding proteins (GBPs) conjugated on the reporter molecules that displayed significant binding with an immobilized antigen were used to assay clinical samples. These assays were denominated as antigen GBP , for example, CA125 WGA . In these cases, anti-T antibody, and anti-Tn antibody are abbreviated as T, and Tn, respectively. also performed due to previous results from a metastatic BrCa patient cohort. 18 The results from these assays is depicted in Figure 3  The assays which produced significant differences between the control and case groups, and the conventional assays, were analyzed for Receiver Operating Characteristics (ROC). The area under the ROC curve (AUC) was used to determine the overall clinical performance ( Figure 4).
The AUCs of the experimental assays were compared to the AUCs of the conventional assays using the bootstrap method with 10 000 replicates. Significant differences (α = 0.05) were found when comparing CA125 IA with CA15-3WGA (p = .0089), and CA15-3 assay with CA125WGA (p = .0204) and CA15-3WGA (p = .0004) assays. The comparisons remained significant after correction for multiple testing.
Logistic regression models for the combination of the experimental assays were also generated and compared in Figure 5. The combination of CA15-3WGA and CD63 IA yielded the best AUC of the two-assay combinations with an AUC of 0.965 which is very close to the AUC produced by the combination of all experimental assays, which was 0.969.

| DISCUSSION
BrCa is the first cause of cancer-related deaths in women worldwide.
Its incidence rates are constantly rising and its aggressiveness is greater in younger, premenopausal women. Diagnosing aggressive BrCa at an early stage would greatly benefit the individuals affected and lower the treatment costs. Screening methods, mainly mammography, have been employed to aid early detection but they come with their downsides such as overdiagnosis and overtreatment, and costs that may exceed the financial benefits achieved. Circulating biomarkers have great potential for early detection of cancer but that potential is yet to be realized. Circulating CA15-3 levels are used for monitoring BrCa patients and CA125 concentrations have been previously found to be elevated in BrCa patients' blood. These proteins are known to be highly glycosylated and their glycosylation to be aberrant in BrCa.
EVs, such as exosomes, and EV-related molecules such as CD63 offer new targets for early cancer diagnostics due to the EVs being aberrantly modified by cancer cells. EVs display and carry a multitude of different molecules to be analyzed for the development of new analytical methods sometimes referred to as liquid biopsies. The tetraspanin CD63 is linked to EVs and is overexpressed in BrCa in which its aberrant glycosylation has been reported to be mediated by ribophorin II. 15 In this study, we found that CD63 and its possible Although the correlation between cell malignancy and the secretion rate of EVs is not clear, our results indicate that the overall quantity of CD63+ EVs is elevated in BrCa. 25 The tumor marker CA125 measured from the mucin 16 glycoprotein is considered a valuable marker for ovarian cancer and its use is recommended for ovarian cancer screening in women with high risk. 26 It has also been reported to be elevated in BrCa patients. 12 The aberrant cancer-related O-glycosylation of mucins is generally considered to be based on core 1 glycosylation, meaning the T and Tn antigens 27 and their sialylated derivatives. 28 The antibodies against T and Tn antigens were tested for binding to CA125 derived from BrCa cell GlcNAc is not present in the core 1 derived BrCa-associated glycans but it is present in the core 2 derived glycans which have been associated with estrogen receptor negative (ER À ) BrCas. 28 The CA125 T and CA125 WGA assays were tested on serum samples and the CA125 T levels F I G U R E 5 Comparisons of disease probability generated using logistic regression. The probabilities were generated using individual or multiple assays' results from the entire cohort as the classifying variables. The cut-off of 0.5 is highlighted with dashed red line and the data points are colored according to the stage of the disease with stage 0 representing the healthy controls. The data points have transparency, which distorts the color of individual data points did not differ significantly between the controls and cases but the CA125 WGA levels did. The patient samples did not have the hormone receptor status available, so it is difficult to speculate whether the WGA bound to sialylated core 1 glycans or to the core 2 glycans that are commonly present in ER À BrCas. The ROC analysis showed that the diagnostic ability of the CA125 WGA was good with an AUC of 0.814. The AUC of CA125 WGA was significantly better than the AUC of CA15-3 IA which is currently the only FDA-approved serological BrCa marker.
Multiple GBPs bound to BrCa-associated CA15-3 in the screening with cell line materials. The Concanavalin A (Con-A) lectin reactive CA15-3 levels have been found to distinguish between benign breast disease and BrCa in a different assay setup. 29 Con-A binds mannose structures that are generally present when N-glycosylation is initiated so it was speculated that truncated N-glycans of CA15-3 might be the BrCa-associated glycans that were detected in the assay. 29 In our screening, Con-A did not significantly bind to the CA15-3 secreted by cultured BrCa cells. This might be due to the specific glycosylation machinery of the cells that were used in this study or even the culture conditions, which can affect the sensitive glycosylation process. 30 Antibodies against the T and Tn antigens and the WGA bound most effectively, but MGL was also tested with clinical samples due to our previous promising results on metastatic BrCa samples. 18 The GBPs bound CA15-3 most likely due to the previously speculated reasons as why the GBPs bound to CA125, that is, the mucin-related core 1 and/or core 2 O-glycans. The GBPs that bound BrCaassociated CA15-3 in the screening were tested with the clinical samples. The CA15-3 WGA levels were found to be statistically different between the controls and cases. The ROC analysis revealed that the AUC achieved by using the CA15-3 WGA levels as a classifier was the highest of any of the tested assays with the value of 0.915. The AUC of CA15-3 WGA was significantly higher than either of the conventional assays' AUC. The CA15-3 WGA also distinguished metastatic BrCa patients from healthy in our previous study, 18 so it is probably not limited to detecting localized BrCa.
Logistic regression models were also generated from the combination of the best-performing glycovariant assays and the probabilities were plotted in Figure 5. The combination of CA15-3 WGA and CD63 IA classified several BrCa samples correctly which were incorrectly classified as negatives (probability <.5) with individual assays. The CD63 UEA was elevated only in the MCF7, which is the least aggressive of the used cell lines. It was derived from a postmenopausal patient, may explain the assay's poor performance in the differential diagnosis of the samples from premenopausal breast cancer patients, who often have aggressive tumors. The CA125 WGA was elevated in only SKBR3, which implies that this marker is elevated in poorly differentiated, hormone receptor negative adenocarcinomas.
The assay that performed best for the differential diagnosis of the samples, CA15-3 WGA , was elevated most in MCF7 and T47D cell lines and not much in SKBR3 cell line. It seems that this glycovariant could be elevated with the hormone receptor positive, less aggressive adenocarcinoma. Even though two of the glycovariants found performed well in the differential diagnosis of the samples used, in the future a screening with an increased number of different breast cancer cell lines combined with a larger cohort of patients with more information on the breast cancer subtypes would best demonstrate the connections between the glycovariants and histopathological characteristics.

| CONCLUSIONS
The major limitation of this study was the cohort size and the inability to correlate breast cancer subtypes with certain markers, and in the future these assays will be evaluated on larger cohorts and using rele-

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon request.

ETHICAL STATEMENT
The collection of the clinical samples was approved by Advarra IRB and another ethics committee. The other ethics committee's location information was redacted by Discovery Life Sciences.

CONSENT TO PARTICIPATE
Written informed consent was obtained from all patients.

CONSENT FOR PUBLICATION
Not applicable.