Recombinant MUC1 probe authentically reflects cell-specific O-glycosylation profiles of endogenous breast cancer mucin: High-density and prevalent core2-based glycosylation

Knowledge on the O-linked glycan chains of tumor-associated MUC1 is primarily based on enzymatic and immunochemical evidence. To obtain structural information and to overcome limitations by the scarcity of endogenous mucin, we expressed a recombinant glycosylation probe corresponding to six MUC1 tandem repeats in four breast cancer cell lines. Comparative analyses of the O-glycan profiles were performed after hydrazinolysis and normal-phase chromatography of 2-aminobenzamide labeled glycans . Except for a general reduction in the O-glycan chain lengths and a high density glycosylation, no common structural pattern was revealed. T47D fusion protein exhibits an almost complete shift from core2 to core1 expression with a preponderance of sialylated glycans. By contrast, MCF-7, MDA-MB231, and ZR75-1 cells glycosylate the MUC1 repeat peptide preferentially with core2-based glycans terminating mostly with a 3-linked sialic acid (MDA-MB231, ZR75-1) or a 2/3-linked fucose (MCF-7). Endogenous MUC1 from T47D and MCF-7 cell supernatants revealed almost identical O-glycosylation profiles compared to the respective recombinant probes indicating that the fusion proteins reflected the authentic O-glycan profiles of the cells. The structural patterns in the majority of cells under study are in conflict with biosynthetic models of MUC1 O-glycosylation in breast cancer, which claim that the truncation of normal core2-based polylactosamine structures to short sialylated core1-based glycans is due to the reduced activity of core2-forming b 6-N-acetyl-glucosaminyltransferases and/or to overexpression of competitive a 3-sialyltransferase. evidence on the profiles and densities of O-linked glycans on MUC1 is presented using a truncated recombinant probe as reporter protein of endogenous mucin glycosylation. Utility of the approach using an artificial protein to probe in vivo O-glycosylation in tumor cell lines was validated by comparison with the endogenous mucin and the demonstration that the natural, cell-specific profiles of O-linked glycans are authentically reflected on the fusion protein. The results obtained for four cell lines reveal additional insight into the process of O-glycosylation in breast cancer cells.

The cell membrane-associated human mucin MUC1 is expressed by most epithelia (1) and some subsets of lymphocytes (2, 3). It is overexpressed by many carcinomas and an altered glycosylation pattern results in tumor-specific exposure of peptide epitopes (4), making MUC1 a promising tumor antigen with diagnostic, as well as therapeutic potential in the treatment of cancer (5-7).
Mature MUC1 consists of two subunits that are proteolytically derived from a common precursor peptide and form a stable heterodimeric complex. The smaller subunit contains a C -terminal cytoplasmic domain, the membrane spanning domain and a short extracellular sequence that is non-covalently linked to the larger, extracellular subunit which contains the extensively O-glycosylated mucin domain (8). After its first appearence on the cell membrane the complex is internalized and recycled several times for follow-up sialylation in the trans-Golgi (9). Mature glycoforms of the mucin can remain on the cell surface or become shed by still unknown mechanisms (9). MUC1 contains five potential N-glycosylation sites, three of which are located in the membrane-associated subunit and two near the C-terminus of the extracellular subunit.
However, the bulk of glycosylation, which can make up between 50 to 80 % of the total mass, is O-linked to numerous threonine and serine residues in the mucin domain. This domain comprises a variable number of tandemly repeated 20 amino acid sequences, each containing five potential O-glycosylation sites (10, 11). Although each of these sites is an O-glycosylation target in vivo, the average density of glycans may vary considerably among MUC1 glycoforms. As an instructive example, tandem repeat glycopeptides that were prepared from MUC1 expressed in the lactating breast (12) or from the breast cancer cell line T47D (13), have been demonstrated to contain an average of 2.6 and 4.8 glycans, respectively. The pool of O-glycan structures produced by a single cell is the product of a complex biosynthetic process, which is not template guided and requires the ordered action of multiple glycosyltransferases. Accordingly, O-glycan patterns are often cell and tissue specific and may differ substantially from one cell type to another. The O-glycan profiles of MUC1 glycoforms from breast milk, urine and two breast carcinoma cell lines have been investigated so far. The O-glycan profile of lactation-associated MUC1 is dominated by core2 based linear or branched polylactosamine chains, which are substituted with up to three fucose residues (14). Mono-and disialylated structures were also present but accounted for less than 25 % of total glycans (15). Urinary MUC1, in contrast, exhibits an O-glycan profile with significantly shorter neutral and acidic glycans, which are based on core1 as well as core2 structures (16). According to three reports, tumor-associated glycoforms, which had been isolated from T47D (17) and BT 20 (18) breast carcinoma cell line supernatants or from cell lysates (19), were demonstrated to contain truncated precursor structures like core-GalNAc or the core1 disaccharide Gal(ß1-3)GalNAc as well as its mono-and disialylated derivatives.
It has been proposed, that breast-cancer associated changes in MUC1 O-glycosylation reflect a general switch of core2 to core1 expression, which leads to a reduction of Oglycan chain length (18) and may be the result of decreased or lacking expression of core2 specific β6 N-acetylglucosaminyltransferases (20). In addition, an increased α2,3sialylation of core1 structures has been claimed to represent a biosynthetic stop signal and to inhibit core2 formation (21). However, due to the limited availability of chemical data, these hypotheses rely mainly on indirect enzymatic and immunochemical evidence, including the analysis of glycosyltransferase expression patterns by in situ hybridisations (22), northern blot analysis or enzyme activity assays (20). Immunochemical analysis of O-glycosylation inhibited cells (23)  To enable the design of efficient tumor vaccines the cancer-associated glycosylation profiles on MUC1 repeat peptides need to be analysed in more detail and with state-of-theart methodologies. Chemical studies on MUC1 O-glycosylation in cancer that have been reported so far, suffered from severe limitations of sample amounts. To overcome the analytical problems that are associated with low sample amounts and to extend the data on MUC1 O -glycosylation in mammary carcinomas by chemical evidence, we have expressed a MUC1 fusion protein in the breast cancer cell lines ZR75-1, MDA-MB 231, MCF-7 and T47D. O-glycan pools were released from the purified fusion proteins by hydrazinolysis and the 2-aminobenzamide labeled glycans were profiled by normal-phase HPLC. Complementary information on the O -glycan profiles was revealed by an independent approach using non-reductive β-elimination and methylation of the oligosaccharides followed by their mass spectrometric analysis. To confirm that the recombinant probes reflect the authentic glycosylation profiles of the individual cell lines, we analysed in two cases also the endogenous mucin by applying mass spectrometric profiling.

Construction of the MUC1 fusionprotein (MFP6)
A MUC1 tandem repeat sequence containing terminal restriction sites was constructed by annealing and ligating four 5´ phosphorylated 60mer oligonucleotides. 5´ overhangs were filled up with Klenow enzyme and the construct was cloned into the BamH I and Hind III sites of pBluescript SK (Stratagene, Heidelberg, Germany) resulting in pBS TR1. Since there is a single Sma I site in each tandem repeat sequence, limited Sma I digestion of a 3.4 kb Hind III / ApaL I cleaved MUC1 cDNA resulted in a series of fragments of 2 -40 tandem repeat sequences and migrating in the 0.2 -3 kb range on a 1.6 % agarose gel.
Fragments ranging from 0.8 -1,8 kb were eluted from the gel and ligated into Sma I / Hind III cleaved pBS TR1. Out of several clones containing 10 -30 tandem repeats and the 3´ sequence of the original construct, a single clone with approximately 25 tandem repeats was choosen. BamH I digestion followed by limited Sma I digestion removed the 5´sequence of the MUC1 cDNA and a random number of tandem repeats and resulted in several fragments, migrating between 3.5 and 5.5 kb on 1.2 % agarose gels. Fragments in the 4 kb range were eluted from the gel and ligated to a 300 bp Sma I / Not I fragment that was exised from pBS TR1 and represented the 5´sequence of pBSTR1. The procedure resulted in several clones that contained 2-16 tandem repeats flanked by the 3´and 5´sequences of the original pBS TR1. A clone containing 6 tandem repeats was choosen and subcloned into the pCEP-PU expression vector (30) using the 5´ Nhe I and the 3´ Not I sites. pCEP-PU already contained the signal peptide of the BM40 extracellular matrix protein, followed by a hexa-histidine sequence, a myc tag and the Nhe I and Not I restriction sites that were used for in-frame insertion of the MUC1 tandem repeat construct. Restriction enzymes and all other DNA modifying enzymes were obtained from New England Biolabs, Frankfurt am Main, Germany.

Purification of MFP6
Conditioned supernatants from confluent cell layers were collected, centrifuged at 1000 x g at 4 °C for 10 min and dialyzed against several changes of demineralized water. Dialyzed supernatants were adjusted to 50 mM sodium phosphate pH 8.0, 200 mM sodium chloride, 1 mM imidazol, 5 mM 2-mercaptoethanol and 10 % ethanol and centrifuged at 20,000 x g (4 °C) for 45 min. Up to 1 L of the adjusted supernatant was loaded onto a column of 2 ml Ni-NTA Superflow (Quiagen, Hilden, Germany). The column was washed with 30 ml 50 mM sodium phosphate pH 6.5, 500 mM sodium chloride, 10 mM imidazol, 10 mM 2mercaptoethanol and 10 % ethanol. After equilibrating the column with 5 ml 20 mM sodium phosphate, pH 6.5, proteins were eluted with 8 x 1 ml 0.2 M triflouroacetic acid (TFA) in 10 % acetonitril. Fractions 2 -5 appeared to contain the fusion proteins and were subjected to HPLC purification on a reversed-phase column (Vydac 208TP1015, MZ Analysentechnik, Mainz, Germany). Samples were injected in 500 µl aliquots and the column was eluted with a gradient of 2 to 80 % acetonitril in 0.1 % TFA over 30 min. A flow-rate of 1 ml/min was used and the chromatogram was registered spectrophotometrically at 214 nm. Peaks were collected manually and the quality of the preparations was analyzed by sodium dodecyl sulfate polyacrylamide gel electrophoresis.
The gels were stained with silver or blotted onto nitrocellulose membranes and the fusion proteins were detected with an anti-myc monoclonal antibody (Santa Cruz Biotechnology, Heidelberg, Germany). Purified MFP6 was quantified by its extinction at 280 nm.

Purification of endogenous mucin
Secretory MUC1 was isolated from the supernatants of T47D and MCF-7 cells by affinity chromatography on anti-MUC1 antibody columns (1 mg of a mixture of repeat peptidespecific monoclonal antibodies B27.29 and BW835 coupled covalently to NHS-activated sulfoxide. After a 2 h incubation at 60 °C the samples were spotted onto chromatography paper (Schleicher & Schüll, Dassel, Germany) and excessive reagents were removed by ascending paper chromatography in n-butanol : ethanol : water (4:1:1). The application points were cut out and eluted with 500 µl of water in order to recover the labeled glycans which do not migrate under the chromatographic conditions described above. Eluted fractions were filtered through 0.2 µm membranes in centrifugal microfiltration devices and stored at -20°C.

Anionexchange HPLC of 2-AB labeled O-glycans
Anionexchange HPLC was performed according to (33). The HPLC system described above was used with a polymer-based anion exchange column (Q HyperD 10, 10 µm 4.6 x 100 mM, Beckman Instruments, Muenchen, Germany) at a flow rate of 1 ml/min. Buffer A was water, buffer B was 500 mM ammonium formiate pH 9,0. A gradient of 0 % B for 1 min, 0-5 % B over 12 min, 5-21 % B over 13 min 21 % -80 % over 25 min followed by 80 % to 100 % over 4 min was used. Samples were dissolved in water and 20 µl were injected.

Preparation of partially deglycosylated tandem repeat peptides
10 µg of MFP6 was partially deglycosylated by sequential treatment with neuraminidase

Generation of a fusion protein containing six MUC1 tandem repeats (MFP6)
Fusion proteins containing the secretory signal peptide of the BM40 extracellular matrix protein and 2 -16 tandem repeats of a MUC1 cDNA were constructed. A hexa-histidine and a myc tag were introduced to allow glycosylation independent affinity purification and immuno detection. MFP6 is a clone with six tandem repeats and was choosen as a glycosylation probe (Fig. 1A), since we noticed that fusion proteins with smaller repeat numbers were expressed poorly in the breast cancer cell lines used in this study (data not shown) and substantially larger ones were expected to be difficult to chromatograph in reversed-phase HPLC. DNA sequencing of the MFP6 construct confimed the expected sequence given in figure. 1A. Several tandem repeat sequences in this construct deviate from the ´conserved´ sequence in two positions, namely the PDTR (PESR) and PPAH (PAAH; PQAH) motive. These sequence variants originate from the MUC1 cDNA and represent a general sequence polymorphism that has been described previously (11,13,34).

Expression and purification of MFP6
The breast carcinoma cell lines T47D, MCF-7, MDA-MB231 and ZR75-1 were transfected with the episomal expression vector pCEP-PU containing the MFP6 construct. Westernblot analysis using an anti-myc monoclonal antibody revealed that MFP6 was expressed and secreted into the supernatant in the four cell lines (data not shown). Expression remained constant over several weeks and passages, as long as selective pressure was applied by culturing the cells in the presence of puromycin. Since serum reduction or depletion might have resulted in an artificial glycosylation pattern, 10 % fetal calf serum were always included in the media. MFP6 was isolated from the conditioned supernatants by affinity chromatography on immobilized Ni 2+ ions and subsequent reversed-phase chromatography on a C8-silica column. Prior to further analysis, the MFP6 preparations were rechromatographed on the same column in order to confirm the expected quantities and the homogeneity of the sample, which was checked by gel electrophoresis and silver staining combined with westernblot analysis (Fig. 1B). According to photometric quantification of pure MFP6 the yields ranged from 0.5 to 2.0 mg protein per liter of conditioned supernatant.

Purification of endogenous MUC1
Endogenous MUC1 was eluted from the antibody affinity column at low pH and appeared in the first 3 -6 fractions according to enzyme immunoassay detection with anti-MUC1 antibody B27.29. After dialysis and concentration by vacuum centrifugation the collected fractions were applied onto a gel permeation FPLC column with an exclusion limit of >1000 kDa and the mucin was effectively separated from low molecular weight proteins and peptides eluting at V E > 2.5V 0 according to the UV profiles at 280 nm. MUC1 positive fractions were identified by westernblot analysis with B27.29 antibody (Fig. 1C), and the absence of contaminating glycoproteins was verified by using digoxigenin labeling of protein-bound glycans (Fig. 1C).

HPLC analysis of 2AB labeld glycans reveals cell -specific O-glycan profiles
The  Fig. 2A). Under these conditions the glycan pools were resolved into more than eleven different species. Digestion with neuraminidase (C. perfringens) which cleaves α2-3 as well as α2-6 linked sialic acids, identified the acidic glycans and revealed the underlying neutral structures, which were assigned as GalNAc, Galβ1-3GalNAc (peak1) and Galβ1-4GlcNAcβ1-6(Galβ1-3)GalNAc (peak 2; refer to figures 2 A and 3 A) according to external by guest on March 24, 2020 http://www.jbc.org/ Downloaded from standards that were prepared from synthetic glycopeptides or commercially available glycoproteins. Peaks 3 and 5 appeared to be resistant to treatment with α2-3 sialidase (S. typhimurium), but digestion with neuraminidase (C. perfringens) resulted in the formation of GalNAc and Galβ1-3GalNAc, respectively. Accordingly, peak 3 was identified as NeuAcα2-6GalNAc and peak 5 as NeuAcα2-6(Galβ1-3)GalNAc, in accordance with the retention times of authentic, mass spectrometrically confirmed standard compounds.
To analyze minor neutral constituents in the MCF-7 profile after enzymatic desialylation, we used an alternative low salt buffer system (50 mM ammonium formiate in buffer B). The use of this system allowed the analysis of fucosylated structures, since residual monosialylated glycans were shiftet to higher retention times (refer to peaks 7 and 8 in figure 3B). Peak 10 was sensitive to α1,3/4 fucosidase (X. manihotis) and digestion yielded a peak isographic with the core2 tetrasaccharide implying that peak 10 by guest on March 24, 2020 http://www.jbc.org/ Downloaded from corresponds to the core2-based Lewis X structure Galβ1-4(Fucα1-3)GlcNAcβ1-6(Galβ1-3)GalNAc. α1,2 fucosidase (X. manihotis) digestion of peak 9 revealed that this structure is also derived from the core2 tetrasaccharide, with a single α-fucose residue linked to position C2 of one of the two terminal galactose residues. Peak 11 represents a difucosylated core2 structure corresponding to Lewis Y , since α1,2 fucosidase ( X. manihotis) cleaved off one residue and generated the Lewis X pentasaccharide (peak 10).
Several minor peaks eluted in the 45 to 55 min range and were also sensitive to fucosidase treatment. Concerted digestion with α1,2 and α1,3/4 fucosidase (X. manihotis) removed these minor signals and led to a corresponding increase of peak 12, which was assigend as core2-based structure with two lactosamine units on the β1-6 branch. The spiked shape of this peak implies a heterogeneity, which is probably due to the presence of both, type I and type II lactosamine units. Although it was not possible to identify the fucosylated derivatives of this structure in detail, it is evident, that highly fucosylated polylactosamine-type glycans are present and account for approximately 5 % of the Oglycans that were derived from MFP6 expressed in MCF-7 cells. Note that fucosidase treatment did not affect any of the sialylated peaks, that elute at 60 and 70 min, indicating that sialylated Lewis structures are not present or below the limit of detection. Moreover, fucosidase treatment did not affect the profiles of any of the other fusion proteins, implying that MCF-7 exhibits an unusual O-glycosylation pattern among the breast carcinoma cell lines analyzed in this study.
Summarizing the structural data in table 1, which were derived from the corresponding profiles in figure 2A, the O -glycans on MUC1 fusion protein revealed c ell-specific expression patterns. Core2-based structures accounted for less than 5 % of the T47D derived glycans and the most prominent species of this cell line were represented by the α2,3 (to Gal) and α2,6 (to GalNAc) sialylated trisaccharides. Neutral glycans were of minor by guest on March 24, 2020 http://www.jbc.org/ Downloaded from abundance and no fucosylated species were detected. A completely opposite pattern was revealed for MDA-MB231 and ZR75-1 cells, which primarily expressed sialylated glycans based on core 2. The structures were generally larger and more complex and terminated preferentially with α3 (to Gal) linked sialic acid. While the profiles were qualitatively similar, the two cell lines exhibited quantitative differences with respect to the contribution of either core-type and the relative amount of sialylated species. It is noteworthy, that high proportions of core 2-based glycans are associated with a high degree of α2,3 sialylation in the ZR75-1 profile. A distinct and unique profile was registered for MCF-7 fusion protein, which carried primarily core2-based, more extended neutral glycans terminating in a considerable portion with α2/3-linked fucose.
In case of T47D-MFP6 the results were in agreement with our previous profiling studies on the O-linked glycans analysed as alditols after reductive cleavage from endogenous MUC1 isolated from cell supernatants (17). Attempts to liberate O-glycans from endogenous mucin by applying hydrazinolysis failed despite extensive desalting and drying of the samples. These findings are in accordance with experience from other laboratories (31), in particular when small amounts of mucin in the low microgram range are treated.

MALDI-TOF analysis of permethylated O-glycan pools
Independent information on the profiles of O-linked chains and confirmation of structural assignments made on the basis of chromatographic criteria (see table 1) came from mass spectrometric analysis of the permethylated glycans, which were liberated by nonreductive β-elimination (Tab. 2 and Fig. 4). Each glycan species was represented in the MALDI mass spectra by its pseudomolecular ion M+Na and mass increment calculation revealed the composition of the oligosaccharides in terms of monosaccharide constituents. The results obtained for the endogenous mucin demonstrate that absence of core2 expression in T47D and prevalence of core2 expression in MCF-7 cells are authentically reflected in the profiles measured for the recombinant probes (Fig. 4, Tab. 2).

O-glycan density on partially deglycosylated tandem repeats
In addition to the structural O-glycan profiles, the density of O-linked chains in the peptide sequence is a second important parameter in the characterisation of mucin O-glycosylation. To address this issue we used an array of exoglycosidases to degrade the O -glycans to the level of the peptide bound N-acetylgalactosamine. Subsequent proteolytical degradation with clostripain, which cleaves once in each tandem repeat, resulted in a mixture of tandem repeat glycopeptides, containing two to five N-acetylgalactosaminyl residues. The relative proportion of each glycoform was revealed on a semi-quantitative basis by reflectron MALDI-TOF mass spectrometry (Fig. 5).
Tandem repeat glycopeptides derived from MFP6 expressed in T47D, ZR75-1 or MDA-MB231 contained three to five GalNAc residues, with the five GalNAc peptide being the most intensive signal in the mass spectra. The glycopeptide mixture that was derived from MFP6 expressed in MCF-7 exhibited two to five GalNAc per tandem repeat peptide. In breast cancer information on MUC1 O-glycosylation was primarily based on enzymatic and immunochemical studies (20,21,22,23), while structural evidence was limited (17,18,19). In summary, the latter studies had revealed that the tumor-associated profiles were characterized by reduced or deficient core2 expression. Studies on the enzymatic mechanisms underlying the aberrant O-glycosylation in breast cancer cells had revealed two important common features 1) low activities or deletion of the core2-specific β6-N-acetylglucosaminyl-transferase (C2GnT), which converts core1-disaccharide (Galβ1-3GalNAc) into the corresponding core2-trisaccharide (20), and 2) over-expression of the core1-specific α3-sialyltransferase introducing NeuAc at Gal of core1 (21,22). Since sialic acid represents a potent biosynthetic stop signal, the trisaccharide NeuAcα2-3Galβ1- by α3-sialylation catalyzed by the over-expressed enzyme (21). Hence, if the biosynthetic model was correct, α2-3 mono-and in particular α2-3 disialylated core2-based glycans (structures 6a, 6b, 7 in table 1) would not be expected to represent the major structures. A competition between C2GnT and α3ST would suggest an at least partial co-localization of the enzymes in one of the Golgi subcompartments. Expectedly, the core-specific enzyme should be found in the cis-Golgi and act on its substrate, before sialyltransferases in the trans-Golgi come into play. However, the actual subcellular distribution of the two enzymes has not yet been published.
In a previous contribution we could demonstrate that breast cancer cells exhibit an unexpected increase in the density of MUC1 O-glycosylation rather than a decrease (13).