Altered Expression of Sialylated Glycoproteins in Breast Cancer Using Hydrazide Chemistry and Mass Spectrometry*

Sialylation is one of the altered protein glycosylations associated with cancer development. The sialoglycoproteins in cancer cells, however, largely remain unidentified because of the lack of a method for quantitative analysis of sialoglycoproteins. This manuscript presents a high throughput method for quantitative analysis of N-linked sialoglycoproteins using conditional hydrazide chemistry, liquid chromatography, and tandem mass spectrometry. We further applied the sialoglycoproteomic method to the profiling of breast cancer tissues and compared findings with the results from the total glycoproteomic analysis using the original hydrazide chemistry method. We identified altered expression of sialoglycoproteins, as well as the total glycoprotein changes associated with breast cancer. Using lectin and Western blot analysis, we characterized one of the sialoglycoproteins, versican, and confirmed that versican was most sialylated and elevated in breast cancer. Furthermore, we showed that versican was detected in both cancer epithelial cells and peritumoral stromal cells using immunohistochemistry. Tissue microarray analysis revealed that epithelial expression of versican had significant relations to lymph node metastasis and pathological stages. This is the first quantitative sialoglycoproteomic and glycoproteomic analysis of breast cancer and noncancerous tissues. These findings present a significant addition of the method to the identification of altered expression of sialylated glycoproteins associated with breast cancer development.

Sialylation is one of the altered protein glycosylations associated with cancer development. The sialoglycoproteins in cancer cells, however, largely remain unidentified because of the lack of a method for quantitative analysis of sialoglycoproteins. This manuscript presents a high throughput method for quantitative analysis of N-linked sialoglycoproteins using conditional hydrazide chemistry, liquid chromatography, and tandem mass spectrometry. We further applied the sialoglycoproteomic method to the profiling of breast cancer tissues and compared findings with the results from the total glycoproteomic analysis using the original hydrazide chemistry method. We identified altered expression of sialoglycoproteins, as well as the total glycoprotein changes associated with breast cancer. Using lectin and Western blot analysis, we characterized one of the sialoglycoproteins, versican, and confirmed that versican was most sialylated and elevated in breast cancer. Furthermore, we showed that versican was detected in both cancer epithelial cells and peritumoral stromal cells using immunohistochemistry. Tissue microarray analysis revealed that epithelial expression of versican had significant relations to lymph node metastasis and pathological stages. This is the first quantitative sialoglycoproteomic and glycoproteomic analysis of breast cancer and noncancerous tissues. These findings present a significant addition of the method to the identification of altered expression of sialylated glycoproteins associated with breast cancer development. Aberrant protein glycosylations are known to be associated with tumorigenesis and cancer progression steps including oncogenic transformation, tumor invasion, and metastasis (1). In breast cancer, elevated concentrations of highly glycosylated proteins, such as mucins, are associated with increased tumor burden and poor prognosis (2). Several glycosylation changes including sialylation, fucosylation, increased branching of N-glycans, and incomplete biosynthesis resulting in truncated glycans are commonly found in cancer (3). By existing as terminal sugars on glycans attached to proteins and lipid moieties, sialic acids play critical roles in intermolecular interactions and the formation of cellular characteristics (4). Altered expression of sialylated glycoproteins has been discovered in many carcinomas such as colon, acute myeloid, leukemia, cervix, and brain tumors (5)(6)(7)(8)(9)(10)(11)(12)(13)(14). Furthermore, partial removal of sialic acids on cell surface has also been shown to increase both cell adhesion and aggregation in the pancreatic cancer cells (15), which confirmed that sialylation indeed contributes to cell metastasis.
Recent development in MS technology has fueled high throughput analyses of glycoproteins (16,17). Current strategies in studying glycoproteins generally start with enrichment of glycoproteins or glycopeptides from complex mixtures using different physical-chemical methods followed by identification and quantification using mass spectrometry (18 -22). We developed a chemical immobilization method of solid phase extraction of glycopeptides (SPEG) 1 using hydrazide chemistry (19,23). After being captured on the solid support, formerly N-linked glycopeptides were released using peptide-N-glycosidase F (PNGase F) and were analyzed by MS for identification and quantification. Using this robust glycoproteomic approach, we identified and quantified thousands of new formerly N-linked glycopeptides from different tissues, cells, and bodily fluids (24,25).
To identify sialylated glycoproteins associated with breast cancer, we used a modified SPEG method for specific isolation of sialoglycopeptides and analyzed the sialoglycopeptides by LC-MS and tandem MS (MS/MS). Comparison of the glycopeptides isolated by modified and original SPEG methods from human serum with sialidase treatment showed that the modified SPEG method specifically enriched N-sialoglycopeptides. We applied the modified and original SPEG methods to breast cancer tissues and paired noncancerous tissues and identified changes in expression of sialoglycoproteins and total glycoproteins associated with breast cancer. We further verified the overexpression of one of the sialoglycoproteins, versican, in cancer tissues by lectin precipitation and Western blot. The immunohistochemistry study using breast cancer tissue microarrays revealed that epithelial expression of versican had significant relation to lymph node metastasis and pathological stages. This is the first unbiased sialoglycoproteomic and glycoproteomic profiling of breast cancer tissues in identifying changes in expression of sialoglycoproteins associated with breast cancer. These findings present a new proteomic method to identify altered expression of sialoglycoproteins associated with cancer.

Frozen Tissue Specimens for Glycoproteomic Analyses and Western Blots
Samples and clinical information were obtained as part of a Johns Hopkins Medicine institution review board-approved study. The surgically removed breast cancer tissues and adjacent normal breast tissues were obtained from the Breast Cancer Tumor Bank at The University of Texas M. D. Anderson Cancer Center (Houston, TX). Tissues were frozen immediately after surgery and stored at Ϫ80°C. The patients were staged according to the American Joint Committee on Cancer Staging System for Breast Cancer (26).

Formalin-fixed and Paraffin-embedded Tissue Sections for Immunohistochemistry Analysis
Whole tissue slides (5 m section) including 10 matched cases of breast cancer tissues and adjacent noncancerous tissues were obtained from the Department of Pathology at Johns Hopkins University under the Johns Hopkins Medicine institution review board protocol. Tissue microarrays including 15 breast cancer tissues and casematched adjacent noncancerous tissues were purchased from IMGENEX (San Diego, CA). Available tumor clinicopathological characteristics of tissue sections include age, stage, histological type, lymph node status, steroid hormone receptors (estrogen receptor/ progesterone receptor) and p53 staining status.

Solubilization of Frozen Tissues and Extraction of Peptides
Frozen tissue blocks (100 mg each) were sliced 1-3 mm 3 thick and were incubated in 100 l of 5 mM phosphate buffer followed by vortexing for 2-3 min. Then the samples were sonicated for 5 min in an ice water bath. 100 l of trifluoroethanol was added to the samples and incubated at 60°C for 2 h followed by sonication for 2 min. The proteins were reduced by 5 mM tributylphosphine with incubation at 60°C for 30 min and then were alkylated with 10 mM iodoacetamide in dark at room temperature for 30 min. The samples were diluted to 5-fold with 50 mM NH 4 HCO 3 (pH 7.8) to reduce the trifluoroethanol concentration to 10% prior to the addition of trypsin at a ratio of 1:50 (w/w, enzyme:protein). The samples were digested at 37°C overnight with a gentle shaking. The precipitate was discarded by centrifuge. Silver staining was used to test the efficiency of tryptic digestion. An average of 4 mg from the total tryptic peptides were recovered from each tissue sample determined by the BCA assay.

Desialylation of Human Serum Peptides
The peptides were extracted from 20 l of human serum as described previously (27). Then 400 g of human serum peptides were resuspended in 100 l of 50 mM ammonium acetate (pH 6.0) with 0.15 unit of neuraminidase (Roche Applied Science) followed by an incubation at 37°C for 24 h.

N-Linked Glycopeptide Capture
N-Glycopeptides were isolated from the tryptic peptides using SPEG that was developed and reported by our group (19,23). Briefly, 1 mg of peptides extracted from tissues were oxidized by 10 mM sodium periodate at room temperature for 1 h. Glycopeptides were covalently conjugated to a solid support via hydrazide chemistry, whereas nonglycopeptides were removed by washing with 1.5 M NaCl, methanol, and water prior to release of formerly N-linked glycopeptides from solid support by PNGase F. The formerly N-linked glycopeptides were purified by C18 columns and resuspended in 40 l of 0.4% acetic acid.

Sialylated N-Linked Glycopeptide Capture
The SPEG method developed from our laboratory previously was used to isolate sialylated N-glycopeptides by tuning the oxidization condition (19,23). First, the solution of sodium periodate and 1 mg of tryptic peptides from the tissues were cooled down in an ice bath for 15 min. The peptides were then oxidized by 1 mM sodium periodate at 0°C for 15 min for a specific oxidation of sialic acid. N-Linked sialoglycopeptides were covalently conjugated to a solid support via hydrazide chemistry, whereas nonsialylated glycopeptides were removed by washing before N-sialoglycopeptides were released from solid support by PNGase F. The enriched formerly N-linked sialoglycopeptides were concentrated by C18 columns, then dried down, and resuspended in 40 l of 0.4% acetic acid.

Isotope Labeling of Peptides from Serum and MS Analysis
Two mg/ml d0 or d4 (four deuterium) or d4C4 (four deuterium and four 13 C)-succinic anhydride was prepared in 100% dimethylformamide. After the peptides spiked with 1 l 1 nM angiotensin peptide were dried and resuspended in 20 l of dimethylformamide/pyridine/ H 2 O (50/10/40, v/v/v), 5 l of succinic anhydride solution was added to the sample and incubated at room temperature for 1 h. Labeled peptides were mixed and cleaned up by C18 cartridge prior to the MS analysis using MALDI 4800-TOF/TOF.

Mass Spectrometry Analysis
The peptides and proteins were identified using MS/MS analysis of LTQ ion trap mass spectrometer. 3 of 40 l of the glycopeptides extracted from 1 mg of tissues were injected into a 10-cm ϫ 75-m inner diameter microcapillary HPLC (-LC) column packed with C18 resin. A linear gradient of acetonitrile from 5-32% of solution B over 100 min at flow rate of ϳ300 nl/min was applied. The HPLC mobile phase A and B were 0.2% formic acid in HPLC grade water and 0.2% formic acid in HPLC grade acetonitrile, respectively. During the LC-MS mode, data were acquired with a profile mode in the mass range scan between m/z 400 and 2000 with 3.0-s scan duration and 0.1-s interscan. The MS/MS was also turned on to collect Collisioninduced dissociation (CID) using a data-dependent mode. Each sample was analyzed three times to increase the accuracy of quantification.

Data Analyses
Peptide Identifications-The data files collected on the mass spectrometer (.raw) were converted to the mzMXL format using Trans-Proteomic Pipeline (TPP, version 4.3). MS/MS spectra were searched with X! Tandem (released January 1, 2010) against a human protein database (ipi.human.v2.28.fasta) containing 40,110 entries. The peptide mass tolerance was 2.0 Da, whereas MS/MS tolerance was 0.6 Da. Other parameters of database searching were modified as the following: oxidization on methionines (add methionine 16), a (PNGase F-catalyzed) conversion of Asn to Asp, and cysteine modification (add cysteine 57). One tryptic end and a maximum of two missed cleavage sites were permitted. The output files were evaluated by Protein Prophet (28). The criterion of Protein Prophet was set to the probability score higher than 0.9 (error rate of 0.025), so that low probability protein identifications could be filtered out. For each identified peptide, peptide sequence, protein name, precursor m/z value, peptide mass, charge state, retention time where the MS/MS was acquired, and probability of the peptide identification being correct were recorded and outputted using INTERACT (29). The data associated with this manuscript are available for download on ProteomeCommons.org Tranche using the following hash code: QRdWgaeLJollB1Li85WCVW889kZpBPlhr3ajEw416lxxyJZ45-B0WodbsR3zIz7TFl/4RLQLoaKZlHeVEAYxE4Xaza58AAAAAAAAY-uAϭϭ. For single peptide identifications, the matched spectra can be found in supplemental Table 2.
Spectral Counting-The identified peptides from LTQ with probability score Ն0.9 were included in the spectral counting for each unique glycosylation site. Semitryptic or shared peptides were exclusive for spectral counting. The peptides with Ն2 spectral counts were used for quantitation.
Lectin Precipitation-Western Blot of Human Versican-Agarosebound SNA (Vector Laboratories) (50 l of 50% slurry) was prewashed five times with 500 l of binding buffer of 10 mM HEPES, 0.15 M NaCl, 0.1 mM CaCl 2 (pH 7.0). Tissue proteins (30 g) were diluted by 100 l of binding buffer and then were incubated with SNA lectin at room temperature for 1 h. Both the SNA-bound and unbound fractions were collected separately by centrifuge and followed by Western blot using rabbit anti-human versican polyclonal antibody (Abcam).
Western Blot Analysis of Human Versican-Protein extracts from tissues were resolved on SDS-PAGE and were transferred electrophoretically onto a nitrocellulose membrane. The membrane was blocked by 5% nonfat milk, 0.1% TBS-Tween 20 at room temperature for 2 h. Then membrane was probed with rabbit anti-human versican antibody (Abcam) (1:2000) at 4°C overnight, followed by three 15-min washes of 0.1% TBS-Tween 20. horseradish peroxidase-conjugated secondary antibody of goat anti-rabbit IgG (Thermo Scientific) was added at 1:2000 dilution, followed by a 1-h incubation at room temperature. After three washes with 0.1% TBS-Tween 20, the signals were visualized by SuperSignal West Femto maximum sensitivity substrate (Thermo Scientific).
Immunohistochemistry Analysis of Human Versican-Tissue sections were first baked at 60°C for 1 h. The slides were then depar-affinized by xylene and rehydrated with serial ethanol. Antigen retrieval was performed by heating the slides in 0.05% Tween, 0.01 mol/liter of sodium citrate buffer (pH 6.0) for 20 min in a steamer. After blocking the endogenous peroxidase activity with 0.03% hydrogen peroxide for 10 min (Envision kit; Dako), the slides were incubated with monoclonal antibody 2B1 against versican (Seikagaku) (1:50) at room temperature for 1 h, because the monoclonal antibody 2B1 can recognize the hemagglutinin-binding domain of versican. Other procedures, such as washing and incubation with substrate Chromogen, were performed according to the manufacturer's instructions (Dako).
The slides were examined for staining intensity and immunoreactivity location. The intensity of staining was graded visually as no staining (0), weak (ϩ), moderate (ϩϩ), and strong (ϩϩϩ) when Ն10% cells were stained. The chi-squared test was used to analyze the relationship between versican expression and clinicopathologic features. p Ͻ 0.05 was considered statistically significant.

A Proof of Concept Study of Sialylated Glycopeptide Isolation Using a Modified SPEG Method-To identify proteins
preferentially sialylated in breast cancer tissues, the sialylated glycopeptides were specifically isolated. We used a low concentration of sodium periodate to selectively oxidize the terminal sialic acids of sialoglycopeptides to aldehydes followed by immobilization of oxidized glycopeptides to solid support (30) and released the formerly N-linked sialoglycopeptides by PNGase F. This method is modified from our previously reported solid phase extraction of N-linked glycopeptides method which isolates the total N-linked glycopeptides (23).
To determine whether the modified SPEG method was capable of specifically capturing sialylated glycoproteins, human serum was used to evaluate the modified procedure. Equal amounts of serum peptides were used in the following three analyses. For isolation I, peptides were desialylated with neuraminidase and were captured by modified SPEG which was specific to N-linked sialylated glycopeptides. For isolation II, peptides were desialylated with neuraminidase and captured by original SPEG which isolated the total N-linked glycopeptides. For isolation III, peptides without desialylation were captured by modified SPEG which was specific to Nlinked sialylated glycopeptides. The products from the three groups were labeled by d0, d4 (four deuterium), or d4C4 (four deuterium and four 13 C)-succinic anhydride, respectively. Angiotensin peptide was spiked into each sample and was used as an internal control to monitor the labeling efficiency and completeness.
Human ␣-1-acid glycoprotein is one of the glycoproteins identified in this proof of principle study (Table I). Human ␣-1-acid glycoprotein is known of five potential N-linked glycosylation sites. The peptide of 1916 Da, QDQCIYN2 TTYLNVQR, was one of N-linked glycopeptides, which fell into the detectable mass range of MALDI-TOF/TOF (800ϳ4000 Da). After succinic anhydride labeling, the peptide mass of 1916 Da was shifted into 2016 Da with d0 labeling (adding 100 Da), 2020 Da with d4 labeling (adding 104 Da), and 2024 Da with d4C4 labeling (adding 108 Da). As a control, the intensity of each peak from the labeled angiotensin peptide was detected at a similar level, with no free angiotensin peak detected, indicating that the labeling was complete, and the labeling efficiency was equivalent for each specimen (Fig.  1A). The desialylated glycopeptide of human ␣-1-acid glycoprotein was barely detectable in isolation I by sialylation-specific capture (Fig. 1B, d0) but was detectable in isolation II by total glycopeptide capture (Fig. 1B, d4). The glycopeptide without desialylation was detected with lower intensity in isolation III by sialylation-specific method compared with isolation II (Fig. 1B,  d4C4). The quantification of other glycoproteins from the serum was similar (Table I). However, one peptide from hemopexin was detected in isolation I with almost equal amount as in isolation III, which could have been caused by the incomplete desialylation or nonspecific sialoglycopeptide capture for this protein. With the aforementioned results taken together, it is clear that the modified SPEG can be used for enrichment of formerly N-linked sialoglycopeptides.
Identification of Formerly N-Linked Sialylated Glycoproteins Associated with Breast Cancer-To identify the altered expression of sialoglycoproteins in breast cancer tissues, the sialoglycopeptides isolated from three breast cancer and paired noncancer tissues using sialylation-specific capture were analyzed by LTQ using MS/MS. The MS/MS spectra were searched by X! Tandem against human protein database to assign the peptide sequences and were subsequentially analyzed by Peptide Prophet (28) to determine the error rate of peptide assignment. With a minimum Peptide Prophet Score of 0.9 (error rate of 0.025), a total of 205 unique formerly N-linked sialoglycopeptides with consensus N-linked glyco-sylation sites was identified, representing 148 N-linked sialoglycoproteins (supplemental Table 1).
The sialoglycopeptides with spectral counts of at least 2 were used for quantification. The spectral counts of each peptide were summed over the three samples in each type. According to spectral counting, 90 N-linked sialoglycopeptides were identified with 2-fold changes in breast cancer tissues compared with their matched noncancerous tissues. To investigate whether the changes were in protein sialylation level or in total glycoprotein level, total glycopeptides were also isolated from the same three pairs of breast cancer and noncancerous tissues using the original SPEG method. Using the same criteria for quantitation of sialoglycopeptides, 43 of 90 glycopeptides with sialylation changes in breast cancer were also identified and quantified in total glycopeptides (Table II). 27 of 43 glycopeptides with changes in sialylation also had total glycopeptide changes (Table II), such as versican core protein. 13 glycopeptides identified with changes in sialylation level did not alter in total glycosylation level, and 3 glycopeptides with changes in sialylation level showed opposite alternation in total glycosylation level.
Verification of One of the Identified Sialoglycoproteins, Human Versican, in Breast Cancer Development-Although human versican has been identified as one of the extracellular space matrix proteins with increased abundance in breast cancer tissues, only a few studies have analyzed the function of versican. It may play a role in intercellular signaling, connecting cellular reaction with the extracellular matrix, and regulation of cell motility, growth, and differentiation (32). Furthermore, the C-terminal G3 domain of versican was found having influence on local and systemic tumor invasiveness in FIG. 1. Evaluation of sialylation specific isolation of human ␣-1-acid glycoprotein. Equal amounts of serum peptides were used in three isolations. For isolation I, peptides were desialylated and captured by modified SPEG. For isolation II, peptides were desialylated and captured by original SPEG. For isolation III, peptides without desialylation were captured by modified SPEG. A, labeling control of equal amounts of angiotensin with d0, d4, and d4C4. B, peptide of human ␣-1-acid glycoprotein from isolations I, II, and III was labeled with d0, d4, and d4C4. preclinical murine models (33). Most recently, Du et al. (33) found that EGFR signaling was an important pathway in the invasiveness and metastasis of versican G3-mediated breast cancer tumor.
To verify the sialylation of the proteins identified in this study, SNA, a lectin that preferentially binds to sialic acid attached to terminal galactose in ␣-2,6 linkage (34), was used to determine the protein sialylation state. SNA was incubated with a pool of proteins extracted from five breast cancer tissues to capture the sialylated glycoproteins. The five cases were the patients diagnosed at stage I (1 case), stage II (3 cases), and stage III (1 case) of the cancer. The SNA-bound fraction and SNA-unbound fraction were used in Western blot analysis with anti-versican antibody. The anti-versican antibody can detect a 70-kDa N-terminal fragment of versican V1 that was cleaved by ADAMTS-1/4 (35,36). Fig. 2A shows that most, if not all, versican was ␣-2,6-sialylated in the breast cancer tissues because it was majorly detected in SNA elution fraction but almost undetectable in the SNA-unbound fraction (SNA flow through fraction).
To further validate the quantitative difference of versican in cancer and noncancerous tissues, versican was examined in the pooled proteins of the five breast cancer tissues and the pool proteins of patient-matched noncancerous tissues. The same amount (40 g) of protein from each pooled sample was analyzed by Western blot using the anti-versican antibody (Fig. 2B). The results showed that the total protein level of versican was also elevated in breast cancer tissues compared with the patient matched noncancerous tissues, which was consistent with the proteomic results (Table II). The overall SNA analysis results show that the versican is most sialylated and the total protein level of versican may be associated with breast cancer.
The protein expression level of versican was further investigated with 25 paired cases of breast cancer and noncancer tissue sections using immunohistochemistry. The versican antibody staining was observed in cancerous epithelial cells and peritumoral stromal cells (Fig. 2C, panels I-III) but not in noncancerous epithelial cells or their adjacent stromal cells (Fig. 2C, panels I, II, and IV). Furthermore, the cancer epithelial cells showed a stronger staining of versican than peritumoral stromal cells. Then the correlation of epithelial versican expression and clinical factors of breast cancer was analyzed. The epithelial versican staining of cancer cells showed signif-icant correlation with lymph node metastasis (p ϭ 0.0102) and pathological stage (p ϭ 0.0254) (Table III). DISCUSSION Aberrant protein glycosylation has been shown to be correlated with cancer development and progression (37)(38)(39)(40). Many recent progresses have been made to identify the glycoproteins and their glycosylation patterns as potential biomarkers for various cancers. Yang et al. (41) used a dual-lectin affinity chromatography and LC-MS in identifying potential urinary glycoprotein biomarkers for bladder cancer. A labelfree identification method has also been developed by Chen et al. (42) to quantify glycoproteins in hepatocellular carcinoma using nonglycopeptide derived from glycoprotein, which ultimately allowed the quantification of proteins in nanograms per milliliter. Li et al. have developed a multiplexed bead assay for profiling glycosylation pattern on some known  serum protein biomarkers of pancreatic cancer (61). Using this method, they found that certain lectin responses on ␣-1-␤ glycoprotein and serum amyloid P could significantly distinguish pancreatic cancer from normal controls. A strategy for the discovery of cancer glycoprotein biomarkers in serum was recently described by Narimatsu et al. (43). To identify low abundant glycoprotein biomarkers, this strategy combines a quantitative real time PCR array for glycogen analysis, lectin microarray for glycan structure analysis, and an isotopecoded glycosylation site-specific tagging for glycopeptide analysis using mass spectrometry (43).
In the present study, we profiled N-sialoglycopeptides in breast cancer using a modified SPEG method and LC-MS and MS/MS analysis. First, we evaluated the specificity of the modified SPEG method for isolating formerly N-linked sialylated glycopeptides. Second, we applied the sialoglycopeptide profiling method to the analysis of human breast cancer tissues and paired noncancerous tissues to identify the changes in sialoglycopeptides associated with breast cancer. Third, when the altered sialoglycopeptides in breast cancer were compared with the changes in total glycopeptides, we found that most of the altered sialoglycopeptides were also changed in total glycopeptides (27 of 43). Thirteen glycopeptides were identified with changes in sialylation but not in total glycopeptides. Three glycopeptides were identified with opposite changes in sialylation comparing with total glycosylation. Fourth, sialylation of one identified sialoglycoprotein, versican, was further confirmed using SNA precipitation and Western blot analysis. Finally, the immunohistochemistry analysis of a breast cancer tissue microarray revealed that versican may play a role in breast cancer metastasis. Overall, our results from profiling of formerly N-sialoglycopeptides identified some interesting candidates; thus, the modified SPEG method surely presents a tremendous potential for studying formerly N-sialylated glycopeptides.
There were two major methods used for capturing glycoproteins: lectin (44) and hydrazide chemistry (SPEG) (19). Although the lectin capture method is based on affinity capture mediated by lectin column, the SPEG method chemically immobilized glycoproteins or glycopeptides through either N-linked and O-linked oligosaccharides to the hydrazide support. Because the glycopeptides were released from the hydrazide beads by peptide-N-glycosidase, the glycosylation site were able to be identified. However, analysis of glycopeptides isolated by SPEG method was unable to provide information about the glycan structures because glycans were removed from peptides before mass spectrometry analysis. The specificity of glycoprotein capture using lectins may not have been as high as the one from chemical immobilization. One possible reason is the decreased efficiency in removing nonglycopeptides with mild washing using lectin affinity column. Furthermore, the glycosylation sites of glycoproteins were not readily identifiable by lectin-based method (45). In conclusion, the two capturing methods are complementary to each other.
In the present study, we used the modified SPEG approach to target sialylated proteins by selective oxidation of sialic acid on glycans (19,23). Proteins were first digested into peptides that contained both glycosylated peptides and nonglycosylated peptides. Because the cis-diol groups of carbohydrates in glycopeptides could be oxidized to aldehydes by sodium periodate (10 mM), the aldehydes from carbohydrates then formed covalent hydrazone bonds with hydrazide groups that were immobilized on a solid support. Nonglycosylated peptides were washed away, whereas the glycosylated peptides remained on the solid support. As a result, the formerly N-linked glycosylated peptides were released from the solid phase using PNGase F, and the isolated peptides were identified using LC-MS/MS. Therefore, in a single analysis, our method identified the N-linked glycosylated proteins, the site(s) of N-linked glycosylation, and the relative quantity of the identified glycopeptides. Moreover, the chemical reaction can potentially be used to selectively oxidize and capture sialic acid-containing glycopeptides using low concentration of sodium periodate (1 mM) and low temperature (0°C) (30). This method could oxidize sialic acid with all kinds of linkage without preference.
To evaluate the specificity of the sialoglycopeptide capture, serum proteins with and without desialylation were captured using the sialylation-specific capture. Several serum proteins showed reduced or no peptide captured with sialylation-specific capture after desialylation. However, glycoproteins such as hemopexin did not show any reduction in the capture after desialylation. The possible reasons could be the following: 1) the desialylation prior to sialylated glycopeptide capture might not have been completed for some glycoproteins such as hemopexin, and 2) the sialoglycopeptide isolation method may not be specific to all proteins. However, the majority of identified proteins were greatly enriched for the formerly Nlinked sialylated glycopeptides using this method.
Other approaches for identifying sialylated glycoproteins have been reported. Using the lectin selection coupled with LC-MS/MS, Zhao et al. (47) identified 130 sialylated glycoproteins. Similar to the phosphopeptide enrichment based on the binding of titanium dioxide to negatively charged phosphopeptides (48), Larsen et al. (49) also enriched sialylated glycopeptides by titanium dioxide. Based on the negative charged peptides, Sickmann and co-workers (21) collected the unbound fraction from strong cation exchange columns to enrich sialylated glycopeptides. Ghesquiè re et al. (50) also reported use of the diagonal chromatographic technology and neuraminidase treatments to enrich sialylated glycopeptides. All of these approaches showed significant ways to enrich sialylated glycopeptides in certain degrees with different specificity for the sialoglycopeptides. Recently, modified SPEG method has been applied to the analysis of cell surface N-linked glycoproteins; however, the specificity to sialylated glycopeptides has not yet been determined (51,52). Kurogochi et al. (46) reported a strategy for the analysis of sialic acid containing glycopeptides by specific oxidation of sialoglycopeptides and conjugation to hydrazide solid support. The conjugated sialoglycopeptides were released from the solid support by hydrolysis of hydrazone linkage between glycopeptides and hydrazide beads in acidic condition. The released glycopeptides with regenerated aldehydes were labeled with 2-aminopyridine (2-AP) for MS analysis. The 2-APlabeled peptides could be further deglycosylated and analyzed by mass spectrometry (46). This method is based on the reversible hydrazone linkage for conjugation and release of sialoglycopeptides. However, any molecules with aldehyde groups or created by oxidation steps such as N-terminal Ser/Thr peptides could be conjugated and recovered using this method, resulting in nonspecific glycopeptide isolation. In this study, we have responded to these issues as we determined the conditions of modified SPEG and evaluated the specificity of the modified SPEG for the sialylation-specific capture. Sialoglycopeptides conjugated to hydrazide support were released by direct enzymatic reaction using PNGase F to recover the formerly N-linked sialoglycopeptides with high specificity. The recovered glycopeptides were quantitatively analyzed by LC-MS/MS. The method is simple and straightforward for high throughput quantitative analysis of formerly N-linked sialoglycopeptides. However, the glycans linked to the formerly N-linked sialoglycopeptides are removed during the analysis.
Using the modified SPEG method, we profiled sialoglycoproteins from breast cancer and paired noncancerous tissues and identified altered sialylated glycopeptides associated with breast cancer. To investigate whether these sialoglycopeptide changes were due to changes in sialylation or total glycosylation, total glycopeptides from the same breast cancer and noncancerous tissues were also analyzed using the original SPEG method. The results showed that the majority of proteins were changed in both sialylation and total glycoproteins (27 of 43). This may be explained by the fact that sialylation is required for glycoprotein synthesis or glycoprotein stability. It also could be due to the fact that the expression of these glycoproteins is increased and most of the glycoproteins are sialylated. It is not clear whether sialylation of these proteins or their expression cause total glycoprotein and sialoglycoprotein changes. However, the results clearly show the correlation between sialylation and total glycosylation. Interestingly, 13 glycopeptides were identified with changes only in sialylation level but not in total glycoprotein level, and three glycopeptides were identified with changes in opposite directions for sialylation and total glycosylation. These inconsistencies may result from the preference of sialylation in certain glycoproteins and studies to investigate the glycan structural changes are currently underway.
As mentioned earlier in the manuscript, one of the identified sialylated glycoproteins, human versican, is an extracellular matrix protein that may be involved in cell adhesion and proliferation. The significant correlation of versican epithelial staining and lymph node metastasis (p ϭ 0.0102) and stage of breast cancer patients (p ϭ 0.0254) indicates that versican may play a significant role in breast cancer metastasis. Further studies are needed to understand the biological function of sialylation on versican and investigate the clinical utility of versican as a prognostic biomarker for breast cancer.
Previous studies of versican in breast cancer have reported the peritumoral stromal staining of versican in node-negative breast cancer in pointing out the association of increased expression of versican in the peritumoral stromal matrix to breast cancer relapse (53,54). However, versican immunoreactivity has not been reported in the epithelial cells of breast cancer in previous studies. There are several possible reasons for the different patterns of immunolocation of versican in the present study: 1) although versican is an extracellular protein and has probably synthesized in tumor stroma by fibroblasts, malignant cells may also synthesize versican (55)(56)(57); therefore versican was also detected in epithelial cells of breast cancer in this study; and 2) the specimens investigated in previous studies were node-negative breast cancer, whereas the specimens analyzed in this study included both nodepositive and -negative breast cancers. Interestingly, we found a statistically significant correlation between epithelial expression of versican and lymph node metastasis. Similar observations of versican epithelial expression were reported in human cervical cancer and endometrial cancer (58,59). The epithelial expression of versican could have been due to changes in overproduction, storage, degradation, or cellular uptake of versican proteins in cancer cells (60). Further studies are required to address the molecular mechanisms of versican overexpression and its potential role as a prognostic marker and therapeutic target in breast cancer.