Glycomic and Proteomic Profiling of Pancreatic Cyst Fluids Identifies Hyperfucosylated Lactosamines on the N-linked Glycans of Overexpressed Glycoproteins*

Pancreatic cancer is now the fourth leading cause of cancer deaths in the United States, and it is associated with an alarmingly low 5-year survival rate of 5%. However, a patient's prognosis is considerably improved when the malignant lesions are identified at an early stage of the disease and removed by surgical resection. Unfortunately, the absence of a practical screening strategy and clinical diagnostic test for identifying premalignant lesions within the pancreas often prevents early detection of pancreatic cancer. To aid in the development of a molecular screening system for early detection of the disease, we have performed glycomic and glycoproteomic profiling experiments on 21 pancreatic cyst fluid samples, including fluids from mucinous cystic neoplasms and intraductal papillary mucinous neoplasms, two types of mucinous cysts that are considered high risk to undergo malignant transformation. A total of 80 asparagine-linked (N-linked) glycans, including high mannose and complex structures, were identified. Of special interest was a series of complex N-linked glycans containing two to six fucose residues, located predominantly as substituents on β-lactosamine extensions. Following the observation of these “hyperfucosylated” glycans, bottom-up proteomics experiments utilizing a label-free quantitative approach were applied to the investigation of two sets of tryptically digested proteins derived from the cyst fluids: 1) all soluble proteins in the raw samples and 2) a subproteome of the soluble cyst fluid proteins that were selectively enriched for fucosylation through the use of surface-immobilized Aleuria aurantia lectin. A comparative analysis of these two proteomic data sets identified glycoproteins that were significantly enriched by lectin affinity. Several candidate glycoproteins that appear hyperfucosylated were identified, including triacylglycerol lipase and pancreatic α-amylase, which were 20- and 22-fold more abundant, respectively, following A. aurantia lectin enrichment.

Pancreatic cancer is associated with a poor long term outcome with an actual 5-year survival rate of less than 5%. The absence of a practical screening strategy and clinical diagnostic test for identifying premalignant lesions within the pancreas prevents early detection of pancreatic cancer. Detection of early stage pancreatic cancer will likely result from novel clinical methods aimed at early tumor detection and molecular profiling of individuals at high risk for pancreatic carcinogenesis.
High-resolution imaging techniques have increased the detection of small, organ-based lesions in the intra-abdominal cavity. Incidental cysts within the pancreas are detected in ϳ13% of patients undergoing cross-sectional imaging for nonspecific abdominal symptoms (1). Greater than 80% of these incidental cysts are non-neoplastic, so few carry malignant potential. Pancreatic cysts are divided into three broad pathologic entities: serous cystadenomas (SCA), 1 mucinous cystic neoplasms (MCN), and intraductal papillary mucinous neoplasms (IPMN). In general, SCA are benign lesions; however, both MCN and IPMN are considered high risk cysts that can easily undergo malignant transformation (2).
Radiologic imaging analysis with computed tomography, magnetic resonance imaging, and endoscopic ultrasound can already determine the nature of many pancreatic cysts (3); however, discriminating benign neoplastic cysts from the ones that represent precursor lesions for invasive cancer remains a challenge. Moreover, serial surveillance imaging tech-niques are costly and consume health care resources over time.
Gene and protein biomarker systems for pancreatic cancer have been extensively investigated, as is evidenced by the list of potential biomarkers asserted in the review by Harsha et al. (4), but most candidates that have been tested clinically have not proved to be reliable for distinguishing between benign and malignant lesions of the pancreas (5,6). In the further pursuit of alternative molecular markers, the patterns of glycosylation have been studied in sera of patients with pancreatic cancer using antibody-and lectin-based sensors (7,8). A similar approach has been used to study pancreatic cyst fluids, targeting representative glycans on prominent candidate glycoprotein markers, including several of the mucins and carcinoembryonic antigen-related cell adhesion molecules. The study by Haab et al. (9) demonstrated that a combination of lectin-based detection using wheat germ agglutinin in tandem with antibody detection of carbohydrate antigen 19-9 (CA19-9) on the putative biomarker mucin-5AC could distinguish mucin-producing cysts (i.e. MCN and IPMN) from nonmucinous cysts with an 87% sensitivity at 86% specificity. Although the array strategies could prove suitable for discriminating benign from malignant pancreatic cysts in a clinical setting, the development of a tumor-specific biomarker system will likely depend on a somewhat clearer understanding of the molecular events associated with pancreatic carcinogenesis.
To date, only a few proteomic studies of pancreatic cyst fluids have been performed (10 -12). Furthermore, there are no studies that describe the glycomic profiles of pancreatic cyst fluids in the literature to our knowledge. In the reported work, glycomic and proteomic profiles have been mapped for four different types of cysts, including pancreatic pseudocysts (PC), SCA, MCN, and IPMN. The data collected from this highly sensitive profiling have provided detailed insight into the diversity of glycans found in pancreatic cyst fluids. A subgroup of the cysts, including several IPMN, was observed here to contain hyperfucosylated N-linked glycans. Increased fucosylation has continued to be implicated in the progression of cancer (13)(14)(15)(16), including specifically pancreatic cancer (16,17). As such, hyperfucosylated N-glycans and glycoproteins have been targeted for quantitative label-free bottom-up glycoproteomic analysis.

EXPERIMENTAL PROCEDURES
Patients and Pancreatic Cyst Fluid Collection-Twenty-one patients with pancreatic cysts identified by high-resolution cross-sectional imaging, including computed tomography or magnetic resonance imaging, were included in this study. The indications for surgical extirpation were determined by clinical, radiologic, and biochemical parameters. The final histopathologic diagnosis for each pancreatic cyst, namely SCA, MCN, IPMN, or PC, was recorded from the final pathology report of the surgical specimen contained within the patient's electronic medical record. IPMNs were graded according to World Health Organization criteria with current terminology (e.g. low, moderate, and high grade dysplasia).
Pancreatic cyst fluids were collected intraoperatively by direct fine needle aspiration of the cyst within the surgical specimen. Cyst fluids were placed on ice immediately and aliquoted for storage at Ϫ80°C. All cyst fluid collections were processed according to a standardized protocol. This study was approved by the Indiana University Institutional Review Board.
Removal of Particulates and Heavily Cross-linked Proteins from Cyst Fluids-Prior to proteomics and glycoproteomics analysis, a 45-l aliquot of each cyst fluid was diluted with 45 l of 10 mM phosphate buffer, and it was then centrifuged at 13.2 krpm for 15 min to stratify particulate matter, heavily cross-linked mucin-like proteins, and the supernatant containing solubilized proteins. The supernatant was then decanted and split: 70 l for lectin enrichment followed by proteomic analysis and 20 l for immediate proteomic analysis with no enrichment. The aliquots of the cyst fluids that were subjected to N-glycan analysis were stratified in the same way to remove the particulates and heavily cross-linked proteins, but the volumes were not equal for all samples. Rather, the volumes used for analyses were normalized to contain the same amount of protein as described below.
Permethylation and MALDI-TOF-MS of N-Linked Glycans-Pancreatic cyst fluid proteins were measured using a bicinchoninic acid protein concentration assay. The amount of cyst fluid subjected to glycomic profiling was chosen such that 200 g of protein was contained therein. In this way, between 50 and 300 l were used for each sample. The samples were denatured using 0.5% (w/v) SDS, 40 mM DTT in PBS, and incubated at 65°C for 45 min. After incubation, the samples were cooled, and Nonidet P-40 was added to a final concentration of 1%. An aliquot containing 0.5 units peptide N-glycosidase F enzyme was then added to each sample and incubated at 37°C for 24 h. The vendor (Northstar Bioproducts, East Falmouth, MA) defines 1 unit as the amount of enzyme required to catalyze the release of N-linked oligosaccharides from 1 nmol of denatured ribonuclease B in 1 min at 37°C, pH 7.5.
N-Linked oligosaccharides were isolated from the other cyst fluid components using two separate solid phase extraction steps. Micro-Spin C18 columns (Harvard Apparatus, Holliston, MA) were washed and equilibrated with 85% acetonitrile/H 2 O (v/v) solution containing 0.1% trifluoroacetic acid (TFAA) and 5% acetonitrile/H 2 O (v/v) solution containing 0.1% TFAA, respectively. All of the samples were diluted to 0.2 ml with water and passed through the spin column five times. The columns were then washed with a 0.2-ml aliquot of 5% acetonitrile/H 2 O (v/v) solution containing 0.1% TFAA. The eluents containing the free N-glycans were then subjected to a solid phase extraction using MicroSpin charcoal columns (Harvard Apparatus, Holliston, MA).
Activated charcoal columns were equilibrated using 85% acetonitrile/H 2 O (v/v) solution containing 0.1% TFAA, followed by 5% acetonitrile/H 2 O (v/v) solution containing 0.1% TFAA. Samples in aqueous solution were then applied and washed first using H 2 O and then 5% acetonitrile/H 2 O (v/v) solution containing 0.1% TFAA. Salts and other by-products were eluted during the washing step, whereas the N-glycans retained on the activated charcoal microspin columns were recovered using a 0.2-ml aliquot of 50% acetonitrile/H 2 O (v/v) containing 0.1% TFAA. This wash was repeated four times, and all of the eluents were collected and evaporated using an Eppendorf Vacufuge.
Solid phase permethylation was performed using the spin column technique developed in our laboratory (18,19). The samples were first dissolved in 70 l of dimethylformamide, 5 l of water, and 25 l of iodomethane. The samples were then passed over a spin column packed with sodium hydroxide mesh beads a total of eight times. The columns were washed once with acetonitrile. A 400-l aliquot of chloroform was then added to the samples followed by 1 ml of 500 mM NaCl for liquid-liquid extraction of impurities in the organic layer.
Twice more, 1-ml aliquots of 500 mM NaCl were applied. The tubes were agitated and centrifuged, and the 500 mM NaCl was removed. The chloroform layer was saved and dried using a SpeedVac, and the extracted material was resolubilized using 4 l of a 50/50 water/ methanol mix.
All mass spectrometric measurements were performed on an Applied Biosystems (Framingham, MA) 4800 MALDI-TOF/TOF mass spectrometer. The glycan samples were resolubilized in 2.5 l of 20/80% methanol/water solution, and a 0.5-l aliquot was spotted on a stainless steel MALDI target and dried. The matrix, 2,5-dihydroxybenzoic acid, was prepared at a concentration of 10 mg/ml in a 50/50 methanol/water (v/v) solution with 1 mM sodium acetate to ensure complete cationization, and a 0.5-l aliquot of 2,5-dihydroxybenzoic acid matrix was added to the sample spot and dried under vacuum to promote uniform crystallization. The instrument was operated in the positive ion reflector mode, and the m/z range from 1500 to 5100 was monitored. A total of 2000 laser shots were applied to each sample.
Lectin Enrichment of Glycoproteins in Pancreatic Cyst Fluid Samples-A 40-l aliquot of agarose-bound Aleuria aurantia lectin (AAL) slurry (50/50, v/v) from Vector Laboratories (Burlingame, CA) was added to a 1.5-ml tube so that a settled bed volume of 20 l of agarose-bound AAL was provided. The gel was washed in triplicate with 200 l of lectin binding buffer (Tris, pH 7.5, 0.15 M NaCl, 1 mM Ca ϩ2 , 0.08% NaN 3 ) for a total wash of 30 bed volumes. Next, a 70-l aliquot of 50% pancreatic cyst fluid in 10 mM phosphate, pH 7.4, was applied to the gel bed, and the mixture was gently vortexed. The enrichment was allowed to proceed for 18 h at 4°C with gentle agitation. The mixture was centrifuged for 1 min at 2 krpm, and the supernatant was removed. The gel bed was washed twice with 200 l of deionized water. Bound proteins were eluted by the addition of 200 l of 0.1 M acetic acid followed by vortexing. The mixture was centrifuged for 1 min at 2 krpm, and the eluted proteins were pipetted into a 1.5-ml tube. The elution procedure was repeated, and the two 200-l aliquots were combined. This procedure was performed on each of the 21 pancreatic cyst fluids.
Protein Denaturation, Reduction, Alkylation, and Trypsin Digestion-For both lectin-enriched and nonenriched cyst fluids, the following desalting procedure was performed prior to trypsin digestion. Cyst fluids were desalted with 3-kDa molecular mass cutoff filters (Millipore, Bellerica, MA). The filter membranes were prepared prior to the addition of cyst fluid by washing with 500 l of deionized water, followed by centrifugation for 15 min at 10 krpm. Remaining water was removed by inverting the molecular mass cutoff cartridge and centrifuging for 1 min at 1 krpm. The cyst fluids were then applied to the membranes. For the nonenriched cyst fluids, 380 l of 50 mM ammonium bicarbonate was added to the initial volume of 20 l to bring the total volume to 400 l, whereas the lectin-enriched glycoprotein samples were already in 400 l. The samples were centrifuged for 15 min at 10 krpm and then washed in triplicate with 400 l of 50 mM ammonium bicarbonate. Filtration cartridges were then inverted into new collection tubes and centrifuged for 2 min at 1000 rpm. The collection volume for each sample was ϳ20 l.
The 20-l aliquots were dried in a vacuum centrifuge. Each sample was resuspended in 6 M guanidine HCl to denature proteins. Next, a 0.5-l aliquot of 200 mM DTT was added, and the samples were incubated for 30 min at 60°C to reduce disulfide bridges between cysteine side chains. Cysteine thiols were then alkylated by the addition of a 2-l aliquot of 200 mM iodoacetamide, and the samples were incubated for 30 min at room temperature. To quench alkylation, an additional 0.5-l aliquot of 200 mM DTT was added, and the sample was incubated for 30 min at room temperature. The guanidine HCl concentration was then diluted to Ͻ2 M by the addition of 40 l of 50 mM ammonium bicarbonate buffer prior to trypsin digestion. Finally, a 4-l aliquot of trypsin (1 g/l) was added to ensure a minimum enzyme:substrate ratio of 1:20. According to its specificity, trypsin was used to cleave the peptide backbone on the C-terminal side of lysine and arginine residues, except where proline was the next consequent amino acid.
Liquid Chromatography-Mass Spectrometry Analysis of Proteins and Glycoproteins-Trypsin-digested protein samples were analyzed by reversed phase liquid chromatography interfaced through nanoelectrospray to an LTQ-Orbitrap mass spectrometer or to an LTQ-FTMS Ultra (Thermo Scientific, Waltham, MA). Reversed phase liquid chromatography was performed on a Dionex Ultimate 3000 system (Sunnyvale, CA). The analytical column was 75-m inner diameter ϫ 15 cm, packed in-house with Magic C18AQ (200 Å pores, 3-m particles) from Michrom Bioresources, Inc. (Auburn, CA). A gradient separation was performed with 0.1% formic acid, 3% ACN in water, hereafter called solvent A, and 0.1% formic acid in ACN, which will be referred to as solvent B. The peptides were separated over a multistep gradient, beginning with a 10-min equilibration period in which the column was washed with 97% solvent A, followed by a 45-min linear increase from 3 to 55% solvent B and then a second linear step from 55-85% solvent B over 5 min. The column was washed in 85% solvent B for 5 min and then rapidly switched to 97% solvent A over 5 min to re-establish the initial condition for the next sample. For each MS scan, the five highest intensity precursor ions were subjected to collision-induced dissociation for peptide identification downstream. An exclusion window of 30 s was used to prevent a precursor m/z from being selected for fragmentation more than once within said time frame, allowing for fragmentation of lower abundant precursors.
Peptide Identification and Label-free Quantification-MS/MS fragmentation spectra were compiled into a candidate peaklist using the TurboRAW2MGF utility developed in-house (20) and searched against the UniProt database release 14.0 (20,328 sequences), Homo sapiens taxonomy, using the MASCOT v2.3 search engine. The following criteria were used: trypsin selected as the enzyme, one missed cleavage allowed from trypsin digestion, Ϯ 0.02 Da tolerance for precursors, Ϯ 0.8 Da for fragment peaks, ϩ2 and ϩ3 charges, carbamidomethylation of cysteine (fixed modification), oxidation of methionine (variable modification), ion score Ն 30, expect Յ 0.1, accept only bold red queries, rank 1 identifications, and a minimum peptide mass of 600.00 Da. A randomized UniProt database was queried with the same specifications, and the results were used to estimate the false discovery rate as previously described (21). The false discovery rate was conservatively estimated at Յ0.6%.
A label-free quantitative approach was used to measure the relative abundance of identified proteins as a function of the confidently identified peptides as described previously (20). Briefly, ProteinQuant Suite, an in-house developed software tool, was utilized to reconstruct extracted ion chromatograms for precursor m/z values (Ϯ 0.03 Da) that were confidently identified as peptides through fragmentation spectral searches of the UniProt database. To ensure that extracted ion chromatograms for the same peptides were integrated to generate relative quantitative data, proteins in each cyst fluid were quantified using a compiled master file of all identified peptides. MS files in the RAW format were converted to the universal mzXML format. For each peptide, the peak apex was then automatically set by searching for the maximum signal intensity within a 5-min window of the retention time for the precursor scan that was stored by MASCOT. Finally, peak edges were defined by the points where the signal intensity dropped to less than three times the base-line intensity, with a further constraint that the peak width could not exceed 1 min. Peak areas were then calculated by numeric integration. For each protein reported, the protein area is a sum of the peptide areas.
Statistical Analyses-Single-factor analysis of variance was performed with the Data Analysis Toolpak in Microsoft Excel 2007 and was used to generate the p values for differential glycoprotein ex-pression described under "Results" and in Table III. Principal component analysis of covariance was performed with the MiniTab 16.1 program using the relative intensities of the individual N-glycans that were tabulated and organized with Peak Calc (in-house software tool). The score plot from principal component analysis is depicted in Fig. 1. An additional multivariate clustering analysis was performed using the glycoprotein intensity data generated with ProteinQuant (20) to group the samples based on a series of pairwise comparisons of their relative similarities. This analysis was also performed with MiniTab 16.1 and was used to generate the dendrogram in Fig. 4.

Glycomic Profiles Generated by MALDI-TOF-MS-
The clinicopathologic details of the pancreatic cysts are included in Table I. A total of 20 pancreatic cyst fluids were analyzed by glycomic profiling. One of the IPMN cyst fluids, P221, did not contain sufficient glycoprotein content for N-glycan profiling following the removal of mucinous cross-linked material.
One hundred and three possible carbohydrate compositions were searched, and eighty structures were identified, and a table containing the mean relative abundance for each structure in the four pathological groups of cysts was compiled (supplemental Table 1 and other supplemental materials). Fragmentation data (not shown) indicates that the list includes mostly high-mannose and complex glycans, although some hybrid structures (part high-mannose, part complex) were detected at low abundance. As a consequence of the low numbers for the individual diagnoses (N SCA ϭ 3, N PC ϭ 4, N MCN ϭ 6, and N IPMN ϭ 7), the relative abundances included in the table are only rough approximations, and analysis of many more cysts would be necessary to clearly define the ranges of relative abundance for each of these structures in the pathologically different fluids. However, from a qualitative review of the mass spectra, it was clear that the profiles from pathologically different fluids were considerably different from one another in terms of the structures present and their expression levels. To statistically describe the overall character of the cyst fluids, a principal component analysis was performed based on the relative intensities of identified glycans (Fig. 1). The PC clustered tightly in both principal components 1 and 2, excluding sample P103; the SCA, MCN, and IPMN samples did not group significantly according to their respective diagnoses. This observation could be a result of the range of glycoprotein concentrations in the different fluids. Interestingly, samples to the right of the principal component 1 axis were dominated by fucosylated glycans. Samples P14, P91, P334, P47, P357, and P103 all contained several structures with two or more fucoses. The masses and associated structures for the glycans containing two or more fucoses are listed in Table II. Some of these glycans defy the orthodox biosynthesis of N-glycans, such as m/z for [M ϩ Na] ϩ at 1693.9 and 1939.0, and are detected at trace amounts. These structures may in fact be free glycans within the cyst fluid or potentially degraded polylactosamine chains.
Hyperfucosylated Glycans-In six of the cyst fluids (described in the previous section), a number of complex glycans were identified that contained multiple fucoses, ranging from two to six fucose residues attached to a single structure. Of the six cyst fluids containing multiply fucosylated glycans, there were four IPMNs, one MCN, and one PC. These glycans also contained a high number of N-acetylglucosamine and galactose residues ( Fig. 2A). A representative postsource de- cay fragmentation spectrum for the precursor ion at m/z 3460.6 in Fig. 2A revealed a peak at 660.1, corresponding to a glycan b-ion fragment mass (using the fragment nomenclature from Domon and Costello (23)) for permethylated galactose, fucose, and N-acetylglucosamine. This is the trisaccha-ride composition of the Lewis a and Lewis x antigens. Fragmentation further indicated, through the characteristic peaks at m/z 864.1, 1046.6, and 1283.4 (annotated in Fig. 2B), that multiple Lewis antigens were directly linked in an extended contiguous structure. For a more precise structural

TABLE II N-Glycan structures and permethylated masses that contain two or more fucose residues
Red triangles represent fucose, yellow circles represent galactose, blue rectangles represent N-acetylglucosamine, and green circles represent mannose. elucidation, using exoglycosidases as analytical reagents (24) would have provided no additional information, because the hyperfucosylated structures are attached to the extended lactosamines, a modification that inhibits the digestion with ␤1-4,6-galactosidase. This "limitation" of galactosidase was recently exploited for positional determination of core versus outer arm attachment of fucose of singly fucosylated glycans in our recent investigation of serum glycans in ovarian cancer (24). Notably, N-acetylneuraminic acid was not attached to any of the measured highly fucosylated structures. The small amounts of materials available for analysis preclude the use of NMR-based structural elucidation methods.
Proteomic Analysis of Pancreatic Cyst Fluids-A total of 247 proteins were identified through the bottom-up proteomic analysis. Although there were significant overlaps between the different groups (i.e. 137 proteins were identified in more than one type of cyst), many proteins were unique to one of the groups: 22 in PC, 10 in SCA, 23 in MCN, and 55 in IPMN. A table of all identified proteins is included in supplemental Table 2. For proteins identified in only one type of cyst, the pathological condition is also noted. It is expected that the majority of the proteins that were identified in only a single fluid are in fact present in the other types as well, only at lower relative abundance, and thus masked or entirely suppressed by coeluting peptides from high abundant proteins during the electrospray ionization event. A more extensive separation strategy prior to LC-MS such as two-dimensional gel electrophoresis or multidimensional liquid chromatography is expected to provide many more protein identifications, particularly in the SCA, that contain several highly abundant serum proteins. A recently published proteomic study that utilized immunoaffinity depletion and SDS-PAGE for a qualitative proteomic analysis of cyst fluids prior to LC-MS demonstrated this principle (12).
A further qualitative characterization of the identified proteins was performed based on the gene ontology annotations (GOA) for the proteins identified in each type of cyst. The molecular function GOA were compiled using STRAP (Software Tool for Researching Annotations of Proteins) (25). The data have been visualized in Fig. 3. An overall similarity was observed between the four pathologic groups with regard to GOA, although there were a few differences as well. SCA fluids were associated with only 13 GOA for catalytic activity, whereas there were 36, 36, and 50 for PC, MCN, and IPMN, respectively. Furthermore, IPMN had a moderately higher occurrence of GOA for the structural molecule activity with 17 compared with 7, 3, and 4 for MCN, PC, and SCA, respectively.
A label-free comparison of the differential abundance of 186 of the identified proteins is listed in supplemental Table 3. The remaining 61 identified proteins were not present at a sufficient signal-to-noise threshold for quantification. The analysis showed that, after removal of large mucinous crosslinked molecules along with coprecipitates by centrifugal sedimentation, the major blood proteins (e.g. serum albumin, hemoglobin subunits ␣ and ␤, transferrin, and immunoglobulin G) were dominant in all types of cysts. Approximately 90% of quantified proteins were present at a relative abundance value of less than 0.01 (i.e. Ͻ1%) of the total integrated peak area for each LC-MS experiment. Two of the abundant pancreatic enzymes, ␣-amylase and triacylglycerol lipase, were of similar relative intensity in each type of cyst, with the exception that the relative abundance of triacylglycerol lipase in SCA was approximately half that of the other groups; ␣-amylase normalized abundances were 0.021 (IPMN), 0.019 (MCN), 0.022 (SCA), and 0.018 (PC); triacylglycerol lipase normalized abundances were 0.022 (IPMN), 0.019 (MCN), 0.009 (SCA), and 0.020 (PC). Although all of the relative abundances are listed in supplemental Table 3, the values of these two enzymes have been included here for their relevance in the context of the following results and discussion.
Glycoproteomic Analysis of Lectin-enriched Cyst Fluids-Following the identification of several multiply fucosylated glycans in six of the cyst fluid samples during glycomic profiling, a semitargeted glycoproteomic investigation was performed in which AAL-agarose was utilized to enrich fuco- sylated glycoproteins. After LC-MS/MS experiments, the label-free approach was used to generate relative quantitative data for 122 proteins. A hierarchical comparison of statistical similarity was performed to characterize the quantitative lectin-enriched profiles for all samples, and the data were illustrated by the dendrogram shown in Fig. 4. A striking feature of this analysis was the isolated "branch" on the far left that included P221, P91, P357, P47, and P103. The clinical diagnoses for these samples were IPMN, IPMN, IPMN, IPMN, and PC, respectively. Furthermore, four of these five samples, P91, P357, P47, and P103, were also observed to contain a high abundance of multiply fucosylated glycans in the glycomic profiling experiments.
The initial observation of multiply fucosylated glycans in the glycomic profiles and the confirmed differences in fucosylation-enriched proteomic profiles led to a statistical analysis of the individual proteins that were up-regulated in these cyst fluids. The fluids were divided into two groups; those that did contain an abundance of highly fucosylated glycans were termed Group 1: P91, P30, P47, P103, P334, and P357, and those that did not were termed Group 2: P14, P21, P25, P27, P28, P39, P50, P106, P116, P221, P316, P342, P345, P128, and P133. The relative abundance ratios between the two groups were determined for all quantified proteins (supplemental Table 3 and other supplemental materials). Singlefactor analysis of variance was performed for the proteins with relative abundance ratios Ն 2, and p values were determined. A total of 21 proteins were identified to be more than 2-fold increased in relative abundance and of these, nine were found significant (p Յ 0.05), as listed in Table III. Moreover, the statistically significant proteins ranged approximately 2 orders of magnitude in their relative signal intensity (data not shown).
The enzymes ␣-amylase and triacylglycerol lipase, which were abundant in similar amounts in each of the different pathologic groups (i.e. IPMN, MCN, SCA, and PC), were observed statistically higher, with ratios of 1.7 and 2.9, re-spectively, in Group 1 compared with Group 2 using the nonenriched proteomic data. Furthermore, they were dramatically increased in relative abundance following the AAL-enriched proteomic experiments: ratios of 22.4 and 20.2, respectively. A similar trend of increased abundance ratio following AAL enrichment can be seen for seven of the nine proteins in the table. DISCUSSION The role of glycosylation in cancer and the potential to utilize specific glycans as molecular markers of the disease continue to be high profile topics in biomedical research. The biological role of the glycan biomarker, CA19-9, has been studied in pancreatic cancer extensively (26 -28). Its value as a diagnostic (29 -31) and prognostic (32)(33)(34) tool has been tested repeatedly. Reports of the overall sensitivity and specificity of CA19-9 for prediction of pancreatic malignancy have varied widely and have been considered inconclusive. Recent studies have reported that serum CA19-9 levels can detect pancreatic cancer independently (35) or as a component of a multiple marker panel (36), although other reports have concluded that it is not specific for the diagnosis of cancer (12,37). In addition to CA19-9, increases in core fucosylation of serum proteins such as ribonuclease 1 (38) and both core and outer arm fucosylation of haptoglobin (16,17) have been reported as putative biomarkers. A considerable amount of research has been conducted to investigate glycan biomarkers as a potential molecular marker system for pancreatic cancer; however, very little information is available with regard to the variety and abundance of glycans (or glycoproteins) that are present in the tissues and fluids of the pancreas. The recent publication of a qualitative proteomic investigation, in which eight cyst fluids were analyzed by a standard bottom-up proteomics approach (12), highlights the further need for indepth "omics-based" profiling of these pertinent biological ma- terials using the state of the art instrumentation. Our current report contributes significantly to such efforts.
In the present study, glycomic profiles of N-linked glycans have been generated by mass spectrometry for 20 pancreatic cyst fluids, which marks the first study of this kind to date. A number of the cyst fluids have been identified to contain hyperfucosylated glycans. The structures of these oligosaccharides are uncommon, and to our knowledge, they have not been reported previously in pancreatic cyst fluid. We have identified them more frequently and in a higher abundance in IPMN and MCN than in SCA and PC. Our MS fragmentation data indicate that these glycans contain multiples of the Lewis antigen, often connected in series on the antennae extended from the chitobiose core. The presence of di-and trifucosylated lactosamine extensions on glycans has been previously reported in the tissue (39) and serum (40) of lung cancer patients using monoclonal antibodies with specificity toward these two antigens (41). Nonetheless, these types of glycans are not routinely observed in sera of different cancer patients during glycomic profiling (42,43). It is important to note that none of the multiply fucosylated glycans described herein are sialylated, and thus, they do not contribute to the concentration of CA 19-9. Finally, whereas the data from this study strongly suggest that these multiply fucosylated glycans are more common and abundant in the mucinous cysts, a more comprehensive investigation for each pathological group would be needed to properly evaluate their diagnostic potential.
The search for molecular markers of disease has continued to challenge researchers. The enormous complexity of biological materials and the large dynamic range of protein concentrations have often been critical barriers to discovering specific biomarkers for individual diseases. The inherent problem of detecting tumor-specific molecules at an early stage of the disease is that the tumors themselves are small, and proteins leaked into accessible biological fluids that are readily available for medical testing (e.g. saliva, blood, urine) are consequently present in low abundance, typically believed to be in the low ng/ml range or lower. Understanding the prevalence of the glycans and glycoproteins in the premalignant cysts (where they are present at much higher concentration) will aid future studies that seek to develop marker systems for more easily accessed materials such as blood serum.
Enrichment with AAL provided an important means for targeting fucosylated glycoproteins for proteomic analysis. A label-free quantitative approach was further used to identify a number of proteins that were up-regulated in the cyst fluids that also contained multiply fucosylated glycans. Pancreatic ␣-amylase and triacylglycerol lipase were two of the most abundant proteins in the hyperfucosylated fluids. Both have been used clinically in the past as markers of pancreatic disease, and both are glycoproteins with known sites of N-linked glycosylation. Clinically, assays have been routinely used to estimate their concentrations in serum (44), but little is known about the exact nature of their glycosylation and how it is affected by pancreatic cancer. Unlike many other putative markers of pancreatic cancer that originate in the diseased tissue, both enzymes are present at appreciable levels in the serum (ng/ml) and could thus be targeted for antibody capture followed by lectin detection. Moreover, with high sensitivity immunoaffinity chromatography, it could be possible to isolate these glycoproteins for the individual glycomic profiling of each.
Following the lectin enrichment, elastase 2A and elastase 3A were observed in ϳ5-fold and ϳ11-fold higher abundance, respectively, in the multiply fucosylated fluids. Although elastase 3A has one site for N-linked glycosylation, elastase 2A does not contain a canon N-glycan motif, nor has it been identified to be O-glycosylated. However, a previous study of the elastase proteins by Wendorf et al. (45) identified glycans containing one to three fucose residues. Several of these N-glycans do not fit the typical pattern in human glycan biosynthesis, but we identified many of the same structures. These glycans did not contain N-acetyllactosamine extensions, however, which provide the "scaffold" for the structures with four to six fucoses on the individual glycans identified in this study. It is notable that two of the proteins that were significantly higher in relative abundance following AAL enrichment, namely pancreatic secretory trypsin inhibitor and trypsin-2, have not been identified as glycoproteins, nor do they exhibit the tripeptide motif (NX(S/T), where X is not P) for N-linked glycosylation. Because we lack a thorough understanding of the protein interactome within pancreatic cyst fluid, it is not possible to conclusively explain the presence of these two nonglycosylated proteins following lectin enrichment. Nonetheless, it is possible that their presence may be in part attributable to interactions with the glycoprotein chymotrypsin-C (caldecrin), in the cases of both proteins, and with several of the trypsins, chymotrypsins, and elastases, in the case of the pancreatic secretory trypsin inhibitor. Regarding trypsin-2 (anionic trypsin), it has recently been described how chymotrypsin-C deactivates all forms of trypsin, and although the reaction kinetics have not been thoroughly characterized; data presented by Szmola and Sahin-Tó th (46) suggest that it is a relatively slow process. Because AAL enrichment was performed in nondenaturing conditions, additional proteinprotein interactions such as this could persist simultaneously. Similarly, the protein pancreatic secretory trypsin inhibitor has been shown to bind strongly to several pancreatic enzymes, including trypsin, chymotrypsin, and elastase (47). The strong relative expression of pancreatic secretory trypsin inhibitor in the multiply fucosylated samples, although probably not related to the enrichment itself, was intriguing because this protein has previously been linked with chronic pancreatitis (48) and a high incidence of pancreatic cancer (22,49).
In summary, 21 cyst fluids were collected by fine needle aspiration directly from the pancreatic cysts during surgical resection. A total of 20 fluids from four clinical diagnoses (SCA, PC, MCN, and IPMN) were glycomically profiled by MALDI-TOF-MS, and 21 were proteomically profiled by LC-MS/MS. Six of the cyst fluids exhibited high levels of multiple fucosylation. A lectin enrichment investigation in which AAL was used to target fucosylated glycoproteins revealed several proteins that were significantly increased in abundance in the hyperfucosylated cyst fluids. Among the proteins that were significantly overexpressed in the multiply fucosylated samples were several abundant glycoproteins, including pancreatic ␣-amylase, triacylglycerol lipase, elastase 2A, elastase 3A, and bile salt-stimulated lipase. These glycoproteins are strong candidates to bear the multiply fucosylated N-linked glycans, although a targeted study in which each protein would be purified through immunoaffinity chromatography, or a comparable technique, would be necessary to determine the exact nature of their glycosylation pattern.