Simultaneous Enrichment of Plasma Soluble and Extracellular Vesicular Glycoproteins Using Prolonged Ultracentrifugation-Electrostatic Repulsion-hydrophilic Interaction Chromatography (PUC-ERLIC) Approach*

Plasma glycoproteins and extracellular vesicles represent excellent sources of disease biomarkers, but laboratory detection of these circulating structures are limited by their relatively low abundance in complex biological fluids. Although intensive research has led to the development of effective methods for the enrichment and isolation of either plasma glycoproteins or extracellular vesicles from clinical materials, at present it is not possible to enrich both structures simultaneously from individual patient sample, a method that affords the identification of biomarker combinations from both entities for the prediction of clinical outcomes will be clinically useful. We have therefore developed an enrichment method for use in mass spectrometry-based proteomic profiling that couples prolonged ultracentrifugation with electrostatic repulsion-hydrophilic interaction chromatography, to facilitate the recovery of both glycoproteins and extracellular vesicles from nondepleted human plasma. Following prolonged ultracentrifugation, plasma glycoproteins and extracellular vesicles were concentrated as a yellow suspension, and simultaneous analyses of low abundant secretory and vesicular glycoproteins was achieved in a single LC-MS/MS run. Using this systematic prolonged ultracentrifugation-electrostatic repulsion-hydrophilic interaction chromatography approach, we identified a total of 127 plasma glycoproteins at a high level of confidence (FDR ≤ 1%), including 48 glycoproteins with concentrations ranging from pg to ng/ml. The novel enrichment method we report should facilitate future human plasma-based proteome and glycoproteome that will identify novel biomarkers, or combinations of secreted and vesicle-derived biomarkers, that can be used to predict clinical outcomes in human patients.

Plasma glycoproteins and extracellular vesicles represent excellent sources of disease biomarkers, but laboratory detection of these circulating structures are limited by their relatively low abundance in complex biological fluids. Although intensive research has led to the development of effective methods for the enrichment and isolation of either plasma glycoproteins or extracellular vesicles from clinical materials, at present it is not possible to enrich both structures simultaneously from individual patient sample, a method that affords the identification of biomarker combinations from both entities for the prediction of clinical outcomes will be clinically useful. We have therefore developed an enrichment method for use in mass spectrometry-based proteomic profiling that couples prolonged ultracentrifugation with electrostatic repulsion-hydrophilic interaction chromatography, to facilitate the recovery of both glycoproteins and extracellular vesicles from nondepleted human plasma. Following prolonged ultracentrifugation, plasma glycoproteins and extracellular vesicles were concentrated as a yellow suspension, and simultaneous analyses of low abundant secretory and vesicular glycoproteins was achieved in a single LC-MS/MS run. Using this systematic prolonged ultracentrifugation-electrostatic repulsion-hydrophilic interaction chromatography approach, we identified a total of 127 plasma glycoproteins at a high level of confidence (FDR < 1%), including 48 glycoproteins with concentrations ranging from pg to ng/ml. The novel enrichment method we report should facilitate future human plasmabased proteome and glycoproteome that will identify novel biomarkers, or combinations of secreted and vesicle-derived biomarkers, that can be used to predict clinical outcomes in human patients. Human plasma contain proteins released from multiple organs and tissue compartments, including both secreted molecules and extracellular vesicles that reflect the physiological or pathological status of the cell of origin (1). In addition to ease of access, plasma represents one of the most comprehensive sources of potential biomarkers of disease processes, hence proteomic profiling of human plasma can provide clinically useful information on a wide range of human pathologies. The plasma proteome contains asparaginelinked (N-linked) glycosylated proteins from secreted proteome, proteins enclosed in extracellular vesicles or shed from plasma membranes, and these low abundant glycoproteins play key roles in cell-cell interactions, signal transduction, protein function modulation, developmental pathways, immune defense, and disease pathogenesis (2,3). Given that glycosylation is the most prevalent post-translational modification of human proteins (2,4), methods that can enrich these low abundant molecules for characterization of aberrant glycosylation should enable the identification of novel biomarkers of clinical outcomes.
The most clinically useful protein biomarkers in cancer medicine are glycosylated molecules, including HER2 in breast cancer, PSA in prostate cancer, CEA in colorectal cancer, CA-125 in ovarian cancer, and ␣-fetoprotein in hep-atocellular carcinoma (5,6). To date, MS coupled with highresolution liquid chromatography (LC) has been the preferred tool for the identification of trypsin-digested N-glycopeptides and glycosylation sites (7,8). However, study of the human plasma glycoproteome remains challenging because of the vast dynamic range of protein abundance, which can span up to 12 orders of magnitude (8). In addition, the low ionization efficiency of plasma glycopeptides, combined with the relatively high quantities of interfering nonglycosylated counterparts in the sample, can often result in signal suppression when subjected to MS analyses (7). This complexity exceeds the analytical capabilities of MS techniques (up to five orders of magnitude) for detection of low-abundance glycopeptides; hence detection of low-abundance glycoproteins from the whole tryptic digest can be an extremely arduous task. Advances in MS technology, combined with the development of effective fractionation and enrichment strategies to simplify plasma prior to MS analyses, have facilitated the identification of lower abundant species in complex biofluids (9,10). In efforts to enhance the detection of the N-glycoproteome, selective enrichment of N-glycopeptides has been attempted using diverse strategies including lectin affinity chromatography (11,12), solid-phase extraction of N-linked glycoproteins (13), size exclusion chromatography (14), boronic acid affinity chromatography (15), titanium dioxide chromatography (16), hydrophilic interaction chromatography (17,18), and ERLIC 1 (19,20). Although these methodologies are effective in achieving higher coverage of the glycoproteome, removal of important albumin and/or solid support-bound proteins may result in inadvertent loss of potential protein markers (21).
Circulating extracellular vesicles and exosomes are released by a multitude of different cell types under both physiological and pathological conditions (22). Although extracellular vesicles can be isolated using methods such as filtration concentration, flotation density gradients and immunocapture beads, the most widely used method is successive centrifugation followed by ultracentrifugation (UC) (23,24). Extracellular vesicles are known to be involved in myriad cellular processes including coagulation (25), inflammation (26), tumor progression (27), immune response regulation (28), ecto-domain shedding of membrane proteins (29), antigen presentation (30), intracellular communication (31), transfer of RNA and proteins (32) and transfer of infectious cargo (prions and retroviruses) (33,34). In addition, extracellular vesicles have been implicated in the pathogenesis of vascular disorders (35,36), cancers (37,38), diabetes (39) and infectious diseases (40,41). Collectively, these data highlight the immense potential of using these blood-based biomarkers to predict diverse clinical outcomes.
Given that both plasma glycoproteins and extracellular vesicles represent excellent sources of potential disease biomarkers, a method that supports simultaneous enrichment of both entities from individual bodily fluid sample would be enable the identification of novel biomarkers, or combinations of biomarkers, that can predict clinical outcomes in human patients. We therefore developed a systematic enrichment method that couples prolonged ultracentrifugation (PUC) with ERLIC [PUC-ERLIC], for the simultaneous isolation of secretory and extracellular vesicle-enriched glycoproteins from nondepleted human plasma. We show that application of this method facilitates the successful recovery of both high-and low-abundant plasma glycoproteins with concentrations ranging from pg/ml-mg/ml. Using this combined PUC-ERLIC approach, we achieved high-confidence identification (FDR Յ 1%) of a total of 599 unique N-glycopeptides and 361 unique N-glycosylation sites which were assigned to 127 glycoproteins in nondepleted plasma, including 48 glycoproteins that displayed concentration ranges as low as pg-ng/ml.

EXPERIMENTAL PROCEDURES
All water used in the experiments was prepared using a Milli-Q system (Millipore, Bedford, MA). All chemicals were purchased from Sigma-Aldrich (St Louis, MO) unless stated otherwise.
Human Plasma-The study was approved by the institutional review board of National University Hospital Singapore (NUHS). Blood was obtained from heart disease patients (n ϭ 20) prior to coronary artery bypass graft surgery (CABG) and then stored on ice until plasma isolation (conducted within 4h of collection into lithium-heparin vacutainers). Plasma was separated from whole blood by centrifugation at 2500 ϫ g for 30min at 4°C, and then frozen at Ϫ150°C in the NUHS AtheroExpress Repository until subsequent proteomic processing. Two biological replicates were performed in this study, and plasma samples from all 20 patients were combined in equal proportions to obtain a total volume of 5 ml prior to analysis (in order to minimize biological variation). Written informed consent was obtained from all study participants.
PUC-Based Enrichment of Secretory Glycoprotein and Extracellular Vesicles-A total of 5 ml pooled plasma was diluted with 25 ml of 1 ϫ phosphate buffer saline (PBS) and then differentially centrifuged at 200 ϫ g (30min), 2000 ϫ g (30min) and 12,000 ϫ g (60min) at 4°C to exclude intact cells and cellular debris. The resultant supernatant was aliquoted into a 25 ϫ 89 mm polycarbonate tube (Type 50.2 Ti rotor, Beckman Coulter, Brea, CA) and enriched plasma glycoproteins and extracellular vesicles were pelleted at 200,000 ϫ g (18h, 4°C) using a Beckman L100-XP Ultracentrifuge (Beckman Coulter, Brea, CA). The enriched glycoprotein and extracellular vesicular fraction was then resuspended in 1 ϫ PBS and pelleted at 200,000 ϫ g (18h, 4°C) to remove residual contaminants.
Cryo-Electron Microscopy (Cryo-EM)-Electron microscope grids coated with holey carbon film (R2/2 Quantifoil) were glow discharged, and a 4 l droplet of extracellular vesicle-enriched suspension was deposited onto the grid at 99% humidity. Excess liquid was blotted with filter paper and plunged into liquid ethane (Vitrobot, FEI Company, Hillsboro, OR). Cryo grids were imaged using a field emission gun transmission electron microscope operated at 80kV (Arctica, FEI Company, Hillsboro, OR), and equipped with a direct electron detector (Falcon II, FEI Company, Hillsboro, OR). Images were recorded at a nominal magnification of 23,500ϫ.
In-Solution Tryptic Digestion-Proteomic sample preparation was performed according to previously described methods designed to minimize experimentally-induced deamidation (42,43), except for minor modifications. Briefly, the secretory and extracellular vesicleenriched glycoproteins were resuspended in 2 ml lysis buffer containing 8 M urea and 50 mM ammonium acetate (pH 6.0). Plasma proteins were quantified in a 96-well plate using a bicinchoninic acid (BCA) assay according to the manufacturer's protocol. Disulfide bonds were reduced by incubating 300 g protein in 20 mM dithiothreitol (DTT) for 3 h at 30°C, and were then alkylated in the dark using 55 mM iodoacetamide (IAA) for 1h at room temperature. Prior to tryptic digestion, the urea concentration was diluted to less than 1 M using 50 mM ammonium acetate buffer (pH 6.0) to ensure optimal trypsin activity. Proteins were enzymatically digested overnight at 37°C using sequencing-grade trypsin (Promega, Madison, WI) at a 1:100 ratio (w/w, trypsin/protein). The tryptic peptides were desalted using a Sep-Pak C18 cartridge (Waters, Milford, MA) and the eluted peptides were dried in a vacuum concentrator.
ERLIC Fractionation-Selective enrichment of glycosylated peptides was performed accordingly to a previously described method (44), with minor modifications. Briefly, vacuum-dried peptides were reconstituted in 200 l mobile phase A (80% acetonenitrile [ACN] containing 0.1% formic acid [FA]) and fractionated using a PolyWAX LP weak anion-exchange column (4.6 ϫ 200 mm, 5 m, 300Å; PolyLC, Columbia, MD) on a Prominence UFLC system (Shimadzu, Kyoto, Japan). The UV spectra of the peptides were collected at 280 nm. Mobile phase A and mobile phase B (10% ACN, 2% FA) were used with a 90 min gradient of 0 -5% B over 8 min and 5-28% B over 37min followed by 45min at 100% B (constant flow rate of 1 ml/min). Forty-five separate fractions were collected, combined into 14 pooled fractions, and then vacuum-dried (illustration available in supplementary Fig. S1).
LC-MS/MS-The fractionated peptides were separated and analyzed using a LC-MS/MS system that comprised a Ultimate 3000 RSLC nano-HPLC system (Dionex, Amsterdam, NL) coupled to an online LTQ-FT Ultra linear ion trap mass spectrometer (Thermo Sci-entific Inc., Bremen, Germany). Approximately 3 g of peptides from each fraction were injected into a Zorbax peptide trap column (Agilent Technologies, Santa Clara, CA) via the auto-sampler of the Dionex system, and were subsequently resolved in a capillary column (75 m ϫ 10 cm) which was packed with C18 AQ (5 m, 300Å; Bruker-Michrom, Auburn, CA), and run at a flow rate of 300 nL/min. Buffer A (0.1% FA in HPLC water) and buffer B (0.1% FA in ACN) were used to establish the 60 min gradient; starting with 1min of 5-8% B, 44 min of 8 -32% B, 7 min of 32-55% B, 1 min of 55-90% B and 2 min of 90% B, followed by re-equilibration in 5% B for 5 min. The samples were ionized in an ADVANCE™ CaptiveSpray™ Source (Bruker Michrom Billerica, MA) with an electrospray potential of 1.5kV. The LTQ-FT Ultra was set to perform data acquisition in the positive ion mode. A full MS scan (350 -1600 m/z range) was acquired in the FT-ICR cell at a resolution of 100,000 and a maximum ion accumulation time of 1000ms. The automatic gain control target for FT was set at 1 ϫ 10 6 , and precursor ion charge state screening was activated. The linear ion trap was used to collect peptides and measure the peptide fragments generated by CID. The default automatic gain control setting was used in the linear ion trap (full MS target 3.0 ϫ 10 4 , MSn 1 ϫ 10 4 ). The 10 most intense ions above a 500 count threshold were selected for fragmentation in CID (MS2), which was performed concurrently with a maximum ion accumulation time of 200 ms. Dynamic exclusion was activated for the process, with a repeat count of one and exclusion duration of 60 s. For CID, the activation Q was set at 0.25, isolation width (m/z) was 2.0, activation time was 30ms, and normalized collision energy was 35%.
Data Analyses-The extract_msn program (version 4.0) as found in Bioworks Browser 3.3 (Thermo Electron, Bremen, Germany) was used to extract tandem MS spectra in dta format from the raw data of LTQ-FT ultra. Protein identification was performed by querying against the extracted Uniprot Human database (Released on 11/29/ 2013; 176,946 sequences, 70,141,034 residues), by means of an in-house Mascot server (version 2.4.1, Matrix Science, Boston, MA). Target-decoy search strategy was employed for the estimation of false positive identification. The search was limited to a maximum of two missed trypsin cleavages; # 13 C of 2; peptide precursor mass tolerances of 5.1ppm; and 0.8 Da mass tolerance for fragment ions. Carbamidomethylation of cysteine residues was set as a fixed modification, whereas oxidation of methionine residues and deamidation of asparagine residues were set as variable modifications. Mascot results were exported to csv file format and then further processed in Microsoft Excel. Identified peptides were sorted from smallest to largest Mascot peptide expect value (a measure of random match probability), and largest to smallest ion score. The resultant peptide list was used for calculation of false discovery rate [FDR ϭ 2*(decoy hits/total hits)*100%], using an in-house script. The FDR associated with the searched data set was 1% at the peptide level. In order to reduce the presence of outliers in our data set, only peptides with ion scores greater than homology or identity scores were selected for further analysis.

RESULTS AND DISCUSSION
Development of a Novel PUC-ERLIC Strategy for Glycoproteome Enrichment-Plasma glycoproteins and extracellular vesicles represent excellent sources of disease biomarkers, but technical challenges have so far prevented the isolation of

Enrichment of Plasma Soluble and Vesicular Glycoproteins
both structures in parallel from individual patient samples. In order to develop an efficient new method for simultaneous isolation of glycoproteins and extracellular vesicles from human plasma, we first sought to simplify the composition of this highly complex biological fluid using ultracentrifugation (UC). We therefore prediluted human plasma fivefold and increased our standard UC speed and duration (from 100,000 ϫ g for 2 h to 200,000 ϫ g for 18 h) in order to reduce plasma viscosity and improve sedimentation efficiency ( Fig.  1). We recovered a visible yellow suspension that was found to be highly enriched in soluble glycoproteins and extracellular vesicles when extracted and digested into peptides using trypsin under denaturing conditions. The glycopeptides from soluble glycoproteins and extracellular vesicles were then further enriched using ERLIC. Finally, the glycopeptide Nlinked glycans were cleaved using PNGase F prior to analysis by reverse-phase LC MS/MS.
Isolation of Extracellular Vesicles from Human Plasma-UC is the most common method used for extracellular vesicles isolation, but this approach often results in copelleting of protein aggregates and other membrane fragments (23). We recognize the presence of inadvertent trace of contaminants in our extracellular vesicles fraction; therefore we define our preparation as "extracellular vesicles-enriched" in place of "pure extracellular vesicles" in this study.
The enrichment of extracellular vesicles using PUC was ascertained by Cryo-EM ( Fig. 2A and 2B), Western blotting (Fig. 2C) and proteomic analyses (detailed searched information available in supplemental Data S1, worksheet 18hMarkers01). When viewed by EM, the harvested extracellular vesicles displayed circular morphology, were membranebound, and exhibited a size range comparable to that of exosomes (50 -100 nm diameter) (32). Western blot analyses of widely-used exosomal markers including Alix, CD9, and CD81 (52) confirmed the presence of extracellular vesicles in both of the biological replicates obtained by PUC, and numerous exosome-specific membrane proteins were detected in our proteomic data set, including FLOT1, TSG101, HSP70, ALIX/PDCD6IP, CD9, CD81, and CD63 (52). Together, the data obtained from our EM, Western blot, and proteomic analyses provided strong evidence of extracellular vesicle enrichment using PUC.
PUC Facilitates Simultaneous Enrichment of Secretory Glycoproteins and Extracellular Vesicles-Standard protocols for the isolation of extracellular vesicles typically require 2 h of UC. In order to assess whether extended UC duration favors the recovery of extracellular vesicles and secretory glycoproteins, we compared the data obtained via PUC-ERLIC with a standard UC-ERLIC approach (detailed experimental protocols available in supplemental Procedures). As depicted in Fig. 3A, a translucent yellow pellet was recovered from plasma after 2 h of standard UC, but after 18 h of UC (PUC) the same volume of plasma gave rise to a pellet that was significantly larger, darker in shade, and surrounded by a dense yellow suspension. This observation suggested that PUC may be superior to standard UC in terms of pellet recovery.
We next examined the glycoproteome of the extracellular vesicle-enriched suspensions obtained by standard UC or PUC. As shown in Fig. 3B, the glycoproteome obtained using our novel PUC-ERLIC approach exhibited a fourfold higher representation of asparagine-linked (N-linked) glycosylated proteins compared with the standard UC-ERLIC-generated glycoproteome (detailed searched information available in supplemental Data S2, worksheet PNGaseAll01), indicating enrichment of secretory glycoproteins by PUC. Consistent with these data, 99 of 127 glycoproteins identified after PUC-ERLIC were annotated as secreted proteins in DAVID v6.7 (48,

Enrichment of Plasma Soluble and Vesicular Glycoproteins
49) (detailed searched information available in supplemental Data S2, worksheet PNGaseNMotif02), indicating that the yellow suspension generated by PUC was highly enriched in soluble glycoproteins. Based on the dramatic visual difference in pellet recovery and glycoproteomic composition, our data show that prolonged UC time facilitates the simultaneous sedimentation of secretory glycoproteins and extracellular vesicles from human plasma.
Despite growing interest in the field of extracellular vesicle biology and the extensive use of UC for their isolation, our report is the first to describe the recovery of a yellow extracellular vesicle fraction enriched with soluble glycoproteins from human plasma. The exact mechanism by which PUC enables enrichment of glycoproteins from plasma is currently unclear. However, we speculate that the larger and extensively branched secretory N-glycoproteins form denser components than their unmodified counterparts, leading to pelleting alongside extracellular vesicles during extended UC. Based on these findings, it appears that PUC-ERLIC is more effective than standard UC-ERLIC for the enrichment of both secretory and low abundant extracellular vesicle-enriched Nglycosylated plasma proteins.

Evaluation of Glycoproteome Enrichment by PUC-
ERLIC-In order to assess the performance of PUC and ER-LIC to plasma glycoproteome enrichment when using our novel strategy, we next compared the protein/peptide composition of samples subjected to PUC alone, ERLIC alone, or PUC-ERLIC combined (detailed experimental protocols available in supplemental Procedures). As shown in Fig. 4A, the percentage of unique plasma N-glycoproteins recovered was significantly higher when using PUC-ERLIC (ϳ62%) compared with ERLIC alone (ϳ34%) or PUC alone (ϳ24%) (detailed searched information available in supplemental Data S2, worksheet PNGaseAll01). The reduced recovery of Nglycoproteome components observed when using ERLIC alone or PUC alone confirmed the known analytical limitations of MS for the detection of low abundance glycoproteins in crude plasma. Approximately 24% of the total unique proteins identified in the PUC sample fractions were glycosylated, indicating that glycoproteins were substantially enriched by extended hours of UC. When comparing the profile of glycoproteins obtained by PUC-ERLIC with those obtained when using PUC alone or ERLIC alone, we observed that a significant number of proteins overlapped, indicating high commonality and complementarity between the three approaches (Fig.  4B). Of a total 137 glycosylated proteins identified, 59 were

Enrichment of Plasma Soluble and Vesicular Glycoproteins
detected using more than one approach, corresponding to an overlap of ϳ43% between the different methods. However, PUC-ERLIC recovered a substantially higher number of unique glycosylated proteins from nondepleted human plasma than either PUC or ERLIC alone. This significant improvement in detection of unique N-glycoproteins (ϳtwofold), N-glycopeptides (ϳfivefold) and N-glycosites (ϳfourfold) using PUC-ERLIC indicated synergistic effects of this combined approach for glycoproteome enrichment that exceeded the performance of the individual component methods.
Selective Glycopeptide Enrichment Using ERLIC-We next assessed the efficiency of glycopeptides enrichment by ERLIC when used in our combined PUC-ERLIC method. The mixed-mode chromatography method ERLIC was originally developed for the enrichment of phosphopeptides and separation of charged biomolecules (53), and it has been optimized for the efficient separation of retained glycopeptides from background unmodified peptides based on differences in charge and polarity (19,20), which exploits the electrostatic attraction of negatively charged sialylated glycopeptides, and the hydrophilic interaction of carbohydrate groups with the weak anion exchange resin under acidic conditions. When applying this approach to the analysis of human plasma samples, the majority of nonglycosylated peptides elute in the flow-through (fractions 0 -3), which is collected during the first 8 min of the gradient MS run (range 70 -66% ACN). Using this method, we observed that negatively charged and hydrophilic glycopeptides were eluted after the unmodified peptides and were collected in fractions 4 -44 (illustration available in supplemental Fig. S1A). We then combined the individual glycopeptide-enriched fractions into 14 pooled fractions, and detected a gradual increase in the proportions of glycopeptides present in the later fractions (illustration available in supplemental Fig. S1B), indicating that ERLIC facilitates significant enrichment of plasma glycopeptides after PUC.
Assessment of False Positive N-linked Glycosylation Sites-The use of standard alkaline reaction buffers to conduct Nlinked glycoproteomic experiments promotes nonenzymatic deamidation of Asn to Asp (42, 43) may hamper the accurate assignment of N-glycosylation sites (54). It has been shown that without affecting protein and peptide identification, chemically induced deamidation were significantly minimized under acidic digestion and deglycosylation conditions at pH 6.0 (42,43). Here, we evaluate the accuracy with which Nglycosylation sites could be assigned after plasma protein enrichment by PUC-ERLIC. The carbohydrate moiety in Nlinked glycosylation is attached to an asparagine residue (Asn/N) followed by a nonproline amino acid (X) and then a serine (S)/threonine (T)/Cysteine (C), resulting in a consensus N-X-S/T/C sequence motif that is used in the assignment of N-glycopeptides. Typically, PNGase F treatment is used to deglycosylate the N-glycopeptides via deamidation of Asn to aspartic acid (Asp/D) (ϩ0.984 Da) at the site of glycan attachment, and the corresponding change in peptide mass can be detected using tandem mass spectrometry (MS/MS). Unfortunately, MS/MS is unable to distinguish PNGase F-mediated deamidation from spontaneous or artificially-induced Asn deamidation (54). In order to determine the extent of truepositive versus false-positive assignment of N-glycoslyation sites, we next performed an independent PUC-ERLIC experiment without PNGase F treatment and identified a total of just four unique Asn deamidated peptides (detailed searched information available in supplemental Data S3, worksheet NoPNGasAll01) that matched the consensus N-X-S/T/C motif (equivalent to 0.68% FDR for all N-glycopeptides in our study). These falsely assigned N-glycosylated peptides as shown in Table I were excluded from further analyses. In addition, the PUC-ERLIC experiment without PNGase F treatment enabled the identification of 252 unique deamidated peptides that lacked the N-linked sequence and instead contained the common deamidation N-G or N-S motif (55) at the ϩ1 position (detailed searched information available in supplemental Data S3, worksheet NoPNGaseDeamid02), which likely arose from deamidation in vivo. These 252 peptides represented ϳ12% of the total 2171 unique plasma peptides identified in our study. Given the precautions taken to reduce artificial deamidation when conducting these analyses and the low frequency of chemical deamidation observed, we believe that PUC-ERLIC facilitates robust assignment of N-glycosylation sites based on PNGase F-mediated deamidation.
Efficacy of PUC-ERLIC for the Enrichment of Low-Abundance Plasma Glycoproteins-Analysis of our PUC-ERLIC glycoproteomic data by querying against the Mascot search engine returned a total of 599 unique N-glycopeptides and 361 unique N-glycosylation sites assigned to 127 distinct glycoproteins in undepleted human plasma (overall confidence level Ն99%; detailed searched information available in supplemental Data S2, worksheet PNGaseAll01), MS/MS spectrum of all glycosylated peptides available in supplemen-

Enrichment of Plasma Soluble and Vesicular Glycoproteins
tal Information S1. In this study, 94 -98% of the total detected N-glycosylated sites matched the stringent canonical motif N-X-S/T, with the N-X-T motif being more frequent than the N-X-S motif. A minor proportion of N-glycosites (2-6%) matched the rare N-X-C motif (detailed information available in supplemental Data S2, worksheet PNGaseNMotif02). We next compared the N-glycosylated plasma proteins identified in our study with data sets obtained from previous studies that used common protocols for N-glycoprotein enrichment. Using a similar LC-MS/MS platform to that employed in the current report, Yang et al. (56) used solid phase extraction of N-linked glycoproteins (SPEG) to enable the identification of 462 glycopeptides from 185 N-glycoproteins in nondepleted human plasma. In contrast, Drake et al. (57) used affinity capture workflows to facilitate the detection of 227 N-glycosylation sites covering 119 N-glycoproteins recovered from MARS-depleted plasma. As illustrated in Fig. 5, our PUC-ERLIC approach enabled the detection of 109 Nglycosylated proteins among the total 229 N-glycosylated proteins reported by the different studies (detailed information available in supplemental Data S4, worksheet Compare-Study), which accounts for at least 48% of glycosylated proteins identified. As established methods such as SPEG and lectin affinity have proved to be useful in the enrichment of glycoproteome, the overlapped of ϳ57% glycoproteins among the three different methodologies shows that enrichment by PUC-ERLIC is at least as efficient as the current alternative methods. Importantly, there is no other method at present that offers the enrichment of both extracellular vesicles and glycoproteins in a single experiment.
We next determined the presence of known extracellular vesicle proteins in our data set by conducting literature surveys in Exocarta, a web-based central repository that catalogs exosomal RNA, lipids, and proteins (http://www. exocarta.org/) (58). As summarized in Table II, among the total 127 glycoproteins identified in our data set 58 have previously been cataloged as extracellular vesicle proteins in Exocarta (human plasma data bank) and in consolidated reference studies (59 -61). A total 46% of the extracellular vesicle pro-teins identified in our study were glycosylated, thus providing further evidence that PUC-ERLIC enables substantial enrichment of vesicle-derived glycoproteins. To our knowledge, our report is the first to describe the glycoprotein composition of extracellular vesicles enriched from human plasma, and accordingly, the remaining 69 glycosylated proteins identified by our analyses were undocumented in the Exocarta human plasma data bank.
Next, to evaluate the dynamic range of detection using our method, we conducted literature surveys to assess the approximate concentrations of the plasma glycoproteins identified in our study. A total 62% of the 127 glycoproteins identified in our report were present in the g/ml-mg/ml concentration range, and were annotated among the top 150 "high abundant" plasma proteins (Table III). The 12 most abundant components of plasma comprise ϳ95% of the total protein content, and albumin alone accounts for around 50% of all plasma protein (62,63). Importantly, these high abundant plasma components did not hinder our detection of 48 low abundant proteins that were identified in the range of pg/ml-ng/ml, including vascular endothelial growth factor receptor 3 (FLT4; 4.1 ng/ml) (64), multiple epidermal growth factor-like domains protein 8 (MEGF8; 4.3pg/ml) (64), and cysteine-rich secretory protein 3 (CRISP3; 6.3 pg/ml) (65). In order to establish unambiguous identification, a precursor ion mass tolerance of 5 ppm and peptide FDR Յ 1% was used in this study. The use of a narrow mass tolerance of 5ppm eliminates the miss-assignment of C13 peak of the native peptide as deamidated peptide (66,67). 137 unique glycosylated peptides assigned to 48 low abundant proteins were statistically matched (p Ͻ 0.05) and most deamidated peptides have been identified with high mascot score and low expect values (e.g. MEGF8 (ALLTNVSSVALGSR), Peptide Score: 119.99, Expect value: 1.60E-10; FLT4 (LVIQNAN-VSAMYK), Peptide Score: 53.85, Expect value: 0.00093; CRISP3 (DSCKASCNCSNSIY), Peptide Score: 47.72, Expect value: 5.50E-05), which indicates the unambiguous identification of homologous protein (detailed information available in supplemental Data S2, worksheet PNGaseLowAbund03). The pertinent MS/MS spectra and fragment ion assignment of all 137 low abundant peptides MS/MS spectra and fragment ion assignment have been manually curated and documented in supplemental Information S2. In addition, complementary sequence homology search on all individual low abundant glycosylated peptide sequences (deamidated with D residue) and on their nonmodified counterparts (native with N residue) were submitted to NCBI BLASTP (50,51). The protein homologues hits obtained from the BLASTP (50, 51) search were consistent with the protein hits obtained from Mascot search (detailed information available in supplemental Information S3), the overall sequence identity when examined among species was found to be considerably high (Ͼ 90%), and found that none of these peptides shares sequence homology to another protein in the NCBI     (72) nr database. Thus, we conclude that the possibility of low abundant protein miss-assignment by Mascot is considerably low in our data set. Because our enrichment method enabled the efficient detection of low abundant glycoproteins in plasma, it may be possible to use this technique to identify N-glycosylated biomarkers and potential therapeutic targets in human patient samples. Indeed, our analyses of human plasma from heart disease patients revealed the presence of lymphatic vessel endothelial hyaluronic acid receptor 1 (LYVE1), a lymphangiogenic glycoprotein which is reportedly involved in myocardial remodeling after infarction (68,69). Our novel PUC-ERLIC approach may therefore enable the detection of cardiac glycoproteins involved in myocardial healing, and could potentially be used to identify biomarkers of distinct clinical outcomes in patients undergoing heart surgery. The fact that we achieved a dynamic range of detection greater than 10 8 without the use of any immunodepletion strategies also reflects the effectiveness of PUC-ERLIC for reducing plasma complexity. These data show that provided the samples are prepared well prior to analysis, it is possible to increase the depth of glycoproteomic data obtained from complex biological samples by developing sensitive detection technologies, and that these technologies have significant potential to uncover novel biomarkers of diseases.

Enrichment of Plasma Soluble and Vesicular Glycoproteins
Functional Annotation of Plasma Glycoproteins-In order to determine the utility of PUC-ERLIC for the identification of glycoproteomic biomarkers for possible use in the clinic, we next assessed the molecular functions, biological processes, and subcellular localization of the secretory and extracellular vesicle-derived molecules identified in our study. To do this, we conducted Gene Ontology (GO) annotation using tools including STRAP1.5 (45), AmiGO2 (46,47), and DAVID v6.7 (48,49) (detailed annotation available in supplemental Data S5, worksheet GOAnn). GO molecular function categorization (Fig. 6A) suggested that binding activity (50.6%) was overrepresented among the glycoproteins we identified, followed by catalytic activity (18.3%) and regulation of enzyme activity (15.2%), all of which are known molecular roles of N-glycoproteins. The principal biological functions associated with these molecules were regulation (28.0%), cellular processes (16%), and metabolic processes or responses to stimuli (12%) (Fig. 6B). Glycoproteins involved in immune responses and interaction with cells and organisms were equally well represented in our data set (both 9%). Other known physiological functions of glycoproteins such as localization (9%), developmental processes (6%) and cell growth (1%) were also annotated.
Because of the location of glycosyltransferase in the endoplasmic reticulum (ER) and golgi apparatus, N-glycosylation is typically restricted to secreted proteins and extracellular membrane proteins (2). Some of the N-glycosylated proteins in our combined data set (9.9%) exhibited annotations in atypical locations (cytoplasmic, nuclear, or mitochondrial), but further analysis revealed that these annotations were nonexclusive. In contrast, GO annotation of subcellular localization for the remaining glycoproteins predicted that the majority (69.4%) were either extracellular or expressed at the plasma membrane (Fig. 6C). Consistent with these data, we also detected two extracellular vesiclespecific proteins (GO:0070062); Cadherin-related family member 5 (CDHR5) and Lysosome-associated membrane glycoprotein 2 (LAMP2).
Extraction of membrane proteins, which comprise 20 -30% of the total proteome, has until now been extremely challenging because of their low abundance in accessible biological fluids and the inherent hydrophobicity of cellular membranes (70). Strikingly, our results show that substantial recovery of glycosylated plasma membrane proteins (20.3%) can be achieved by PUC-ERLIC, in parallel with the enrichment of secreted N-glycosylated proteins, thus enabling a wide range of proteomic coverage that could be applied to the identification of candidate biomarkers (or combinations of biomarkers), as well as the discovery of potential therapeutic targets in human diseases.

CONCLUSIONS
Circulating glycoproteins and extracellular vesicles secreted from damaged/infected host cells and tissues represent excellent sources of potential biomarkers of human pathologies. In this study, we showed that glycoprotein enrichment by PUC-ERLIC facilitates the simultaneous recovery of both secretory and extracellular vesicle-derived glycoproteins from nondepleted human plasma, which could in part be attributed to the synergistic effects of using PUC and ERLIC in combination. When using PUC, we were able to

Enrichment of Plasma Soluble and Vesicular Glycoproteins
recover a yellow, extracellular vesicle-containing fraction (confirmed by cryo-EM, Western blot and proteomic analyses), and this fraction was found to be highly enriched in secretory glycoproteins that could potentially be probed for prognostic/diagnostic biomarkers. Indeed, the low levels of chemical deamidation observed in our study and correspondingly low FDR (0.68%) indicates that our approach enabled both significant glycoprotein enrichment and ro- bust assignment of N-glycosylation sites, thus achieving the level of performance and accuracy required to enable these data to be translated into real clinical applications.
In sum, in a single LC-MS/MS run, we confidently identified a total of 127 glycoproteins in human plasma samples, including 599 unique glycopeptides and 361 unique glycosylation sites (FDR Յ 1%). Our novel PUC-ERLIC method enabled the detection of 48 low-abundant glycoproteins (pg/ml-ng/ml) in human plasma without the need for prior depletion of highabundance molecules, hence this approach may offer a new way of overcoming the technical challenges associated with delineating the complex protein composition of human plasma. Using plasma obtained from heart disease patients, we have identified, LYVE1, a cardiac-specific remodeling glycoprotein that has been reported to be involved in myocardium healing. The proteomics data set obtained in this study not only directly addresses the current paucity of knowledge on the glycoproteomic composition of extracellular vesicles in human plasma, but also show potential to assist the future development of candidate biomarkers for use in a wide variety of clinical diseases.