Advancing a High Throughput Glycotope-centric Glycomics Workflow Based on nanoLC-MS2-product Dependent-MS3 Analysis of Permethylated Glycans**

The intrinsic nature of glycosylation, namely nontemplate encoded, stepwise elongation and termination with a diverse range of isomeric glyco-epitopes (glycotopes), translates into ambiguity in most cases of mass spectrometry (MS)-based glycomic mapping. It is arguable that whether one needs to delineate every single glycomic entity, which may be counterproductive. Instead, one should focus on identifying as many structural features as possible that would collectively define the glycomic characteristics of a cell or tissue, and how these may change in response to self-programmed development, immuno-activation, and malignant transformation. We have been pursuing this line of analytical strategy that homes in on identifying the terminal sulfo-, sialyl, and/or fucosylated glycotopes by comprehensive nanoLC-MS2-product dependent MS3 analysis of permethylated glycans, in conjunction with development of a data mining computational tool, GlyPick, to enable an automated, high throughput, semi-quantitative glycotope-centric glycomic mapping amenable to even nonexperts. We demonstrate in this work that diagnostic MS2 ions can be relied on to inform the presence of specific glycotopes, whereas their possible isomeric identities can be resolved at MS3 level. Both MS2 and associated MS3 data can be acquired exhaustively and processed automatically by GlyPick. The high acquisition speed, resolution, and mass accuracy afforded by top-notch Orbitrap Fusion MS system now allow a sensible spectral count and/or summed ion intensity-based glycome-wide glycotope quantification. We report here the technical aspects, reproducibility and optimization of such an analytical approach that uses the same acidic reverse phase C18 nanoLC conditions fully compatible with proteomic analysis to allow rapid hassle-free switching. We further show how this workflow is particularly effective when applied to larger, multiply sialylated and fucosylated N-glycans derived from mouse brain. The complexity of their terminal glycotopes including variants of fucosylated and disialylated type 1 and 2 chains would otherwise not be adequately delineated by any conventional LC-MS/MS analysis.

report here the technical aspects, reproducibility and optimization of such an analytical approach that uses the same acidic reverse phase C18 nanoLC conditions fully compatible with proteomic analysis to allow rapid hasslefree switching. We further show how this workflow is particularly effective when applied to larger, multiply sialylated and fucosylated N-glycans derived from mouse brain. The complexity of their terminal glycotopes including variants of fucosylated and disialylated type 1

and 2 chains would otherwise not be adequately delineated by any conventional LC-MS/MS analysis. Molecular & Cellular Proteomics 16: 10.1074/mcp.TIR117.000156, 2268-2280, 2017.
One of the most conspicuous hallmarks of protein glycosylation is its extreme structural heterogeneity driven by nontemplate encoded stepwise elongation and branching from a few invariant core structures (1). Terminal or peripheral sialylation, fucosylation and/or sulfation then generate a plethora of terminal glyco-epitopes, or glycotopes, distributed over myriad carrier glycans and proteins. MS-based glycomics (2,3), despite being highly sensitive in affording high precision mass measurement at high throughput, often fails to reveal the least abundant glycotopes. A glycotope can be a very minor constituent of the glycome and yet be highly relevant when carried and presented properly at specific sites of receptors. Any change in abundance or additional modifications imparted on this glycotope may well be undetectable at the glycomic level and yet will have profound impact on the functions of its carriers. This calls into question if current glycomics is of sufficient analytical depth to address the most relevant glycobiology issues. Indeed, there is a considerable gap among positive detection of a glycotope by monoclonal antibody or lectin, and its identification by MS. Although the most notable pitfalls of probing by antibody are its poorly defined cross-reactivity and not informative of carrier glycans, that of MS is a need to further resolve the isomeric or isobaric constituents of a glycotope defined by a unique mass.
There will always be practical limitations in resolving each of the structural and stereoisomeric glycans chromatographi-cally, particularly as the glycan size gets bigger along with increasing permutation of isomeric arrangements. MS 2 and often MS 3 , if not higher orders, are needed to define linkage and substituent positions (4,5). In that respect, the most reliable MS-based glycan sequencing is based not on analyzing native but permethylated glycans (5,6). In fact, the advantages of permethylation extend beyond that. We have previously shown that by converting all hydroxyl groups including the carboxylic groups of sialic acids into O-methyl, permethylation allows simple enrichment and identification of a wealth of sulfated glycans occurring at low abundance that would otherwise not be detected (7,8). We further demonstrated that negative ion mode nanoLC-MS/MS can be productively applied in acidic buffer commonly used in proteomics without compromising much the stability and detection sensitivity (9). Low mass MS 2 diagnostic ions for locating the sulfate onto the common Gal␤1-3/4GlcNAc unit, with and without additional sialylation, were identified and we suggested that ambiguity in assignment can conceivably be addressed by additional MS 3 mode that will not suffer from low mass cutoff the way an ion trap based fragmentation would (9).
The advent of new generations of Orbitrap series with increasing data acquisition speed and flexibility in combining multiple stages of higher energy collision dissociation (HCD) with ion trap collision induced dissociation (CID) 1 , invites innovative multimode MS 2 /MS 3 data acquisition and processing methods that would best address the most relevant glycomic features. An important concept to be advanced here is that any structural feature that can be defined by diagnostic ions at MS 2 and/or MS 3 level can thus be identified and relatively quantified based on its MS 2 /MS 3 ion intensity and the frequency it was produced in an automated LC-MS 2 /MS 3 analysis. We have experimented with a glycotope-centric glycomic workflow, which aims foremost to delineate the various isomeric glycotopes by well-established diagnostic ions. Using the same acidic solvent system for both positive and negative ion mode data acquisition allows for rapid switching within or in between runs to probe for occurrence of target (sulfo)-glycotopes and to derive quantification indices for rapid comparison across as many biological sources. This allows us to ask how these sialylated, fucosylated and/or sulfated glycotopes were affected on pathophysiological activation, and/or genetically or chemically manipulated in vivo and in vitro, in a way never possible. Aided by our own in-house developed data mining tool, tens of thousands of MS 2 /MS 3 spectra can now be meaningfully interrogated by nonexperts and relevant glycotope information extracted in a fully automated fashion to make glycomics more commonplace.
Mouse Striatum Tissues-Male C57BL/6J (8 -12 weeks old) were bred and maintained in the animal core of the Institute of Biomedical Sciences at Academia Sinica following the protocol approved by the Institutional Animal Care and Utilization Committee of Academia Sinica. Brain striatum tissues were carefully removed, minced into small pieces, and transferred into a 2-ml grinder and homogenize on ice using a homogenization buffer (1 mM EGTA, 1 mM MgCl 2, 10 nM okadaic acid, 100 M, phenylmethylsulfonyl fluoride, 40 M leupeptin, 25 mM Tris HCl buffer, pH8.0) containing the 1X cOmplete™ Protease Inhibitor (Roche, Switzerland), and 1ϫ phosStop (Roche, Switzerland). The homogenate was first centrifuged at 500 ϫ g for 10 min at 4°C to remove debris. The supernatant was collected and centrifuged at 50,000 ϫ g for 1 h at 4°C to collect membrane fractions existing in the pellets. Pellets were suspended with an ice-cold lysis buffer (0.2 mM EGTA, 0.2 mM MgCl 2 , 30 nM okadaic acid, 40 M phenylmethylsulfonyl fluoride, 0.1 mM leupeptin, 0.2 mM sodium orthovanadate, and 20 mM HEPES, pH 8.0 plus 1X cOmplete™ Protease Inhibitor and 1X phosStop) and stored at Ϫ80°C until used.
Glycan Release and Permethylation-Harvested 1 ϫ 10 7 AGS, Colo205 cells and membrane fraction from one mouse striatum were extracted by lysis buffer containing 1% Triton X-100 and centrifuged to collect the supernatant. The supernatants were subjected to reduction by 10 mM DTT at 37°C for 1 h, alkylation by 50 mM iodoacetamide at 37°C for 1 h in the dark, and then precipitation by TCA to a final concentration of 10%. The remaining detergents were further removed by cold acetone precipitation and the recovered proteins were digested overnight by trypsin (250 g for cells or 50 g for extracts from 1 mouse striatum) in 50 mM ammonium bicarbonate at 37°C, followed by same amount of chymotrypsin in the same buffer at 37°C for 6 h. N-glycans were released by 3 U of PNGase F treatment and O-glycans by alkaline ␤-elimination from the de-Nglycosylated glycopeptide, as described (7). Both the released N-and O-glycans were permethylated by the sodium hydroxide/DMSO slurry methods (8) at 4°C for 3 h and separated into nonsulfated, monosulfated, and the multiply sulfated glycans by loading the neutralized reaction mixtures onto a primed Oasis ® Max solid phase extraction (SPE) cartridge (Waters, Milford, MA) (10), and eluted off by 95% acetonitrile, 1 mM ammonium acetate in 80% acetonitrile, and 100 mM ammonium acetate in 60% acetonitrile/20% methanol, respectively. Before MS analysis, aliquots from all fractions were additionally cleaned up by applying to ZipTip C18 in 0.1% formic acid and eluted by 75% acetonitrile/0.1% formic acid. The permethylated glycan standards and samples were dissolved in 10 l of 10% acetonitrile in 0.1% formic acid.
Tribrid™ Mass Spectrometer (ThermoFisher Scientific) via a PicoView nanosprayer (New Objective, Woburn, MA) for nanoLC separation at 50°C using a 25 cm x 75 m C18 column (Acclaim PepMap ® RSLC, ThermoFisher Scientific) at a constant flow rate of 500 nL/min. The solvent system used were 100% H 2 O with 0.1% formic acid (FA) for mobile phase A, and 100% ACN with 0.1% FA for mobile phase B. A 60 min linear gradient of 25 to 60% B for O-glycan, and 30 to 80% B for N-glycan and mono-sulfated O-glycan, was used for eluting the permethylated glycans. Another EASY-nLC™ 1200 system was interfaced to an Orbitrap Fusion™ Lumos™ Tribrid™ Mass Spectrometer (ThermoFisher Scientific) via a Nanospray Flex™ Ion Sources (ThermoFisher Scientific) for nanoLC separation at 50°C using the same 25 cm x 75 m C18 column (Acclaim PepMap® RSLC, ThermoFisher Scientific) at a constant flow rate of 300 nL/min. The solvent system used were 100% H 2 O with 0.1% FA for mobile phase A and 80% ACN with 0.1% FA for mobile phase B. A linear gradient of 40 to 95% B in 70 min was used for analysis of the permethylated mono-sulfated O-glycan.
The Orbitrap Fusion™ Tribrid™ mass spectrometer was operated in positive mode for analyses of nonsulfated N-glycans (m/z mass range 800 -2000, charge state 2 to 4) and O-glycans (m/z mass range 500 -1700, charge state 1 to 3). Top speed mode was used for data dependent acquisition at 3 s duty cycle. Full-scan MS spectrum was acquired in the Orbitrap at 120,000 resolution with automatic gain control (AGC) target value of 4 ϫ 10 5 , followed by quadrupole isolation of precursors at 2 Th width for higher energy collisional dissociation (HCD)-MS 2 at 15% normalized collision energy (NCE) Ϯ 5% stepped collision, with 10 s dynamic exclusion applied. HCD MS 2 fragment ions were detected in the Orbitrap analyzer at 30,000 resolution at an AGC target value of 5 ϫ 10 4 . Target HCD-MS 2 ions detected at high resolution and accurate mass (HR/AM) within 10 ppm were selected automatically for product-dependent MS 3 (pd-MS 3 ) acquisition in the ion trap using CID at an AGC target value of 1 ϫ 10 4 and 30% NCE, followed by ion trap detection. The MS system was operated in negative ion mode for analyses of mono-sulfated O-glycans (m/z mass range 700 -2000, charge state 1). The same Orbitrap resolution and AGC target value were applied for MS 1 but a 5 s top speed mode was used instead for parallel HCD/CID MS 2 data dependent acquisition (9). The AGC target value and NCE for CID MS 2 in the ion trap was set at 1 ϫ 10 4 and 40%, respectively. For HCD MS 2 to be detected in the Orbitrap analyzer at 30,000 resolution, the AGC target values was set at 5 ϫ 10 4 and NCE at 50% with Ϯ 10% stepped collision energy. For additional pd-HCD MS 3 data acquisition, the AGC target value and NCE were set at 1 ϫ 10 4 and 45%, respectively. All data were processed by in-house developed software GlyPick v1.0 and Xcalibur software v2.2.
GlyPick Software Development-GlyPick was developed using Microsoft Visual Studio 2013 professional edition in conjunction with Thermo Raw API (Version 3.0 sp3) for data extraction from Thermo raw file. Its parameter settings can be fine-tuned by user inputs via a graphic user interface (GUI) according to the specific data set to be analyzed, and output along with the result files. GlyPick starts from picking out only those MS 2 spectra containing at least 1 or 2 userspecified glycan fragment MS 2 ions and then tracks each of the selected MS 2 scan back to its preceding MS 1 scan and associated pd-MS 3 scans. The m/z value of the monoisotopic precursor will then be determined from accurately mass measured MS 1 scan and listed alongside that of the triggered precursor, its peak intensity, elution time and scan numbers. The generated output file in Excel format also tabulates as many of the user-defined MS 2 ions detected above specified intensity threshold and within the mass accuracy tolerance, along with the signal intensities of each. For pd-MS 3 scans, predefined diagnostic MS 3 ions, if detected, will be used to identify specific glycotope in addition to a listing of all detected MS 3 ions and their signal intensities. The total number of spectra in which each of the specified MS 2 and MS 3 ions diagnostic of particular glycosylation features or glycotopes was detected will be counted and their respective intensities summed for the purpose of relative quantification. Because multiple pd-MS 3 events (normally Ն 3) were specified, as many MS 2 events were actually triggered for the same MS 1 precursor to isolate the respective MS 2 ion for each pd-MS 3 without acquiring the additional MS 2 data. For quantification purpose based on spectral counting and summing of ion intensity, the MS 1 and MS 2 scans were each duplicated for every additional pd-MS 3 scans to compensate for the time spent on the same precursor. This extrapolation may end up giving extra quantification weighting for MS 2 scans containing more MS 2 ions targeted for pd-MS 3 but deemed justifiable and give a better approximation for the glycotope-centric analysis.
GlyPick can also mass fit the m/z values of identified monoisotopic precursors to glycosyl compositions, with or without considering additional user-defined adducts, modifications, maximum and minimum number of allowed glycosyl residues, and simple rules of permissible combination derived from biosynthetic constrains. Either a single round or iterative rounds of fitting can be executed. The latter involves initial fitting without considering the extra permutations generated by cation adducts and under-methylation. MS 2 scans fitted will then be removed and the remaining be fitted again by considering the extras. All monoisotopic precursors thus assigned to the same glycosyl composition can be optionally grouped and their inferred MS 1 peak intensities summed to provide a quantitative measure of relative abundance.
GlyPick version 1.0 containing all features described in this work is available on request for testing while new features are being added and performance continuously optimized.
Identification of Glycotopes and Reported Glycan Structures-All glycotopes reported were based on detecting their respective MS 2 and MS 3 ions, as extracted out from the MS/MS data set by GlyPick. The critical glycosyl linkages for discriminating the various fucosylated glycotopes and location of sulfates can thus be unambiguously defined but the Gal/GlcNAc identification was inferred directly from Hex/HexNAc without further verification. The anomeric configuration was likewise based on presumptive glycobiology knowledge and not further defined. MS1 precursors fitted with glycosyl compositions by GlyPick were not meant to imply glycan structural identification. Only select glycans with their MS/MS spectra manually interpreted and shown annotated with cartoon drawings are considered as structurally assigned to be singled out for discussion. Their monosaccharide stereochemistry and anomeric configurations were similarly not further verified.

RESULTS
Most of the current offline nanospray MS/MS or online nanoLC-MS/MS analysis of permethylated glycans were conducted in positive ion mode doped with sodium to promote sodiated molecular ions (11)(12)(13), which would afford more complete sets of fragment ions. In contrast, under acidic conditions, the protonated molecular ions would mostly yield highly abundant nonreducing terminal oxonium ions via cleavage at HexNAc, concomitant with characteristic elimination of substituents at its C3 position (14,15). The advantages offered are two folds: (1) fully compatible with conventional proteomic data acquisition workflow to allow convenient scheduling, because there is no need to change solvent and nanoLC set-up, with no unwarranted introduction of sodium into sophisticated high-end MS instrument system; (2) abun-dant MS 2 product ion to be selected for MS 3 , with reliable, characteristic MS 3 ions informative of the linkage. The programmed data acquisition mode is called product-dependent (pd)-MS 3 , which would automatically isolate any of the targeted MS 2 ions for further stage of HCD/CID fragmentation. As demonstrated against a panel of authentic standards (supplemental Fig. S1), each of the fucosylated glycotopes can thus be unambiguously defined based on which glycosyl residue is eliminated from the C3-position of the GlcNAc ϩ .
Characteristic Features of RP nanoLC-MS/MS of Permethylated Glycans-For the smaller O-glycans, the RP C18 nanoLC-separation afforded at elevated temperature is reasonably satisfactory. To take a simple case example, the O-glycan with a Fuc 1 Hex 2 HexNAc 2 -itol composition from AGS cells was found to comprise both simple core 2 and extended core 1 structures with Fuc on either Gal or GlcNAc giving rise to H or Le X , respectively. These were resolved into at least 4 distinct major peaks, with structures carrying H eluting earlier than Le X , and the branched core 2 earlier than extended core 1 (Fig. 1). Additional structural isomers carrying Le A would elute slightly later than the Le X counterparts as demonstrated with similarly prepared and analyzed O-glycan sample from colo205 cells. This is further corroborated by the elution order of type 1 versus type 2 chain carried on a nonfucosylated extended core 1 structure. A label free quantification based on extracted ion chromatogram (XIC) of the resolved peaks for these smaller O-glycans including Tn, sialyl Tn and T is thus possible (supplemental Fig. S2). However, as the size increases, so are the possible structural isomers, concomitant with the increase in the number of nonfully resolved peaks for each XIC. Similar pattern also extends to sulfated O-glycans analyzed in negative ion mode, as de- Peaks are labeled #1 -9, the MS 2 and target MS 3 (small inset) spectra of which are shown in 9 small panels, with the diagnostic oxonium ions annotated in red. Z 1 ion for the reducing end GalNAitol (annotated in blue) could be readily detected at m/z 294 or after further loss of 6-arm substituents at m/z 280, which is informative of core 1 versus core 2 structures. A Z 2 ion at m/z 498 is also commonly observed for all core structures that are extended at either the 6 or 3-arm but not both. Extended core 1 structure further afforded a B ion at the Gal of the Gal-GalNAcitol, e.g. m/z 668 and 842, which is very useful to define the entire moiety extending from the 3-arm. Under low collision energy, protonated permethylated glycans do not normally yield cross-ring cleavage ions. scribed previously (supplemental Fig. S2, S3) (9). For the N-glycans, each of the Man 5-9 GlcNAc 2 can be clearly separated but not the individual isomers, whereas multiantennary complex type structures with various degrees of fucosylation and sialylation were not efficiently resolved (Fig. 2) (16). Nevertheless, the terminal fucosylated epitopes as detected by MS 2 could be similarly targeted for MS 3 to define the occurrence of Lewis versus H glycotopes.
Typically with our Orbitrap Fusion Tribrid instrument settings and acquisition parameters, a duty cycle of 3 s (Top Speed mode) applied to a sample of relatively high complexity could fit in an average total of 25 MS 2 ϩMS 3 scans for a positive mode nanoLC-HCD-MS 2 -pd-CID-MS 3 analysis, with the MS 1 and HCD-MS 2 scan being mass measured in the Orbitrap at 120 K and 30 K resolution, respectively. This amounts to an average total of Ͼ 20,000 MS 2 /MS 3 spectra acquired within the effective elution range of glycans. In neg-ative ion mode for the sulfated glycans, we opted instead for nanoLC-parallel CID/HCD-MS 2 -pd-HCD-MS 3 at a duty cycle of 5 s. Both CID and HCD MS 2 were similarly mass detected in the Orbitrap for high resolution and mass accuracy but the HCD-MS 3 spectra were acquired in the ion trap after fragmentation in the HCD cell to increase sensitivity. A typical run also produced an average total of MS 2 ϩMS 3 scans like that in positive ion mode for nonsulfated glycans. As with all proteomics and glycoproteomics data, the sheer number of spectra defies laborious manual interrogation and expert interpretation. To facilitate the data mining and analysis process, we have developed a computational tool named GlyPick, which aims to automate the data analysis process for nonexperts.
Identification and Relative Quantification of Target Glycotopes by GlyPick-GlyPick implements different levels of data mining starting from MS 2 data. At the first level, the raw data will be processed and nonglycan and/or poor quality MS 2 data will be filtered away. This step relies on high resolution/ accurate mass MS 2 data and the choice of diagnostic glycan ions that can be input by user (supplemental Fig. S4 for GUI), with options available for specifying at least how many of these common glycan MS 2 ions must be detected at 5 ppm or less above a predefined intensity threshold for a spectrum to qualify as bona fide glycan MS 2 spectrum. The advantage of analyzing protonated molecular ions is obvious here because the oxonium ions corresponding to terminal glycotopes including simple NeuAc ϩ , HexNAc ϩ and Hex-HexNAc ϩ are usually abundant. By tracking the preceding MS 1 survey scan, GlyPick will attempt to determine the correct monoisotopic precursor and all relevant information will be collated, which includes (1) the scan number (and elution time) for MS 1 , MS 2 and associated MS 3 if triggered; (2) the peak intensity for the precursor and each of the characteristic MS 2 and MS 3 ions; (3) the m/z for inferred monoisotopic peak versus experimentally triggered precursor, and the value of z.
As first proposed and implemented in proteomics, the spectral count or its equivalents can be taken as a rough indication of abundance (17,18). In our case, the number of times a unique glycan precursor defined by its mass is selected for MS 2 can conceivably be used to indicate its relative abundance under nonsaturating conditions. However, this level of counting is less useful for glycomics because each precursor often comprises many different isomers and many different glycans share the same glycotopes. To map the glycomic occurrence and relative abundance of a glycotope, we have resorted instead to count the number of glycan MS 2 spectra in which a diagnostic MS 2 ion is detected and to sum its peak intensity. For isomeric glycotopes that can only be resolved at MS 3 level, we further sum the intensity of each diagnostic MS 3 ion detected in all productive MS 3 spectra for a given target glycotope. This novel idea of identifying and relative quantification of target glycotopes at glycomic level was first tested out using the nonsulfated AGS O-glycan FIG. 2. RP C18 nanoLC separation of permethylated high mannose type and biantennary complex type N-glycans. Overlay plots of extracted ion chromatograms for the high mannose structures indicated that the Man 5-9 GlcNAc 2 itol N-glycans were well resolved from one another but not for their individual isomeric constituents. Larger complex type N-glycans, with and without sulfates, produced more complicated and not well resolved ion chromatograms, the unambiguous identification of individual chromatographic peaks based on MS 2 of the protonated molecular ions alone is not feasible but glycotopes carried on each can still be efficiently determined by MS 2 -pd-MS 3 analysis. sample as a case study (Fig. 3). To ascertain reproducibility from essentially a random data dependent acquisition (DDA) sampling event, we have varied the concentration of injected analytes across 10ϫ dilution range (supplemental Table S1) and additionally performed a technical triplicate analyses for the 4-fold diluted O-glycan sample (supplemental Table S2).
As expected, a better linearity over 10x concentration was observed when the ion intensity and not just the spectral count for all the selected MS 2 ions was totaled (Fig 3A). This indicated that for normalization purpose, the sum of all se-lected MS 2 ions, either by count or intensity, is a better indicator of the total glycan amount injected than the total nonredundant MS 2 spectral count itself, within the loading capacity defined by the C18 Zip-Tip used for pre-LC-MS/MS clean-up. For the nondiluted sample, a total of over 13,000 MS 2 scans containing at least 2 glycan-specific MS 2 ion de- . The relative amount of these constituent glycotopes expressed as % total in the 100% stacked column chart (right panel) stays remarkably consistent across the 10x concentration range despite the inherently stochastic nature of MS 2 data acquisition. E, Good reproducibility was also obtained on triplicate profiling of the terminal glycotopes based on summing the total ion intensities of each of the selected MS 2 ions. For the mono-and difucosylated Gal-GlcNAc glycotopes represented by MS 2 ions at m/z 638 and 812 respectively, analogous summing of the diagnostic MS 3 ion intensities (see supplemental Fig. S5) allows a reproducible mean to delineate its isomeric constituents for comparison across different samples. Full experimental data set used to generate the various charts here and supplemental S5 can be found in supplemental Tables S1 and S2.
(HO) 1 Hex 1 HexNAc ϩ , Hex 1 HexNAc ϩ , and Fuc 1 Hex 1 HexNAc ϩ , respectively, were each detected in Ͼ 10,000 MS 2 spectra, whereas a few ions such as m/z 505, 679, and 999, representing HexNAc 2 ϩ , Fuc 1 HexNAc 2 ϩ , and NeuAc 1 Fuc 1 Hex 1 HexNAc ϩ , respectively, were each detected in fewer than 600 spectra (supplemental Table S1.2), or less than 0.2% total after given a summed ion intensity weighting (supplemental Table S1.3). This MS 2 spectral counting and/or ion intensity summation for the selected MS 2 ions allows a quick assessment of the kind of terminal glycotopes or glycosylation features that characterize a glycome and their relative abundance, with good reproducibility demonstrated across the 10x concentration range analyzed (Fig 3B-3D, supplemental Fig. S5A). To further distinguish the isomeric fucosylated glycotopes identified by a common MS 2 ion will require detecting their respective diagnostic MS 3 ion in as many productive pd-MS 3 afforded (supplemental Figs. S5B, S5C). A comparable linear response for their summed MS 3 ion intensities over 10x concentration was similarly observed, allowing consistent estimation of their relative abundance (supplemental Table  S1.4, supplemental Fig. S5D). Additional triplicate analyses on the AGS O-and N-glycans (supplemental Table S2) indicated a good reproducibility for the identified glycotopes and their relative abundance (Fig 3E). The N-glycans expressed relatively more of terminal HexNAc and less of sialyl LacNAc than the O-glycans. Both carried comparable amount of monoand di-fucosylated glycotopes predominantly on type 2 chain, resulting in mostly Le X , H2, and Le Y .
MS 2 Data Mining for Sulfated Glycotopes by GlyPick-Applying the same concept described above for mining the data sets of nonsulfated glycans, GlyPick can similarly filter out true sulfated glycan MS 2 spectra acquired in negative ion mode based on their containing at least one of the diagnostic MS 2 ions for known sulfated glycotopes (9), and quantify their relative abundance by summed ion intensities. The MS 2 data sets of 2 independently prepared AGS sulfated O-glycan samples acquired using 2 different nanoLC conditions on 2 different Orbitrap Fusion systems revealed a minimum amount of m/z 167 and 195 to account for terminal Gal-6-Osulfate and internal GlcNAc-6-O-sulfate in both samples, the sulfated glycotopes of which was dominated instead by Gal-3-O-sulfate (Gal3S) defined by m/z 153 and 181 (Fig 4C). Interestingly, a slight increase in the relative abundance of Gal3S was detected in the more recently prepared sample analyzed on Orbitrap Fusion Lumos, accompanied by a more conspicuous increase in m/z 253 and 283 that defines a terminal sulfated Gal, concomitant with a decrease in m/z 371 and 528. This suggests that the Gal3S carried on a LacNAc was reduced relative to Gal3S on its own, which is in accord with the significant increase in the 2 peaks eluting at 39.68 (m/z 1199) and 39.82 (m/z 1373) min (supplemental Fig. S2 left  panel, Fig 4A), identified as carrying a Gal3S on the 3-arm coupled with either a Le X or Le Y on the 6-arm (supplemental Figs. S3, Fig. 4B).
By fitting the inferred monoisotopic precursor mass of each MS 2 scan to glycosyl composition within the user-defined constrains and output the resulting entries in csv format that can be further filtered, edited, color coded and sorted at will using the built-in Excel functions (see supplemental Table S3), GlyPick greatly facilitates manual verification of the nanoLC-MS 2 -pd-MS 3 data set. It enables users to home in directly on relevant MS 2 scans of the more abundant glycans, or those carrying the targeted glycotopes, for structural assignment. As shown here (Fig 4B) for the 4 isomeric peaks of m/z 1199 (Fuc 1 Hex 2 HexNAc 2 S), only the second and third isomers in elution order produced the diagnostic MS 2 ion at m/z 702 for a sulfated fucosylated LacNAc carried on the 6-arm of a core 2 structure and 3-arm of an extended core 1 structure, respectively. Because the sulfate is determined to be on terminal Gal3S, and m/z 371 defined a type 2 chain, the sulfated fucosylated LacNAc can thus be assigned as sulfo Le X . The first and fourth isomers, on the other hand, carried a sulfo LacNAc instead as revealed by the presence of m/z 528. The 4th peak comprised 2 isomers, the major one being core 2 structure with a Gal3S on the 3-arm coupled with a Le X on the 6-arm, whereas the minor one appears to be a di-LacNAc fragment derived from larger glycans that was produced during sample preparation and carried a sulfo LacNAc at its nonreducing end. Going through the GlyPick-collated data set by MS 1 peak intensity, the only unanticipated major component revealed is the one at m/z 1240, which could be manually assigned by referring to its MS 2 scans as a core 2 structure carrying a rare fucosylated HexNAc 2 glycotope on the 6-arm with a Gal3S on the 3-arm (Fig 4B). Notably, all O-glycans with a single Gal3S on the 3-arm were found to produce a diagnostic ion at m/z 150.97 that is more abundant than m/z 152.99.
Identifying Rare Disialylated LacNAc Glycotopes Facilitated by GlyPick-Although GlyPick allows postdata acquisition query of any glycotope or structural feature defined by a diagnostic MS 2 ion retrospectively, it can also be used in a more forceful and targeted manner in conjunction with the MS 2 -pd-MS 3 scan function. This is demonstrated from within an integrated glycomic workflow aiming at identifying the full ensemble of glycotopes carried on the N-glycans of mouse brain striatum, including the possible occurrence of a terminal disialyl capping unit implicated by striatal-enriched expression of the ␣2-8-sialyltransferase, ST8Sia3 (19 -21). Our first step involved a rapid MALDI-MS screening of the permethylated N-glycans, which would inform the N-glycomic sample quality, quantity and complexity to guide subsequent nanoLC-MS/MS analysis. As expected, the brain N-glycans are dominated by complex type structures that can be tentatively assigned as bi-to tetra-antennary, core fucosylated, with nonextended GlcNAc, or fucosylated and/or sialylated LacNAc termini (Fig 5A). Although a few major peaks can be directly subjected to MALDI MS/MS to verify the structural assignment, it is not practically feasible to attempt a compre-hensive MS 2 analysis, particularly for the very low abundant peaks above m/z 4500 that are the most likely candidates carrying disialylated antenna. Instead, a single nanoLC-MS 2pd-MS 3 run followed by data mining using GlyPick would more efficiently accomplish the glycotope mapping at much higher sensitivity. It revealed that the mouse brain striatum is mostly decorated with terminal GlcNAc, sialyl LacNAc and Le X (Fig 5B).
More importantly, we could pick up low level of diagnostic ions for disialylated glycotope based on m/z 737 (NeuAc 2 ϩ ) and 1186 (NeuAc 2 -Hex 1 HexNAc ϩ ), the confident identification of which is secured via the programmed target pd-MS 3 (Fig 5C). In the case of m/z 737, MS 3 should produce a terminal NeuAc ϩ at m/z 376 to confirm that it is indeed a NeuAc-NeuAc unit. Because m/z 737 was selected in the first place at MS 2 level by accurate mass, failure to yield productive MS 3 does not necessarily discredit its identification but positive MS 3 would additionally imply that it was of significant ion intensity. In the case of m/z 1186, we not only identified, for the first time by glycomic analysis, a NeuAc-NeuAccapped LacNAc based on the MS 3 ion at m/z 737, but also a small % of NeuAc-Hex-3(NeuAc)HexNAc disialylated glycotope based on m/z 589. This MS 3 ion can only be derived from NeuAc-6HexNAc ϩ after elimination of the NeuAc-Hex unit attached to the C3 position of the HexNAc, consistent with a type 1 chain, Gal-3GlcNAc, sialylated at both terminal Gal and internal GlcNAc. The MS 3 data also confirmed that in the case of fucosylated Hex-HexNAc, a majority was based on type 2 chain comprising both H2 and Le X , with only a very small % contributed by Le A . Other glycotopes defined by specific MS 2  supplemental  Table S3. FIG. 5. Glycotope-centric glycomic analysis of N-glycans from mouse brain striatum. A, MALDI-MS mapping of the permethylated N-glycans, with major peaks annotated using the standard symbol nomenclature system ((35) and (http://www.ncbi.nlm.nih.gov/books/ NBK310273/)), based primarily on glycosyl composition. Several isomeric permutations for the actual glycotopes carried on each structure are possible and indeed verified by subsequent LC-MS 2 /MS 3 analyses. Data analysis by Glypick allows rapid identification and assessment of the relative abundance of expressed glycotopes at glycomic level based on the summed intensity of diagnostic MS 2 ions (B), and MS 3 ions in the case of isomeric fucosylated and disialylated Hex 1 HexNAc 1 (C). N-glycans carrying the disialylated glycotopes can be inferred from the compiled data (supplemental Table S4) and verified by manual interpretation of their respective MS 2 scans averaged over their elution time, exemplified here for a disialylated mono-antennary N-glycan (D), and 2 tetra-sialylated tatra-antennary N-glycans carrying either one (E) or two (F) fucoses. For the latter two, more than one isomeric structures are clearly present based on the complement of oxonium ions detected. The MS 2 data will not inform which specific glycotopes are carried on which of the 4 antennae. All MS 2 ions are singly charged except those annotated with 2ϩ or 3ϩ.
From Glycotopes to Comprehensive Listing of Glycans with Supportive MS 2 /MS 3 -The next level of data analysis is to locate the identified glycotopes on specific N-glycan structures and, ideally, to provide supporting MS 2 data for each of the major MS 1 peaks detected by MALDI-MS and/or nanoLC-MS/MS. This would require tracking each of the filtered glycan MS 2 scans back to its preceding MS 1 scan, and to mass fit the inferred monoisotopic precursor to glycosyl composition. In the first iteration, only fully permethylated and protonated precursors were considered to restrict the mathematical permutations. In general, many other precursors carrying NH 4 ϩ , Na ϩ and/or K ϩ were also detected at lower abundance and triggered for MS 2 , along with those carrying 1-2 degrees of under-methylation, and those corresponding to nonreduced fragments (derived from sample processing) or fragment ions (derived from in source fragmentation). Therefore, in addition to those MS 2 scans that would fit exact glycosyl compositions within the applied constraints, there are many more that would not but still contain the targeted diagnostic ions. A significant proportion of these arise from mis-identified monoisotopic peak, which is a genuine problem intrinsic to analysis of this nature. As the N-glycan size increases, so are the possible permutations particularly for samples carrying multiple Fuc and NeuAc. We noted that the glycosyl composition Fuc 2 HexNAc 3 differs from NeuAc 3 by a mere 0.036 Da, NeuAc 2 Fuc 1 differs from Hex 2 HexNAc 2 by 2.016 Da, and Fuc 3 HexNAc 1 differs from NeuAc 1 Hex 2 by 1.979 Da. The 2 Da differences would complicate the process of picking the correct monoisotopic precursor from an overlapping isotopic cluster. This is exemplified by two overlapping triply charged (Mϩ3H) 3ϩ precursors with a theoretical monoisotopic mass at m/z 1763.217 versus 1763.889 (supplemental Fig. S6). Even with the excellent high resolution afforded by Orbitrap, the third and most abundant isotopic peak of the former at m/z 1763.886 could not be resolved from the monoisotopic peak of the latter. Further issues include low ion statistics for weak signals and the problem of including more than 1 precursors within the 2 Th isolation window to produce multiplexed MS 2 spectra. None of these is easily solved problem given the extreme heterogeneity in coeluting isobaric and isomeric structures, which impart a high level of uncertainty and errors in retrofitting the actual precursors with correct glycosyl composition.
Despite the inherent shortcomings, GlyPick will provide the best fit (Ͻ 5 ppm) for inferred monoisotopic precursors, compute the theoretical m/z values for the singly charged (MϩNa) ϩ equivalents for ease of referring back to MALDI-MS data, group all MS 2 /MS 3 scans tracking back to the same precursors, sum the MS 1 precursor and selected MS 2 ion intensity for each unique glycan precursor thus identified and output this in Excel format (supplemental Table S4), which can be further edited and mined by users similar to that described above for the simpler case of mono-sulfated O-glycans. A final "A-list" of 454 grouped entries containing 3860 MS 2 spectral counts was registered after further editing out those entries fitted to unreasonable glycosyl composition, usually the ones with the number of Hex more than HexNAc by Ն 2 residues, or carrying excessive numbers of Fuc or Sia relative to Hex and HexNAc. All major MALDI-MS peaks annotated (Fig 5A) are contained within this list, which matched well the grouped entries with highest summed MS1 ion intensity. MS 2 scans that led to productive MS 3 and which matched to more abundant precursor intensities could then be manually examined to verify as many N-glycan structures carrying the target glycotope (see supplemental Table S5 and supplemental Fig.  S7 for a filtered subset of MS 2 scans containing the MS 3validated diagnostic MS 2 ion at m/z 1186 for the disialyl LacNAc glycotope). For simplicity, a listing of only the more abundant grouped entries with summed MS 1 and MS 2 intensities could also be output to support the assignment of major peaks detected by MALDI-MS.
Of note, we showed that many N-glycans including the smallest one with only a single antenna (Fig 5D), as well as those larger tetra-sialylated tetra-antennary structures ( Fig  5E, 5F) could carry a disialylated antennae but would not normally be annotated as such based on composition alone. The MS 2 ion at m/z 1186 from each of these precursors were successfully auto-selected for pd-MS 3 to yield the diagnostic NeuAc 2 ϩ ion at m/z 737 otherwise not obvious in their MS 2 spectra. Interestingly, MS 2 ion at m/z 999 indicative of sialyl Le X could be detected for the difucosylated structure, suggesting the presence of alternative isomeric structures. In fact, our comprehensive MS 2 /MS 3 analyses revealed all possible permutations for the sialylation and fucosylation, as well as low level of type 1 chain-based glycotopes, which collectively contributed to the vast glycomic heterogeneity. However, it is also clear from this work which are the major glycotopes carried on the brain striatum N-glycans. DISCUSSION A glycotope-centric glycomic approach and its enabling computational tool aim not to identify as many glycans as confident as possible by matching the individual MS 2 spectra to a spectral library or predicted MS 2 ion sets derived from a predefined glycan database. The glyco-analysis community is still far from this goal of glycomics with many technical problems to overcome. In fact, it is valid to question the feasibility, cost-effectiveness and wisdom of such undertaking, given the vast isomeric and isobaric heterogeneity intrinsic to glycosylation. Particularly in cases of large, highly complex, multisialylated, multifucosylated N-glycans such as those of the brain N-glycome analyzed here, it is unlikely that any form of chromatography including the widely used poros graphitized carbon (PGC) nanoLC capable of resolving isomeric glycans in their native, reduced forms (22)(23)(24), will ever achieve a resolution at single glycan entity level. Data dependent MS 2 acquisition invariably generates multiplexed spectra, even with the narrowest precursor isolation width. Assignment of positive spectrum match to a single dominant structure is delusive and only at the expenses of minor isomeric or isobaric constituents. Moreover, many isomeric glycan structures have almost identical MS 2 spectra, whereby further stages of MS n analysis will at best qualitatively inform the presence of its individual constituents. All these must be considered in the context of achievable precision and sensitivity at reasonable throughput and accessibility to nonexperts.
We argue that a more productive approach that is also more biologically relevant is to first and foremost quantitatively map all the expressed glycotopes at the glycomic level. GlyPick enables the requisite automation and high throughput, making the proposed platform amenable to all and well suited for large scale comparative glycomic analysis of biological samples. As demonstrated in this work, the well-established elimination of substituents from the C3 position of an oxonium ion (14) is sufficient to allow unambiguous assignment of the isomeric constituents of a glycotope. On the plus side, any structural feature that can be defined by a unique MS 2 ion and preferably also confirmed by an MS 3 ion can be thus identified and relatively quantified. This includes the precise location of sulfate on terminal glycotopes, as determined in negative ion mode. It is also fully compatible with mining the LC-MS/MS data of native glycans such as those separated on a PGC column because GlyPick allows input of any userspecified MS 2 ions. On the other hand, structural features that will not directly yield a diagnostic ion can potentially be defined by incorporating additional chemo-enzymatic manipulation steps. For example, recent work by Jiang et el demonstrated that ␣2-3 and 2-6-sialic acids can be differentially derivatized by dimethylamidation followed by permethylation (25), resulting in 13 mass units difference between a methyl esterified and dimethyamidated sialic acid depending on whether it is 2-3 or 2-6-linked to Gal, respectively. This allows production of distinctive NeuAc ϩ and NeuAc-containing oxonium ions for a variety of ␣2-3/2-6-linked sialylated glycotopes to be distinguished and relatively quantified using our workflow. The use of ␣2-3-specific sialidase and quantifying the reduced intensity of sialylated glycotopes relative to nontreated sample is another viable approach. Granted, not all features can thus be solved at present but neither can it by any other single method at sufficiently high sensitivity.
In contrast to the rather straightforward mapping of glycotopes based on summed intensity of MS 2 ions, assigning the glycosyl compositions for each of the precursors that yield the respective diagnostic fragment ions is more problematic. Some of the technical challenges and pitfalls have been discussed when referring to the brain N-glycomic data set. It should be emphasized that GlyPick does not attempt to identify glycans and thus there is no associated probabilistic scoring or false discovery rate issue. The most common source of error is failure to infer correctly the monoisotopic precursors, resulting in a mass value that does not or incorrectly fit a permissible glycosyl composition. This is, however, less an issue for MS 1 signals of higher intensity. As reflected from the grouping of all precursors with the same mass values and summing of their MS 1 ion intensities (supplemental Table  S4), the major components thus stand out from the list correlates well with the major assigned MALDI-MS peaks. Gly-Pick faithfully collates all MS 2 and MS 3 spectra acquired, summed their ion intensities, and computes the fitting glycosyl compositions for each of the precursors, without attempting to identify them by sophisticated search and scoring algorithm. It facilitates manual interpretation, especially in plowing through thousands of spectra, but does not substitute for the eventual structural assignment effort.
Much of what we now know of the brain N-glycome stems from early comprehensive studies in the 90s using large amount (ϳ200 g) of whole brains from adult rats as starting materials to isolate, fractionate, and characterize in detail the released N-glycans (26,27). In addition to about 15% of high mannose type structures, it was shown that the neutral Nglycans were dominated by core fucosylated biantennary structures terminating with Gal␤1-4GlcNAc (LacNAc), ␣3fucosylated LacNAc (Le X ), or nongalactosylated ␤-GlcNAc. A substantial portion of the nonfucosylated LacNAc in these and most of the larger tri-and tetra-antennary N-glycans were NeuAc␣2-3-sialylated, along with a significant amount of disialylated antenna in the form of NeuAc␣2-3Gal␤1-3(NeuAc␣2-6)GlcNAc, as well as nonsialylated type 1 chain Gal␤1-3GlcNAc and a small amount of sialyl Le X . These "brain-type" N-glycosylation pattern and glycotopes identified are well-conserved between rat and mouse brains and reproducibly observed in all subsequent modern day MALDI-MS-based mapping of the brain N-glycome using much less materials and without the extensive prefractionation (28 -30). However, conspicuously absent from these single dimensional MS profile was any evidence that would support the presence of disialylated LacNAc antennae beyond simple inference by molecular mass and hence overall assigned glycosyl compositions. More recently, a PGC LC-MS/MS-based mapping was applied to native mouse brain N-glycans (31) but structural details on terminal glycotopes were equally not pinned down because MS 2 alone cannot distinguish the isomeric fucosylated or disialylated glycotopes. In contrast, we have been able to provide conclusive supporting evidence for all the major glycotopes in single nanoLC-MS 2 -pd-MS 3 run, including resolving the 2 different disialylated termini and the Le X -dominant fucosylated glycotopes, using less than 5% of the permethylated N-glycans derived from the striatum of a single mouse brain. We were also able to map a full range of the sulfated glycotopes in a different fraction collected from the same sample after permethylation (unpub-lished data), an advantage not afforded by MS analysis of native glycans.
The current emphasis for a majority of mammalian glycomic analysis is high throughput and sensitivity, to reliably detect glycosylation changes or differences using less amount but larger number of biological samples (32)(33)(34). This is needed to satisfy the requisite statistical significance over high individual variations. Decades of more conventional complete structural analysis have laid a strong foundation for understanding the range of mammalian glycans that can be made by a restricted set of glyco-genes. Although the exact number of glycan entities may be large and indefinite because of unlimited permutation of branching and extension before terminal capping, the glycotopes and structural features that matter are not. In fact, a survey of current glycomic literature seldom hits on reports of novel glycotopes. This is likely because most efforts being undertaken are not conducive to delineate among the known isomeric glycotopes, let alone to discover new ones. Our workflow, in contrast, allows for comprehensive MS 2 /MS 3 data acquisition and semi-quantitative analysis, homing in on distinguishing the relative amounts of isomeric glycotopes. It strikes the optimal balance among high throughput, sensitivity and precision. It similarly relies much on presumptive structural knowledge of the stereochemistry of Hex and HexNAc residues, but does validate the critical linkages that identify the glycotopes. It is particularly powerful for rapid survey, comparative or confirmatory, glycomic mapping and can be equally applied to any glycomic sample searching for known or novel glycotopes.