Glycoproteomic Analysis of Seven Major Allergenic Proteins Reveals Novel Post-translational Modifications*

Allergenic proteins such as grass pollen and house dust mite (HDM) proteins are known to trigger hypersensitivity reactions of the immune system, leading to what is commonly known as allergy. Key allergenic proteins including sequence variants have been identified but characterization of their post-translational modifications (PTMs) is still limited. Here, we present a detailed PTM1 characterization of a series of the main and clinically relevant allergens used in allergy tests and vaccines. We employ Orbitrap-based mass spectrometry with complementary fragmentation techniques (HCD/ETD) for site-specific PTM characterization by bottom-up analysis. In addition, top-down mass spectrometry is utilized for targeted analysis of individual proteins, revealing hitherto unknown PTMs of HDM allergens. We demonstrate the presence of lysine-linked polyhexose glycans and asparagine-linked N-acetylhexosamine glycans on HDM allergens. Moreover, we identified more complex glycan structures than previously reported on the major grass pollen group 1 and 5 allergens, implicating important roles for carbohydrates in allergen recognition and response by the immune system. The new findings are important for understanding basic disease-causing mechanisms at the cellular level, which ultimately may pave the way for instigating novel approaches for targeted desensitization strategies and improved allergy vaccines.

Allergenic proteins such as grass pollen and house dust mite (HDM) proteins are known to trigger hypersensitivity reactions of the immune system, leading to what is commonly known as allergy. Key allergenic proteins including sequence variants have been identified but characterization of their post-translational modifications (PTMs) is still limited.
Here, we present a detailed PTM 1  Allergic respiratory disease is a global health problem and current clinical guidelines recommend a combination of allergen avoidance, pharmacotherapy, and allergen specific immunotherapy for treatment (1)(2)(3)(4). At present allergy testing and vaccines are based on isolated crude antigen preparations from natural sources (i.e. HDM, pollens, etc.), but a move toward recombinant allergen design is ongoing (5,6). This could have important functional implications because the production host will determine the repertoire of post-translational modifications (PTMs) and in particular glycan modifications presented on allergens.

characterization of a series of the main and clinically relevant allergens used in allergy tests and vaccines. We employ Orbitrap-based mass spectrometry with complementary fragmentation techniques (HCD/ETD) for site-specific PTM characterization by bottom-up analysis. In addition, top-down mass spectrometry is utilized for targeted analysis of individual proteins, revealing hitherto unknown
The carbohydrate structures found on allergens are in most cases not found in mammals and therefore frequently lead to the induction IgE antibodies named Cross-reactive Carbohydrate Determinants (CCD) (7)(8)(9)(10)(11). Moreover, glycans may directly be involved in and promote uptake and target allergens to carbohydrate lectin receptors on antigen presenting cells (APC) (12)(13)(14). Therefore, a full structural characterization of the glycans on the natural allergens is a prerequisite for understanding both antibody reactivity and lectin receptor mediated allergen recognition and modulation of the immune response (15,16). Furthermore, a detailed characterization of PTMs of allergens is important for standardization of allergen products for diagnostic purposes as well as for vaccine use (17,18). Although many major allergens and their etiology have been characterized in some detail, structural information on for example their immunological important PTM status is still incomplete (19 -21).
Mass spectrometry-based technologies offer sensitive and accurate analyses for identification and characterization of proteins. The common proteomics workflow typically adopts the bottom-up approach, i.e. in vitro proteolytic digestion of proteins followed by nanoflow-liquid chromatography-tandem mass spectrometry (nLC-MS/MS) for protein identification and PTM characterization. Electron-or collision-driven fragmentation techniques, e.g. electron transfer dissociation (ETD) (22) or higher energy collisional dissociation (HCD) (23) have enabled accurate identification of peptides of purified proteins, e.g. allergens (21,24), or complex biological samples (25)(26)(27) with concurrent characterization of their PTMs. One advantage of bottom-up mass spectrometry is the ability to resolve modified peptides within a narrow chromatographic time frame thereby enabling in-depth characterization of sitespecific features, e.g. glycoforms, on peptides. This peptidelevel information is subsequently used to generate a proteinlevel view on the PTM status for a given protein. Importantly, the PTM connectivity of the protein (28) is lost upon proteolytic digestion, and alternative approaches are often required for comprehensive characterization of all proteoforms (29). Top-down mass spectrometry has emerged as an alternative approach to bottom-up proteomics, offering complementary MS and MS/MS information that may be used for protein identification and characterization (30,31). With top-down MS, intact proteins are typically analyzed by high-resolution FTMS and characterized at the MS/MS level by CID, HCD, ECD, or ETD. This technique provides instant protein-level information on analytes, e.g. sequence variants, amino acid substitutions, PTMs, etc., which can be verified at the MS/MS level by different fragmentation modes. The combination of bottom-up and top-down mass spectrometry is therefore a powerful tool for the identification and characterization of proteins. Here, we combine top-down and bottom-up mass spectrometry for comprehensive characterization of seven major allergens as a first step toward unraveling the molecular mode of action of allergens with complex PTMs. By these methods, we demonstrate hitherto unknown PTMs of HDM allergens and identify more complex glycan structures than previously reported on the major grass pollen group 1 and 5 allergens. The new findings implicate important roles for carbohydrates in allergen recognition and response by the immune system.

Purification of HDM Dermatophagoides Pteronyssinus (Der p) and
Dermatophagoides Farina (Der f) Major Allergens Group 1 and 2-Spent house dust mite (HDM) cultures of either specie were extracted 1:10 (w/v) in 44% (v/v) acetone and 56% (v/v) 0.125 M NH 4 HCO 3 , pH 8.3, for one hour at 5°C. After centrifugation, the supernatant was mixed with an equal volume of acetone and the precipitate was collected by centrifugation and dried in a ventilated hood over night at room temperature. The dried precipitate was dissolved in 0.125 M NH 4 HCO 3 , pH 8.3, dialyzed overnight against Milli Q water at 5°C, and lyophilized.
The purification procedure for Der p and f allergens is almost identical. Three to four hundred mg culture extract was dissolved in 10 ml 0.05 M Na Acetate, pH 5.0, and mixed with 20 ml 2.0 M NH 4 SO 4 , and the mixture was subjected to Hydrophobic interaction chromatography (HIC) on a Butyl-S FF column (25 ml) equilibrated with 0.05 M Na Acetate and 1.5 M NH 4 SO 4 , pH 5 (Buffer A). The flow-through fractions were collected and pooled (ϳ100 ml). This pool contains primarily the group 2 allergen and other HDM extract proteins, whereas group 1 allergen bind to the column. The group 1 allergens were eluted from the column by a stepwise decrease of the NH 4 SO 4 molarity, changing the buffers from A to B (0.05 M Na Acetate and 1.05 M NH 4 SO 4 , pH 5.0), to C (0.05 M Na Acetate and 0.98 M NH 4 SO 4 , pH 5.0), and finally to D (0.05 M Na Acetate, pH 5.0). The fractions eluted in buffer D (ϳ5 column volumes) were pooled and dialyzed toward E (0.05 M Na Phosphate and 0.5 M NaCl, pH 7.5). This sample was applied on a HiTrap Chelating Sepharose HP column (5.0 ml) charged with 0.1 M CuSO 4 and equilibrated with buffer E. The proteins were eluted by a stepwise change in buffer composition and pH from E to F (0.05 M Na Phosphate and 0.5 M NH 4 Cl, pH 7.5), to G (0.65 M Na Acetate, pH 4.5), and finally H (0.05 M Na Phosphate, 0.5 M NaCl, and 0.1 M EDTA, pH 7.5). The HDM group 1 allergen(s) were eluted in buffer G (approx. 100 ml), dialyzed against buffer E and applied on HiTrap Chelating Sepharose HP (5 ml) charged with 0.1 M ZnSO 4 and equilibrated in buffer E. The proteins were eluted from the column using the same stepwise procedure as for the Cu chelating column and Der p 1 was eluted in buffer G (ϳ20 ml) and dialyzed against 0.005 M NH 4 HCO 3 , pH 8.3, and lyophilized. The yield of purified Der p 1 was ϳ3% (w/w) of the starting material. Der f 1 was purified by the same procedure excluding the Cu Chelating chromatography with a final yield of approx. 4% (w/w) of the starting material.
The pooled fraction containing HDM group 2 in buffer A from the above HIC (100 ml) was dialyzed against buffer E and subjected to Cu Chelate chromatography as described for the HDM group 1 allergen purification. The HDM group 2 allergens eluted in buffer G (approx. 80 ml) and were concentrated in an Amicon Ultra 15, 3K centrifugation tube to 2-3 ml. This sample was applied on a Size Exclusion column, HiLoad Superdex 75 16/60, equilibrated with 0.15 M NH 4 HCO 3 , pH 8.3. The fractions were analyzed by SDS-PAGE and the fractions containing HDM group 2 allergens were pooled (approx. 50 ml) and concentrated as above to 3 ml. The concentration step involves three dilutions with water followed by concentration in order to decrease the salt content of the sample to less than 0.01 M and the final sample is subjected to preparative flatbed Iso Electric Focusing (IEF) on a Agilent 3100 OffGel fractionator, using precast PG IEF immobilized pH gradient poly acryl amide strips, pH 3.5 to 10, in a 24 well setup. The IEF runs overnight and the fractions containing HDM group 2 allergens (Identified by SDS-PAGE and/or Fused Rocket Immuno Electrophoresis) was dialyzed against 0.025 M NaHCO 3 , pH 8.5, and stored in aliquots at Ϫ20°C.
Purification of Pollen Major Allergens-Pollens from Birch trees (Betula verrucosa) and Timothy grass (Phleum pretense) were obtained from Greer, Nuway Circle, NE, and extracted 1:10 (w/v) in 0.125 M NH 4 HCO 3 , pH 8.3, overnight. After removal of particulate matter by centrifugation, the supernatant(s) were dialyzed against 0.005 M NH 4 HCO 3 utilizing frequent changes of dialysis buffer, and lyophilized.
Five hundred thirty mg Birch pollen extract was dissolved in 30 ml 1.8 M NH 4 SO 4 and applied on a column of two serial connected HiTrap Butyl S FF columns equilibrated in buffer A1 (0.05 M Na Phosphate and 1.8 M NH 4 HCO 3 , pH 7.0). The proteins were eluted from the column by stepwise change in buffer to A2 (0.05 M Na Phosphate, pH 7.0) and finally to A3 (0.05 M Na 2 CO 3 and 50% (v/v) ethylene glycol, pH 10). Fractions eluted with A2 were pooled (50 ml) and concentrated in an Amicon Filter Cell using PLBC filter from Millipore. The sample (concentrated six times) was subjected to Size Exclusion Chromatography on HiLoad Superdex 75 16/60 120 ml column equilibrated in 0.125 M NH 4 HCO 3 , pH 8.3. The fractions containing the Bet v 1 allergen (identified by SDS-PAGE of the individual fractions) were pooled and dialyzed toward 0.02 M TRIS HCl, pH 8.0. The sample was subjected to ion exchange chromatography on a HiTrap Q HP column equilibrated in the same buffer. Bet v 1 was eluted with 0.02 M BIS-TRIS and 1.0 M NaCl, pH 6.5. The fractions were pooled and dialyzed against 0.05 M NA Phosphate, 0.5 M NaCl, and subjected to Cu Chelate chromatography on a HiTrap chelating HP column charged with 0.1 M CuSO 4 and equilibrated with the same buffer. The column was eluted by stepwise changes in buffer and pH.
Bet v 1 was primarily eluted with 0.65 M Na Acetate pH 5.0 and pH 4.8. These fractions were pooled and concentrated on an Amicon Filter Cell (see above). Finally, the concentrated sample was polished on a HiLoad Superdex 75 16/0 SEC column equilibrated in 0.150 M NH 4 HCO 3 , pH 8.3. The fraction containing Bet v 1 was dialyzed toward 0.05 M NH 4 HCO 3 , pH 8.3, and mixed with glycerol (final concentration 50% (v/v)) and stored at Ϫ20°C. The yields were 3.7% (w/w).
The grass pollen major allergens Phl p 1 and 5 were purified from pollen extracts by the following procedures: The extract (450 mg) dissolved in 0.15 M NH 4 HCO 3 , pH 8.3 (150 ml) was precipitated by addition of solid ammonium sulfate to a concentration of 3.0 M, after centrifugation the supernatant containing Phl p 1 was diluted to 1.0 M ammonium sulfate and subjected to HIC on an Octyl FF 16/30 20 ml column equilibrated in 0.05 M Na Acetate and 1.0 M NH 4 SO 4 , pH 5.0. The flow-through fractions (containing Phl p 1) were concentrated (Amicon Ultra spin tubes) to 10 ml and further fractionate on two consecutive SEC columns HiLoad Superdex-75 16/60 equilibrated in 0.159 M NH 4 HCO 3 , pH 8.3. The fractions containing Phl p 1 (identified by SDS-PAGE) from the first SEC were pooled, concentrated (Amicon Ultra spin tubes) to 5 ml and run a second time on the same SEC column, yielding 10 mg purified Phl p 1 after lyophilization.
Phl p 5 was purified from the above 3.0 M ammonium sulfate precipitate. The precipitate was dissolved in 0.15 M NH 4 HCO 3 , pH 8.3, and subjected to SEC (see above). The fractions containing Phl p 5 (identified by SDS-PAGE) were pooled and dialyzed against 0.1 M Na Acetate, pH 5.0. After addition of 2.0 M NH 4 SO 4 to a final concentration of 1.0 M, the solution was subjected to HIC on an Octyl FF. The column was eluted with a stepwise change in buffer and pH and the Phl p 5 eluted in 0.05 M NaCO 3 , pH 10, and 50% (v/v) ethylene glycol. The fractions were pooled and fractioned further on two consecutive SECs (see Phl p 1 purification), yielding ϳ10 mg lyophilized Phl p 5.
SDS-PAGE and MALDI-TOF Analysis-Purified allergens (5 g) were mixed with an equal volume of x2 Novex LDS sample buffer (Life Technologies) supplemented with 100 mM dithiothreitol and heated at 95°C for 5 min. Two g of each allergen was separated (200 V, 35 min) using MES-buffer and Novex Bis-Tris 4 -12% gradient gels (Life Technologies, Naerum, Denmark). Bands were visualized by staining with Novex SimplyBlue SafeStain (Life Technologies). Allergens (1 g) were analyzed on a MALDI-TOF (Autoflex Speed, Bruker Daltonics, Bremen, Germany) instrument operated in the positive mode using sinapinic acid as matrix. Mass spectra were acquired in the linear mode (mass range m/z 10,000 -30,000 or m/z 10,000 -40,000) using 2000 laser shots/spot.
Bottom-up Mass Spectrometry-10 g of each allergen was reduced (5 mM dithiothreitol, 60°C, 30 min) and alkylated (10 mM iodoacetamide, RT, 30 min) before ON treatment at 37°C with 1 g trypsin (Roche, Hvidovre, Denmark). The tryptic digest of each allergen was purified using in-house packed Stagetips (Empore disk-C18, 3 M) as described with minor modifications (32). The C-18 material was packed to a height of ϳ2 mm in a standard 200 l pipette tip and washed consecutively with methanol, 50% methanol in 0.1% formic acid, and with 0.1% trifluoroacetic acid. The tryptic digest was acidified by adding 2 l trifluoroacetic acid and loaded on the Stagetip. The Stagetip was washed once again with 0.1% formic acid and tryptic peptides were eluted by 50% methanol in 0.1% formic acid. The tryptic digests were separately analyzed using a setup composed of an EASY-nLC 1000 UHPLC (Thermo Scientific) interfaced via a nanoSpray Flex ion source to an LTQ-Orbitrap Velos Pro hybrid mass spectrometer. The EASY-nLC 1000 was operated using a single analytical column setup (PicoFrit Emitters, New Objectives, 75 m inner diameter) packed in-house with Reprosil-Pure-AQ C18 phase (Dr. Maisch, 1.9 m particle size). Peptides were separated using a 90 min LC gradient operated at 200 nL/min. The mobile phases were composed of solvent A (H 2 O) and solvent B (acetonitrile); both solvents containing 0.1% formic acid (v/v). The gradient was 5-30% B for 65 min, followed by 30 -80% B for 10 min, and finally 80% B for 15 min. Mass spectra were acquired essentially as previously described (27) with minor modifications. Briefly, MS1 precursor scan (m/z 350 -1700) acquisition was performed in the orbitrap using a nominal resolution of 30,000 followed by HCD-MS2 fragmentation of the five most abundant multiply charged precursor ions. A minimum signal threshold of 50,000 ions was used for triggering HCD-MS2 and ETD-MS2 fragmentation. MS2 scans (m/z 100 -2000) were acquired in the orbitrap mass analyzer using a resolution setting of 15,000. The tryptic digest of Bet v 1 was analyzed on an LTQ-Orbitrap XL (Thermo Scientific) instrument interfaced with an EASY-nLC II system (Thermo Scientific).The analytical column was packed in-house using the same materials as above with the exception of particle size for C18 material that was 3 m. Peptides were separated using a 60 min gradient and characterized only by HCD fragmentation for the three most abundant multiply charged precursor ions.
Top-down Mass Spectrometry-Each allergen (1 pmol/l in 50% methanol and 0.1% formic acid, v/v) was analyzed on an LTQ-Orbitrap Velos Pro hybrid mass spectrometer (Thermo Fisher Scientific) by direct infusion. The samples were introduced via a TriVersa NanoMate ESI-Chip interface (Advion BioSystems, Ithaca, NY); direct infusion was controlled by Chipsoft 8.1.0 (Advion Biosciences). MS1 precursor scan was acquired in the Orbitrap with 100,000 resolving power at m/z 400. MS2 fragmentation events utilizing HCD, CID, and ETD modes were performed and product ions were detected in the Orbitrap with 30,000 -60,000 resolving power at m/z 400.
Quantification of Carbohydrates-The C-terminally amidated IRPT-MTIPGYVEPTAV peptide standard, modified by a single O-linked mannose on Thr 4 , was used to generate a standard curve. 500 pmol of peptide standard, Der p 2 and Der f 2 allergens were each dissolved in 40 l 50 mM sodium acetate, pH 4.8. To each sample, 20 mM sodium periodate was added and the reaction was allowed to proceed at RT for 60 min before being quenched by the addition of 1 l 96% glycerol. Biotin-hydrazide was added to a final concentration of 500 M and the mixture was left to react in darkness for 16 h at RT. Control reactions for the glycopeptide standard and allergens were as described above with the exception of omitting sodium periodate from the reaction. The products were desalted by reversed-phase chromatography (C18 Stagetip microcolumns). ELISA plates (Nunc-Immuno MaxiSorp F96 plates (Nunc, Roskilde)) were coated with a dilution series (1:2 starting at 100 ng/ml) of glycopeptide, indicated allergens and controls in carbonate-bicarbonate buffer (pH 9.6) ON at 4°C. Plates were washed and blocked with BSA/Triton X-100 buffer (1% BSA, 1% Triton X-100, 3 mM KCl, 0.5 M NaCl, and 8 mM phosphate buffer (pH 7.4)) for 1 h at RT followed by washing and incubated with Streptavidin-HRP (Thermo Scientific) for 1h at RT. Plates were developed with TMB ϩ one-step substrate system (Dako), reactions stopped with 0.5 M H 2 SO 4 , and read at 450 nm. Amount biotinylated protein was calculated from the peptide standard curve.
Reductive Beta Elimination-Chemical release of O-linked glycans was done as previously described (33) with modifications. Briefly, 50 g bovine fetuin (Sigma Aldrich), Der f 2 and Der p 2 allergens were separately incubated in 100 l 0.1 M NaOH and 1 M NaBH 4 . The reductive beta elimination was allowed to proceed 16 h at 50°C and the reaction was terminated by addition of 8 l glacial acetic acid. The glycans were separated from protein components by Sep-pak C18 columns and desalted by Dowex AG 50W X8 cation exchange resin. Finally, glycans were dried over a stream of nitrogen gas and BH 4 Ϫ was removed as volatile methyl borate by repeated evaporation (5x) in 500 l methanol containing 1% acetic acid. Released glycans were permethylated according to the procedure of Ciucanu and Costello (34) and analyzed by positive mode MALDI TOF (Autoflex Speed, Bruker Daltonics, Bremen, Germany) operated in the reflector mode (mass range m/z 500 -3000; 2000 laser shots/spot) using 2,5-dihydroxybenzoic acid as matrix.
Gas Chromatrography-Mass Spectrometry-For GC-MS analysis, 50 g of dust mite allergens were spiked with myo-inositol (internal standard) and depolymerized by incubation in 0.5N methanolic HCL (Supelco, Bellefonte, PA), 80°C, 16h. The reaction mixtures were dried over a stream of nitrogen gas and re-N-acetylated (RT, 30 min) by the addition of 500 l methanol, 50 l pyridine and 50 l acetic anhydride. The reaction mixtures were dried down once again and per-O-trimethylsilylated with Tri-Sil reagent (Thermo Scientific) at 80°C, 30 min. The samples were dried down as above and dissolved in 1 ml n-hexane (Merck) prior to injection. GC-MS analysis was performed using a TRACE™ GC Ultra gas chromatograph coupled to a PolarisQ ion trap mass spectrometer (Thermo Scientific). Samples were injected (splitless mode) at 40°C (1 min), oven temperature was ramped to 150°C (25°C/min) followed by an increase to 200°C (2°C/min) before a final ramp to 260°C (10°C/min) where it was held for 5 min. Monosaccharides were identified by comparison of retention times and mass spectra to N-acteylglucosamine, N-acetylgalactosamine, mannose, or lactose standards (depolymerized, re-Nacetylated and per-O-trimethylsilylated). All retention time comparisons were relative to the myo-inositol internal standard.
LC-MS of Bariumhydroxide Mediated Peptide Bond Hydrolysates-Briefly, 180 g Phl p 1 and Phl p 5 were subjected to BaOH 2 (0.22 M final concentration) mediated peptide bond hydrolysis in a total volume of 0.5 ml, which was incubated ON at 108°C. After cooling to RT, HCOOH was added to 5 mM and the samples carefully neutralized by small aliquots of 4 M H 2 SO 4 . The resulting precipitate was pelleted and the supernatant obtained.
LC-MS (3 l supernatant injected) was carried out on an Agilent 1100 Series LC (www.agilent.com) coupled to a Bruker HCT Ultra ion trap mass spectrometer (www.bruker.com). The column was a Phenomenex Luna C8(2) (3 M, 100 A, 150 ϫ 2.0 mm; www.phenomenex. com) preceded by a Phenomenex Gemini C18 SecurityGuard (4 ϫ 2 mm). The oven temperature was maintained at 35°C. The mobile phases were A, water; B, acetonitrile, both with 0.1% (v/v) HCOOH, and the flow rate was 0.2 ml min Ϫ1 . The gradient was: 0 to 2 min, isocratic 1% B; 2 to 8.5 min, linear gradient 1 to 3% B; 8.6 to 9.6 isocratic 99% B; and 9.7 to 17 min, isocratic 1% B. The mass spectrometer was run in positive electrospray mode. An extensin enriched fraction from roots of wild type Arabidopsis thaliana (ecotype Columbia) was obtained as described (35) and subjected to BaOH 2 mediated peptide bond hydrolysis yielding the extensin derived Hyp-Araf 1-4 species as earlier reported (36). This preparation served as a Hyp-Araf 1-4 positive reference. Semi quantitative integration of peaks on the extracted ion chromatograms was done using the Bruckner Compass Data analysis version 1.1 program.
Data Analysis-Analysis and processing of bottom-up mass spectra was performed essentially as previously described (27). Briefly, .raw files were processed using the Sequest HT node of Proteome Discoverer 1.4 software (Thermo Fisher Scientific) and searched individually against allergen sequences (n ϭ 95, supplemental Table S2) downloaded from the Uniprot database (April, 2014). Enzyme setting was full cleavage specificity by trypsin, one missed cleavage; peptide mass tolerance was 10 ppm; fragment ion mass tolerance was 0.05 amu; carbamidomethyl was set as fixed modification for cysteine residues; and methionine oxidation and asparagine deamidation were used as variable modifications. In addition, attachment of hexose at Ser/Thr/Tyr/Lys and N-acetylhexoseamine at Ser/Thr/Asn residues was used as a dynamic modification for dust mite allergens. Dynamic hexose modification (at Ser/Thr/Tyr/Lys), hydroxylation of proline residues, and variable pentose modification (at hydroxyproline) was used for grass pollen allergens (including Bet v 1). All spectra at the medium confidence level (p Ͼ 0.01) and below were filtered and resubmitted to a second Sequest HT node as described above but with the exception of using semi-specific trypsin cleavage for enzyme settings. The final results were filtered for only high confidence (p Ͻ 0.01) identifications. Spectra matched to peptides containing PTMs were inspected manually to verify the accuracy of the assignments and only fragment ions of 0.01 amu mass accuracy or better were considered. The Xtract tool (Xcalibur Qual Browser software, Thermo Fisher Scientific) was used to deconvolute MS and MS/MS spectra using fit factor setting of 40% and remainder setting of 25%. Masses calculated by Xtract were used to assign the correct monoisotopic mass of precursor and product ions.

RESULTS
The natural major allergens Der p 1, Der p 2, Der f 1, Der f 2, Bet v 1, Phl p 1, and Phl p 5 were purified to more than 95% purity as revealed by SDS-PAGE analysis (supplemental Fig.  S1). The total molecular mass for the individual allergen proteins was obtained by MALDI-TOF analysis (supplemental Figs. S2 and S3). The characterization of PTMs of Der p 1, Der p 2, Der f 1, Der f 2, Bet v 1, Phl p 1, and Phl p 5 is summarized in Fig. 1 together with Table I, and the detailed structural analysis of the individual allergens is provided below.
Phl p 1 Allergen-The common timothy allergen Phl p 1 was observed at m/z 27,532, which is ϳ1400 Da above the predicted molecular weight. The tryptic digest of the natural Phl p 1 was analyzed by bottom-up FTMS and identified with 85% sequence coverage (supplemental Data). For identification of N-glycan modifications of Phl p 1 the ion chromatogram was filtered for the HCD-MS2 m/z 204.08 -204.09 ion trace (herein referred to as m/z 204), indicative for the presence of N-acetylhexosamine (HexNAc), and inspected manually. Precursor ions of full scan MS1 spectra, found at specific retention times that coincide with HCD-MS2 m/z 204 ion traces, were summed and deconvoluted into their monoisotopic [MϩH] ϩ masses ( Fig. 2A). These precursor ions were matched with good mass accuracy (Ͻ5ppm) to several Nglycoforms of the tryptic (V 27 PPGPNITATYGDK 40 ) and semitryptic (I 24 PKVPPGPNITATYGDK 40 ) peptides (supplemental Table S1) of Phl p 1. These glycopeptides additionally contained 1-2 hydroxyproline (Hyp) modifications together with the presence of 1-4 pentose (Pen) residues, one of which was the single xylose (Xyl) linked to the N-glycan. The glycan modifications were confirmed for all precursor ions by HCD fragmentation (supplemental Fig. S4). For example, the pre-cursor ions at m/z 2454.098, z ϭ 1 and m/z 2616.151, z ϭ 1 were identified as the tryptic V 27 PPGPNITATYGDK 40 peptide with the N-linked glycans Pen 1 dHex 1 Hex 2-3 HexNAc 2 as earlier reported (21,24). In addition, we identified semitryptic I 24 PKVPPGPNITATYGDK 40 peptides with the N-linked glycans Pen 1 dHex 1 Hex 2-3 HexNAc 2 with two Hyps and additional 1-3 Pen residues. Wicklein et al. previously identified a single Hyp linked arabinose (Ara) on Phl p 1 but no reports were made on additional Ara extensions (21). No other modifications were identified on Phl p 1.
HCD-MS2 of Phl p 1 glycopeptides revealed sequential loss of individual monosaccharides with prominent B-and Y-type ions, corresponding to fragmentations at glycosidic bonds (supplemental Fig. S4). The peptide ion with a single HexNAc and Pen residue was typically observed as the most abundant fragment ion for arabinosylated glycopeptides. This suggests that the Hyp linked Pen residue is relatively stable during HCD fragmentation and not eliminated as easily as other forms of O-glycosylations (e.g. GalNAc-type). However, the Pen residue was not stable enough for glycosite assignment and was usually  eliminated from b-and y-type fragments, thus precluding confident identification of Pen modified Hyp residues by HCD-MS2.
The HCD spectra were also found to contain fragment ions that were useful for mapping the location of the Hyp resides (supplemental Fig. S4B-4C). The b 4 fragment ion (m/z ϭ 438.31, z ϭ 1), present in all I 24 PKVPPGPNITATYGDK 40 glycopeptide spectra, showed that P 25 is not modified by hydroxylation. For glycopeptides with single Hyp modifications, we observed the y 10 ϩHexNAc ion (P 31 NITATYGDK 40 ) at m/z 1298.61, z ϭ 1, which shows that the first hydroxylation occurs on P 31 . The y 10 ion, with Hyp on P 31 , was also observed without HexNAc modification (m/z ϭ 1095.53, z ϭ 1), but typically at low abundance and not always isotopically resolved. Importantly, y 10 ions, without hydroxylation on P 31 , were not observed. For Phl p 1 glycopeptides with two Hyp residues, we observed y 10 ϩHexNAc and y 11 ϩHexNAc ions (G 30 PNITATYGDK 40 ) with hydroxylation on P 31 at m/z ϭ 1289.61, z ϭ 1 and m/z ϭ 1355.63, z ϭ 1, respectively. In addition, the y 12 ϩHexNAc fragment (P 29 GPNITATYGDK 40 , m/z ϭ 1452.69, z ϭ 1) was also observed with one Hyp. This fragment includes two proline residues, P 29 and P 31 , but is modified by only one hydroxylation on P 31 . Again, the b 4 fragment was also observed at m/z ϭ 438.31, corresponding to I 24 PKV 27 without hydroxylation, which rules out Hyp at P 25 . However, the b 5 fragment ion was observed at m/z ϭ 551.35, z ϭ 1, which corresponds to the I 24 PKVP 28 fragment with one hydroxylation, thereby showing that the second Hyp residue is on P 28 . Taken together, the HCD results show that the first and second hydroxylation occurs on P 31 and P 28 , respectively. These results are in agreement with previous reports where Hyp residues were localized through peptide sequencing using Edman degradation (21).
To investigate the number of Pen substitutions on Hyps we performed barium hydroxide mediated peptide bond hydrolysis of Phl p 1 followed by subsequent LC-ESI-MS analysis. With this strategy we identified the presence of the ions at m/z 264, 396, and 528/550 (Hϩ/Na ϩ forms) corresponding to Hyp-Pen 1 , Hyp-Pen 2 , and Hyp-Pen 3 , with Hyp-Pen 1 and Hyp-Pen 3 being most abundant as evidenced by visual inspection and integration of peaks in the extracted ion chromatograms (supplemental Fig. S5). The Hyp-Pen 1-3 eluted with identical retention times as barium hydroxide hydrolyzed extensin derived Hyp-Arabinofuranose 1-3 (Hyp-Araf 1-3 ) on a HPLC column. Structural confirmation of the Hyp-Pen 1 , Hyp-Pen 2 , and Hyp-Pen 3 species (m/z 264, 396, and 528/550) was obtained by MS2 analysis (supplemental Fig. S5). Other Hyp substituted species, such as the extensin specific Hyp-Araf 4 , were not identified. Our results show that Phl p 1, besides the previously reported singly Hyp linked Ara (24), also harbors two or three Hyp linked Pen residues ( Fig. 2A). Taken together with the HCD-MS2 analysis (supplemental Fig. S4), these results further suggest that only one site is occupied by the Hyp-Pen 1-3 modifications.
Phl p 5 Allergen-The MALDI-TOF analysis of the natural Phl p 5 allergen revealed two peaks at m/z 26,373 and m/z 28,746 (supplemental Fig. S2), which deviate by ϳ10 kDa in mass compared with the apparent molecular weight observed on the SDS-PAGE gel (supplemental Fig. S1). The two molecular species correspond to two of the main Phl p 5 isoform groups: Phl p 5.010i (MW approx. 27.7 to 18.5) and Phl p 5.020i (MW approx. 26.1), which differ in their primary amino acid sequences (www.allergen.org). Although both Phl p 5 isoforms were identified with Ͼ86% sequence coverage (supplemental data) using bottom-up MS, we did not detect a m/z 204 ion trace in any of the HCD-MS2 spectra for these pollen allergens, indicating that Phl p 5 is not modified by any N-or O-linked HexNAc residues. Data mining using Proteome Discoverer 1.4 did, however, identify 0 -7 Hyp residues on Phl p 5.0103 (Uniprot accession O81341) found in the N-terminal A 26   (supplemental Fig. S6A). By allowing variable Pen modifications, up to seven Pen residues were also found on this peptide (Fig. 2B, supplemental Fig. S7). Additional Phl p 5 peptides modified by Hyp and Pen are presented in supplemental Fig. S6C-E. No other modifications were identified on Phl p 5. ETD experiments of Phl p 5 glycopeptides were not successful, that is, ETD fragmentation did not produce fragment ions useful for localization of arabinosylated Hyp residues. However, HCD-MS2 demonstrated that all seven prolines (P 32 , P 35 , P 38 , P 44 , P 47 , P 50 , and P 55 ) are hydroxylated (supplemental Fig. S6A). Barium hydroxide mediated peptide bond hydrolysis of Phl p 5 and subsequent ESI-MS analysis identified the presence of the ions at m/z 396 and 528, 550 corresponding to Hyp-Pen 2 and Hyp-Pen 3 , which migrated as barium hydroxide hydrolyzed extensin derived Hyp-Arabinofuranose 2 (Hyp-Araf 2 ) and Hyp-Araf 3 (supplemental Fig. S5). Structural confirmation of the m/z 396 and 528, 550 species was obtained by MS/MS analysis (supplemental Fig. S5). Other Hyp substituted species were not identified. Our results show that Phl p 5 harbors Hyp substitutions of one to three Pen in length (Fig. 1), which has not been reported previously.
Der p 2 and Der f 2 Allergens-MALDI-TOF analysis of Der p 2 and Der f 2 revealed a major peak at m/z 14,116 (theoretical mass: 14,121 Da) and m/z 14,088 (theoretical mass: 14,076 Da), respectively (supplemental Fig. S2). However, both Der p 2 and Der f 2 also displayed a second peak with a 162 amu increment at m/z 14,278 and m/z 14,250 (supplemental Fig. S3). For precise mass measurements and identification, we subsequently analyzed Der p 2 and Der f 2 by top-down FTMS. A cluster of multiply charged (z ϭ 8 -14) ions were observed for Der f 2 (Fig. 3A) and deconvoluted into their monoisotopic masses (Fig. 3B). The most abundant ion (m/z 14062.092, z ϭ 1) was found to deviate by Ϫ2.6 ppm from the theoretical mass of the natural Der f 2 protein (Uniprot accession Q00855) containing three intramolecular disulfide bonds. The precursor ion at m/z 1564.35, z ϭ 9 (Fig. 3A) was subjected to HCD fragmentation and revealed several band y-type fragments (Fig. 3C) that verified the protein identity. The mass increments were now accurately measured and found to be 162.05 amu, suggesting the presence of Hex residues (Fig. 3B). The precursor ion modified by four potential Hex (m/z 1636.37, z ϭ 9, Fig. 3A) was subsequently subjected to CID fragmentation (Fig. 3D). This revealed a prominent product ion at m/z 1564.35, z ϭ 9 resulting from the loss of all four Hex residues. In addition, a second product ion at 1600.25, z ϭ 9 resulting from the loss of two Hex residues was also evident (Fig. 3D). Although both allergens were identified with Ͼ85% sequence coverage using bottom up MS (supplemental data), we failed to detect m/z 204 ion traces in the HCD-MS2 spectra, indicating that these allergens are not modified by HexNAc residues. However, data mining using Proteome Discoverer 1.4 confirmed the presence of Hex modifications (supplemental Table S1) for both Der f 2, shown in Fig. 4, and Der p 2, shown in Fig. 5. Additional spectra and are presented in supplemental Figs. S8 and S9. In Fig. 4A,   modified with 0 -9 Hex residues. Although the most extensive modification with nine Hex residues occurs with relatively low stoichiometry, these results show the presence of a polyhexose glycan, which could be organized as a linear or branched structure. The relatively low abundance of precursor ions representing peptides modified by more than two Hex residues precluded further MS/MS characterization of the polyhexose sequence (Fig. 4). Next, Der p 2 and Der f 2 allergens were subjected to reductive ␤-elimination in order to release O-linked glycans for further structural characterization. Although a typical pattern of sialyl-T, disialyl-T, and disialyl core-2 O-glycans were observed for the bovine fetuin standard (not shown), we did not detect the expected glycans from Der p 2 or Der f 2. Depolymerization of the polyhexose sequence using methanolic HCl followed by trimethylsilylderivatization and GC-MS did not reveal the epimeric identity of the Hex residues attached to Der p 2 and Der f 2 allergens (not shown). Finally, the attachment site of the PTMs were unambiguously identified through bottom-up ETD analysis of the tryptic digest followed by data mining allowing variable Hex modifications at lysine residues (supplemental Figs. S8 and S9). The ETD-MS2 spectrum of the Der p 2 G 49 KPFTLEALFDANQNTK 68 peptide, with a Hex modification at K 50 , is shown in Fig. 4C  both T 53 and T 67 , was observed without Hex modification at m/z 1595.79. In addition, the c4 fragment, which excludes both T 53 and T 67 but includes K 50 , was observed with a 162.05 amu mass increment, corresponding to a Hex residue, at m/z 609.32. These results clearly demonstrate that K 50 , and not T 53 or T 67 , is modified by Hex.
Similar results were obtained for the Der f 2 allergen. The C 95 PLVKGQQYDIK 106 peptide was identified with a single Hex although it lacks serine or threonine residues, thus leaving only tyrosine as a potential site of O-linked glycosylation. ETD fragmentation produced multiple c-and z-fragments, which confirmed the C 95 PLVKGQQYDIK 106 peptide identity (supplemental Fig. 9A). However, fragment ion masses in support of tyrosine O-glycosylation were not found. Instead, a continuous series of both c-and z-fragments in support of Hex modification at K 99 were observed in the ETD spectrum that unambiguously identified the lysine residue as the site of modification. These results demonstrate that Der p 2 and Der f 2 are modified by mono-or polyhexoses at multiple lysine residues (Fig. 1, supplemental Table S1).
ELISA Quantification of Carbohydrate Content-The O-mannosylated peptide standard was specifically biotinylated on the hexose residue through periodate oxidation followed by aldehyde-hydrazide chemistry and subsequently used in an ELISA format to determine the carbohydrate amount of the allergens. The molar ratio of carbohydrate containing molecules was determined to be 32% for Der p 2 and 42% for Der f 2, as quantitated by ELISA (supplemental Fig. S10).

Der p 1 and Der f 1 Allergens-
The masses m/z 25,554 and m/z 25,365 obtained by MALDI-TOF (supplemental Fig. S2) of the homologous dust mite allergens Der p 1 and Der f 1, respectively, are relatively close to the predicted molecular masses (Der p 1: 25,019 amu and Der f 1: 25,192 amu) of the mature and processed forms of both allergens. Although this indicated that some PTMs might be present, the MALDI-TOF measurement was unable to provide any details other than the mass increments of ϳ535 amu and 173 amu for Der p 1 and Der f 1, respectively. The tryptic digests of Der p 1 and Der f 1 were analyzed by bottom up MS, resulting in Ͼ74% sequence coverage for both HDM allergens (supplemental Data). A distinct m/z 204 ion trace was observed at ϳ48 min in the ion chromatogram of the Der p 1 allergen (not shown) and the HCD-MS2 spectrum of the precursor ion at m/z 814.12, z ϭ 4, shown in Fig. 6, was matched to the tryptic N 150 QSLDLAEQELVDCASQHGCHGDTIPR 176 peptide sequence with a single HexNAc modification (supplemental Fig.  S11A). The single HexNAc moiety was mapped to Asn 150 by ETD fragmentation (supplemental Fig. S11B), which is located in an N-glycosylation consensus sequence. Furthermore, a second precursor ion (m/z 865.39, z ϭ 4) with a mass increment corresponding to an additional HexNAc residue was also observed and matched to the same peptide sequence (supplemental Table S1, supplemental Fig. S11C). Taken together, these data show that natural Der p 1 is N-glycosylated  with one or two HexNAc residues at Asn 150 . Similar results were obtained for the natural Der f 1 allergen. A prominent m/z 204 trace was found in the tryptic digest of Der f 1 and the HCD-MS2 spectrum of the precursor ion at m/z 811.36, z ϭ 4 (Fig. 6), was matched to the HexNAc modified N 151 TSL-DLSEQELVDCASQHGCHGDTIPR 177 peptide of Der f 1 (supplemental Fig. S12A). An additional precursor ion (m/z 862.39, z ϭ 4), corresponding to same peptide modified by two Hex-NAc residues, was also characterized (supplemental Table  S1, supplemental Fig. S12B). Glycans elongated beyond the HexNAc 2 structure were not observed for the natural Der p 1 or Der f 1 allergens and peptides harboring these modifications were not observed in their nonglycosylated forms. Both Der p 1 and Der f 1 were also found to be modified by Hex monosaccharides at lysine residues (supplemental Table S1, supplemental Fig. S13). Compositional monosaccharide analysis by GC-MS did not reveal the identity of Hex and HexNAc residues on Der p 1 and Der f 1 allergens. In conclusion, our results show that Der p 1 and Der f 1 allergens are modified by HexNAc glycans at Asn residues and Hex glycans at Lys residues ( Fig. 1, supplemental Table S1).
Bet v 1 Allergen-The white birch allergen Bet v 1 was mass measured by MALDI (supplemental Fig. S2) and observed as a single peak at m/z 17,474 (theoretical average mass: 17,440 Da), supporting the notion that major modifications are not present on this allergen (37). Bottom up MS of the tryptic digest identified Bet v 1 with 97% sequence coverage (supplemental data), and did not reveal an m/z 204 ion trace or any other glycan modifications in accordance with lack of PTMs on the natural Bet v 1 allergen. Top down FTMS analysis (supplemental Fig. S14) revealed several multiply charged precursor ions (z ϭ 12-22) that were deconvoluted into two prominent peaks at m/z 17429.134, z ϭ 1 and 17429.840, z ϭ 1. The latter mass was matched (-5.5 ppm) to the major isoform of Bet v 1 lacking the initiator methionine. Top down HCD and ETD analysis verified the Bet v 1 sequence identity (supplemental Fig. S14). The precursor ion at m/z 17429.134, z ϭ 1 was found to deviate from the monoisotopic mass of Bet v 1 by Ϫ46 ppm. This could be attributed to a single amino acid substitution, for example, Asp to Asn, but has not been confirmed by MS2 analysis. In conclusion, our results demonstrate that Bet v 1 is not subjected to post-translational modifications (Fig. 1). DISCUSSION Here we demonstrate that a number of major allergens important in human allergic diseases carry carbohydrate modifications with an hitherto unknown level of complexity that may have implications for pathogenesis, clinical diagnostics, and immune based treatment strategies (19, 38 -40). We have adopted a proteocentric approach for mass spectrometric characterization, which we have applied on key grass, tree pollen, and HDM allergens. The data presented provide an in depth analysis of the carbohydrate structure of Der p 1, Der p 2, Der f 1, Der f 2, Bet v 1, Phl p 1, and Phl p 5 exceeding previous investigations in the obtained level of detail (21,24,41). We identified novel PTMs of HDM allergens as well as more complex glycan structures than previously reported on the major grass pollen group 1 and 5 allergens. Our findings may have implications for the theoretical understanding of how specific glycans may affect antibody (IgE) reactivity to allergens, antigen uptake, and immune presentation.
By employing bottom-up analysis of tryptic digested Phl p 1 the previously published number of structural variants for Phl p 1 was increased from three (21,24) to eight glycoforms ( Table I). The N-terminal part of the mature Phl p 1 has previously been reported to contain the N-linked Pen 1 -dHex 1 Hex 2-3 HexNAc 2 core structures with the additional presence of a single Hyp-linked furanosidic arabinose in its close vicinity (21,24). Here, we confirmed the presence of the Pen 1 dHex 1 Hex 2-3 HexNAc 2 core structures and further identified the presence of one to three Pen (supplemental Fig. S4) on Phl p 1. Barium hydroxide treated Phl p 1 analyzed by LC-ESI-MS demonstrated the presence of Hyp substituted with one to three pentose residues (Hyp-Pen 1-3 ) with Hyp-Pen 1 and Hyp-Pen 3 being most abundant. The presence of Hyp substitutions with more than a single Pen on Phl p 1 has not previously been reported. Interestingly, it has earlier been demonstrated that a large proportion of patients with grass allergy have enhanced IgE reactivity to epitopes containing the carbohydrate structure and/or hydroxylation of the proline residues (42). It is therefore plausible that the two N-linked glycoforms (Pen 1 dHex 1 Hex 2-3 HexNAc 2 ) together with the additional presence of the heterogeneous Hyp linked arabinosylations contribute to facilitate the reported high IgE binding and cross reactivity potential of Phl p 1.
The second grass pollen allergen that we examined, Phl p 5, does not have a N-glycosylation consensus motif and is not reported to carry any N-linked glycans. Here we report the presence of 0 -7 Hyps and up to seven Pen residues on a defined N-terminal region of Phl p 5 (supplemental Table S1, supplemental Fig. S7). Barium hydroxide treated Phl p 5 analyzed by LC-ESI-MS revealed the presence of Hyps substituted with one to three Pen residues with Hyp-Pen 3 being most abundant. This is a first report of Hyp formation and Hyp-substitutions with confirmed Pen 1-3 glycans, which may confer significant immunogenic properties to this major allergen. In a recent study, deletion or substitution of selected proline residues in E. coli expressed Phl p 5 yielded a hypoallergenic variant that displayed a highly diminished IgE binding while maintaining T-cell reactivity suggesting involvement of prolines and probably hydroxyprolines in IgE binding to Phl p 5 (43).
N-linked heterogeneous glycosylation of the major allergens of Dermatophagoides pteronyssinus HDM 1 and 2 (Der p 1 and Der p 2) has been demonstrated to influence structure, processing, function, and immunogenicity of these allergens (44 -46), although the precise glycan structures remain unre-solved (41,47). With bottom up mass spectrometry, we demonstrated that the natural Der p 1 is N-glycosylated with one or two HexNAc residues at Asn 150 (supplemental Fig. S11). A similar glycosylation was obtained for the natural Der f 1 allergen (supplemental Fig. S12). Glycans elongated beyond the HexNAc 2 structure were not observed for the natural Der p 1 or Der f 1 allergens. This is, to our knowledge, the first report of confirmed N-linked truncated HexNAc's in mite group 1 allergens.
In both Der p 2 and Der f 2, top down and bottom up FTMS revealed hexose modifications at multiple sites. The combination of top down and bottom up mass spectrometry was of central importance for the successful identification of the hexose modifications for HDM allergens. Initial profiling of total molecular weight using low resolution MALDI-TOF mass spectrometry hinted toward potential hexose modifications, which were verified only when high resolution FTMS was employed followed by top down characterization utilizing collisional activation at the MS2 level. The main advantage of the FTMS top down approach was thus to provide a reference point from which bottom up experiments and, perhaps more importantly, data mining and analysis could be based on. Once the hexose modifications were identified, bottom up FTMS analysis of proteolytic digest was employed not only for detailed mapping of the site-specific locations but also for characterization of glycan heterogeneities. In other words, to achieve complete characterization it was important to first understand the total molecular composition (top down) followed by detailed analysis of PTMs at the peptide level (bottom up).
Initially, the CID and HCD induced loss of hexose units, a typical feature for labile O-linked glycans, mislead us into believing that the mono-and polysaccharide glycans of Der p 2/f 2 were O-linked to Ser, Thr and, for the C 95 PLVKGQQYDIK 106 peptide (supplemental Table S1), also to Tyr residues (supplemental Fig. S15A). Surprisingly, reductive beta elimination, a procedure that specifically releases O-linked sugars from the peptide backbone, failed to liberate the expected glycans of Der p 2 and Der f 2 allergens. The alkaline-stable nature of these PTMs thus indicated that the polyhexose glycans of these allergens are linked by other means than the classical O-linkage to Ser/Thr residues. In addition, the Hex modified G 49 KPFQLEALFEANQNSKTAK 68 peptide of one Der p 2 isoform (Uniprot accession I2CMD6) was identified with an almost complete series of c-and z-ions, mapping the modification to S 64 (supplemental Fig. S16A). However, one of the few ions missing in the series was the c16/z4 ion pair that separates S 64 and K 65 . Notably, both the C 95 PLVKGQQYDIK 106 and G 49 KPFQLEALFEANQNSKTAK 68 peptide contained a missed trypsin cleavage site at lysine residues. Taken together, these results raised suspicions that lysines, and not Ser/Thr/Tyr, might be the modified residues. Attachment of sugars at lysine residues would account for the missed tryptic cleavages and the alkaline-stable nature of the modifications. We therefore re-analyzed our data by allowing variable modifications also at lysine residues. This approach resulted in the identification of all but one c-and z-ions between K 99 and Y 103 of the C 95 PLVKGQQYDIK 106 sequence, thereby leading to the unambiguous identification of K 99 as the modified residue (supplemental Fig. S15B). For the G 49 KPFQLEALFEANQNSKTAK 68 peptide, the c16 ion was now also identified, which mapped the Hex modification to K 65 (supplemental Fig. S16B). The attachment of Hex to lysine, which occurs through N-linkage, also explains the betaelimination resistant properties of these modifications. In an attempt to identify the monosaccharide identities of the attached Hex residues, GC-MS was employed but did not reveal the epimeric configurations of the attached sugars. The Hex 1 modification was observed on ϳ30% (molar ratio) of Der p 2 but, because of the N-linkage, this lysine-linked monosaccharide is not expected to be release by acid methanolysis (48). Although significant amounts of Der p 2 are modified by this initial Hex, this allergen pool does not contribute with any free Hex residues to the final sample subjected to GC-MS analysis. The O-glycosidic linkages connecting individual monosaccharides in the polyhexose sequence are cleaved by methanolic HCl but the Hex 2-9 polyhexose sequence is Յ50 times less abundant relative to the Hex 1 form. In other words, only 5 ng or less Hex is expected to be released (from 50 g Der p 2) for TMS-derivatization and GC-MS analysis. The same principle applies to the Der p 1 and Der f 1 allergens, that is, the N-linkage of HexNAc at asparagine is stable to methanolysis and Յ 15 ng HexNAc is expected to be released for TMS-derivatization. These amounts are not sufficient for successful GC-MS analysis.
Given that the relative ion abundances are not reliable for determining the stoichiometry of these modifications, we also designed an assay for absolute quantification of carbohydrate content in Der p 2 and Der f 2 allergens. Periodate oxidation of cis-diols is a well-known procedure for introduction of reactive aldehyde functionalities specifically on glycans (49). This property may thus be exploited to tag oxidized glycans by hydrazide-containing molecules of choice, e.g. biotin-hydrazide, which could then be used in conventional ELISA format to quantify the carbohydrate content of specific glycoproteins. Using this approach, we were able to conclude that 30 and 40% (molar ratio) of Der p 2 and Der f 2, respectively, are modified by glycans, thereby confirming that a significant amount of these allergens are subjected to Hex modifications. Of note, the ELISA values are in agreement with the ion abundances observed for modified Der p 2 and Der f 2 as measured by MALDI-TOF and top down ESI-MS (Fig. 2, supplemental Fig. S3).
Although an enzymatic mechanism for the attachment of Hex to lysine residues in HDM allergens cannot be completely excluded, it is likely that the modifications may be a result of nonenzymatic glycation reactions. We do not believe that these modifications are introduced during purification proce-dures because all purifications steps of HDM allergens were performed at 4 -12°C using buffers that did not contain freeend mono-or polysaccharides. The results therefore strongly suggest that these glycans on the allergens originate from the mites. HDM are believed to have a repertoire of up to 30 distinct allergens (50,51) and it is likely that the identified glycans may be present on many other allergens from mites. Glycations have previously been described on peanut allergens Ara h 1 and Ara h 3 as well as on ovalbumin (52,53). The modification has been implicated in interactions with APCs and might have caused increased immunogenicity (52,53). Furthermore, the receptor for advanced glycation end products (RAGE) has been shown to be an important mediator in allergic airway sensitization after HDM challenge (54), although the exact mechanisms and role of glycation in immune recognition and response is still poorly understood.
Glycosylation on allergens may play other roles in allergy than simply presenting antigen epitopes. APCs are equipped with an array of glycan receptors and targeting such receptors has been shown to induce tolerance (15,(55)(56)(57)(58)(59)(60)(61)(62). The idea that carbohydrates are able to boost and shape the immune response has been tested previously in numerous studies, systematically reviewed in (63). In a food allergy model it has, for example, been shown that oral delivery of bovine serum albumin (BSA) carrying mannoside residues significantly reduce BSA-induced anaphylactic response in the prophylactic setting (64). Furthermore, in the study by Thunberg et al. allergen rFel d 1 bound to carbohydrate based particles promoted uptake, induced strong antibody and cytokine response, and prevented induction of allergic immune responses (65). In addition, a tolerogenic activity has been shown for the carbohydrate receptor DC-SIGN in humans (66,67). Taken together, these and other studies support the importance of the CLR-ligand axis in the development of targeted therapies aimed at inducing immune modulation (12,68). Thus, it will be interesting to test the potential immunogenic properties of the individual PTMs and their potential influence of the allergenicity of major allergens. Furthermore, the detailed characterization of glycans could also be important for the development of well-defined allergens for diagnostic purposes (8 -11).
In conclusion, our findings demonstrate that most major allergens harbor extensive and diverse glycans, and detailed analysis of individual allergens are necessary to obtain full characterization. The presence of glycans on allergens may have profound importance for the immunogenicity of allergens as well as their presentation to the immune system. The detailed characterization presented here, now opens for direct analysis of the implications of these glycans for designing allergens for future vaccine formulations and diagnostic purposes.