Diversity in the Protein N-Glycosylation Pathways Within the Campylobacter Genus*

The foodborne bacterial pathogen, Campylobacter jejuni, possesses an N-linked protein glycosylation (pgl) pathway involved in adding conserved heptasaccharides to asparagine-containing motifs of >60 proteins, and releasing the same glycan into its periplasm as free oligosaccharides. In this study, comparative genomics of all 30 fully sequenced Campylobacter taxa revealed conserved pgl gene clusters in all but one species. Structural, phylogenetic and immunological studies showed that the N-glycosylation systems can be divided into two major groups. Group I includes all thermotolerant taxa, capable of growth at the higher body temperatures of birds, and produce the C. jejuni-like glycans. Within group I, the niche-adapted C. lari subgroup contain the smallest genomes among the epsilonproteobacteria, and are unable to glucosylate their pgl pathway glycans potentially reminiscent of the glucosyltransferase regression observed in the O-glycosylation system of Neisseria species. The nonthermotolerant Campylobacters, which inhabit a variety of hosts and niches, comprise group II and produce an unexpected diversity of N-glycan structures varying in length and composition. This includes the human gut commensal, C. hominis, which produces at least four different N-glycan structures, akin to the surface carbohydrate diversity observed in the well-studied commensal, Bacteroides. Both group I and II glycans are immunogenic and cell surface exposed, making these structures attractive targets for vaccine design and diagnostics.

In eukaryotes, glycosylated proteins are ubiquitous components of extracellular matrices and cellular surfaces. Their oligosaccharide moieties are implicated in a wide variety of essential cell-cell and cell-matrix processes ranging from im-mune recognition to cancer development. The first general protein glycosylation (pgl) 1 pathway was discovered in the epsilonproteobacterium Campylobacter jejuni (1). The organism transfers a conserved heptasaccharide en bloc to asparagine residues within the sequon D/E-X 1 -N-X 2 -S/T (X 1 , X 2 P) of Ͼ60 glycoproteins (2)(3)(4). Furthermore, the pathway can be functionally transferred into Escherichia coli, and the oligosaccharyltransferase (OTase), PglB, is capable of adding foreign sugars to acceptor proteins (5)(6)(7). C. jejuni PglB also possesses hydrolase activity, influenced by the cellular growth phase and osmotic environment, releasing free oligosaccharides (fOS) into the periplasmic space in a 10:1 ratio relative to the amount of heptasaccharide N-linked to protein (8,9).
In this study, we used phylogenetic, immunological, structural and glycoproteomic studies to compare the N-glycosylation systems of 29 Campylobacter species and identified unexpected variations. Thus, although the pathway is a common feature within this genus, variability in the N-glycans and fOS at the species level suggests that each species possess a unique array of glycosyltransferases, which correlate with their phylogenetic relatedness.

EXPERIMENTAL PROCEDURES
Plasmids, Bacterial Strains, and Growth Conditions-All bacterial strains and plasmids used in this study are listed in Table I. Campylobacter strains were grown on Mueller Hinton or Brain Heart Infusion supplemented with 5% horse blood under microaerobic (85% N 2 , 10% CO 2 , and 5% O 2 ) conditions. Alternatively, anaerobic strains were grown for 18 -72 h at 37°C in the following medium: 30 g/l Tryptic Soy Broth, 2 g/l yeast extract, 3.0 g/l Bacto-Peptone, 2.0 g/l NaCl, 3.0 g/l sodium formate, 4.0 g/l formic acid, 4.0 g/l asparagine monohydrate, 1.0 g/l cysteine hydrochloride, pH 7.8; supplemented with 5% (v/v) horse blood. For large scale anaerobic cultures, bacteria were grown in anaerobic medium (without horse blood) at 37°C in ventilated 1 liter bottles in anaerobic jars using the AnaeroGen gas pack system (Oxoid). E. coli strains were routinely grown in 2ϫ YT medium at 37°C. Growth media were supplemented with chloramphenicol (Cm, 25 g/ml) and/or kanamycin (Kan, 30 g/ml), where appropriate.
Phylogenetic Analysis-The AtpA (ATP synthase ␣ subunit, housekeeping gene) protein sequence was used as the phylogenetic marker for trees constructed with the neighbor-joining algorithm of the MEGA3.1 software package (20) using a multiple alignment parameters gap opening penalty of 8, a gap extension penalty of 2, and the PAM protein weight matrix. To identify putative pgl genes, the predicted proteomes of 29 Campylobacter genomes, representing all validly described taxa, were compared with the predicted proteome of C. jejuni strain NCTC 11168 using BLASTP. Proteins demonstrating Ͼ35% amino acid identity to Pgl proteins from 11168 were scored as positive matches. Putative gene products within the 29 proposed pgl loci were identified by BLASTP comparison to proteins present within  (21) and the CAZy database (22) to identify additional putative glycosyltransferases. fOS Purification and NMR Analysis-Campylobacter cells from 6 liter cultures were harvested (10 min, 10,000 ϫ g, 4°C) and digested with proteinase K at pH 8 at 37°C for 48 h. Digests were separated on a Sephadex G-15 column (1.5 ϫ 60 cm). Fractions were dried, resuspended in D 2 O and analyzed by 1 H NMR. Fractions containing fOS were separated by anion exchange chromatography on a Hitrap Q column (5 ml size, Amersham Biosciences). Glycans were eluted with a linear gradient of NaCl (0 -1 M, 1 h). Desalting was performed on Sephadex G15 prior to glycan analysis on a Varian INOVA 500 MHz ( 1 H) NMR spectrometer with a 3 mm gradient probe at 25°C with acetone as the internal reference (2.225 ppm for 1 H, 31.45 ppm for 13 C) using standard pulse sequences: DQCOSY, TOCSY (mixing time 120 ms), ROESY (mixing time 500 ms), HSQC and HMBC (100 ms long range transfer delay). AQ time was kept at 0.8 -1 s for H-H correlations and 0.25 s for HSQC, 256 increments was acquired for t1. Sugars and sugar linkages were identified by characteristic coupling constants and chemical shift patterns. For fOS of C. concisus, the absolute configuration of the glucolactilic acid carboxyethyl group was determined as described (23).
HPAEC-PAD-Cff fOS were separated by high performance anion exchange chromatography with pulsed amperometric detection (HPAEC/PAD) using a semipreparative CarboPac® PA 100 (9 ϫ 250 mm equipped with a Guard Column: 3 ϫ 50 mm) and a fraction collector (DIONEX UltiMate 3000) under the following conditions: flow rate: 0.5 ml/min; eluent system: 50 mM sodium acetate in 100 mM sodium hydroxide; detection mode: pulsed amperometry, quadruple waveform, Au electrode; AgCl reference electrode; and the ambient column temperature was set to ϳ30°C. Elution fractions containing the single fOS forms were neutralized with equimolar amounts of 0.2 M HCl and 1/10 of each elution fraction was applied to an analytical column CarboPac® PA200 (3 ϫ 250 mm CarboPac PA100 equipped with a Guard Column: 3 ϫ 50 mm) under the same conditions to confirm proper separation of individual fOS forms. Individual fOScontaining fractions were combined, lyophilized and stored at Ϫ20°C until further use.
fOS Analysis Using Mass Spectrometry-fOS were identified and analyzed as described (8,24), with the difference that precursor ion scanning was carried out for the detection of the diagnostic oxonium ion fragments at m/z 204 (for HexNAc) and m/z 407 (HexNAc) 2 .
Generation of fOS-BSA Glycoconjugates-Individual Cff fOS preparations were conjugated to BSA by reductive amination as described (26). Formation of BSA-glycoconjugates was confirmed by Western blotting with alkaline phosphatase (AP)-conjugated WGA.
Generation of Group I and Group II C. fetus Subgroup Glycanspecific Antibodies-New Zealand White Rabbits were immunized with individual group II C. fetus subgroup fOS-BSA conjugates using a 10 week immunization protocol (approved Animal Care Committee protocol No. 717). Blood sera were analyzed for the production of pgl glycan-specific antibodies by Western blotting against whole cell lysates from various Campylobacter species. The antiserum against HexNAc-[Hex]-HexNAc 3 -diNAcBac-BSA was named GRPII-1, the antiserum against HexNAc-[HexNAc]-HexNAc 3 -diNAcBac-BSA was named GRPII-2. To generate group I-N-glycan specific (R1) antiserum, rabbits were immunized with formaldehyde-fixed (according to (27)) cells of E. coli K12 wzy::Kan (pACYC(pgl mut ) expressing the C. jejuni N-glycan on their surface (17).
Digestion of Whole-cell Lysates for ZIC-HILIC Enrichment-One mg of proteins, determined using the Qubit TM kit (Invitrogen, Carlsbad CA), was resuspended in 6 M urea, 2 M thiourea, and 40 mM NH 4 HCO 3. The solubilized sample was then reduced for 1 h with 10 mM dithiothreitol, subsequently alkylated using 20 mM iodoacetamide for 1 h in the dark and was quenched using 10 mM dithiothreitol . Then, 1/200 (w/w) of endoproteinase Lys-C (Sigma-Aldrich) was added and digestion was allowed to proceed for 4 h at 25°C. Samples were diluted 1:4 with 100 mM NH 4 HCO 3 and digested with 1/50 (w/w) of porcine sequencing grade trypsin (Promega, Madison, WI) for 18 h at 25°C. Samples were then dialyzed against ultra-pure water overnight using a Mini Dialysis Kit with a molecular mass cut off of 1000 Da (GE Healthcare) and lyophilized.
Enrichment of Glycopeptides by Zwitterionic Hydrophilic Interaction Chromatography (ZIC-HILIC)-ZIC-HILIC enrichment was performed according to (4) with minor modifications. Micro-columns composed of 10 m ZIC-HILIC resin (Sequant, Umeå, Sweden) packed into p10 tips containing a 1 mm 2 excised C 8 Empore™ disc (Sigma) were packed to a bed length of 0.5 cm. Before use, the columns were washed with ultra-pure water, followed by 95% acetonitrile (ACN) and then equilibrated with 80% ACN and 5% formic acid (FA). Samples were resuspended in 80% ACN, 5% FA and insoluble material removed by centrifugation at 20,000 ϫ g for 5 min at 4°C. Samples were adjusted to a concentration of 2 g/l and 100 g of peptide material loaded onto a column and washed with 10 load volumes of 80% ACN, 5% FA. Peptides were eluted with three load volumes of ultra-pure water into low-bind tubes and concentrated using vacuum centrifugation. Triplicate enrichments were generated for each glycopeptide analysis.
Identification of Glycopeptides Using Reversed Phase LC-MS and HCD MS/MS-ZIC-HILIC enriched glycopeptide samples were resuspended in 0.1% FA and loaded onto an Acclaim PepMap 100 m C18 Nano-Trap Column (Dionex Corporation, Sunnyvale, CA) for 10 min using a UltiMate 3000 intelligent LC system (Dionex Corporation). Peptides were eluted and separated using a 20 cm, 100 m id, ReproSil -Pur C 18 AQ 3 m column using a 120 min gradient in which the proportion of phase A (0.1% FA) and phase B (0.1% FA, 80% ACN) were altered from 100% phase A to 40% phase B over 120 min at 200 nL/min in an LTQ-Orbitrap Velos mass spectrometer (Thermo Scientific, San Jose, CA). The LTQ-Orbitrap Velos was operated using Xcalibur v2.2 with a capillary temperature of 275°C in a data-dependent mode automatically switching between MS and MS/MS with CID followed by HCD. For each MS scan, the three most abundant precursor ions were selected for fragmentation with CID (normalized collision energy 35, activation time 30 ms) followed by HCD (normalized collision energy 45, activation time 30 ms) to obtain MS/MS spectra. MS resolution was set to 60,000 with an ACG of 1e 6 , maximum fill time of 500 ms and a mass window of 600 to 2000 m/z. CID fragmentation was carried out with an ACG of 2e 4 , maximum fill time of 100 ms whereas HCD fragmentation was carried out with an ACG of 2e 5 , maximum fill time of 500 ms, resolution of 7500 and mass window 170 to 2000 m/z.
Glycopeptide Data Processing-The .raw files were then processed within Proteome Discover version 1.0 Build 43 (Thermo Scientific) to generate mgf files and searched using Sequest against the specific strain FASTA database (obtained from NCBI on 08/06/2011). Scan events that did not result in peptide identification from Sequest searches were exported to Excel (Microsoft, Redmond, WA). To identify possible glycopeptides within this list, a new feature within the MS 2 module of GPMAW 8.2 called "mgf graph" was used, which identified all scan events within the generated mgfs containing the diagnostic oxonium 204.086 m/z ion. Using Excel, all scan events that were not matched by Sequest and contained a predicted marker of glycosylation were identified. These events were manually inspected and identified as possible glycopeptides based on the presence of the glycan fragment within the CID scan. To facilitate glycopeptide assignments from HCD scans, the ions below the mass of the predicted deglycosylated peptides were extracted with Xcalibur v2.2 using the Spectrum list function. Ions with a deconvoluted mass above the deglycosylated peptide and ions corresponding to known carbohydrate oxoniums were removed in a similar approach to post-spectral processing of ETD data (28,29). MASCOT v2.2 searches were conducted via the Australasian Proteomics Computational Facility (www.apcf.edu.au) of the proteobacteria taxonomy of the LugwigNR database (database release LugwigNR_Q410, composed of 13112897 entries available via ftp://ftp.ch.embnet.org/pub/databases/ nr_prot/). Searches were carried out with a parent ion mass accuracy of 20 ppm and a product ion accuracy of 0.02 Da with no protease specificity as well as the fixed modification carbamidomethyl (C) and variable modifications, oxidation (M) and deamidation (N). The instrument setting of MALDI-QIT-TOF was chosen because of previous studies showing quadrupole-like fragmentation within HCD spectra (30) (generating a, b, and y ions) and our observation of internal cleavage products that are all included in this setting. All spectra were searched with the decoy option enabled and no matches to this database were detected (FDR 0%). To further validate glycopeptide matches, all spectra were manually annotated to ensure all major peaks were match providing further confidence of identity and localization. All annotated spectra are provided in supplemental MS-Figures.

Campylobacters Divide into Two Major Groups Based on
Phylogeny and Antigenicity-All Campylobacter species, except C. canadensis, possess a complete repertoire of pgl genes ( Fig. 1). Phylogenetic comparisons showed that campylobacters divide into two major groups that comprise the thermotolerant (group I) and the nonthermotolerant species (group II) with C. lari and related species forming a subgroup (the C. lari subgroup), within group I (Fig. 1). When whole cell lysates of all 29 Campylobacter pgl-containing species were screened with C. jejuni N-glycan specific antisera, all group I organisms were recognized (Figs. 2A, 2B). No reactivity against proteins from the group II species was observed suggesting that these organisms are either incapable of synthesizing N-glycans or they produce different oligosaccharide structures compared with C. jejuni.
The Characterization of Group I Campylobacter fOS Structures-Because the C. jejuni fOS to N-glycan ratio is 10:1 under standard laboratory growth conditions (9), we screened species from each group for fOS by mass spectrometry using precursor ion scanning. In group I organisms, except the C. lari subgroup, a mass of m/z 1425.0 strongly suggested the presence of a heptasaccharide structure similar to C. jejuni (supplemental Fig. S1) (9). For fOS isolated from the C. lari subgroup, masses of m/z 1263.0, m/z 1244.6, and 1343.0 were detected (supplemental Fig. S1). In C. lari subsp. lari, Because multiple glycoforms were identified from the C. lari subsp. lari fOS analysis, we used NMR to determine the phosphate position on the fOS and investigate the authenticity of the nonphosphorylated form (Fig. 6, supplemental Table  S2). For the C. lari subsp. lari fOS, the NMR H-1 signal of diNAcBac showed correlation to a phosphate signal at 1 ppm and had an additional coupling of 7 Hz because of phosphorylation. The C-1 signal was at 95.2 and no beta anomer was present. Therefore the phosphorylation was at the O-1 of diNAcBac and the linear [GalNAc] 5 -diNAcBac-P structure was confirmed for this species with no evidence of dephosphorylated glycoforms detected. As a control, structural analysis of fOS from C. jejuni (Fig. 6, supplemental Table S3) confirmed the previously established N-linked glycan structure is identical to the fOS glycan.
The Characterization of Group II Campylobacter fOS Structures-As representatives of the group II species, we analyzed Cff, Cfv, C. hyointestinalis subsp. hyointestinalis (Chyo) and C. hyointestinalis subsp. lawsonii (further referred to as the C. fetus subgroup). Precursor ion scanning for the diagnostic oxonium ion m/z 204 (for HexNAc) identified two parent ions with m/z 1263.0 and m/z 1222.0 (supplemental Fig. S1). MS 2 analyses of these ions for Cff identified the monosaccharide compositions and sequence of both fOS species as HexNAc-[Hex]-HexNAc 3 -diNAcBac and HexNAc-[HexNAc]-HexNAc 3 -diNAcBac (Figs. 3D, 3E). NMR analysis of the fOS of Cff, Cfv and Chyo confirmed that these species produce two oligosaccharides, as suggested by MS analysis (Fig. 6, supplemental Table S1). Within these glycans the three sugars at the reducing end were identical to those of C. jejuni whereas the three sugars at the nonreducing end were: The presence of two oligosaccharide variants in the C. fetus subgroup raised the question about their expression and antigenicity. Quantitative and preparative HPAEC-PAD analyses on purified C. fetus subgroup II glycans (Fig. 4) not only allowed the separation of the two Cff-fOS species, but also demonstrated that the ratio of peak 1 (major)/peak 2 (minor) group II fOS was ϳ3.6:1 (Fig. 4). Immunofluorescence staining of intact cells of Cff, Cfv and Chyo with GRPII-1 and GRPII-2 antisera individually raised in rabbits against each of the two structures, showed that both glycans are antigenic and expressed on the bacterial cell surface (Fig. 4F). In contrast, the GRPII antisera did not react with C. jejuni (group I) cells (Fig.  4F). Similarly, group II species could not be labeled with group I antiserum whereas C. jejuni cells showed strong reactivity.
To confirm that these glycoforms are indeed N-linked to protein, Cff whole cell lysates were subjected to ZIC-HILIC glycopeptide enrichment and MS-MS analysis. A total of 65 unique glycopeptides were identified within this species (supplemental Table S7) with both glycans detected on protein substrates. Forty-six peptides contained the HexNAc-[HexNAc]-HexNAc 3 -diNAcBac glycan whereas 19 contained the HexNAc-[Hex]-HexNAc 3 -diNAcBac glycan with 14 glycosylation sites confidently assigned independent of sequon alone (supplemental Table S9) and included examples of both glycans occupying the same glycosylation site. The identification of glycopeptides containing both glycans, which co-eluted, enabled the comparison of the ratio of the HexNAc-[HexNAc]-HexNAc 3 -diNAcBac and HexNAc-[Hex]-HexNAc 3 -diNAcBac glycoforms of the 19 glycopeptide pairs detected (supplemental Table S8). An example of this is the glycopeptide 135 DTNETELSAQK 145 of the putative protein A0RNV0_CAMFF, which demonstrates the ratio of ϳ3.5/1 for the two glycoforms (Fig. 5) that is representative for the average, ϳ3.7/1, observed for all confirmed glycopeptide pairs (supplemental Table S8). This ratio is in good agreement with the observed fOS ratio.
Group II Campylobacter Species Show Extensive Diversity and Split into Eight Subgroups-The generation of C. fetus specific glycans antibodies provided a rapid screening tool to investigate the attachment of similar glycans to protein substrates. Whole-cell lysates of all 29 Campylobacter species were probed with GRPII-1/GRPII-2 N-glycan specific antisera (Figs. 2C, 2D). Similar to the immunofluorescence results, proteins of Group I species did not react, whereas proteins from C. fetus sp. and C. hyointestinalis sp. strongly reacted with GRPII sera. The GRPII sera showed variable reactivity with other group II Campylobacter species, which ranged from moderate with C. gracilis proteins, weak with C. sputorum bv. sputorum and C. lanienae, and no reactivity against C. hominis, C. rectus, C. showae, C. mucosalis, C. concisus, C. curvus, C. ureolyticus, C. sputorum bv. fecalis and C. sputorum bv. paraureolyticus (Figs. 2C, 2D). Interestingly Campylobacter species with a closer phylogenetic relationship demonstrated similar patterns of reactivity toward GRPII-1/ GRPII-2 antisera suggesting that group II may contain subgroups that differ in their glycan structures. Because of the observed variations in antibody reactivity, we analyzed the fOS composition of the remaining group II strains by MS and MS 2 . In the C. sputorum subgroup (including C. lanienae and  Table S5). Interestingly, these fOS structures only differ by the configuration of the HexNAc branch. Although multiple attempts were made to analyze fOS from all representative subgroups, we were not able to obtain the amount of fOS that is required for NMR to elucidate the glycan structure of the remaining Campylobacter species.
Characterization of Group II Glycans by Glycopeptide Enrichment-Because we could not fully elucidate the fOS structure of certain Campylobacter group II species such as C. rectus and C. hominis, by MS or NMR, we employed ZIC-HILIC glycopeptide enrichment to further characterize these glycans and their composition. Enrichment of C. hominis glycopeptides confirmed that the fOS composition of  Fig. 8D). In total 18 unique glycopeptides corresponding to 10 unique peptide sequences and five localized glycosylation sites were identified within C. hominis (supplemental Table S10) with this being the first documented case of isobaric N-linked glycans within the Campylobacter genus.
Analysis of C. rectus glycopeptides confirmed the fOS mass of 1560.0 Da and also enabled the complete assignment of the oligosaccharide as Hex-HexNAc-[Hex-Hex]-HexNAc-217-HexNAc-diNAcBac (Fig. 9A). Within the generated HCD spectra, the unusual 217 Da carbohydrate moiety lead to the formation of an oxonium ion of mass MH ϩ ϭ  Table S8. 218.06398 that strongly suggested an elemental composition of C6H10O5N4 (-2.710ppm) with the presence of the oxonium ion being diagnostic for C. rectus glycopeptides. A total of 25 unique glycopeptides were identified enabling the localization of 11 glycosylation sites across multiple proteins (supplemental Table S11) such as the M23 peptidase domain protein, 438 KAEENLTK 445 (Fig. 9B). The C. rectus N-glycan represents the first example of an octasaccharide within the Campylobacter genus.
Confirmation of a Noncarbohydrate Moiety in the C. gracilis Glycans by Glycopeptide Enrichment-Because our initial analysis of C. gracilis fOS lead to the identification of a glycan composition of 123-HexNAc-[234]-HexNAc 3 -diNAcBac with the identity of the 123 Da moiety being unassigned, we further characterized this moiety and verified the attachment of this unusual glycan to protein substrates by ZIC-HILIC glycopeptide enrichment. Multiple glycoforms were identified in C. gracilis that were characterized by the presence or absence of the noncarbohydrate modification PEtN. In total eight glycopeptides from seven glycoproteins were identified enabling the localization of four glycosylation sites (supplemental Table  S12). Within this dataset, two proteins and three glycopeptides were demonstrated to contain PEtN on the glycan (supplemental Table S12). Examples of the identified C. gracilis glycopeptides were the PEtN-HexNAc-[234]-HexNAc 3 -di-NAcBac glycopeptide, 84 VKIEANATK 92 of preprotein translo-case SecG subunit (Fig. 9C, D) and the HexNAc-[234]-Hex-NAc 3 -diNAcBac glycopeptide, 117 TGADLNGTLAEQR 129 of the periplasmic nitrate reductase (Figs. 9E, 9F). Within this dataset, the use of CID and HCD fragmentation led to the generation of internal PEtN fragments, including the loss of the diagnostic ethanolamine group at 43.04 Da, which aided in the assignment of this noncarbohydrate modification.
The N-linked Glycan of C. lari subsp. lari Differs from Its fOS Counterpart-To determine whether the N-glycan of C. lari, like its fOS counterpart, is modified with a phosphate residue at the reducing end, ZIC-HILIC enrichment and a combination of MS HCD and ETD fragmentation was used to identify glycopeptides in C. lari subsp. lari (supplemental Table S13). Within the identified glycopeptides, such as 58 AAVDAN-ASGSEK 69 from ubiquinol cytochrome c oxidoreductase, only [HexNAc] 5 -diNAcBac glycans were detected (Figs. 9G, 9H) and there was no evidence of phosphate-containing N-glycan suggesting that the C. lari subsp. lari N-linked glycan is [HexNAc] 5 -diNAcBac. DISCUSSION Phylogenetic analysis coupled with antibody screening divided the Campylobacter genus into two major groups: group I includes all thermotolerant species capable of living at the high body temperatures found within avian hosts, whereas group II contains the nonthermotolerant campylobacters. Interestingly, a similar grouping was also observed when the sequence of the oligosaccharyltransferase PglB, necessary for N-glycosylation, was used as the phylogenetic marker (12). The C. jejuni heptasaccharide is conserved in all group I species (excluding the C. lari subgroup), and identical fOS and N-glycan structures are produced. The C. lari subgroup lack the glucosyltransferase (PglI) responsible for the addition of the glucose branch ( Fig. 1) and instead produce a hexasaccharide that is modified with a phosphate on the fOS, but not on the N-glycan. This is the only Campylobacter species that produces fOS structures that are different from the N-glycans. This linear hexasaccharide was also observed when the C. lari pgl cluster was transferred into the heterologous E. coli host (25). Remarkably, although Neisseria species O-glycosylate their proteins, they too possess enzymes for the synthesis of diNAcBac at the reducing end of the O-glycan and have recently been described to display regression of their glucosyltransferase enzyme (32). In the eukaryotic N-glycosylation system, a single glucose is recognized by lectins involved in the quality control of glycoprotein folding leading to transport to the Golgi or retrotranslocation to the cytoplasm; whereas in C. jejuni, we have previously demonstrated that mutation of the glucosyltransferase does not have a phenotype (2). It is interesting to speculate that the general O-glycoprotein pathway of Neisseria and the N-glycoprotein pathway of Campylobacter are exhibiting parallel bacterial evolution. Also, the fact that C. lari has lost several key metabolic enzymes and likely relies on metabolites provided by its restricted marine  (33), yet has retained the pgl pathway, emphasizes the importance of this pathway for campylobacters.
We are currently trying to understand why the fOS in C. lari is phosphorylated. If C. lari fOS is involved in osmoregulation similar to C. jejuni (9), the negatively charged phosphate may assist in the retention of periplasmic fOS between the membranes, and similar modifications have been observed in other Gram-negative bacterial membrane-derived oligosaccharides (34). Alternatively, C. lari may have developed a more efficient method for recycling undecaprenylphosphate (UndP) by possessing an alternate enzyme capable of releasing P-fOS through cleavage of the pyrophosphate linkage of the lipid carrier (because the phosphate residue is alpha-linked to diNAcBac) or coupling UndP recycling with phosphorylation of PglB-released fOS.
Within group II, we identified several novel glycan structures. The C. fetus subgroup produces two different fOS and N-glycans that are both present in an ϳ4:1 ratio (with GlcNAc or Glc branches). The reason the GlcNAc-containing hexasaccharide is preferred over the Glc-hexasaccharide is unknown, but may be caused by transferase efficiencies or nucleotide-donor availabilities. In the C. fetus subgroup, the narrower host range (35) may have resulted in the formation of two glycan structures as a potential mechanism to evade the host immune system by modulating the amount of N-glycan populations on their surface. A similar mechanism was found increase in elution time associated with substitution of the 245 Da moiety for a HexNAc. Inserts show high mass accuracy MS spectra of each glycoform demonstrating an increase in ϳ42.02 Da associated with the ␤ glycoform compared with the ␣ glycoform, and the ␥ glycoform compared with ␤ glycoform. D, HCD fragmentation of the ␤ glycoform containing glycopeptide 93 EDQNKSEPFVPLVPDAK 109 of the preprotein translocase, SecG subunit (uniprot accession number A7I0C3_CAMHC; Mascot ion Score: 43). for the general O-glycosylation pathway in Neisseria species that was speculated to be driven by selection imposed by host adaptive and innate immunity (32,36).
Because both C. fetus subgroup glycans are immunogenic, and the generated antisera do not cross react with group I species, this indicates that the terminal sugars are the immunodominant N-glycan epitopes on the bacterial cell surface. The split between terminal GalNAc residues in group I and predominantly terminal GlcNAc residues in group II is quite striking (Fig. 1). This is particularly noteworthy because it has been demonstrated that the human C-type lectin, MGL, binds terminal GalNAc residues on the C. jejuni N-glycan (17), but an MGL homolog in chickens has not been identified. Instead, the only characterized C-type lectin isolated from chicken hepatocytes was shown to recognize glycans terminating in GlcNAc residues (37), which are not present in any of the thermotolerant campylobacters. We are examining the C-type lectin repertoire on chicken dendritic cells to characterize the immune recognition of the pgl-derived glycans and to better understand the vaccine potential of these structures (38,39).
Initial MS and MS 2 fOS analysis of the C. sputorum biovars and C. lanienae identified multiple unique glycans. At the MS level, two glycans were observed, one with a mass identical to that observed for group I whereas the other corresponded to an additional Hex-HexNAc-[Hex]-HexNAc 3 -diNAcBac glycan. MS 2 analysis of the identical group I mass fOS suggested further variation with potentially two glycans differing in the position of the hexose residue able to be assigned. Using NMR, only the identical group I mass fOS was observed, but interestingly corresponded to a single glycan, ␤-Glc-4-␤-GlcNAc-3-[␣-GlcNAc-6]-␣-GlcNAc-4-␣-GalNAc-4-␣-GalNAc-3-␣,␤-diNAcBac for C. sputorum bv. sputorum and ␤-Glc-4-␤-GlcNAc-3-[␣-GalNAc-6]-␣-GlcNAc-4-␣-GalNAc-4-␣-GalNAc-3-␣,␤-diNAcBac for C. sputorum bv. fecalis. These assignments suggest that the observed variation in the hexose branch in the initial MS 2 analysis may be the result of terminal sugar re-arrangement during MS, as previously described (40). Although this suggests the identical group I mass glycan to be less variable than originally thought, the inability of NMR to detect the Hex-HexNAc-[Hex]-HexNAc 3 -diNAcBac glycan suggests low abundance glycan variability may still exist and is currently under investigation. Interestingly, the GlcNAc branch within the fOS of C. sputorum bv. sputorum, and likely in C. lanienae, resembles the configuration of the nonreducing end GlcNAc within the major form of the C. fetus subgroup fOS, and explains the weak reactivity with C. fetus subgroupspecific antisera for these species, which also primarily exist in farm animals (41,42).
Additional variations in fOS composition were observed in group II. One striking feature was the presence of sugars with masses of 217 Da, 234 Da and 245 Da. We identified the 234 Da component as glucolactilic acid, a sugar that was described as a component of the Shigella boydii type 17 (23) and E. coli O124 (43) lipopolysaccharides (LPS). However, the nature of the 217 Da and 245 Da sugars remain obscure. A 217 Da monosaccharide is present in the N-glycan of H. pullorum (44) and identified within the capsule and LPS of Aeromonas salmonicida strain 80204-1 as 2-acetamido-2deoxy-D-galacturonic acid (GalNacA) (45). Furthermore, high mass accuracy measurements of the 217 oxonium ion derived from C. rectus (m/z 218.06398) and that of the 217 oxonium ion of H. pullorum glycopeptides (m/z 218.06503) (data not shown) confirms these carbohydrates possess identical mass. This suggests the presence of acidic sugar residues in pgl glycans of specific group II species, which are, with the exception of C. hominis, predominantly found in the human oral cavity (18,46). The 245 Da component is unique to C. hominis, which is the only known human commensal among the taxa (47). C. hominis also exhibits the most extensive repertoire of N-glycan structures, with 3 glycans of differing mass detected of which one corresponds to two isobaric glycan forms. The 245 Da component may, based on its mass measurements, correspond to an O-acetylated HexNAc. Currently it is unknown if the observation of these multiple glycans provides insight into the true nonstoichiometric heterogeneity in the glycan or is a result of chemical loss of acetylation during glycopeptide preparation. It is interesting that commensal Bacteroides species also have an extensive repertoire of glycan structures on their surface that enable the organism to persist in the gut by modulating the host immune responses (48,49). The reason why the group I species express a single, conserved N-glycan structure whereas group II FIG. 9. Novel N-linked glycopeptides in C. rectus, C. gracilis and C. lari subsp. lari. An N-linked glycopeptide from the M23 peptidase domain of the protein B9D2T1_WOLRE of C. rectus is shown. A, CID-based fragmentation for the assignment of Hex-HexNAc-[Hex-Hex]-HexNAc-217-HexNAc-diNAcBac from the C. rectus-derived glycopeptide. B, HCD fragmentation enabled the identification of the glycopeptides as 438 KAEENLTK 445 , Mascot ion score 46, showing a single possible glycosylation site N442. PEtN-modified N-glycans and nonmodified N-glycans were identified in C. gracilis. The PEtN-modified glycopeptide, 84 VKIEANATK 92 of the preprotein translocase, SecG subunit was identified using (C) CID fragmentation and (D) HCD fragmentation. CID confirmed the present of a hexasaccharide decorated at the terminal end by PEtN. HCD confirmed the peptide component as 84 VKIEANATK 92 , Mascot ion score 48, with further support of the assignment of the PEtN modification by the presence of the 284.05 ion corresponding to a HexNAc-P generated by the loss of ethanolamine. Conversely the nonmodified glycopeptides 117 TGADLNGTLAEQR 129 , Mascot ion score 49, of the periplasmic nitrate reductase was identified with (E) CID and (F) HCD but was lacking the terminal PEtN modification of the glycan. An N-linked glycopeptide from the ubiquinol cytochrome c oxidoreductase protein (B9KCX8_CAMLR) of C. lari subsp. lari. G, CID-based fragmentation enabled the assignment of HexNAc 5 -diNAcBac within the C. lari subsp. lari derived glycopeptide. H, ETD enabled both the identification of the glycopeptide, 58 AAVDANASGSEK 69 , Mascot ion score 21, as well as the confirmation of the site of attachment, N63; * ϭ neutral loss ions. species express a range of N-glycan structures will be exciting to explore.
The terminal phosphoethanolamine (PEtN) modification identified on the C. gracilis fOS and glycoproteins was also unique and requires further study. PEtN modifications present on the lipid A and FlgG protein of C. jejuni 81-176 are important for resistance against polymyxin B and for motility (50). In other bacteria, PEtN modification of lipid A or LOS core sugars acts as a mechanism to increase antibiotic resistance (51,52). Interestingly, isolates of C. gracilis show striking resistance against penicillin and various cephalosporins compared with other Campylobacter species (53). One of the striking features of the C. gracilis fOS analysis by MS was the dominant presence of B-ions compared with the Y-ions seen within the analysis of other fOS structures. This alteration in the dominant ion series is because of the proton sequestration ability of the basic ethanolamine group, akin to the alteration in the dominant ion series seen in LysN peptides or when C-terminal chemical derivatization is used to increase basicity (54,55). This observation further supports the assignment of the PEtN residue.
This work provides a comprehensive analysis of the Nglycosylation pathways of all known Campylobacter taxa through genome analysis, phylogenetic comparisons, antibody screening, NMR structural characterization and MS technologies through direct examination of both the N-glycosylated proteins and free oligosaccharides. These studies enabled us to confirm that this pathway is functional in all but one of these species and expands on recent studies examining N-glycosylation in H. pullorum and a subset of Campylobacter species examined in vitro (25,44,56). There are some discrepancies in structures observed in our study compared with those described by Jervis et al. (56). The difference in the N-glycan structures in C. concisus might be a result of intraspecies diversity at the genome level (57), whereas the differences for C. lanienae are likely because of the methods used for analysis. In the latter study, the authors elucidated the N-glycan structures by glycosylating biotin-tagged peptides with membranes from the select Campylobacter species, but obtained yields too low to generate reliable mass-spectrometry data and therefore supplemented the reactions with the heterologous OTase from C. jejuni (56). In the case of C. fetus subsp. fetus, Jervis et al. observed one rather than the two glycan structures that we observed by NMR and MS at both the glycopeptide and fOS levels, but again this is likely caused by not analyzing this system directly in the host (56).
Our studies suggest that protein N-glycosylation plays a fundamental role in Campylobacter biology because of the observed evolutionary pressure to conserve this pathway and the demonstration that glycosylation of specific proteins appears to be conserved across multiple species, such as the general secretory pathway protein, SecG, which we identified to be glycosylated in both C. hominis and C. gracilis and previously shown to be glycosylated in C. jejuni (58). Com-parative studies with C. canadensis, the only Campylobacter species lacking this pathway, may help to elucidate the key function of protein glycosylation in campylobacters. The diversity in glycan structures among the species parallels the phylogenetic relatedness of the organisms and could potentially be associated with the natural hosts and niches that each species inhabits. Because the nonreducing sugar residues have been shown to be the determinants for adaptive and innate immune detection, host recognition mechanisms may in part be responsible for the pressures influencing Nglycan diversity within this genus and this diversity could be exploited to generate novel glycoconjugate vaccines and diagnostics.