Type IX Secretion System Cargo Proteins Are Glycosylated at the C Terminus with a Novel Linking Sugar of the Wbp/Vim Pathway

Porphyromonas gingivalis and Tannerella forsythia, two pathogens associated with severe gum disease, use the type IX secretion system (T9SS) to secrete and attach toxic arrays of virulence factor proteins to their cell surfaces. The proteins are tethered to the outer membrane via glycolipid anchors that have remained unidentified for more than 2 decades. In this study, the first sugar molecules (linking sugars) in these anchors are identified and found to be novel compounds. The novel biosynthetic pathway of these linking sugars is also elucidated. A diverse range of bacteria that do not have the T9SS were found to have the genes for this pathway, suggesting that they may synthesize similar linking sugars for utilization in different systems. Since the cell surface attachment of virulence factors is essential for virulence, these findings reveal new targets for the development of novel therapies.

with this, PorU has been implicated as a novel Gram-negative sortase that cleaves the T9SS signal at a moderately conserved site and conjugates the new C terminus to A-LPS (35,36). Mutant strains lacking genes that are essential for either A-LPS biosynthesis or the T9SS are nonpigmented. In P. gingivalis, mass spectrometry (MS) analyses of the A-LPS-modified proteins after deglycosylation with trifluoromethanesulfonic acid (TFMS) demonstrated that the mature C terminus of each protein was linked to an A-LPS fragment of 648 Da comprised of three units with masses of 104 Da, 198 Da, and 346 Da (36). The 104-Da unit was shown to be linked to the protein via a peptide bond.
In this study, we conducted extensive MS analyses of these LPS fragments isolated from both P. gingivalis and T. forsythia and report the putative structures of the linking sugars and their biosynthesis via the novel Wbp/Vim pathway.

RESULTS
In this study, we present detailed MS analyses of the LPS fragments isolated from modified cargo proteins of P. gingivalis and T. forsythia. The proposed structures of these fragments combined with the structure of their major TFMS-cleaved products are shown in Fig. 1. The elucidation of these structures is described below.
Determining the accurate mass and molecular formula. Previously, P. gingivalis T9SS cargo proteins were found to be modified at their matured C termini with A-LPS. A fragment of A-LPS that remained bonded to these proteins after deglycosylation with TFMS was shown to be composed of three components, I, II, and III, with masses of 104, 198, and 346 Da, respectively, with component I suggested to be a serinamide that was amide linked to the new C terminus of the cargo proteins (36). In this study, the A-LPS fragment was released by proteinase K cleavage; detailed MS n analysis was conducted using a linear ion trap quadrupole Fourier-transform ion cyclotron resonance (LTQ-FTICR) mass spectrometer (see analysis of components I and II below), and high-resolution spectra were acquired for the determination of accurate mass. The A-LPS fragment exhibited an m/z of 649.3070 (z ϭ 1ϩ) and collision-induced dissociation (CID) fragmentation of this peak produced a major peak at m/z 303.1298, which matched best to the formula C 11 H 19 N 4 O 6 with an error of 0.37 ppm. Taking this into consideration, the best match to the precursor ion (649 Da) was C 31 H 45 N 4 O 11 with an error of 1.4 ppm. The difference between these molecular formulae is C 20 H 26 O 5 , (346 Da) corresponding to component III (Fig. 1B). MS 3 of the m/z 303 ion produced a major peak at m/z 199.0713, corresponding to component II which, after the proton was subtracted, uniquely matched to C 8 H 10 N 2 O 4 , with an error of 0.17 ppm (Fig. 1B). Subtracting this from the formula of the m/z 303 ion gives the molecular formula of component I, C 3 H 8 N 2 O 2 (Fig. 1B). Interestingly, MS 3 of the dehydration product (649 ¡631) exhibited a peak at m/z 209.1325, which uniquely matched to C 16 H 17 with an error of 0.11 ppm. Within the scope of the whole study, this peak was observed only when ions that included component III were fragmented, indicating that the C 16 H 17 fragment derived from component III.
TFMS deglycosylation in the presence of ethylbenzene reveals the underlying chemistry. The identification of the C 16 H 17 fragment was interesting as it suggested that this group may be a novel hydrophobic anchor that is inserted into the OM. An alternative explanation, however, was the possibility that this group was derived from the toluene (C 7 H 8 ) used as a free radical scavenger in the deglycosylation reaction. To test this, new P. gingivalis samples were prepared with the deglycosylation step being conducted in the presence of ethylbenzene (C 8 H 10 ), abbreviated below as EB, rather than toluene (Tol). Liquid chromatography tandem MS (LC-MS/MS) analyses of the trypsin-digested samples failed to detect any of the previously identified peptides with the mass difference of 630 Da for the modification. Instead, modified C-terminal peptides were found to have an increased mass difference of 658 Da, indicating that the presence of scavenger (toluene or EB) was causing an artefactual modification. The mass difference between toluene and EB is 14 Da; however, the observed mass shift was 28 Da, suggesting that the artifact included two molecules of toluene or EB. As expected, precursor ions would lose 374 Da rather than 346 Da, demonstrating that the artifact was part of component III (Fig. 2). Subtracting Tol 2 (184 Da) or EB 2 (212 Da) from  Fig. 1B Fig. 1B and 2).
To identify further potential sugars, TFMS deglycosylation of P. gingivalis proteins purified from outer membrane vesicles (OMVs) was conducted in a time series in the presence of EB. Trypsin-digested samples were again analyzed by LC-MS/MS (Orbitrap) analyses. The C-terminal peptides of RgpB, P27, and PG0553 produced the most comprehensive data. The RgpB C-terminal peptide (VEGT) was analyzed first. The first time point of ϳ25 min provided the most intense peaks for all the modified peptides that were identified, as well as proportionally higher intensities for the larger species, with the peaks at m/z 1,209 and m/z 1,309 exhibiting an intensity of 4.6% and 1.9% relative to the most abundant form at m/z 1,063 (Table 1). Presumably, the longer cleavage times caused a greater degree of cleavage as well as a greater diversity of unwanted reactions. Therefore, an ϳ25-min reaction time was chosen for all subsequent experiments. The MS/MS data for the m/z 1,209 peak indicated the presence of a deoxyhexose (dHex) while, in addition, the m/z 1,309 peak also contained C 4 H 4 O 3 ( Table 1). The same results were found for the C-terminal peptides of P27 and PG0553, and the MS/MS spectra for P27 are shown in Fig. 3. The MS/MS spectrum of the most intense peak of m/z 991 corresponds to the P27 C-terminal peptide KGE linked to components I, II, and III (KGE-I-II-HexEB 2 ) (Fig. 3C). A weaker peak at m/z 617 was found to correspond to just KGE-I-II (Fig. 3A). Since this peptide was observed at the same retention time but was missing the hydrophobic HexEB 2 it was concluded to be an in-source decay fragment. Helpfully, the m/z 779 form consisting of components I, II, and the hexose residue (162 Da) without the EB 2 artifact was also observed (Fig. 3B). This peak was also concluded to be an in-source decay fragment as it had low intensity and the same retention time as the compound shown in Fig. 3D. This peak at m/z 1,137 was of higher mass than the major form by 146.057 Da, which accurately matches the mass of a deoxyhexose residue. MS/MS of this form suggested that a deoxyhexose is the next sugar in the chain (Fig. 3D). An additional form at m/z 1,237 was 100.016 Da higher, matching uniquely to C 4 H 4 O 3 , which may represent a residue of an organic acid such as succinate. MS/MS of this form first lost 358 Da (dHexEB 2 ) and then 100 Da, suggesting that the C 4 H 4 O 3 group was bonded to the hexose residue (Fig. 3E). Consistent with this, an MS peak at m/z 1,091 corresponding to KGE-I-II-Hex(C 4 H 4 O 3 )EB 2 was also observed (spectrum not shown). In this case, the MS/MS data did not support a separate loss of 100 Da; rather, the 100 Da was lost with the HexEB 2 . Analogous results were found for the C-terminal peptides of all three proteins (data not shown), confirming that the A-LPS fragment includes components I, II, hexose, deoxyhexose, and C 4 H 4 O 3. Despite extensive manual analysis of the data set, further sugars could not be reliably assigned.
Modification of T. forsythia T9SS substrates. Since T. forsythia is closely related to Porphyromonas species, we next determined whether a similar modification might occur in the cargo proteins of this species. Since the OMVs of this organism are enriched with these cargo proteins (37), we used OMVs as the starting material. Deglycosylation of OMVs with TFMS resulted in a reduction in the molecular weight (MW) of the major high-MW bands, known to correspond to TfsA, TfsB, and Tanf_02425 (37), to values more consistent with their calculated MWs, suggesting that the degly- VEGT-I-II-Hex-dHex-EB 2 5.4 ϫ 10 6 (4.6) 3.0 ϫ 10 6 (4.4) 2.9 ϫ 10 6 (3.0) 1. Novel Linking Sugar Enables Protein Conjugation to LPS ® cosylation was successful (Fig. 4). These deglycosylated bands were then digested with trypsin and analyzed by LC-MS/MS. Initially, the data were searched by Mascot using the same delta mass of ϩ630.3 Da used for P. gingivalis; however, no C-terminal peptides were positively identified, suggesting that the modification might be different. We therefore plotted neutral-loss  chromatograms of Ϫ346 Da for each band corresponding to the loss of component III (data not shown). The neutral-loss chromatogram for band 1 exhibited a strong peak at 33.4 min corresponding to the fragmentation of a compound of m/z 637.2. Detailed inspection of the MS/MS spectrum of this compound resulted in the identification of the expected C-terminal peptide of Tanf_02425 with the sequence FGPDHV and a modification delta mass of ϩ601.2 Da, 29 Da less than the P. gingivalis modification. As an addendum to the b-ion series, major ions were observed at further mass differences of 74 and 199 Da (Fig. 5A). The same approach was employed for bands 2 and 3 ( Fig. 4B), resulting in the identification of the modified C-terminal peptides of TfsB and TfsA, respectively ( Fig. 5B and C). When the data were automatically searched by Mascot using ϩ601.2 Da as an optional C-terminal modification, modified C-terminal peptides were identified for an additional two cargo proteins, Tanf_11855 and Tanf_06020 (37; also data not shown). In each case, the same pattern of peaks was observed, suggesting that the residual modification for these T. forsythia cargo proteins is composed of three components, I, II, and III, with masses of 74 Da, 199 Da and 346 Da, respectively (Fig. 1D).
With P. gingivalis modified C-terminal peptides, the modification could be released by proteinase K cleavage, indicating a peptide bond between the protein C terminus and the 104-Da component of the modification (36). Therefore, we also treated the deglycosylated T. forsythia proteins with proteinase K and analyzed the digestion products with LC-MS/MS. Through searching for the characteristic neutral loss of 346 Da, we observed a singly charged peak at m/z 620.3 that produced major product ions at m/z 274 (Ϫ346 Da) and m/z 200 (component II, MH ϩ ), confirming its identity as the residual modification, which, as in P. gingivalis, appears to be connected to the cargo protein via a peptide (amide) bond (data not shown).
Analyses of the T. forsythia C-terminal peptides by orbitrap LC-MS/MS was conducted using higher resolution (240,000) in the MS/MS scans to allow assignment of molecular formulae. For these spectra, inspection of the peptide a-ions and b-ions across the relevant mass range gave a maximum error of Ͻ1 ppm, and therefore these data were searched with a tolerance of 1 ppm. Component I returned C 2 H 6 N 2 O as the only match; component II returned C 8 H 9 NO 5 as the only match, and component III returned C 20 H 26 O 5 as the simplest of four conceivable matches ( Fig. 1D and Table 2), consistent with the hexose-toluene adduct identified for P. gingivalis. Component II. The major data collected for components I and II were the detailed MS n analyses of the purified LPS fragments from both species. The tentative structures for component II were N-acetyl glucuronamide and N-acetyl glucuronic acid for P. gingivalis and T. forsythia, respectively. This was deduced from a large number of MS spectra, mostly at the MS 4 level. These structures are supported by the following data. The MS 4 data of dehydrated component II at m/z 181 (P. gingivalis) and m/z 182 (T. forsythia) proved particularly insightful ( Fig. 6A and B), and these data were compared to the fragmentation patterns obtained for synthetic dehydrated glucuronamide and N-acetylglucosamine (NAG) (see Fig. S1 in the supplemental material). The precursor ions shown in Fig. 6A and B were assigned to disubstituted pyrylium ions and were found to lose major modules of NH 3 (Ϫ17 Da), CO (Ϫ28 Da), C 2 H 2 O (Ϫ42 Da), and CHNO (Ϫ43 Da) for P. gingivalis and H 2 O (Ϫ18 Da), CO (Ϫ28 Da), C 2 H 2 O (Ϫ42 Da), and CO 2 (Ϫ44 Da) for T. forsythia. The loss of 59 Da shown in Fig. 6A was deemed to correspond to consecutive losses of 42 Da and 17 Da rather than to the loss of acetamide (59 Da). This was supported by MS n analyses of glucuronamide and NAG as the loss of NH 3 was observed only for glucuronamide, the loss of ketene (Ϫ42 Da) was observed only for NAG, and the loss of 59 Da was not observed in either (Fig. S1). This interpretation is also consistent with data shown in Fig. 6B where, instead of a loss of 59 Da, a loss of 60 Da was observed corresponding to consecutive losses of 42 Da and 18 Da.
The peak at m/z 138 in both spectra ( Fig. 6A and B) is proposed to be an N-acetyl pyrylium ion resulting from the loss of cyanic acid (Ϫ43 Da) in P. gingivalis or the loss  (Fig. S1B), while the loss of cyanic acid (Ϫ43 Da) could be replicated from dehydrated glucuronamide (Fig. S1D). The loss of CO (Ϫ28 Da) observed in both P. gingivalis and T. forsythia ( Fig. 6A and B) was deduced to be from the ring and was supported by the same loss from both NAG and glucuronamide (Fig. S1). The suspected loss of CO from the ring infers a nonsubstituted carbon adjacent to the ring oxygen, presumably C-1. The loss of ketene (Ϫ42 Da) was the most favorable loss in both spectra ( Fig. 6A and B) and theoretically could be from either the N-acetyl group or from ring cleavage. It was concluded that most of the loss was from ring cleavage because the loss of ketene was much less for NAG (Fig. S1B), and ketene was not lost when serinamide or glycinamide was present in the structure (Fig. 7). MS 4 analyses of the nondehydrated component II ions were dominated by the loss of water ( Fig. 6C and D). The next major loss was again ketene (Ϫ42 Da), supporting the position of the hydroxyl group at C-4 rather than at C-1.
Component I. For both P. gingivalis and T. forsythia, the molecular formula of the first component was unequivocal. Since component I is linked directly to the protein C terminus via an amide (peptide) bond, it is deduced to have a free amine. Furthermore, while major fragment ions were observed for C-terminal peptide ϩ104 Da (P. gingivalis) or ϩ74 Da (T. forsythia), ions were also observed for C-terminal peptide ϩ87 Da (P. gingivalis) or ϩ57 Da (T. forsythia), which matches to serine or glycine, respectively (36), (Fig. 5). The additional 17 Da is inferred to correspond to an additional amine that connects components I and II. For the 74.0480-Da entity, the only metabolite in the Metlin database having two free amines is 2-aminoacetamide (glycinamide). Therefore, since the component I compounds from P. gingivalis and T. forsythia are likely related,  we hypothesized that they are glycinamide (T. forsythia) and serinamide (P. gingivalis), respectively.
To confirm the assignment of serinamide, the MS 4 data of the component I ion (m/z 105) from the purified P. gingivalis LPS fragment was analyzed and compared to the MS 2 spectrum of synthetic serinamide (Fig. 8). The spectra produced indistinguishable profiles, with observed fragments at m/z 60 corresponding to the loss of formamide and at m/z 87 corresponding to the loss of H 2 O. An unexpected ion at m/z 77 was present in both spectra, presumably representing the loss of CO concomitant with molecular rearrangement.  Table 3. The loss of 59 Da was assigned to the serine component of the P. gingivalis modification (Fig. 7A). This preferred cleavage between the carbonyl and alpha carbons is the same as that of serinamide (Fig. 8). The loss of NH 3 to give a peak at m/z 286 could be from either component I or II. MS 4 of the m/z 286 ion gave rise to major losses of H 2 O, further NH 3 , Ϫ87 Da, and new major losses at Ϫ30 Da and Ϫ47 Da while the losses of Ϫ59 Da and Ϫ104 Da were minimal (Fig. S2). The data are most consistent with the m/z 286 ion mostly lacking NH 3 from serine, with a smaller proportion of molecules having lost NH 3 from component II. With NH 3 removed, the losses of 30 Da, 42 Da, and 87 Da are readily matched to cleavages involving the residual serinamide component. The large loss of H 2 O from the m/z 286 ion may also be at least partly from the serine side chain. The same serinamide fragmentation was observed in the MS 4 spectra of the dehydrated ions of m/z 285 and m/z 267 (Fig. S2). These losses (besides H 2 O) were not observed in the equivalent spectra for T. forsythia (Fig. 7B), further supporting their assignment to the serine moiety in P. gingivalis. For T. forsythia, component I did not appear to fragment, apart from the loss of NH 3 , glycine, and glycinamide (Fig. 7B).
Partial structural confirmation using hydrogen/deuterium exchange. As noted above, the purified P. gingivalis A-LPS fragment has a protonated mass of 649 Da. The assigned structure theoretically has 13 exchangeable hydrogens which, upon exchange Novel Linking Sugar Enables Protein Conjugation to LPS ® with deuterium, would increase the mass to 662 Da as shown in Fig. S3A. After hydrogen/deuterium exchange and MS analysis, the spectrum exhibited a distribution of peaks ranging from ϩ6 to ϩ13 Da due to incomplete exchange (Fig. S3A). Fragmentation of the ϩ12-Da peak at m/z 661 (which is substantially more abundant than the ϩ13) produced a major cluster of ions between m/z 309 and m/z 312 which were assigned to the linking sugar comprising components I and II (Fig. S3B). The nondeuterated form of this fragment has an m/z of 303. MS 3 fragmentation of the m/z 311 peak produced component I fragments at m/z 109 and m/z 110, suggesting five exchangeable hydrogens relative to the nondeuterated form of m/z 105, in agreement with the proposed structure (Fig. S3C). The component II fragments were observed at m/z 202 and m/z 203, suggesting four exchangeable hydrogens relative to the nondeuterated form of m/z 199, also in agreement with the proposed structure (Fig. S3C).
The linking sugar is partly a product of the Wbp pathway. In P. gingivalis, the Wbp pathway was described to include four enzymes, WbpA, WbpB, WbpE, and WbpD, and shown to be essential for A-LPS synthesis (18). The Wbp product is expected to be a di-N-acetylated glucuronic acid [ Fig. 9, UDP-GlcNAc(3NAc)A] similar to the linking sugars identified in this study. The only differences in P. gingivalis are seryl instead of acetyl and amide instead of acid. In T. forsythia there was only the one difference, glycyl  instead of acetyl (Fig. 9, compare the Wbp product with final products). To find the genes that may be responsible for the differences in this pathway, the arrangements of the wbp genes in P. gingivalis and T. forsythia were compared with those of other species (Fig. S4). It was observed that a putative asparagine synthase gene (asnB, PGN_1234) was immediately downstream of wbpE (porR) and wzx (porS) in P. gingivalis and that a similar arrangement of genes was observed in several other species, suggesting that PGN_1234 may have a function in the Wbp pathway. Since AsnB is an amidotransferase, converting carboxylic acids to amides, we predicted that PGN_1234 may convert the glucuronic acid into glucuronamide. T. forsythia did not appear to have an asnB homolog, consistent with the presence of the uronic acid form. The PGN_1234 mutant, KDP1101, was therefore constructed and characterized for its role in A-LPS modification of T9SS cargo proteins. The mutant was found to have a Novel Linking Sugar Enables Protein Conjugation to LPS ® phenotype similar to that of the wild type (WT) in exhibiting black pigmentation and normal levels of gingipain and hemagglutination activities, consistent with normal secretion and attachment of these virulence factors to the cell surface (Fig. S5). Western blots of whole-cell lysates showed the presence of highly modified Arg gingipains and HBP35 consistent with wild type A-LPS modification ( Fig. 10A and B). However, although levels of total LPS appeared normal (Fig. 10C), A-LPS was barely detected with MAb 1B5 (Fig. 10D), suggesting that PGN_1234 was required for the creation of the MAb 1B5 epitope but not essential for linkage of LPS to cargo proteins.
To determine if PGN_1234 was responsible for glucuronamide formation, modified cargo proteins were isolated from the PGN_1234 mutant. The fractions containing modified RgpB were subjected to deglycosylation, trypsin digestion, and analysis by LC-MS/MS. The MS/MS data revealed that component II was now 199 Da (rather than 198 Da), consistent with the inability of the mutant to convert the glucuronic acid into glucuronamide ( Fig. 11A and B).
VimA transfers the serine or glycine to complete biosynthesis of the linking sugar. Next, we considered how the expected product of the Wbp pathway (di-Nacetylglucuronic acid/amide) might be converted to the observed linking sugar through the incorporation of serine or glycine. The only known proteins required for A-LPS biosynthesis that are potentially involved in sugar modifications other than those involved in the Wbp pathway are VimA and VimE. A search of the Conserved Domain database revealed that VimE is a putative member of the carbohydrate esterase 4 family which includes deacetylases (E value of 1EϪ83), and VimA is a putative N-acetyltransferase (E value of 2.9EϪ05). Therefore, we hypothesized that VimE may remove an acetyl group from the Wbp pathway product and that VimA may then transfer the serine (P. gingivalis) or glycine (T. forsythia) to the newly exposed amine (Fig. 9). To test this, we expressed the wild-type vimA gene from T. forsythia (vimA Tf ϩ ) in the P. gingivalis vimA mutant (vimA Pg Ϫ vimA Tf ϩ ) with the expectation that T. forsythia VimA would incorporate glycine into the linker of the modified P. gingivalis cargo proteins. This strain was successfully complemented by the wild-type vimA Tf gene, as indicated by the restoration of black pigmentation, MAb 1B5 reactivity, and the modification of cargo proteins with A-LPS (Fig. S6). The modified cargo proteins were purified from the OMVs of this complemented strain, and the RgpB fraction was subjected to deglycosylation, trypsin digestion, and analysis by LC-MS/MS, as previously described. The MS/MS data revealed that component I was now 74 Da (rather than 104 Da), consistent with the incorporation of glycine instead of serine ( Fig. 11C and D). The positive MAb 1B5 reactivity to the cargo proteins despite the exchange of glycine for serine suggests that the serine side chain may not be part of the MAb 1B5 epitope. These data indicate that VimA is responsible for the specificity toward serine (P. gingivalis) or glycine (T. forsythia), and combined with the predicted N-acetyltransferase function, we suggest that VimA directly catalyzes this amino acid transfer.

DISCUSSION
Structure summary. Through an extensive MS-based exploration, the structure of the linking sugar was concluded to be 2-N-seryl, 3-N-acetylglucuronamide in P. gingi- Novel Linking Sugar Enables Protein Conjugation to LPS ® valis and 2-N-glycyl, 3-N-acetylmannuronic acid in T. forsythia. The amino acid portions of these molecules were well supported by the MS data and by their cleavage with proteinase K. The N-acetyl and glucuronamide/glucuronic acid portions of the structures were also well supported, but the positions of substituents around the ring were not definitively determined. Further support from the literature for these groups is presented below. The linking sugar was bonded to a hexose residue in both species. In P. gingivalis, this hexose was further linked to both a deoxyhexose residue and C 4 H 4 O 3 (Fig. 1).
Novel method. The use of TFMS is standard for the removal of glycan chains from glycoproteins leaving the protein intact (38). Generally, the cleaved glycans are not analyzed; however, a small number of studies have detected N-acetylated sugars after TFMS treatment (38). To our knowledge, this is the first study to show that simple hexoses can also be detected after TFMS treatment. As seen by other investigators, the N-acetylated sugar (component II) could be analyzed since it was not extensively cleaved and modified during the deglycosylation procedure (39). In contrast, the hexose and deoxyhexose identified in this study both reacted with the arene (toluene or EB) that was used together with TFMS. An extensive search of the literature to investigate whether this kind of chemistry had been described previously revealed that reducing sugars will react with two arenes in the presence of an AlCl 3 catalyst under anhydrous conditions to produce 1,1-diaryl-1-deoxyalditols (40). Anhydrous AlCl 3 is a strong Lewis acid, and TFMS is likely to be functioning in the same way. The mechanism involving AlCl 3 proceeds through an intermediate modified by a single arene. A search of our LC-MS/MS data for this intermediate was successful (Fig. 3F), supporting the contention that the same chemistry applies to the TFMS-catalyzed reaction. The MS intensities of the peaks corresponding to the final bis-arylated product were at least 100-fold greater than the peak matching the single arylated intermediate, indicating that the reaction almost went to completion. TFMS deglycosylation can therefore be used to analyze glycan structure more broadly than previously thought.
Relationship with published LPS structures. The challenge these data represent is to determine how they relate to the reported A-LPS structure and therefore complete the picture of how the T9SS substrates are attached to the cell surface in P. gingivalis and T. forsythia. In P. gingivalis, A-LPS is composed of lipid A, a core oligosaccharide common to O-LPS, and a specific polysaccharide that is uniquely recognized by MAb 1B5. The outer core is composed of mannose with only some phosphoethanolamine while the inner core includes glycerol, allosamine, and KDO (3-deoxy-D-mannooctulosonic acid) (11). Lipid A is composed of glucosamine with N-linked and O-linked fatty acids (29). The polysaccharide of A-LPS was originally reported to be a branched phosphomannan (13). Based on these reports the structure of the A-LPS fragment reported here cannot be assigned to any known part of A-LPS. In contrast, the polysaccharide of O-LPS of P. gingivalis has been shown to have a tetrasaccharide repeating unit comprising GalNAc, Rha, Glc, and Gal (10), which represents the closest fit to our data since the Rha may be the same as the deoxyhexose identified in the unique A-LPS fragment. Consistent with this, Shoji et al. have suggested that A-LPS may contain a tetrasaccharide repeating unit that is similar to the O-polysaccharide (14). Key to this finding is the identification of the A-LPS-specific glycosyl transferases WbaP and GtfC that appear to be responsible for the addition of the first and second sugars of the A-LPS tetrasaccharide, while the glycosyltransferases GtfE and GtfB catalyze the addition of the third and fourth sugars in both O-LPS and A-LPS (14). Two further glycosyltransferases specific to A-LPS, GtfF and VimF (14,41), potentially transfer additional sugars to form an A-LPS-specific branch. We speculate that the A-LPS fragment identified in this study corresponds to this branch; however, further work is required to determine whether the identified deoxyhexose corresponds to the Rha within the tetrasaccharide or whether the branching point is elsewhere.
In T. forsythia, to date only rough LPS (which lacks the polysaccharide component) has been isolated. The structure of the core oligosaccharide is comprised of KDO, mannose, and glucosamine (42). Since the linking sugar was not identified, the form of LPS which is bonded to cargo proteins in this species has not yet been found. Interestingly, T. forsythia has orthologs for at least three of the A-LPS-specific glycosyltransferases, including two (WbaP and GtfC) implicated in the synthesis of the P. gingivalis tetrasaccharide repeating unit (Table 4). Further work is required to elucidate the exact roles of these transferases in both species and to identify the exact attachment point of the linker to the LPS anchor.
The linking sugar may be the product of the novel Wbp/Vim pathway. Our data indicate that the linking sugar is likely to be a product of Wbp and Vim enzymes for the following reasons. Since the linking sugar connects cargo proteins to A-LPS, its biosynthesis is expected to be essential for T9SS cargo protein modification which has an absolute requirement for the gene products WbpA, WbpB, WbpD, and WbpE (18) as well as for VimA and VimE (19). Second, the elucidated structure of the linking sugar fits exactly the predicted activities of these Wbp and Vim enzymes, with WbpS being an additional nonessential enzyme utilized by P. gingivalis (Fig. 9). Third, the demonstration that PGN_1234 (WbpS) is involved in the biosynthesis of the linking sugar in P. gingivalis by converting the uronic acid into the uronamide form supports both the proposed structures and the contention that the Wbp pathway is involved in the synthesis. Finally, the proposed biosynthesis of the elucidated structures is consistent with the genetic data by accounting for all the genes known to be specific to A-LPS biosynthesis besides the glycosyltransferases (WbaP, GtfC, GtfF, and VimF). We have designated this novel linking sugar biosynthetic pathway the Wbp/Vim pathway (Fig. 9).
It follows that the linking sugar may be the only A-LPS-specific sugar required both for recognition by MAb 1B5 and for cargo modification. Indeed, the significant loss of reactivity toward MAb 1B5 observed in the PGN_1234 mutant suggests that the uronamide of the linking sugar forms part of the MAb 1B5 epitope and is consistent with the nonreactivity of T. forsythia LPS since this species does not produce the uronamide (43). The phosphomannan was also reported to have an epitope for MAb 1B5 (13), which suggests that the phosphomannan is in close association with the linking sugar. However, both findings need to be confirmed with direct evidence. The phosphomannan may not be essential for cargo modification and the development of black pigmentation on blood agar since despite extensive screening for the genes required for black pigmentation (27), none have been found that could be specifically assigned to its biosynthesis. The phosphomannan consists of one phosphate and eight different mannoses and presumably requires several different mannosyltranferases for its synthesis, which have not been identified to date.
Position of N-Ser and N-Gly groups. The results presented here do not definitively show the position of any of the substituents in the linking sugar. The structure shown is the most consistent with and strongly indicated by the data, but consideration of its biosynthetic provenance strongly influenced the decision of where to place the substituents. The Wbp pathway uses a common precursor in GlcNAc, with the N-acetyl group in the usual C-2 position. In the second and third steps, it places an amine in the Novel Linking Sugar Enables Protein Conjugation to LPS ® C-3 position, and then in the fourth step WbpD transfers an acetyl group to the amine (Fig. 9). The function of WbpD was successfully complemented by WbpD from Pseudomonas aeruginosa, confirming that the di-N-acetylglucuronic acid is formed in P. gingivalis (18,22). It then appears that VimE removes one of the acetyl groups and that VimA transfers the amino acid. But if VimE removes the acetyl group from C-3, the product will be the same as the WbpE (PorR) product, which would seem to be an unnecessary step. Moreover, if this were the case, WbpD would not be essential for A-LPS biosynthesis. It is therefore proposed that VimE must instead target removal of the original acetyl group at C-2, consistent with our interpretation of the MS data. Association of vim genes with wbp genes and sortase-encoding genes in bacterial genomes. A conserved domain search of VimA matches not only to N-acetyl transferases but also to the pep-cterm_femAB family (TIGR03019). This family is defined in part by its codistribution within bacterial genomes with other genes previously found to be associated with each other. These associated genes include secreted proteins having the PEP-CTERM signal, an exosortase, and genes involved in the production of exopolysaccharide (EPS) (44). The PEP-CTERM signal includes the PEP motif, a putative transmembrane helix, and a positively charged C terminus. After translocation across the inner membrane (IM), the exosortase which is embedded in the IM is proposed to cleave near the PEP motif, leaving the transmembrane helix in the IM and conjugating the new C terminus to an unidentified compound. The protein is then expected to be secreted across the OM by an unknown pathway and become associated with the EPS (44). It is therefore interesting to speculate that the VimA-related proteins in these bacteria may be involved in producing the unidentified compound which may also be an amino acid-modified sugar. To investigate this further, some of the genetic loci containing these VimA-related proteins were examined (see Fig. S7 in the supplemental material). In Nitrosomonas eutropha, the vimA-related gene was adjacent to a polysaccharide deacetylase, glycosyltransferases, exosortase (EpsH), and a gene annotated as asparagine synthase, which matched to PGN_1234 by protein PSI-BLAST. A similar gene arrangement was observed in Nitrosococcus oceani and Desulfovibrio vulgaris (Fig. S7). Conserved domain searches of the two asparagine synthases produced top matches to the eps_aminotran_1 family (TIGR03108), which represents another protein associated with the PEP-CTERM system. To further understand the gene arrangement of vimA, vimE, and PGN_1234 homologs, PSI-BLAST searches were conducted for the three proteins, and the gene arrangements in some of the most prominently matching species were briefly examined. In Pseudomonas mosselii, Aeromonas veronii, and Vibrio cholerae, vimA homologs were found adjacent to vimE homologs, and genes belonging to the complete Wbp pathway were also found in the same cluster. These species do not have exosortases to our knowledge, but V. cholerae and A. veronii both have a rhombosortase (Fig. S7). Rhombosortases are associated with secreted proteins with the GlyGly-CTERM motif (45). Rhombosortase from V. cholerae has been shown experimentally for one protein to cleave at the GG motif and conjugate the C terminus to a moiety deduced to contain glycerophosphoethanolamine, which is required for cell surface anchorage after secretion through a type II secretion system (T2SS) (46). Other CTERM systems have also been identified in other bacteria (47). Further work is required to determine if the Wbp/Vim pathway is used for protein modification and cell surface attachment in any of these CTERM systems.
The PSI-BLAST searches of PGN_1234 revealed that it was more closely related to WbpS in P. aeruginosa and its homolog WbqG in Escherichia coli O121, both members of the eps_aminotran_1 family, than to other AsnB-related proteins in the same species (Table S3). WbpS has long been proposed to convert N-acetyl galacturonic acid into the uronamide form (48), and its homolog, WbqG, was demonstrated to be required for this activity (49). We therefore propose the name WbpS for PGN_1234.

MATERIALS AND METHODS
Bacterial strains, plasmids, and bacterial growth. Bacterial strains and plasmids used in this study are listed in Table S1 in the supplemental material. P. gingivalis strains were grown on solid medium Veith et al.
Construction of the PGN_1234 mutant. All DNA primer sequences used in this study are listed in Table S2. The P. gingivalis PGN_1234 mutant was constructed by removal of the coding region corresponding to E 71 -L 542 by making a suicidal plasmid with PGN_1234 upstream and downstream regions either side of an ermF antibiotic resistance cassette (pKD1401). This was done by exchange of upstream and downstream regions of the plasmid that was used for deletion of the hbp35 gene (pKD740, hbp35::ermF in pGEM-T Easy plasmid) (50). All PCRs used PrimeSTAR Max DNA polymerase (TaKaRa, Japan). When amplicons were cloned into pUC118, a HincII/BAP-treated vector (where BAP is bacterial alkaline phosphatase) and Mighty Cloning Reagent Set from TaKaRa were used. The PGN_1234 upstream and downstream regions were separately amplified by PCR using the primer pair PGN1234upFw and PGN1234upRev and the pair PGN1234dwFw and PGN1234dwRev, respectively, and P. gingivalis ATCC 33277 genomic DNA as the template. The upstream amplicon was cloned into pUC118 to produce pUC118-PGN_1234 up . The downstream amplicon was cloned into pUC118 to produce pUC118-PGN_1234 dw . The PGN_1234 upstream region was cleaved and purified from pUC118-PGN_1234 up using SphI-BamHI and ligated into the SphI-BamHI-cleaved hbp35 deletion plasmid pKD740 for upstream region exchange, producing pPGN_1234 up -ermF-hbp35 dw . The PGN_1234 downstream region was cleaved and purified from pUC118-PGN_1234 dw using PstI-SacI and ligated into PstI-SacI-cleaved pPGN_1234up-ermF-hbp35 dw for downstream region exchange producing the PGN_1234 deletion plasmid, pKD1401 (pPGN_1234 up -ermF-PGN_1234 dw ). Finally, pKD1401 was linearized with SphI and introduced into P. gingivalis ATCC 33277 by electroporation and selection on blood agar plates containing 10 g/ml erythromycin to obtain the Em-resistant transformant KDP1101.
Interspecies vimA complementation. A complementation plasmid that replaces the P. gingivalis C-terminal region of mfa1 and N-terminal region of mfa2 genes with the vimA gene (BFO_2568) of Tannerella forsythia was constructed. First, the N-terminal coding region of the mfa1 gene (mfa1 N ) was amplified by PCR using the primer pair mfa1-F and mfa1-R and P. gingivalis ATCC 33277 genomic DNA as the template. The amplicon was cloned into pUC118 producing pUC118-mfa1 N and was confirmed by DNA sequencing. The C-terminal coding region of the mfa2 gene (mfa2 C ) was amplified by PCR using the primer pair mfa2-F and mfa2-R and P. gingivalis ATCC 33277 genomic DNA as the template. The amplicon was cloned into pUC118 and the orientation with the primer BamHI site closest to the vector SphI site was confirmed by DNA sequencing producing pUC118-mfa2 C . The pUC118-mfa2 C was digested by SacI and self-ligated to eliminate the pUC118 BamHI site, producing pUC118-mfa2 C_2 . The SphI-BamHIcleaved mfa1 N DNA fragment from pUC118-mfa1 N was then ligated into the SphI-BamHI sites of pUC118-mfa2 C_2 producing pUC118-mfa1 N -mfa2 C . The BamHI-PstI DNA fragment encoding the ermF gene from pAL30 was inserted into the BamHI-PstI sites of pUC118-mfa1 N -mfa2 C , producing pmfa1 N -ermF-mfa2 C . The PCR fragment containing the P. gulae catalase promoter fused to the T. forsythia vimA coding sequence (BFO_2568) was generated as follows. The promoter region of the P. gulae catalase gene (p) from pKD954 (28) was amplified by PCR with the primers Pcat-F and Pcat-R. The T. forsythia vimA gene was amplified by PCR with the primers TfvimA-F and TfvimA-R and genomic DNA from T. forsythia ATCC 43037. The promoter and vimA amplicons (with a 16 nucleotide-overlap between them) were purified and used as PCR template with the primer pair Pcat-F and TfvimA-R, primer pair producing the p-vimA Tf ϩ gene fusion. This amplicon was cloned into pUC118, producing pUC118-p-vimA Tf ϩ and confirmed by DNA sequencing. The p-vimA Tf ϩ -containing KpnI-NotI DNA fragment from pUC118-p-vimA Tf ϩ was ligated into KpnI-NotI-cleaved pmfa1 N -ermF-mfa2 C , producing the vimA Tf ϩ complementation plasmid, pvimA Tf ϩ (pUC118-mfa1 N -ermF-p-vimA Tf ϩ -mfa2 C ). This plasmid was linearized with SphI, purified, and introduced into the P. gingivalis ATCC 33277 vimA mutant, KDP202 (vimA::tetQ) (53) by electroporation. The complemented recombinant (KDP1103, ATCC 33277 vimA::tetQ, mfa1 N mfa2 C ::ermF-