Revised Model for the Type A Glycan Biosynthetic Pathway in Clostridioides difficile Strain 630Δerm Based on Quantitative Proteomics of cd0241–cd0244 Mutant Strains

The bacterial flagellum is involved in a variety of processes including motility, adherence, and immunomodulation. In the Clostridioides difficile strain 630Δerm, the main filamentous component, FliC, is post-translationally modified with an O-linked Type A glycan structure. This modification is essential for flagellar function, since motility is seriously impaired in gene mutants with improper biosynthesis of the Type A glycan. The cd0240–cd0244 gene cluster encodes the Type A biosynthetic proteins, but the role of each gene, and the corresponding enzymatic activity, have not been fully elucidated. Using quantitative mass spectrometry-based proteomics analyses, we determined the relative abundance of the observed glycan variations of the Type A structure in cd0241, cd0242, cd0243, and cd0244 mutant strains. Our data not only confirm the importance of CD0241, CD0242, and CD0243 but, in contrast to previous data, also show that CD0244 is essential for the biosynthesis of the Type A modification. Combined with additional bioinformatic analyses, we propose a revised model for Type A glycan biosynthesis.

M any bacteria are flagellated; i.e., they have at least one flagellum.Rotation of the flagellar filament allows directed motility toward beneficial conditions (e.g., nutrientrich) and away from noxious environments. 1,2In addition, flagella mediate processes like adherence 3 and immunomodulation. 4The flagellar filament is composed of repeating units of flagellin C (FliC). 5,6FliC O-glycosylation is essential for flagellar assembly and/or function in many species, e.g., Helicobacter pylori and Campylobacter jejuni. 7,8Often, the glycan structures are unique and dependent on biosynthetic pathways with unusual enzyme activities. 9,10n the major human gut pathogen Clostridioides dif f icile, FliC is also modified with glycan structures.In C. dif f icile, FliC glycosylation is pivotal for flagellar function because motility is seriously impaired in gene mutants with improper biosynthesis of the flagellar glycan. 11,12So far, two different straindependent glycan structures have been described, Type A and Type B, 11,13 which only have in common the core monosaccharide that is O-linked to multiple serine and threonine residues of FliC.The Type A glycan, which is found in the C. diff icile strain 630Δerm, consists of an O-linked N-acetylglucosamine (GlcNAc) that is linked to N-methyl-Lthreonine through a phosphodiester bond (Figure 1A).This structure was fully characterized by a combination of mass spectrometry (MS) 14 and nuclear magnetic resonance spectroscopy (NMR). 11 C. dif f icile 630Δerm, a cluster of five genes (encoding CD0240−CD0244, Figure 1B) is linked to the biosynthesis of the Type A glycan. 11,14This cluster is found downstream of the f liC gene (cd0239) as part of the larger flagellar gene cluster.CD0240 is a glycosyltransferase, and disruption of this gene led to non-glycosylated FliC. 14The role of the other genes within the cluster is less clear, but one study looked at alterations in the Type A glycan structure in mutants with insertions in individual genes using MS analyses of FliC glycopeptides from purified flagella. 11In two of the mutants (cd0241::CT and cd0242::CT), flagellin was modified with only the core GlcNAc (i.e., lacking the N-methyl-phosphothreonine moiety).In contrast, in the cd0243::CT mutant strain, the Type A glycan structures lacked the N-methyl group on the threonine (only GlcNAc modifications were also observed), which was in line with the putative methyltransferase activity of CD0243 (Figure 1B).Surprisingly, no clear alterations in the Type A glycan structure were observed in the cd0244::CT strain (a mix of the full Type A glycan and GlcNAc on FliC was found), suggesting that CD0244 is redundant for Type A glycan biosynthesis.However, in the same study, it was observed that the bacterial motility in the cd0244::CT strain was highly impaired.The reason for this apparent inconsistency has hitherto remained elusive.Nonetheless, a model for the biosynthesis of the Type A glycan structure in C. diff icile was proposed, 11 in which no role for CD0244 was defined.
Interestingly, in addition to C. dif f icile (a Gram-positive bacterium), a Type A-like glycan is also found in the Gramnegative bacterium Pseudomonas aeruginosa, for example, in the reference strain PAO1.In this structure, the monosaccharide is a deoxyhexose which is linked to an unknown moiety through a phosphodiester bond. 15The similarity between the structures is also apparent from the gene cluster observed in P. aeruginosa (Supplemental Figure 1A).However, this cluster consists only of four genes (pa1088−pa1091, homologs of cd0240−cd0243) and lacks a gene similar to C. dif f icile cd0244. 15This supported the absence of an essential role for CD0244 in the model for the Type A glycan biosynthetic pathway in C. diff icile, as described above.However, bioinformatic analyses show that pa1091 ( fgtA) encodes a protein with both putative glycosyltransferase activity (similar to CD0240) and phosphotransferase activity (similar to CD0244).When mapping the predicted structures of CD0240 and CD0244 to the predicted structure of PA1091 (FgtA), the enzymatic domains of these proteins align with the predicted glycosyltransferase and phosphotransferase domains of PA1091, respectively (Supplemental Figure S1B,C).Hence, this also challenges the current model for the Type A glycan biosynthesis in C. dif f icile and led us to reinvestigate the alterations of the Type A glycan on FliC in the individual C. diff icile mutant strains.In contrast to the previous study that used qualitative analyses of FliC glycopeptides from purified flagella, we used an overall quantitative MS-based proteomics approach.Importantly, and in contrast to the previous data, we show that CD0244 is essential for full Type A glycan biosynthesis in C. dif f icile.Based on our data, we propose a revised model for the Type A glycan biosynthesis, providing testable hypotheses on the activity of individual enzymes encoded in the gene cluster.

Relative Abundance of CD0241-CD0244 in C. dif f icile
Wild-Type, Mutant, and Complemented Strains.To determine the relative abundance of the Type A biosynthetic proteins, we performed a MS-based quantitative proteomics analysis of the C. diff icile strains from the previous study as listed in Table 1 (i.e., wild-type (WT), cd0241::CT, cd0242::CT, cd0243::CT, cd0244::CT, cd0241::CT comp., cd0242::CT comp., cd0244::CT comp., in duplicate) using TMTpro 16plex labeling (no complemented strain for cd0243::CT was available). 11verall, 2187 C. diff icile proteins with at least two peptides were identified (Supplemental Table S1).To the best of our knowledge, this represents one of the most in-depth proteomics analyses of C. dif ficile cells.Given the aim of our study, we focused on the proteins encoded by the genes in the Type A glycan biosynthesis cluster (CD0240−CD0244), and all of them were readily identified with a high number of peptides.The data clearly showed that the levels of CD0241, CD0242, and CD0244 in the respective complemented strains were much higher than in the WT strain (Supplemental Figure S2), likely as a result of the plasmid-mediated expression under the control of a constitutive promotor from the fdx gene of Clostridium pasteurianum. 21Unexpectedly, the relative protein abundance of CD0241 and especially CD0244 in the respective insertion mutants did not reflect a knockout phenotype (Supplemental Figure S2 and Table S1).To rule out any unexpected issues with the strains,  we performed whole genome sequencing of all C. dif f icile 630Δerm strains in Table 1, which confirmed that the strains were isogenic and that the ClosTron insertions were as reported previously 11 (data not shown).We argue that the seemingly high levels of CD0241 and especially CD0244 in their knockout strains result from the unusually high expression of these proteins in their respective complemented strains, thereby compromising the correct quantification of these proteins using TMT labels (i.e., the levels of reporter ions are outside the dynamic range for accurate TMT quantification). 22This was supported by the data for CD0243 in the cd0243::CT strain, for which no complemented strain was available, and which reflected a knockout phenotype (Supplemental Figure S2).
To increase the accuracy of the quantification of the proteins involved in Type A biosynthesis, a second quantitative proteomics analysis was performed in which the complemented strains were excluded.The data from this experiment confirmed the knockout phenotype of the individual insertion mutants (Figure 2).The minor residual signals can be explained by either co-isolation or impurities in the TMTpro labels; i.e., each label contains a small percentage of different isotopologues (TMT Reporter Ion Isotope Distributions for TMTpro 16plex batch WK334339, Thermo Fisher Scientific).However, Figure 2 also clearly shows an effect of the gene disruption by ClosTron mutagenesis 23 on the downstream genes.For example, a ClosTron insertion in cd0242 influenced protein expression from the downstream cd0243 and cd0244 genes yet did not influence the upstream cd0241 gene to a similar extent.Obviously, these downstream polar effects could not be restored by complementation (Supplemental Figure S2).It is unsurprising that the ClosTron insertion caused the polar effects, given the fact that cd0241−cd0244 are part of a single operon in which transcription occurs from cd0241 throughout the rest of the genes.Since cd0244 is the last gene of the operon, this knockout did not show disruptive polar effects on the upstream genes in the operon (Figure 2).

Alterations of the Type A Glycan in the cd0241− cd0244 Mutant Strains.
To study the role of the individual genes in the cd0241−cd0244 cluster on the Type A glycan biosynthesis, we explored our data from the TMTpro 16plex experiment, including all strains, for the presence of Type A glycan-modified tryptic peptides from C. dif f icile FliC (UniProt ID: Q18CX7).We focused on four different tryptic peptides of FliC that were modified with a Type A glycan structure (LLDGTSSTIR, aa 135−144; AGGTTGTDAAK, aa 191− 201; TMVSSLDAALK, aa 202−212; LQVGASYGTNVSGTS-NNNNEIK, aa 145−166).For each of these peptides, we concentrated on three scenarios, i.e., modification with the full Type A modification, a GlcNAc, or a Type A lacking the methyl group.MS/MS spectra corresponding to these peptides were observed in the proteomics data described above (Supplemental Table S1).However, to provide the best quantitative information, we performed additional targeted HCD MS/MS analyses of these peptides, which allowed us to sum the intensities of the TMT signals over the full peak, instead of using the TMT signals from a single MS/MS scan.In addition, this generated good-quality fragmentation spectra of our peptides of interest and their (altered) respective Type A structures.
The MS/MS spectrum of Type A-modified tryptic peptide LLDGTSSTIR is shown in Figure 3A.In this spectrum, Type A glycan-specific fragments at m/z 214.048 (N-methylthreonine-phosphate, [M+H] + ) and 284.053 (phospho-GlcNAc, [C 8 H 15 NO 8 P] + ) are apparent.Moreover, the major peptide fragments have lost the Type A glycan modification.For the other three peptides containing a full Type A modification, similar fragmentation characteristics were observed (Supplemental Figures S3−S5).
The MS/MS spectrum of the tryptic peptide LLDGTSSTIR modified with a single GlcNAc is shown in Figure 3B, and Supplemental Figures S3−S5 show the spectra for the other peptides.The MS/MS spectra of these species more clearly showed the GlcNAc oxonium ions, e.g., at m/z 204.087, as compared to Type A glycan-modified peptides (Figure 3A and Supplemental Figures S3−S5).The ratio of the oxonium ions at m/z 138.055 and 144.066 is consistent with a GlcNAc. 24,25f note, a signal at m/z 126.055 was observed that corresponds to a GlcNAc oxonium ion, which is distinct from the 126C TMT reporter ion (m/z 126.128).
Interestingly, in the case of the absence of the N-methyl on the threonine as part of the Type A structure, our experimental setup would allow for TMT labeling of this extra amine group.Indeed, such FliC tryptic peptides containing the Type A glycan lacking the methyl group but with an additional TMT label were observed (Figure 3C  Next, we determined the relative abundance of the differently modified FliC peptides in each of the strains from Table 1.In Figure 4, the TMT signals from the MS/MS spectra of these modified FliC tryptic peptides are depicted.In line with what was shown previously, 11 the Type A glycanmodified FliC peptides were absent in the cd0241::CT, cd0242::CT, and cd0243::CT strains.However, in contrast to what was shown previously, Type A glycan-modified peptides were also absent in the cd0244::CT strain (Figure 4).As described above, the minor TMT signals observed for cd0241::CT and cd0244::CT in Figure 4 can be explained by impurities in the TMT labels.As expected, in addition to the WT strain, Type A glycan-modified peptides were also detected in the complemented strains, although the level of complementation varied per strain and peptide.
In line with previous data, 11 FliC tryptic peptides with a single GlcNAc were observed in the cd0241::CT and cd0242::CT strains (Figure 4).Importantly, we clearly show that also in the cd0244::CT strain FliC tryptic peptides with a single GlcNAc are highly abundant, again demonstrating that the modification of FliC in this strain is radically different from that in the WT strain.FliC peptides with only GlcNAc moieties were also detected in the cd0243::CT strain, which is in line with what has been observed before. 11We propose that this is due to the polar effects on cd0244 expression in the cd0243::CT strain (Figure 2).
As expected, peptides containing the Type A modification lacking the methyl group but with an additional TMT label were predominantly observed in the cd0243::CT strain (Figure 4).However, TMT reporter ions that could not be explained by impurities in the TMT labels were also observed in the cd0241::CT comp.and cd0242::CT comp.strains.We find it likely that this is due to a decreased efficiency in Type A biosynthesis due to the polar effects of the ClosTron insertion on cd0243 in the cd0241::CT and cd0242::CT strains (Figure 2).
Overall, our new data not only confirm the importance of CD0241−CD0243 for full Type A glycan biosynthesis in C. diff icile but also demonstrate that CD0244 is pivotal for full Type A glycan biosynthesis.In the cd0244::CT strain, the loss of the Type A glycan structure coincides with the appearance of peptides displaying only a GlcNAc.
Revised Model for the Type A Glycan Biosynthetic Pathway in C. dif f icile.Our results are not compatible with the current model for Type A glycan biosynthesis, which did not include a role for CD0244.In the previous model, it was proposed that CD0241 catalyzes the addition of phosphate to threonine, followed by CD0242 mediating the transfer of the phosphothreonine to the GlcNAc. 11Finally, CD0243 catalyzes the N-methylation of the threonine, although it is unclear during which step this occurs.In addition to the lack of a role for CD0244, the previous model also did not predict how the phosphothreonine is activated as a biosynthetic intermediate that can act as a donor.Hence, the above information prompted us to formulate new hypotheses about the activities of the different enzymes in this important biosynthetic pathway.
Bioinformatic analyses show that CD0242 belongs to the family of nucleotidyl transferases, which transfer a nucleoside monophosphate moiety to an accepting molecule.For example, a Phyre2 homology search models 97% of the sequence with 99.8% confidence to GDP-mannose pyrophosphorylase (a nucleotidyl transferase) from Leishmania donovani (PDB: 7whs, 21% i.d.).Indeed, the C. dif f icile reference genome (strain 630) from UniProt (Taxon ID: 272563) annotates CD0242 as a nucleoside triphosphate transferase (Figure 1, ID: Q18CY2).One of the proteins that is similar to CD0242, and was mentioned in the previous study, 11 is CTP: phosphocholine cytidylyltransferase.This cytidylyltransferase is a key enzyme in the synthesis of phosphatidylcholine referred to as the Kennedy pathway. 26Based on this amino acid similarity, we hypothesize that CD0242 is a CTP:phosphothreonine cytidylyltransferase that transfers CMP to phosphothreonine.The end product of the reaction is expected to be CDP-L-threonine.
CD0244, for which no role has previously been predicted, shows similarity to the CDP-glycerol:poly(glycerophosphate) glycerophosphotransferase TagF from Staphylococcus epidermis (Phyre2 models 77% of the sequence with 100% confidence, PDB: 3l7m, 16% i.d.), which has enzymatic activity similar to that of the phosphotransferase in the Kennedy pathway. 26In line with this and our new data for the cd0244::CT strain, we hypothesize that CD0244 is a CDP-threonine:GlcNAc threoninephosphotransferase that transfers the phosphothreonine moiety from CDP-L-threonine to the core GlcNAc on FliC.
The most challenging prediction is the role of C. dif f icile CD0241.In the previous model, it was thought to be involved in the synthesis of phosphothreonine.However, bioinformatic analyses showed the homology of CD0241 with a phosphoserine phosphatase (PSP), not a kinase.A Phyre2 homology search models 96% of the sequence with 100% confidence to the PSP from Methanocaldococcus jannashii (PDB: 1j97, 29% i.d.).Also, the counterpart of CD0241 in P. aeruginosa, PA1089, is predicted to exhibit similar activity.However, in that same organism, a different PSP-like enzyme is present that shows not only phosphatase activity but also phosphotransferase activity. 27This enzyme, ThrH, is a phosphoserine:homoserine phosphotransferase.Interestingly, both CD0241 and PA1089 are ThrH homologs and are predicted to adopt a fold similar to that of ThrH (Supplemental Figure S6), while many other similar PSP-like proteins display more differences in size and/or fold.In addition, homoserine is an isomer of threonine, indicating that there are only minor differences in substrates.Based on the above, we hypothesize that CD0241 (and PA1089) possesses a phosphoserine: threonine phosphotransferase activity.
Based on our proteomics data and bioinformatic analyses, we propose a revised model for the Type A glycan biosynthesis on FliC, as shown in Figure 5. Here, CD0241 transfers the phosphate group from a phosphoserine to a threonine, forming phosphothreonine.Next, CD0242 transfers the phosphothreonine to CTP, thereby forming CDP-threonine, while releasing an inorganic pyrophosphate (PPi).Then, CD0244 transfers the phosphothreonine to the GlcNAc moiety on FliC, which is attached by the glycosyltransferase CD0240, and this causes the release of CMP.At an unknown point during these steps, CD0243 mediates the N-methylation of threonine to form the complete Type A glycan modification.A likely donor is S-adenosyl methionine, which is converted to S-adenosyl homocysteine when donating its methyl group.

■ DISCUSSION
Flagella and their roles in motility, adherence, and other host− pathogen interactions vary across different C. dif f icile lineages.−30 However, flagellated strains display an increased fitness in vivo 30 and greater cecal adherence 30 and induce a more intense inflammation 31 than their non-flagellated counterparts.On the other hand, strains that are impaired in FliC production, the major component of the flagellar filament, have been shown to produce more exotoxins and are more virulent. 29,30revious studies have shown that post-translational modification of FliC is important for flagellar function. 7,8Also in C. dif f icile 630Δerm, disruption of several genes involved in the biosynthesis of the Type A glycan structure that is present on FliC, i.e. cd0241, cd0242, and cd0244, resulted in impaired mobility. 11Moreover, a strain that was only able to modify FliC with the core GlcNAc moiety of the Type A glycan (cd0241::CT) showed attenuated initial colonization and recurrence in mice. 11However, the proposed model for the biosynthesis of the Type A glycan defined no role for CD0244 and lacked detailed prediction on enzymatic activities and biosynthetic intermediates. 11ur results demonstrate a clear role for CD0244 in the biosynthesis of the Type A glycan.In the cd0244::CT strain, the loss of Type A coincided with the appearance of the core GlcNAc of the Type A structure.Previously, a mixed population of both structures was observed. 11We currently have no explanation for the discrepancy between our results and the previously reported data, especially since we used the same set of strains.However, the quantitative nature of the current study, as compared to the earlier qualitative analyses, may partially explain this.Nevertheless, the current data would explain the apparent inconsistency that was found between the impaired motility that was observed in the cd0244::CT strain as opposed to the absence of clear alterations in the Type A glycan structure in the earlier study.
Based on our bioinformatic analyses, we hypothesize that CD0242 mediates the synthesis of CDP-L-threonine, which would be a key biosynthetic intermediate of the Type A biosynthesis.To the best of our knowledge, CDP-L-threonine would be a so far not described cellular metabolite.However, several studies have shown the existence of amino acid residues linked to CDP in other prokaryotes, namely CDP-L-glutamine and CDP-L-serine. 32Furthermore, we predict CD0244 to be a CDP-threonine:GlcNAc threonine phosphotransferase.However, CD0244 also shows a similarity to UDP-N-acetylglucosamine 2-epimerase.This enzyme catalyzes the reversible epimerization of UDP-GlcNAc into UDP-ManNAc, the activated donor of ManNAc.Yet, this function is not supported by the data.First of all, the Type A modification has been shown to contain a GlcNAc and not a ManNAc. 11Second, the lack of CD0244 in the cd0244::CT strain does not prevent the glycosylation of FliC.
Recently, we showed that a phosphoproteomics workflow could be used to enrich Type A-modified peptides, 33 probably due to the phospho moiety of the Type A glycan.For the current study, such an approach was not suitable because we would lose the GlcNAc-modified peptides.In our previous phosphoproteomics data, we also observed a fraction of FliC tryptic peptides that were modified with a phospho-GlcNAc. 34owever, we have not observed the accumulation of such peptides in any of our knockout strains.Therefore, we find it likely that these peptides were the result of breakdown processes.This is supported by the fact that phospho-GlcNAc peptides could be identified in our database searches, but they all co-eluted with the full Type A-modified peptides, indicating in-source decay of the Type A peptides (data not shown).Moreover, species originated only from the WT and the complemented strains that produce the full Type A glycan, further supporting the idea that the phospho-GlcNAc moiety is not an intermediate in the biosynthesis of the Type A glycan.First, CD0241 transfers the phosphate group from a phosphoserine to a threonine, forming phosphothreonine.Next, CD0242 transfers the phosphothreonine to CTP, thereby forming CDP-threonine, while releasing inorganic pyrophosphate.Then, CD0244 transfers the phosphothreonine to the GlcNAc moiety on FliC, which is attached by the glycosyltransferase CD0240, and this causes the release of CMP.The GlcNAc moiety on FliC is most likely donated by a nucleoside-diphosphate-GlcNAc (NDP-GlcNAc).At an unknown point during these steps, CD0243 mediates the N-methylation of threonine to form the complete Type A glycan modification.A likely donor is S-adenosyl methionine, which is converted to S-adenosyl homocysteine when donating its methyl group.
Disruption of any of the genes in the cd0241−cd0244 cluster using the ClosTron method completely prevents the formation of the Type A glycan.Although the levels of CD0241, CD0242, and CD0244 are unnaturally high in their respective complemented strains, this overexpression did not restore the levels of Type A-containing peptides to the WT level.For the cd0241::CT complemented and cd0242::CT complemented strains, we argue that this is due to the polar effects on the downstream genes caused by the gene insertions, which appeared to be quite strong.Nonetheless, the fact that partial complementation was possible shows that, despite the strong polar effects, active enzymes from the affected genes are still present.For the cd0244::CT strain, no disruptive polar effects on the upstream genes in the cluster were observed, which was also apparent from the lack of peptides with a single GlcNAc in the cd0244::CT complemented strain.Still, we found lower levels of Type A-modified peptides in the cd0244::CT complemented strain as compared to the WT strain.This, however, might be explained by low levels of FliC itself in the cd0244::CT complemented strain (Supplemental Table S1).FliC levels in the cd0244::CT complemented strain appeared to be around 6 times lower compared to the WT, and if we corrected for these differences in FliC levels, the levels of Type A-modified peptides in the cd0244::CT complemented strain would approach the WT levels.The nature of the lower levels of FliC in this strain remains unclear.Possibly, a feedback loop is present that responds to the overexpression of cd0244 in the complemented strain.

■ CONCLUSION
Based on quantitative proteomics and bioinformatic analyses, we propose a revised model for the biosynthesis of the Type A glycan modification on FliC in C. dif f icile and predict enzymatic activities for each of the involved proteins.Further experiments using these enzymes should shed more light on their activities.Our findings and model for post-translational glycan modification of flagellin in C. diff icile will be relevant to the similar locus in P. aeruginosa PA01 and other bacterial species with similar flagellin modifications.
■ METHODS Bacterial Strains and Growth Conditions.The C. diff icile strains used in this study are listed in Table 1 11 and were cultured at 37 °C in a Don Whitley A55 HEPA anaerobic workstation.The cells were grown in brain heart infusion (BHI, Oxoid) broth supplemented with 5 g/L yeast extract (BHIY) or on BHIY agar plates.When appropriate, 15 μg/mL of thiamphenicol was added.
Sample Preparation for the Quantitative Proteomics Analysis of C. dif f icile Strains.Single colonies of C. diff icile were picked and were precultured for 24 h in 5 mL prereduced BHIY.Next, the precultures were used to inoculate 5 mL of prereduced BHIY broth at a starting OD 600 of 0.05, and cells were grown for 16 h.Then, cells were pelleted by centrifugation (3220g, 20 min, 4 °C).Pellets were resuspended in 1 mL of ice-cold PBS and washed three times (8000g, 5 min, 4 °C).After the last wash, pellets were resuspended in 1 mL of ST lysis buffer (5% SDS, 0.1 M Tris-HCl pH 7.5).Tubes were incubated for 20 min on ice prior to lysis by sonication, and cells were subsequently lysed by sonication for five bursts of 10 s with cooling on ice in between rounds.After lysis, tubes were centrifuged (15 min, 15000g, RT).Supernatants were transferred to new tubes and stored at −20 °C until further use.
For each strain, 100 μg of protein in 100 μL of ST buffer was used as the starting material.Proteins were reduced using 5 mM TCEP for 30 min, alkylated with 10 mM iodoacetamide for 30 min, and quenched with 10 mM DTT for 15 min, all at room temperature.Proteins were precipitated by chloroform− methanol precipitation.For this, 400 μL methanol, 100 μL chloroform, and 300 μL dH 2 O were added with vortexing in between each step.Following centrifugation (21130g, 2 min, RT), the pellet was washed two times with 500 μL methanol.The protein pellet was subsequently resuspended in 100 μL of 40 mM HEPES pH 8.4 containing 4 μg trypsin and incubated overnight at 37 °C.Again, 4 μg trypsin was added and incubated for 3 h.
LC-MS/MS Analysis.TMT-labeled peptides were dissolved in 0.1% formic acid and subsequently analyzed by online C18 nano-HPLC MS/MS with a system consisting of an Easy nLC 1200 gradient HPLC system (Thermo, Bremen, Germany) and an Orbitrap Fusion LUMOS mass spectrometer (Thermo).Fractions were injected onto a homemade precolumn (100 μm × 15 mm; Reprosil-Pur C18-AQ 3 μm, Dr Maisch, Ammerbuch, Germany) and eluted via a homemade analytical nano-HPLC column (30 cm × 75 μm; Reprosil-Pur C18-AQ 1.9 μm).The analytical column temperature was maintained at 50 °C with a PRSO-V2 column oven (Sonation, Biberach, Germany).The gradient was run from 2% to 40% solvent B (20/80/0.1 water/acetonitrile/ formic acid (FA) v/v) in 120 min.The nano-HPLC column was drawn to a tip of ∼5 μm and acted as the electrospray needle of the MS source.The LUMOS mass spectrometer (Thermo) was set to use the MultiNotch MS3-based TMT method. 16The MS spectrum was recorded in the Orbitrap (resolution of 120,000; m/z range of 400−1500; automatic gain control (AGC) target set to 50%; maximum injection time of 50 ms).Dynamic exclusion was after n = 1 with an exclusion duration of 45 s and a mass tolerance of 10 ppm.Charge states 2−5 were included.Precursors for MS2/MS3 analysis were selected using "TopSpeed" with a cycle time of 3 s.MS2 analysis consisted of collision-induced dissociation (quadrupole ion trap analysis; AGC was set to "standard"; normalized collision energy (NCE) 35; maximum injection time 50 ms).The isolation window for MS/MS was 1.2 Da.Following the acquisition of each MS2 spectrum, the MultiNotch MS3 spectrum was recorded using an isolation window for MS3 of 2 Da.Ten MS2 fragments were simultaneously selected for MS3 and fragmented by high-energy collision-induced dissociation (HCD) at 65% at a custom AGC of 200% and analyzed using the Orbitrap from m/z 120 to 500 at a maximum injection time of 105 ms at a resolution of 60,000.
To obtain more accurate ratios for selected species, a separate targeted MS2 (tMS2) run was recorded for the following peptides and their selected m/z: LLDGTSSTIR with Type A, 588. ).tMS2 spectra were recorded with a precursor isolation width of 0.7 Da at an HCD collision energy of 36% at resolution 30,000 and an AGC target "standard".The maximum injection time was set to 54 ms.MS2 spectra of each selected species were summed.
LC-MS/MS Data Analysis.In a post-analysis process, raw data were converted to peak lists using Proteome Discoverer version 2.5.0.400 (Thermo Electron) and submitted to the UniProt C. dif ficile 630Δerm database (3752 entries) (Taxon ID: 272563) using Mascot v. 2.2.07 (www.matrixscience.com)for peptide identification.Mascot searches were performed with 10 ppm and 0.5 Da deviation for precursor and fragment mass, respectively, and trypsin was selected as enzyme specificity with a maximum of two missed cleavages.The variable modifications included Type A (ST), Type A minus methyl plus TMT (ST), HexNAc (ST), Oxidation (M), and Acetyl (protein N-term).For the TMTpro 15plex search, also phosphoHexNAc was included.The static modifications included TMTpro (N-term, K) and Carbamidomethyl (C).Peptides with an FDR < 1% based on Percolator 17 were accepted.Quantification of peptides was performed on MS3 spectra with an SPS Mass Match threshold of 100%.
Whole Genome Sequencing.For identity confirmation, mutant strains were subjected to whole genome sequencing according to standard procedures. 18In short, total genomic DNA was isolated from a single colony resuspended in PBS on a QiaSymphony platform (Qiagen).Purified DNA was sequenced on the Illumina Novaseq6000 platform with a read length of 150 bp in the paired-end mode.The resultant FASTQ files were used in a reference assembly against the C. diff icile 630 reference genome (GenBank AM180355) in Geneious software (Biomatters Ltd.); Clostron insertions were confirmed by visual identification of clusters of nucleotide polymorphisms and computational identification of highquality single nucleotide polymorphisms using the Find Variations/SNPs algorithm in Geneious (minimum coverage 10, minimum variant frequency 0.8).
Bioinformatic Analyses.To search for protein homologs and predict functions, the Phyre2 Web portal 19 and the InterPro Web site for classification of protein families (www.ebi.ac.uk/interpro/search/sequence) were used.Predicted protein structures were retrieved from the Alphafold database (alphafold.ebi.ac.uk/) or modeled using Alphafold2. 20Analyses of protein structures were performed in PyMOL 2.5.5.

■ ASSOCIATED CONTENT Data Availability Statement
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium 35 via the PRIDE 36 partner repository with the dataset identifier PXD045152.

Figure 1 .
Figure 1.Reported structure of the Type A glycan modification and the gene cluster responsible for its biosynthesis.(A) Structure of the FliC Type A glycan.The structure consists of an O-linked GlcNAc that is linked to N-methyl-L-threonine through a phosphodiester bond.(B) The gene cluster responsible for the Type A glycan modification and the functions of the protein products as annotated in the UniProt C. diff icile 630Δerm reference proteome (Taxon ID: 272563).

Figure 2 .
Figure 2. Relative levels of the Type A biosynthetic proteins in mutants with ClosTron insertions in the individual genes.A quantitative proteomics experiment was performed using TMTpro 15plex labeling (each strain in triplicate) and analyzed using LC-MS/MS on an Orbitrap Fusion Lumos Tribrid mass spectrometer.The protein levels of the Type A biosynthetic proteins in each of the individual gene mutant strains relative to the WT are shown.Ratios are calculated based on the average absolute abundance of a protein from three replicates per strain.

Figure 3 .
Figure 3. Summed MS/MS spectra of the LLDGTSSTIR peptide displaying the Type A glycan and variants thereof.Targeted HCD MS/MS analysis of the TMTpro 16plex labeled strains was performed.MS/MS spectra were summed over the full peak corresponding to the LLDGTSSTIR peptides displaying the complete Type A (A), only the GlcNAc (B), or Type A minus the methyl group having an extra TMT label (C).The theoretical precursor masses and the experimental masses of important fragment ions are shown on the right.All indicated b-and y-ions are from the unmodified TMT-labeled peptide.

Figure 4 .
Figure 4. Relative levels of peptides containing different Type A variants in individual gene mutant and complemented strains.Targeted HCD MS/MS analysis of the TMTpro 16plex labeled strains was performed.MS/MS spectra were summed over the full peak corresponding to the peptides displaying the complete Type A, only the GlcNAc or Type A minus the methyl group, having an extra TMT label.The bars represent the absolute intensities of the TMT reporter labels for each strain, which were analyzed in duplicate.The complemented strains are indicated with a "C".

Figure 5 .
Figure 5. Schematic overview of the revised model for the Type A glycan biosynthetic pathway.First, CD0241 transfers the phosphate group from a phosphoserine to a threonine, forming phosphothreonine.Next, CD0242 transfers the phosphothreonine to CTP, thereby forming CDP-threonine, while releasing inorganic pyrophosphate.Then, CD0244 transfers the phosphothreonine to the GlcNAc moiety on FliC, which is attached by the glycosyltransferase CD0240, and this causes the release of CMP.The GlcNAc moiety on FliC is most likely donated by a nucleoside-diphosphate-GlcNAc (NDP-GlcNAc).At an unknown point during these steps, CD0243 mediates the N-methylation of threonine to form the complete Type A glycan modification.A likely donor is S-adenosyl methionine, which is converted to S-adenosyl homocysteine when donating its methyl group.

Table 1 .
Overview of the C. dif ficile Strains Used in This Study a a Data taken from ref 11.