Elucidation of the O-antigen structure of Escherichia coli O93 and characterization of its biosynthetic genes

Abstract The structure of the O-antigen from the international reference strain Escherichia coli O93:−:H16 has been determined. A nonrandom modal chain-length distribution was observed for the lipopolysaccharide, a pattern which is typical when long O-specific polysaccharides are expressed. By a combination of (i) bioinformatics information on the gene cluster related to O-antigen synthesis including putative function on glycosyl transferases, (ii) the magnitude of NMR coupling constants of anomeric protons, and (iii) unassigned 2D 1H, 13C-HSQC, and 1H,1H-TOCSY NMR spectra it was possible to efficiently elucidate the structure of the carbohydrate polymer in an automated fashion using the computer program CASPER. The polysaccharide also carries O-acetyl groups and their locations were determined by 2D NMR experiments showing that ~½ of the population was 2,6-di-O-acetylated, ~¼ was 2-O-acetylated, whereas ~¼ did not carry O-acetyl group(s) in the 3-O-substituted mannosyl residue of the repeating unit. The structure of the tetrasaccharide repeating unit of the O-antigen is given by: →2)-β-d-Manp-(1→3)-β-d-Manp2Ac6Ac-(1→4)-β-d-GlcpA-(1→3)-α-d-GlcpNAc-(1→, which should also be the biological repeating unit and it shares structural elements with capsular polysaccharides from E. coli K84 and K50. The structure of the acidic O-specific polysaccharide from Cellulophaga baltica strain NN015840T differs to that of the O-antigen from E. coli O93 by lacking the O-acetyl group at O6 of the O-acetylated mannosyl residue.


Introduction
Escherichia coli is a bacterium that belongs to the family Enterobacteriaceae and it is part of the colonic flora of humans, where the bacterium and the host mutually benefit from each other. However, clinical syndromes that result from pathogenic strains include enteric/diarrheal disease, urinary tract infection, and sepsis/meningitides (Stenutz et al. 2006;Vila et al. 2016;Liu et al. 2020).
Some E. coli make Shiga toxin(s) known as Stx1, one of the most potent bacterial toxins described to date (Melton-Celsa 2014), though a second type referred to as Stx2 is present in certain E. coli strains. Collectively these bacteria are referred to as Shiga toxin producing E. coli (STEC) and this pathogen is a potential health risk as it may cause, in particular, hemolytic uremic syndrome (HUS) resulting in kidney failure. The prevalence of STEC in healthy adults in Japan was investigated by collecting close to 3 million samples during a 2-year period and a large number of serogroups was identified carrying stx genes, among others, E. coli O93 (Morita-Ishihara et al. 2016). Monitoring of food-producing animals during handling and preparation of meat is therefore important in order to avoid STEC-based outbreaks and illness caused by these bacteria. The use of venison for human consumption called for a closer investigation of potential contamination by STEC, e.g. from wild deer in Japan where 4 serogroups were isolated, one of them being E. coli O93 (Asakura et al. 1998). In a more recent study of 30 sika deer in Japan during 3 years the most dominant serogroup was O93 present in almost 40% of the isolated samples (Kabeya et al. 2017). In bovine ground meat from Brazil, E. coli O93 was not only identified as STEC but also as an enteroaggregative E. coli (EAEC) pathotype (Pelayo et al. 2019). Furthermore, bacterial E. coli isolates from cattle in Australia revealed that serogroup O93 was common and it was shown to have an antibiotic resistance phenotype, though the specific antibiotics to which the E. coli O93 bacteria were resistant did vary (Bettelheim et al. 2005). Avian pathogenic E. coli (APEC) were isolated from ducks in China during an 8-year period, where E. coli O93 was the predominant serogroup , whereas in ducks from Vietnam it was the least common of the isolated serogroups (Thu et al. 2019). A study of virulence factors in E. coli isolates from bovine mastitis (Fernandes et al. 2011) identified serogroups in a few cases, one of them being E. coli O93, where also the somatic antigen was possible to identify, i.e. the O-antigen. Furthermore, this E. coli O93 isolate was resistant to the antibiotics ampicillin and trimethoprim sulfa.
The potential severity as a human pathogen has prompted us to investigate the structure of the O-antigen polysaccharide of E. coli O93 as knowledge thereof may give insight to how the bacterium can evade the immune system of the host, and it may also form the basis for future vaccine candidates. To this end we have used a combination of bioinformatics information, NMR spectroscopy and the structure prediction tool CASPER to unravel its primary structure. Subsequently, 3-dimensional sub-structure models of the polysaccharide were possible to construct revealing polymer extension and epitope accessibility of structural entities.

Results and discussion
Escherichia coli O93 was grown in a lysogeny broth medium and the LPS was extracted according to the hot phenol/water method using previously described procedures (Furevi et al. 2020). The LPS was analyzed by SDS-PAGE (Fig. 1, lane 3). This analysis revealed a typical ladder pattern for the LPS from E. coli O93 owing to the O-antigen having a number of repeating units (RU) expressed with a nonrandom modal chain-length distribution. For comparison, the rough LPS from E. coli CWG303 waaG (lane 1; Yethon et al. 2000) and E. coli W3110 (lane 2; Klein et al. 2013) were included. Escherichia coli CWG303 waaG is truncated at the heptose region of the inner core whereas E. coli W3110 contains the full K12 core. An initial analysis by 1 H NMR spectroscopy of the LPS revealed the presence of signals in the spectral region 2.00-2.05 and 2.17-2.22 ppm where resonances from methyl groups of N-acetyl and O-acetyl groups of sugar residues reside, respectively.
The part of the genome of E. coli O93 related to Oantigen synthesis and expression, from galF to HisI, has been deposited at NCBI (GenBank AB812041.1), analyzed and putative functions of the gene products have been proposed and annotated (Fig. 2;Iguchi et al. 2015;DebRoy et al. 2016;Liu et al. 2020) The presence of the genes wzx, wzy, and wzz indicates that the O-antigen is synthesized via the Wzx/Wzy biosynthetic pathway Islam and Lam 2014;Liu et al. 2019;Ståhle and Widmalm 2019;Whitfield et al. 2020). The gene cluster also contained 2 NDPsugar precursor-encoding genes, manB and manC, involved in the synthesis of the donor GDP-d-Manp, having an αanomeric configuration. Furthermore, a putative O-acetyl transferase, wexC, was proposed and by identification of conserved regions it can be shown to be maltose O-acyltransferase (MAT)-like (Lo Leggio et al. 2003). A gene annotated as ugd was also present. This gene most likely encodes a UDPglucose 6-dehydrogenase that converts UDP-glucose to UDPglucuronic acid (Stevenson et al. 1996) (UDP-d-GlcpA with an α-anomeric configuration). Two genes coding for glycosyltransferases (GTs) were identified, viz., wfdV and wfdW, after galF and wexC, respectively. Analysis and identification of conserved regions revealed that wfdV belongs to GT family 2 that acts through an inverting mechanism whereas a corresponding analysis showed that wfdW belongs to GT family 1, which also acts through an inverting mechanism. The function of the product coded by the gene denoted by orf5 has not yet been identified.
The Wzx/Wzy-dependent pathway employs WecA to make Und-PP-d-GlcNAc (Al-Dabbagh et al. 2016) as the initial sugar of the biological repeating unit (BRU). Since both gnu, coding an epimerase that interconverts N-acetyld-glucosamine-PP-Und to N-acetyl-d-galactosamine-PP-Und, and gne, coding for an epimerase that isomerases LPS from E. coli CWG303 waaG, Lane 2: LPS from E. coli W3110, and Lane 3: LPS from E. coli O93. A 14% Tricine SDS-PAGE gel was used to separate LPS samples, which subsequently were stained with AgNO 3 . A pre-stained protein marker (Thermo Fisher Scientific) was utilized for calibration of molecular weights. UDP-d-GlcpNAc to UDP-d-GalpNAc (Cunneen et al. 2013), are absent in the genome as judged from analysis hitherto performed, we conclude that the first sugar residue of the BRU is d-GlcpNAc and that d-GalpNAc is not present in the O-antigen (unless orf5 corresponds to one of the two genes). As there are not any other genes in the sequence between galF and gnd the sugar residues in the RU should contain Databases storing information on polysaccharide structures facilitate comparison among entries, which can be highly beneficial when unknown structures are to be elucidated. ECODAB stores (for each serogroup) available information from literature on E. coli O-antigen structures, their NMR chemical shifts and function of GTs used in the biosynthesis of the O-polysaccharides (Lundborg et al. 2010;Rojas-Macias 2015). Importantly, local BLAST+ searches have been performed within the database to compare amino acid sequences of putative GTs and the results from the sequence similarity searches are stored and ranked according to the e-value (the lower the value the more significant is the agreement) based on the similarity to other GTs within the database, for which function is known or has been predicted. In this way structural information can be gained resulting in a disaccharide structural element whose two constituent monosaccharides, the anomeric configuration at the glycosidic linkage and the substitution position of the acceptor sugar can be proposed. For E. coli O93 O-antigen biosynthesis, the first gene in the sequence coding for a putative GT, wfdV, is closely similar, with an e-value of 3 × 10 −101 , to weiG from E. coli O161 (Li et al. 2010). weiG codes for a GT that makes the disaccharide structural element β-d-GlcpA-(1→3)d-GlcpNAc, which is consistent with the fact that WfdV is proposed to operate via an inverting mechanism and that UDP-d-GlcpA with an α-anomeric configuration should then be the donor molecule as the product has the β-anomeric configuration.
Polysaccharide material was prepared by treatment of the LPS with dilute acid, dialysis, and gel-permeation chromatography as well as by treatment under dilute basic conditions to yield material referred to as PS and PS-OH, respectively. The 1 H NMR spectrum of the PS-OH sample showed four resonances from anomeric protons and the sugar residues corresponding to these were denoted A-D by order of descending proton chemical shift (Fig. 3a). Notably, the 3 J H1,H2 of the anomeric protons of residues B and C were small and estimated to have 3 J H1,H2 < 2 Hz, which is characteristic for manno-configured monosaccharides. The anomeric proton of D showed 3 J H1,H2 of ∼8 Hz whereas H1 of C was significantly broader in comparison to those from B and C and was therefore estimated to have a medium-sized 3 J H1,H2 ≈ 3-4 Hz, which is characteristic for pyranose sugar residues having the gluco/galacto-configuration.
A 1 H, 13 C-meHSQC (multiplicity-edited) NMR spectrum of the O93 PS-OH showed a total of 27 cross-peaks consisting of 4 cross-peaks in the spectral region for anomeric atoms (Fig. 4a), 15 cross-peaks with 13 C chemical shifts in the spectral range 65-82 ppm, 3 pairs of hydroxymethyl related resonances at δ C 60-64, one cross-peak from a nitrogenbearing carbon at δ C 53.70 (Fig. 4b) and a resonance at δ H /δ C 2.04/22.87 in the N-acetyl spectral region. The bioinformatics analysis suggested 3 different sugar residues and together with the NMR data we conclude that the fourth sugar residue in the O-antigen should be a hexose. Furthermore, the 1 H NMR data of the resonances from anomeric protons are consistent with the fact that two mannosyl residues should be present as part of the RU of the O-antigen polysaccharide. That the absolute configuration is d for the sugar residues in the Oantigen polysaccharide is deduced based on biosynthetic considerations. A sugar analysis of the polysaccharide showing that the major hexose component was mannose supports the above conclusion based on the bioinformatics analysis. In addition, a minor hexose component was detected by the sugar analysis, viz., glucose, and it is assumed to derive from the core region of the LPS.
To determine the structure of the PS-OH, spectral data from a 1D 1 H NMR spectrum and the magnitude of Top 3 structures using the "Determine Glycan Structure" entry in CASPER for E. coli O93 PS-OH with NMR data obtained from 1 H, 13 C-HSQC and 1 H, 1 H-TOCSY experiments and magnitudes of coupling constants for input to CASPER, viz., those from 1 J C1,H1 and 3 J H1,H2 obtained from coupled 1 H, 13 C-HSQC and 1 H NMR spectra, respectively.
3 J H1,H2 coupling constants, 2D 1 H, 13 C-meHSQC and F 2coupled 1 H, 13 C-meHSQC spectra where the latter reveals the magnitude of 1 J C1,H1 coupling constants (1 with 1 J C1,H1 > 170 Hz and 3 with 1 J C1,H1 ≈ 160 Hz), and correlations from 1 H, 1 H-TOCSY NMR spectra were used. The computer program CASPER can determine polysaccharide structure using NMR data in conjunction with some additional information such as sugar components (Jansson et al. 1987b;Lundborg and Widmalm 2011). The information obtained from the bioinformatics analysis and the sugar analysis, including the fact that the O-antigen is supposed to be synthesized via the Wzx/Wzy-dependent pathway (referred to as the biological WecA rule in CASPER), was given as input to CASPER, i.e. the sugar residues d-GlcNAc, d-GlcpA, and d-Man twice as well as the structural element β-d-GlcpA-(1→3)-d-GlcpNAc in conjunction with the above described unassigned NMR data (Supporting information).
The highest ranked O93 PS-OH structure proposed by CASPER has a linear tetrasaccharide RU in which both of the mannose residues have the β-anomeric configuration (Fig. 5), whereas the second and third ranked structures both have branched RUs. The proposed structure was confirmed by NMR resonance assignments of the PS-OH. The four sugar residues should have the pyranoid ring-form as indicated from 13 C NMR chemical shift data (Jansson et al. 1989(Jansson et al. , 1991 and that characteristic signals for furanosides (Ritchie et al. 1975;Beier et al. 1980) were absent in the 1 H, 13 C-meHSQC spectrum. Residue A having 1 J C1,H1 > 170 Hz shows that it has the α-anomeric configuration, whereas residues B-D have 1 J C1,H1 ≈ 160 Hz revealing that they all have the β-anomeric configuration (Table 1; Bundle and Lemieux 1976).
Residue A whose anomeric proton resonates at δ H 5.25 showed at a short mixing a correlation to δ H 4.06 in the 1 H, 1 H-TOCSY spectrum and the latter correlated to the single 13 C resonance at δ C 53.70 in the 1 H, 13 C-meHSQC NMR spectrum; this sugar residue was therefore assigned to α-d-GlcpNAc. At longer mixing times the intensity of the cross-peak increased further and it was not until long mixing times were used that the full spin-system could be identified (Table 1). Thus, also H3 resonates at δ H 4.06 and correlates to a glycosyloxylated carbon-13 nucleus resonating at 81.15 ppm, consistent with a large downfield chemical shift displacement, δ C3 9.41, due to substitution at O3 in →3)-α-d-GlcNAc-(1→. The manno-configured sugars B and C had spin-systems that showed correlations between their anomeric protons and their H2 protons at δ H 4.16 and δ H 4.17, respectively, only when the longest mixing time of 120 ms was employed in the 1 H, 1 H-TOCSY experiments. The remaining part of their spin-systems were assigned in a stepwise manner from the H2 autopeak using 1 H, 1 H-TOCSY NMR spectra with incremented mixing times. In addition, strong intra-residue correlations were observed from H1 to H2, H3, and H5 in the 1 H, 1 H-NOESY spectrum (Fig. 3b), which further confirms that both sugar residues have the βanomeric configuration. The glycosylation shifts of residues B and C, δ C2 5.20 and δ C3 6.08, reveal that they correspond to →2)-β-d-Man-(1→ and →3)-β-d-Man-(1→, respectively. 1 H, 1 H-TOCSY NMR spectra with increasing duration of the spin-lock times were used to assign 1 H NMR chemical shifts of residue D, the anomeric proton of which resides at δ H 4.59, up to H5 having a chemical shift of 3.81 ppm. In the 1 H, 13 C-meHSQC NMR spectrum its H4, δ H 3.74, correlates to a signal at 81.26 ppm and the δ C4 8.57 for the glucuronic acid then defines the position of substitution and the residue thus corresponds to →4)-β-d-GlcA-(1→. The 1 H and 13 C chemical shifts of the PS-OH were correlated using the 1 H, 13 C-meHSQC NMR spectrum and the sequence between residues, [-B-C-D-A-], suggested by CASPER, was confirmed by the 1 H, 1 H-NOESY spectrum (Fig. 3b, Table 2). The predicted 1 H and 13 C NMR chemical shifts by CASPER were in good agreement with those assigned by the 1D and 2D NMR experiments (Fig. 6). The structure of the PS-OH is also consistent with the fact that WfdW was proposed to act via an inverting mechanism and consequently the O-antigen should contain at least one β-d-Manp residue, since the donor molecule used would be GDP-d-Manp with an α-anomeric configuration.
The PS of E. coli O93 showed 1 H NMR resonances at 2.17 and 2.22 ppm and 13 C NMR resonances at 20.99 and 21.32 ppm for a minor (denoted by a prime) and a major population (denoted by a double-prime), respectively, which were not present in the material after treatment under dilute basic conditions, fully consistent with the presence of O-acetyl groups as substituents on the sugar residue(s) of the O-antigen polysaccharide. The 1 H, 13 C-meHSQC NMR spectrum showed characteristic chemical shift displacements as a result of O-acetylation, inter alia, for a 6-O-acetylation resulting in cross-peaks between δ C 63.88 and the two protons at δ H 4.37 and 4.46 (Fig. 4c), i.e. δ C6 ∼2 and δ H6 ∼0.5 (Jansson et al. 1987a). Furthermore, cross-peaks at δ H /δ C 5.53/69.97 and δ H /δ C 5.55/69.87 (Fig. 4d) could be assigned to the respective protons and carbons of C2 and C2 resulting in a →3)β-d-Man2Ac-(1→ residue. The two carbonyl carbons at δ C 174.02 and 173.95 had correlations to δ H 5.53 and δ H 5.55, respectively, as deduced from a 1 H, 13 C-BS-CT-HMBC experiment (Claridge and Pérez-Victoria 2003). Interestingly, given in hertz in square brackets and obtained from a coupled 1 H, 13 C-HSQC NMR spectrum. d1 J C1,H1 values given in hertz in curly brackets and calculated from a 1 H, 13 C-CT-CE-HSQC NMR spectrum.
cross-peaks from δ H 5.55 (H2 of C ) to δ H 4.37 and δ H 4.46 (H6 of C ) were observed in the 1 H, 1 H-TOCSY acquired with a mixing time of 120 ms ( Fig. 4e and f) showing that two O-acetyl groups substitute positions 2 and 6 in C , i.e. a →3)-β-d-Man2Ac6Ac-(1→ residue. The 1 H chemical shift displacements due to the O-acetylation at position 2 follow the pattern observed for α-l-Rha2Ac-OMe where downfield shifting by >1 ppm takes place for H2 and smaller effects in the same direction occur for H1 and H3; for the 13 C chemical shifts the magnitude of the displacements is in the range of ∼1.5-2.5 ppm, downfield for C2 and upfield for C1 and C3 (Table 1; Rönnols et al. 2013). For the PS material the anomeric configuration of, in particular, the 2 mannosyl residues was further confirmed by a 1 H, 13 C-CT-CE-HSQC NMR experiment (Tian et al. 2001) in which 2 × 1 J C1,H1 is determined along the F 1 dimension of the spectrum (Fig. 4g, Table 1) as a way to resolve spectral overlap that may occur in an F 2 -coupled HSQC NMR spectrum. The degree of Oacetylation of residue C was determined by integration of the anomeric signals in the 1 H, 13 C-HSQC NMR spectrum and showed that ∼ 1 / 2 of the population was 2,6-di-O-acetylated, ∼ 1 / 4 was 2-O-acetylated, whereas ∼ 1 / 4 of the population did not carry O-acetyl groups. Whether the observed di-Oacetylation of a specific mannosyl residue in the RU results from the action of two different O-acetyl transferases, on migration (Lassfolk et al. 2019(Lassfolk et al. , 2022Oh et al. 2022), or on the conditions of growth and purification of polysaccharide material (Ilg et al. 2013;Knirel et al. 2015), remains an open question. The sequential arrangement of the sugar residues in the O93 PS was determined from 1 H, 13 C-HMBC and 1 H, 1 H-NOESY NMR spectra (Table 2) and is summarized in the "sequence plot" (Fig. 7), in which the correlations "from" the anomeric atoms in residues A-D "to" adjacent sugar residues are shown. To further validate the NMR resonance assignments of the PS, the suggested structure was submitted, along with the assigned 1 H and 13 C NMR chemicals shifts, using the "Predict NMR Chemical Shifts" entry in CASPER. The experimentally assigned and predicted NMR chemical shifts of the PS having a →3)-β-d-Man2Ac6Ac-(1→ residue as part of its RU were compared (Fig. 6) and found to be in good agreement, but for a conspicuous, but significant, 1 H outlier for B2 . Significant upfield 1 H NMR chemical shifts displacements of 0.33 and 0.35 ppm were observed for B2 and B2 , respectively, compared to B2 in the PS-OH. Intriguingly, whereas the 1 H NMR chemical shift of B2 was predicted well for the PS-OH by CASPER it deviated when 2-O-acetyl substitution was present in residue C. Moreover, in residue A, which is not directly linked to residue C, the anomeric proton also showed a significant upfield chemical shift displacement of 0.14 and 0.17 ppm for A1' and A1 , respectively.
We hypothesize that these upfield 1 H NMR chemical shift displacements may be due to shielding by the carbonyl group, which has a strong magnetic anisotropy (Field et al. 2020) of the O-acetyl group in residue C. To further analyze this, we set out to investigate spatial proximity of protons between sugar residues and acquired 1D 1 H, 1 H-NOESY experiments with mixing times of 300 and 600 ms where the selective irradiation was set on the methyl group of the 2-O-acetyl substituent (C2 Ac/C2 Ac; Fig. 8). Interestingly, strong interresidue correlations to A1 , A5 , A6 , and B2 (and the corresponding mono-O-acetylated sugar residue which has degenerate or closely similar NMR chemical shifts) were observed (Fig. 8). This allowed us to construct a 3D model where C2 Ac was oriented in such a way that it was proximate to the protons that it had NOEs to. In this model torsion angles of φ H ,ψ H ≈ −35 • , −45 • for the glycosidic linkage of αd-GlcpNAc-(1→2)-β-d-Manp and φ H ,ψ H ≈ +25 • , +60 • for that of β-d-Manp-(1→3)-β-d-Manp2Ac6Ac were adopted. The orientation of the carbonyl group then is such that the anisotropic shielding can affect B2 and the A1 protons, consistent with the significant upfield chemical shift displacement of these (Fig. 8).
The analysis of function of gene products, from galF to HisI (vide supra), in relation to the determined structure of the BRU of the O-antigen from E. coli O93 calls for the presence of a third GT (or that WfdW has a dual activity). The function of the gene product from orf5 was still uncharacterized and the information deposited by Iguchi et al. (2015) translated into a putative protein sequence containing 334 amino acids. A protein BLAST resulted in the identifier WP_000697476.1, which was entered to UniProt to give A0A0A8J5U3_ECOLX, followed by searching the AlphaFold Protein Structure Database (Jumper et al. 2021;Varadi et al. 2022) resulting in a hit of an uncharacterized protein, the structure of which is shown in Fig. 10. The protein has two Rossmann-like domains with βαβ-segments (Hanukoglu 2015) and displays a GT-B fold (Chang et al. 2011). Interestingly, Orf5 has a recurrent C-terminal α-helix that extends back onto the Nterminal domain, like in the GT WaaG that also has the GT-B fold (Martinez-Fleites et al. 2006;Riu et al. 2022). Moreover, whether Orf5 has a membrane-interacting region (MIR), which has been shown to present in WaaG (Liebau et al. 2015), requires further investigations. Based on the information acquired we propose that Orf5 is a GT responsible for the third sugar transfer starting from the initial GlcNAc-PP-undecaprenyl acceptor to result in a tetrasaccharide-PPundecaprenyl entity, which subsequently is translocated from the cytoplasmic side of the inner membrane to the periplasmic side by Wzx where the oligosaccharide will be polymerized by the Wzy-Wzz complex to give the O-antigen polysaccharide.

Conclusions
The initial analysis of the gene cluster related to the biosynthesis of the O-antigen polysaccharide from E. coli O93 revealed putative functions of proteins that were encoded in this part of the genome (Iguchi et al. 2015;DebRoy et al. 2016;Liu et al. 2020). This information gave clues about the biosynthesis of the O-polysaccharide, such as the presence the Wzx/Wzy biosynthetic pathway, possible constituent sugar components as well as probable O-acetyl decoration on the polysaccharide. Further information was obtained from ECODAB, in which the function of one of the GTs had been proposed with high fidelity, viz., the structural element β-d-GlcpA-(1→3)d-GlcpNAc, and was consistent with the presence of UDPglucuronic acid as the donor molecule and Und-PP-d-GlcNAc as the acceptor molecule in conjunction with the proposed GT family acting by an inverting mechanism.
To gain deeper insight into the structure of the Oantigen polysaccharide the bioinformatics information was coupled with an NMR analysis. Using the computer program CASPER, structure proposals were generated in a highly efficient way without the need to carry out NMR chemical shift assignments; the top-ranked proposal was subsequently confirmed to be correct by means of additional NMR experiments. Detailed NMR analysis was employed to identify the O-acetylation pattern of the O-antigen polysaccharide, which has tetrasaccharide RUs, →2)-β-d-Manp-(1→3)-βd-Manp2Ac6Ac-(1→4)-β-d-GlcpA-(1→3)-α-d-GlcpNAc-(1→, consisting of three populations, viz., ∼ 1 / 2 of them are 2,6-di-O-acetylated, ∼ 1 / 4 are 2-O-acetylated, and ∼ 1 / 4 do not carry O-acetyl groups on the penultimate sugar (a mannosyl residue) of the BRU. From 3D models of the different Oantigen populations it is evident that the O-acetyl groups can form exposed epitopes on the polysaccharide backbone, but whether the dual substitution pattern on a single sugar residue in the RU renders the bacterium any advantages as a pathogen during infection remains to be established.

Bacterial strains and polysaccharide preparations
Escherichia coli reference strains of O93 (E. coli O93:−:H16) was obtained from SSI Diagnostica A/S (Hillerød, Denmark). Bacterial culture, LPS extraction and preparation of lipid-free polysaccharides (PS) were carried out as previously described (Furevi et al. 2020). O-deacetylated polysaccharide (PS-OH) was prepared by treatment of the PS with aq. NaOH (0.5 M) at 20 • C for 18 h followed by extensive dialysis against deionized water using Spectra/Por 4 Membrane Tubing with a MWCO of 12-14 kDa (fisher scientific).

Sugar analysis by GLC
Hydrolysis of the PS was carried out using 2 M TFA at 120 • C for 30 min followed by reduction with NaBH 4 in aq. ammonia (1 M) at ambient temp for 30 min and acetylation at 100 • C for 30 min using Ac 2 O and pyridine (1:1). The resulting mixture of alditol acetates was analyzed by gas-liquid chromatography (GLC) whereby components were separated on a PerkinElmer Elite-5 column using a temperature program of 150 • C for 2 min, 1 • C·min −1 up to 173 • C, followed by 8 • C·min −1 to 220 • C with hydrogen as the carrier gas (25 psi). The column was fitted to a PerkinElmer Clarus 400 gas chromatograph equipped with flame ionization detectors. The retention times of the derivatives were compared with those of authentic standards as references.

NMR spectroscopy
The PS sample was deuterium-exchanged by freeze-drying from 99.9% D 2 O and examined by NMR spectroscopy at 70 • C as solutions in 99.96% D 2 O (10 mg in 0.55 mL) with a trace amount of NaN 3 and was used for 1D 13 C NMR, 2D 1 H, 13 C-HMBC, 1 H, 13 C-H2BC and 1 H, 13 C-BS-CT-HMBC, 1 H, 13 C-HSQC, and 1 H, 13 C-CT-CE-HSQC experiments. A PS-OH sample purified by gel-permeation chromatography was used for further analysis employing 1D 1 H and diffusion-filtered 1 H NMR experiments, F 2 coupled 1 H, 13 C-HSQC, 1 H, 13 C-HSQC, 1 H, 1 H-TOCSY, and 1 H, 1 H-NOESY NMR experiments (<1 mg in 0.55 mL) acquired at 60 • C. NMR spectra were recorded using a Bruker Avance III 700-MHz spectrometer equipped with 5-mm TCI ( 1 H/ 13 C/ 15 N) Z-Gradient (53.0 G cm −1 ) CryoProbe or a Bruker Avance III 600-MHz NMR spectrometer equipped with a 5-mm TXI ( 1 H/ 13 C/ 15 N) Z-gradient (55.7 G cm −1 ) inverse probe. Chemical shifts are reported in ppm using internal sodium 3-trimethylsilylpropanoate-2,2,3,3-d 4 (TSP, δ H 0.00) for 1 H NMR and external 1,4-dioxane 10% in D 2 O (δ C 67.40) for 13 C NMR as references. Chemical shift differences were obtained by comparison with NMR data of the corresponding monosaccharides (Jansson et al. 1989). NMR experiments suitable for resonance assignments of carbohydrates (Widmalm 2021) were recorded essentially as previously described (Furevi et al. 2020); specific additional experimental conditions are given below. 1 H, 1 H-TOCSY experiments were acquired with mixing times of 20, 30, 60, 90, and 120 for the PS-OH and 30, 60, 90, 120, and 200 ms for the PS using DIPSI-2 spin-locks. For the PS-OH, a phase-sensitive 1 H, 1 H-NOESY experiments at 700 MHz was obtained with a mixing time of 150 ms employing homospoil gradient pulses (1 ms) during mixing time with a strength of 40% of the maximum; both types of experiments used 2,048 × 256 data points in the F 2 and F 1 dimensions, respectively. For the PS material, a 1 H, 1 H-NOESY experiment, with suppression of effects from zeroquantum coherence (Thrippleton and Keeler 2003), was carried out at 600 MHz with mixing times of 60 and 150-ms employing 14 k × 256 data points in F 2 and F 1 , respectively, an acquisition time of 1.7 s and a relaxation delay of 5 s.
Multiplicity-edited 1 H, 13 C-HSQC experiments were acquired for PS and PS-OH at a 1 H frequency of 700 MHz with 1,176 × 512 data points in F 2 and F 1 , respectively, covering 100 ppm in the indirect dimension for PS and covering 105 ppm in F 1 for the PS-OH. Both experiments were acquired with non-uniform sampling of 50% sparsity and an exponential weighting set to 100 ms for the T 2  (Kuttel et al. 2016) and visualized with VMD (Humphrey et al. 1996). Sugar-residues are colored with D-GlcNAc in blue, D-Man in green, and D-GlcA in cyan. The O-acetyl groups present in (PS ) and (PS ) are depicted in red color. relaxation time. The F 2 coupled 1 H, 13 C-HSQC experiment applied to the PS-OH was recorded with 1,886 × 128 data points in the F 2 and F 1 dimensions, respectively, using an evolution time corresponding to 1 J CH of 145 Hz. A 1 H, 13 C-CT-CE-HSQC experiment (Tian et al. 2001) was acquired for the PS with 1,176 × 512 data points in F 2 and F 1 , respectively, and an evolution time corresponding to 1 J CH of 165 Hz.
A 1 H, 13 C-HMBC experiment for the PS was acquired using a 3-fold low-pass J-filter, 1 J CH-min and 1 J CH-max set to 120 and 170 Hz, respectively, and a delay for the evolution of the long-range couplings corresponding to 8 Hz. The bandselective constant-time 1 H, 13 C-HMBC experiment (Claridge and Pérez-Victoria 2003) for the PS was acquired with a delay Fig. 10. Protein structure obtained from the AlphaFold Protein Structure Database of a putative glycosyltransferase (orf5 in the O-antigen gene cluster) from E. coli O93 carrying out the last glycosylation reaction in formation of the oligosaccharyl-PP-undecaprenyl entity. The molecular model was made using UCSF Chimera (Pettersen et al. 2004).
for long-range couplings corresponding to 10 Hz, spectral widths of 8 × 6 ppm in the F 2 and F 1 dimensions, respectively, with the transmitter frequency at 175 ppm in the F 1 dimension and a Q3 Gaussian cascade pulse for selective 13 C excitation in the spectral region for carbonyl groups. A 1 H, 13 C-H2BC experiment (Nyberg et al. 2005) for the PS material was recorded with a 3-fold low-pass J-filter, 1 J CH-min and 1 J CH-max set to 145 and 170 Hz, respectively, and a constant-time delay of 22 ms.