Streptococcus pneumoniae serotype 33G: genetic, serological, and structural analysis of a new capsule type

ABSTRACT Streptococcus pneumoniae (the pneumococcus) is a human pathogen responsible for a spectrum of diseases such as pneumonia, sepsis, and meningitis. The capsule is the major pneumococcal virulence factor and is encoded by the capsular polysaccharide (cps) locus, a recombination hotspot that has resulted in over 100 distinct capsular polysaccharide types (serotypes) identified to date. Recently, 33X (also known as 10X) was proposed as a putative novel serotype, but the capsule structure had not been elucidated. Here, we provide an in-depth investigation of 33X, demonstrating it is a new pneumococcal capsular serotype. In this study, we screened 12,850 nasopharyngeal swabs from both healthy children and pneumonia patients (adults and children) in Mongolia collected between 2015 and 2022. We identified 20 pneumococcal 33X isolates. Using whole genome sequencing, we found that the 33X cps locus is a chimera of genes from pneumococcal serogroups 35, 10, and 33, as well as other Streptococcal species. Serotyping of 33X pneumococci by the Quellung reaction revealed a unique serological profile, typing as both 10B and 33B. Competitive ELISAs confirmed that antibodies that were generated in mice directed against 33X were inhibited by 33X pneumococci but not 10B or 33B. Lastly, the elucidation of the 33X capsule structure revealed that the polysaccharide is distinct from other serotypes, consisting of an O-acetylated hexasaccharide repeat unit of →5)-β-Galf-(1→3)-β-Glcp-(1→5)-β-Galf 2Ac-(1→3)-β-GalpNAc-(1→3)-α-Galp-(1→4)-Rib-ol-(5→P→. Therefore, 33X meets the requisite genetic, serological, and biochemical criteria to be designated as a new serotype, which we have named 33G. IMPORTANCE Streptococcus pneumoniae (the pneumococcus) is a bacterial pathogen with the greatest burden of disease in Asia and Africa. The pneumococcal capsular polysaccharide has biological relevance as a major virulence factor as well as public health importance as it is the target for currently licensed vaccines. These vaccines have limited valency, covering up to 23 of the >100 known capsular types (serotypes) with higher valency vaccines in development. Here, we have characterized a new pneumococcal serotype, which we have named 33G. We detected serotype 33G in nasopharyngeal swabs (n = 20) from children and adults hospitalized with pneumonia, as well as healthy children in Mongolia. We show that the genetic, serological, and biochemical properties of 33G differ from existing serotypes, satisfying the criteria to be designated as a new serotype. Future studies should focus on the geographical distribution of 33G and any changes in prevalence following vaccine introduction.

S treptococcus pneumoniae (the pneumococcus) is a common member of the upper respiratory tract microbiota of children.The pneumococcus is also an important human pathogen responsible for a range of diseases such as otitis media, pneumonia, sepsis, and meningitis (1).The polysaccharide capsule is a major pneumococcal virulence determinant.The capsule promotes survival within the host by facilitating the evasion of phagocytosis, complement, and mucus entrapment (2).Pneumococci exhibit a high level of diversity in capsular polysaccharide structure, with over 100 capsular types (referred to as "serotypes") identified to date (3).Serotyping is the basis for the classification of pneumococci.The pneumococcal capsule also has important public health implications as it is the antigenic target for licensed vaccines.Accurate serotyping data are essential in evaluating vaccine impact in carriage and disease, as well as for disease surveillance programs more broadly.
To date, the genomes of over 50,000 pneumococci have been sequenced (4,5).Sequence analysis has led to the identification of genetic variants of existing serotypes (6)(7)(8)(9)(10) as well as new serotypes (3,(11)(12)(13)(14)(15).Recently, there has been identification of serotype variants (6,16) and new serotypes (15) within serogroup 33.The putative new serotype 33X [also known as 10X (17)], relevant to this study, was identified in Thailand (n = 5 isolates) (18) and South Africa (n = 2 isolates) (16).Although this 33X has been reported previously, little is known about the 33X capsule including whether it is indeed a new serotype or a molecular variant of an existing serotype as the capsule structure has not been investigated.In this study, we identify 33X pneumococci in Mongolia and show the 33X capsule possesses genetic, serological, and structural properties that differ from known serotypes.We name this new serotype 33G.
The cps locus of the 33X isolates from Mongolia has over 99.7%DNA sequence identity to the 33X cps sequences first reported by van Tonder et al. (18) as well as by Mostowy et al. (17) where it was referred to as 10X.These cps sequences were previously identified in isolates from Thailand (ST5123 and GPSC293) (18) and South Africa (ST5178 and GPSC241) (16).We examined the 33X cps locus of the 20 isolates from Mongolia using DNA sequence alignment and genome annotation by RAST (29), finding they have the same mosaic structure as the previously described 33X cps locus, composed of genes with high DNA identity to sequences from serogroups 35 (wzg, wzh, wzd, wze, wchA, and wciB; Fig. 1A), 10 (partial wzy sequences, wcrB, wcrC, wcrD, and wciF), and 33 (wzx, wciG, and glf; Fig. 1B).
In addition to the genes described above, the 33X cps locus contains another four putative open reading frames.Orf1 exhibits partial similarity to part of the wcrG and wzx genes from the serotype 10B reference cps locus (Genbank accession CR931650), possibly representing 10B remnant sequences following a recombination event at this site.Interestingly, orf2, orf3, and orf4 do not show sequence similarity to any pneu mococcal cps genes.Rather, they show similarity to a region in the cps locus from Streptococcus oralis subsp.dentisani strain Y5914-11 (Fig. 1C), a blood isolate from a patient with infective endocarditis from Denmark (31).Between wciB and wcrB are two sequences with similarity to part of the wzy gene, which encodes the polymerase from the serotype 10B reference cps locus.However, the deletions apparent in these sequences suggest these would not encode a functional enzyme.Orf2 is predicted to encode a Wzy homolog, with the predicted amino acid sequence exhibiting high identity to O-antigen ligase family proteins from S. oralis (97.2%, Genbank accession WP_204956359) and Streptococcus intermedius (97.6%, Genbank accession MBF1713530).Among pneumococcal Wzy proteins, ORF2 exhibits the highest identity to Wzy from serotype 37 (41.7%,Genbank accession CAI34388).Lastly, ORF3 and ORF4 are partial amino acid sequences that match portions of the S. oralis acyltransferase (86.5% and 74.7%, to S. oralis acyltransferase Genbank accession WP_084971822 for ORF3 and ORF4, respectively).However, the orf3-orf4 region in the 33X cps locus contains multiple frameshift mutations, suggesting it is unlikely to encode a functional acyltransferase (Fig. S1).To determine the implications of this genetic variation for molecular approaches for serotyping, we tested the 33X isolates in five different whole genome sequencing-based serotyping tools [PneumoCaT (33), seroBA (34), SeroCall (35), PneumoKITy (36), and PfaSTer (37)] as well as DNA microarray (as described above) and the TaqMan Array Card, a qPCR-based assay (38).Most tools were unable to designate a serotype to 33X pneumococci (Table S2).Tools that report the closest "hit" (PneumoCaT, PneumoKITy, and PfaSTer) were matches to serotypes in serogroup 35.As described earlier, DNA microarray designates 33X as 35A/10B-like due to the detection of cps genes from serotypes 35A and 10B.The TaqMan Array Card mistyped these isolates as 10B.The seroBA tool was the only one to detect 33X (designated as 10X in this tool) given the cps locus from the isolates previously identified in Thailand and South Africa is included in the seroBA database.(30) and Streptococcus oralis subsp.dentisani strain Y5914-11 cps locus (Genbank accession no.NZ_NCUW01000028) (31).Images were generated using Easyfig version 2.2.5 (32).

Serological properties of 33X pneumococci
To investigate the serological properties of the 33X capsule, the isolates from Mongo lia were examined using the Quellung reaction, the gold standard for pneumococcal serotyping (39,40) using commercially available typing sera from the Statens Serum Institut (SSI).All isolates reacted with both group 10 and 33, but not group 35, sera.When tested against the relevant factor sera, the 33X pneumococci reacted with 10b, 10d, and 33f factor sera.Thus, the 33X pneumococcal isolates serotyped as both 10B and 33B by Quellung, despite being confirmed as pure cultures by whole genome sequencing and the 10B and 33B control strains reacting as expected (Table 2).
To investigate the immunological properties of 33X isolates, we conducted compe tition ELISAs.To obtain sera, we infected 5-day-old C57BL/6 mice with a 33X pneumo coccal strain (PMP1486) to establish nasopharyngeal colonization and collected serum 7 weeks post-infection.The serum was pre-incubated with pneumococci expressing serotype 33X, 10B, 33B, or 14 capsules as competitors before conducting the 33X ELISA.Serotype 14 was used as a negative control, as it was not expected to be recognized by capsular antibodies raised against other serotypes based on previous competition ELISA work (41).This allowed us to establish background inhibition levels in the assay.Pre-incubation of serum with 33X pneumococci resulted in greater inhibition (76.6%) compared with pre-incubating with 10B (36.4%), 33B (33.9%), or 14 (negative control, 28%), suggesting that the immunological properties of 33X differ to 10B and 33B (Fig. 2).

TABLE 2
Quellung reaction profile of 33X pneumococci (all strains) compared with 10B (PMP738) and 33B (PMP790) SSI strains with "+" and "−" denoting a positive and negative reaction with the typing sera, respectively a Reaction is weak with one 1 µL loopful of 10d factor sera.A stronger reaction was obtained using two 1 µL loopfuls.
Taken together, these data show that 33X pneumococci have genetic and immu nological properties that are distinct from previously characterized pneumococcal serotypes.

Elucidation of 33X capsule repeat unit structure
We next determined the repeat unit structure of the polysaccharide purified from 33X isolates by the use of chemical analysis and nuclear magnetic resonance (NMR) spectroscopy.Acid hydrolysis of the serotype 33X polysaccharide followed by deriva tization and gas chromatography-mass spectrometric (GLC-MS) analysis showed the presence of ribitol (Rib-ol), glucose (Glc), galactose (Gal), and N-acetylgalactosamine (GalNAc) in the molar ratio of 1.6:1.0:2.2:0.8.Performing the reduction step with NaBD 4 and the absence of a deuterium label in the ribitol derivative confirmed that ribitol was not derived from ribose.Composition analysis following methanolysis gave comparable results and confirmed the absence of uronic acid in the repeat unit.The determination of the glycosidic linkages was achieved by GLC-MS analysis of the partially methyla ted alditol acetates which identified: 4,5-Rib-ol, 3-Glcp, 4-Galp or 5-Galf, 3-Galp, and 3-GalNAcp in the molar ratio of 0.4:1.6:2.1:1.0:0.7.These data are consistent with a linear hexasaccharide repeat unit without branch points and side chains, unlike the repeat units reported for related serotypes 10B (42) and 33B (43).
The 1 H NMR spectrum of the serotype 33X polysaccharide (Fig. 3A) at 25°C (500 MHz) showed five signals and an H-COAc peak at 4.93 ppm in the anomeric region, ring protons, and peaks from low molecular weight components (glycerol, buffer, and amino acids).
The full spectrum (Fig. S2A) contains an O-acetyl signal at 2.14 ppm and an N-ace tyl signal at 2.03 ppm indicating that serotype 33X polysaccharide has an O-acetyla ted hexasaccharide repeating unit containing an N-acetyl sugar.The peaks from HOD (water), process residuals, and amino acids were readily removed in the diffusion-ordered spectroscopy (DOSY) experiment in Fig. 3B (and Fig. S2B) to give the polysaccharide signals together with peaks from residual cell wall polysaccharide (CWPS) with the characteristic choline resonance at 3.23 ppm.Further NMR studies were performed at 600 MHz and at higher temperatures that resulted in gradual de-O-acetylation (Fig. 3C) leading to the NMR spectrum of the polysaccharide backbone shown in Fig. 3D (and Fig. S2D).The structure of the 33X hexasaccharide repeat unit was determined on the de-O-acetylated sample by the use of an array of1 H-1 H homonuclear experiments (COSY, TOCSY, and NOESY) and 1 H- 13 C heteronuclear correlation experiments (HSQC and HMBC) described in Fig. S3 to S5.The 1 H-31 P HMBC experiment (Fig. S6) showed that the phosphodiester signal at 0.34 ppm was correlated to the H5 of ribitol and the broad proton signal at 4.35 ppm, assigned to H5 of β-Gal I f.The proton/carbon pairs are labeled in the HSQC spectrum (Fig. 4), the NMR data are collected in Table 3.
Finally, the linkages and sequence of sugar residues followed from the 1 H-13 C HMBC inter-residue correlations are labeled in Fig. 5. Thus, NMR analysis established the structure of the hexasaccharide repeat unit of serotype 33X as →5)-β-

Identification of O-acetylation at C2 of β-Gal II f
NMR experiments performed on the fully and partially O-acetylated polysaccharide (Fig. 3A and C) confirmed the presence of 5-linked β-Gal II f 2Ac with diagnostic signals for H1 at 5.20, H2 at 4.93, H3 at 4.48 ppm, and the O-acetyl singlet at 2.14 ppm.TOCSY of the partially O-acetylated polysaccharide elucidated the proton spin systems of both 5-linked β-Gal II f 2Ac and the de-O-acetylated 5-linked β-Gal II f residues (Fig. S7).Finally,  HSQC-DEPT and HMBC of the fully O-acetylated polysaccharide, displayed as an overlay in Fig. 6, identified the carbon resonances for this residue.In addition, the inter-residue correlations confirmed the sequence and linkages of the serotype 33X O-acetylated hexasaccharide repeat unit.The NMR assignments for the 5-linked β-Gal II f 2Ac residue are provided in Table 3, lower panel.Overall, the repeat unit of 33X is an O-acetylated hexasaccharide with components in common with both the 10B and 33B repeat units but not identical to either.Based on their known enzymatic activities (44), we were able to assign a function to all the enzymes encoded in the 33X cps locus to the linkages in the repeat unit, further supporting the proposed structure and its designation as a new pneumococcal serotype, hereby referred to as 33G (Fig. 7).

DISCUSSION
Traditionally, pneumococci are classified into serotypes based on differences in serological reactions using the Quellung reaction (46)(47)(48).Different serotypes produce a distinct capsular polysaccharide (48)(49)(50)(51) encoded by a unique cps locus (30).Previously, seven isolates from Thailand (17,18) and South Africa (16) with the serotype designation 33X (also known as 10X) were identified.No investigation had been conducted into the structural properties of the capsule, and it was therefore not known whether 33X is a novel pneumococcal serotype.In this study, we provide a comprehensive investigation of this putative new serotype using 20 isolates that we identified in Mongolia.
Consistent with the previous reports (17,18), the 33X isolates from Mongolia possess a unique cps locus compared with known serotypes.Using the gold standard Quellung reaction, we show that 33X pneumococci exhibit a unique serological profile, typing as both 10B and 33B.Importantly, pre-incubation of serum from mice colonized with 33X pneumococci resulted in greater ELISA inhibition than pre-incubation with either 10B or 33B pneumococci.Serotype 35A was not explored further given the negative Quellung reaction with group 35 sera.Together, these data indicate that serological responses to 33X are distinct from the related serotypes 10B and 33B.Using 1D and 2D NMR and sugar composition analyses, we interrogated the 33X capsule, demonstrating a structure that possessed features of both 10B and 33B, yet was not identical to either nor any other known pneumococcal serotype.Taken together, these data provide the requisite genetic, serological, and biochemical evidence that 33X is a new pneumococcal serotype, hereby referred to as 33G.
We have proposed placing this new serotype in serogroup 33 rather than 10 since five of the residues in the 33G hexasaccharide repeat unit are conserved in 33B compared with four for 10B.Like 33B, 33G and other serotypes in serogroup 33 (33C and 33D) also possess a hexasaccharide backbone containing ribitol-5-P.In addition, 33G has an O-acetylated β-Galf residue, found in all members of serogroup 33 and lacking in serogroup 10 capsules (42,43).The presence of O-acetyl groups in the 33G capsule is likely to have important biological relevance given that O-acetylation impacts the structure of the polysaccharide and is an epitope in pneumococcal capsules readily recognized by the immune system (48,52).Such immune pressure has led to the emergence of new serotypes that lack the O-acetyl modification (15,(53)(54)(55).Lastly, a characteristic feature found in all serotypes in serogroup 10 is a terminal side chain β-Galf (11,42), which is lacking in 33G.
The 33G cps locus is a mosaic of genes from multiple sources including three pneumococcal serogroups (35, 10, and 33) as well as other streptococci.The emergence of such a complex cps is the result of several recombination events.The pneumococcal cps locus is a known recombination hotspot, and examples of recombination with the cps locus of other pneumococci have been reported previously (6,17).Similar recombi nation events with non-pneumococcal streptococci also occur, as exemplified by the recently described serotype 10D, which contains capsular genes from serotypes 6C, 39, and oral streptococci (11).The earliest detection of 33G was in an invasive disease isolate from South Africa in 2007 ( 16).However, with limited epidemiological data and a small number of isolates (n = 27 in total across three countries including the present study), it is difficult to determine when and where 33G emerged.
The chemical and NMR studies performed on the serotype 33G polysacchar ide elucidated the O-acetylated repeat unit as →5)-β-Gal  43), not only 3-linked as in 33G.Additionally, the large deletions in the wzy gene in the 33G cps locus suggest that it is unlikely to encode a functional polymerase.However, orf2 is predicted to encode a wzy homolog, suggesting the acquisition of ORF2 from another streptococcal species has replaced the ancestral Wzy as the polymerase responsible for linking the 33G repeat units together.
Our findings have implications for surveillance using both phenotypic and genotypic serotyping methods.For researchers relying on serological typing methods, like the Quellung reaction, 33G will yield two serotype results (10B and 33B).Therefore, there is potential for 33G to be mistyped as 10B or 33B (or even 10F in some cases if the operator misses the weak reaction with 10d factor sera).Interestingly, 33G pneumococci did not react with any of the sera recognizing serogroup 35 despite a large portion of the 33G cps locus with high homology to serogroup 35.This likely occurred because the serogroup 35-derived genes include the transcriptional regulatory region (wzg-wze), the initial transferase (wchA), and one glycosyltransferase (wciB), which either have no impact on the capsule structure or are not responsible for antigenic sites recognized by group 35 typing sera.Not surprisingly, most molecular tools using whole genome sequence data are not yet able to accurately serotype 33G (Table S2).We have made the 33G cps sequences from this study publicly available (Genbank accession numbers OR509570-OR509589) so that cps databases can be updated.However, the users of seroBA will not need to update the database as the 33G cps locus is already detected (called 10X).The US Centers for Disease Control and Prevention (CDC) method for qPCR serotyping (56) does not include primers for 10B or 33B and would therefore be unlikely to detect serotype 33G.DNA microarray designates 33G as 35A/10B-like due sequences in the 33G cps locus matching probes for some but not all genes for the 35A and 10B cps loci.The TaqMan Array Card for pneumococcal serotyping (38) designates 33G as 10B due to the detection of wcrD in 10B, whereas the targets for other relevant serotypes such as 33B (wciN) and 35A (wcrK) are absent from the 33G cps locus.Serotypes 10B and 33B are not included in any licensed vaccines, so it is unlikely 33G would be misidentified as a vaccine serotype in surveillance studies.
Serotype 33G has been found in healthy carriers [reference (57) and this study], in the nasopharynx of pneumonia patients (this study) as well as in invasive disease (16).These data indicate that 33G can not only colonize but also cause disease.A total of 27 serotype 33G isolates have now been detected, including in Thailand (n = 5) (18), South Africa (n = 2) (16), and Mongolia (this study, n = 20).We have tested over 18,000 swabs from across Asia and have not detected 33G in any of our previous pneumococcal carriage studies in healthy children or children with acute respiratory infection including Lao PDR (58), Fiji (59), Vietnam (60), Indonesia (61), or Papua New Guinea (62).It will be interesting to see whether 33G will be identified in other countries in the future, especially as sequence-based analysis becomes more accessible.
The discovery of serotype 33G has raised multiple questions that should form the basis of future research.Firstly, similar to the recently described serotype 33E (15), 33G lacks a branching monosaccharide, which are typically highly immunogenic (63).Thus, future research should focus on identifying whether the production of a 33G capsule provides enhanced immune evasion capabilities or other biological advan tages.Similarly, the function of orf2 requires experimental investigation.Based on the bioinformatic analysis, we propose orf2 encodes a Wzy homolog replacing the ancestral wzy gene, which contains deletions and is likely non-functional.However, the role of ORF2 as the polymerase needs to be experimentally validated.Lastly, serotype 33G was only detected in Mongolia from 2018 onward, following the phased introduction of PCV13 from 2016.However, there are insufficient data to determine whether the emergence of 33G in Mongolia is related to vaccine introduction or not.Future studies should focus on gaining a better understanding of the geographic distribution of 33G, the prevalence of this serotype following vaccine introduction, as well as the potential of this serotype to cause pneumococcal disease, particularly in the post-PCV era.

Pneumococcal identification and DNA microarray serotyping from nasophar yngeal swabs
Nasopharyngeal swabs were collected, stored, transported, and tested following World Health Organization guidelines (39).Screening of swabs for pneumococci was conducted by lytA qPCR (24,58).Culture of pneumococci from swabs was conducted on selec tive horse blood agar plates supplemented with 5 µg/mL gentamicin (Thermo Fisher Scientific).DNA was then extracted from these cultures (QIAamp 96 DNA QIAcube HT kit) and tested by DNA microarray (Senti-SPv1.5 slides) as described previously (25) to determine serotype(s).Pneumococcal isolates of interest were purified and confirmed as pneumococci using standard identification tests (39), including optochin sensitivity and genotyping by MLST and GPSC.

Competitive ELISA
Serum was generated by administering 2 × 10 3 CFU of 33X pneumococci (strain PMP1486) to 5-day-old C57BL/6 mice intranasally without anesthesia as described previously (66).At 7 weeks post-infection, blood was collected by cardiac puncture, allowed to clot overnight at 4°C, and subjected to centrifugation to collect serum.Animal work was approved by the Murdoch Children's Research Institute Animal Ethics Committee (protocol A945) in accordance with the Australian code for the care and use of animals for scientific purposes (67).To prepare bacterial stocks for the ELISA, pneumococcal strains of serotypes 33X (PMP1486), as well as SSI strains of 10B, 33B, and 14 (hereby named PMP738, PMP790, and PMP829, respectively) were grown to OD 0.85-0.95 in THY broth [3% (wt/vol) Todd-Hewitt broth, 0.5% (wt/vol) yeast extract] at 37°C and 5% CO 2 and stored in 80% (vol/vol) glycerol until required.Wells of high-binding Nunc Maxisorp plates were coated with PBS-washed 33X pneumococci (100 µL per well) and incubated overnight at 4°C.Competition ELISAs were conducted using a method adapted from previously established approaches (41,68).Briefly, serum was incubated with PBS-washed preparations of pneumococci of different serotypes (33X, 10B, 33B, or 14) in 10% (vol/vol) fetal calf serum (FCS) overnight at 4°C.Serum was diluted to 1:1,000 in these reactions, which was determined to be the optimal dilution for high inhibition.The 96-well plates were blocked by adding 10% FCS for 1 h at 37°C.After washing with PBS containing 0.05% (vol/vol) tween, 50 µL of sera was added to the well and incubated for 1 h at 37°C and washed again.Biotinylated goat anti-mouse IgG (diluted 1:500 in 10% FCS; Sigma-Aldrich) was added to the wells and incubated for 1 h at 37°C.Wells were washed, streptavidin-Horseradish Peroxidase conjugate (diluted 1:500 in 10% FCS) added and incubated for 1 h at 37°C.A 1:1 mixture of KPL TMB Peroxidase Substrate and KPL Peroxidase Substrate Solution B (Sera Care) was added to each well (50 µL) and incubated in the dark for ~12 min.The reaction was then stopped with the addition of 1 M H 3 PO 4 and optical density read at 450 nm (and 630 nm as a reference) to calculate the percentage of inhibition.

Extraction of 33X capsular polysaccharide
Pneumococcal strains (PMP1486, PMP1488, PMP1489, and PMP1521) were streaked from frozen stocks onto Columbia sheep blood agar plates and incubated overnight at 37°C and 5% CO 2 .Resultant colonies were then inoculated into modified Lacks medium (69) supplemented with 20 mM glucose and grown to an optical density at 600 nm of 0.5.Cultures were subjected to centrifugation at 3,200 × g in a Heraeus centrifuge (Thermo Fisher Scientific, USA) and washed with ice-cold ultrapure water (in-house product).After another centrifugation, the supernatants were removed, and the pellets were resuspended again in ultrapure water.Buffer-saturated phenol (Thermo Fisher Scientific, USA) with a final concentration of 1% (vol/vol) was added to the suspension and incubated at room temperature overnight.Incubated samples were subjected to centrifugation for 30 min with 3,200 × g at 4°C, and supernatants were collected.The collected supernatants were incubated with 3 µL benzonase nuclease (Sigma-Aldrich, USA) overnight at 37°C.Then 20 mg/mL Proteinase K solution (Roche, Switzerland) was added and again incubated overnight at 37°C.Next, the solution was transferred to a Millipore Amicon Ultra 30 kDa cut off membrane centrifugal filter unit (Merck Millipore, USA; prewashed twice to remove glycerol) and subjected to centrifugation for 25 min at 3,200 × g.The remaining solution above the filter was collected and evaporated under reduced pressure (25 mbar).Dried polysaccharide samples were submitted for chemical and NMR analysis.

Sugar composition and linkage analysis
Composition analysis was carried out by chemical derivatization of the sample both as alditol acetates and trimethylsilyl methyl glycosides.Alditol acetates were obtained after hydrolysis with 2 M trifluoroacetic acid (TFA) for 1 h at 125°C, followed by reduc tion with sodium borohydride (NaBD 4 ) and peracetylation with acetic anhydride (70).Trimethylsilyl methyl glycosides were obtained by derivatization with the reagent Sylon HTP (Merck) after methanolysis of the polysaccharide with 3 M HCl in methanol at 85°C for 16 h (71).To determine the position of the glycosidic linkages, the polysaccharide 33X was permethylated following the protocol developed by Harris (72), hydrolyzed with 2 M TFA for 1 h at 125°C, reduced with sodium borodeuteride (NaBD 4 ), and peracetylated with acetic anhydride to give a mixture of partially methylated alditol acetates.
The derivatized samples were analyzed by GLC using an Agilent Technologies 6,850 gas chromatograph equipped with a flame ionization detector, using He as the carrier gas and a Zebron ZB-5 MSi capillary column (Phenomenex, 30 m × 250 μm × 0.25μm).The following temperature programs were used: for alditol acetates, 3 min at 150°C, 150°C-270°C at 3°C/min, and 2 min at 270°C; for trimethylsilyl methyl glycosides, 1 min at 150°C, 150°C-280°C at 5°C/min, and 2 min at 280°C; for partially methylated alditol acetates, 1 min at 90°C, 90°C-140°C at 25°C/min, 140°C-200°C at 5°C/min, 200°C-280°C at 10°C/min, and 10 min at 280°C.GLC-MS analyses were carried out on an Agilent Technologies 7,890 A gas chromatograph coupled to an Agilent Technologies 5975C VL MSD using the same column and the temperature programs of the GLC analyses.Values of the integrated area of the partially methylated alditol acetates were corrected by the effective carbon response factors (73).

NMR analysis
A sample of polysaccharide 33X (1-2 mg from PMP1486, PMP1488, PMP1489, and PMP1521) was dissolved in deuterium oxide (D 2 O; Sigma-Aldrich, USA), subjected to centrifugation for 5 min at 16,800 × g, transferred to 1.7 mm NMR tubes (Bruker, USA) and submitted for NMR analysis.Initial studies were conducted on a Bruker Neo 500 MHz NMR spectrometer equipped with a 1.7-mm triple resonance probe (TXI) probehead; the probe temperature was set at 25°C.Further NMR experiments were performed after recovery of the sample, cycles of D 2 O exchange, and transfer to a 5-mm tube, on a Bruker Avance III 600 MHz NMR spectrometer equipped with a BBO Prodigy cryoprobe and processed using standard Bruker software (Topspin 3.2).The probe temperature was set at 60°C or 70°C.1D ( 1 H, 31 P, and 13 C) and 2D, COSY, TOCSY, NOESY, HSQC, and HMBC NMR experiments were performed.2D COSY and NOESY experiments were recorded with pre-saturation of HOD, whereas the TOCSY experiments were performed using DOSY to remove signals from low molecular weight components (ledbpgpml2s2d).2D TOCSY experiments were recorded using a mixing time of 180 ms and the 1D variants using 180 or 200 ms.2D NOESY experiments were recorded using a mixing time of 300 ms and the 1D variants using 300 or 500 ms.The HSQC experiment was optimized for J = 145 Hz (for directly attached 1 H- 13 C correlations), and the HMBC experiments were optimized for a coupling constant of 6 Hz (for long-range 1 H- 13 C correlations) and 10 Hz (for 1 H-31 P correlations).To improve sensitivity by performing many scans, the 2D experiments were recorded using non-uniform sampling: 40% for homonuclear and 20%-30% for heteronuclear experiments.Spectra were referenced to residual CWPS: 1 H signal at 3.23 ppm, 13 C signal at 54.5 ppm, and the shielded 31 P signal at 1.30 ppm (74).

FIG 3
FIG 3 Expansion of the 1D proton NMR spectra of pneumococcal serotype 33X polysaccharide, some diagnostic anomeric, and ring protons are labeled.(A) 1D

FIG 4
FIG 4 The expansion of the HSQC spectrum of polysaccharide 33X recorded at 600 MHz (60°C), the cross peaks from the methyl region of the spectrum are shown in the inset.All the hexasaccharide repeat unit (including ribitol-P) proton/carbon cross peaks have been labeled according to the carbon atom of the corresponding residue (βG I f = β-Gal I f, αG = α-Galp, βG II f = β-Gal II f, βGN = β-GalNAc, β-G = β-Glc, and rib = ribitol).Additional peaks are due to buffer (B) and glycerol (G).

TABLE 1
Participants from which 33X pneumococci were isolated from nasopharyngeal

swabs Participant status Participant age No. PCV13 doses a District Year Isolate name
a Adults were not eligible for PCV13 vaccination.

TABLE 3
NMR chemical shift for serotype 33X polysaccharide (de-O-acetylated) and O-acetylated Gal II f (lower panel) a31 P at 0.34 ppm.b N-acetyl of β-D-GalpNAc: C=O at 174.2 ppm and methyl 1 H at 2.05 ppm;13C at 22.7 ppm.c O-acetyl of β-D-Galf 2Ac: C=O at 173.3 ppm and methyl 1 H at 2.14 ppm; 13 C at 20.8 ppm.d Linkage carbons are underlined.