Genomic insights into the diversity, virulence, and antimicrobial resistance of group B Streptococcus clinical isolates from Saudi Arabia

Introduction Detailed assessment of the population structure of group B Streptococcus (GBS) among adults is still lacking in Saudi Arabia. Here we characterized a representative collection of isolates from colonized and infected adults. Methods GBS isolates (n=89) were sequenced by Illumina and screened for virulence and antimicrobial resistance determinants. Genetic diversity was assessed by single nucleotide polymorphisms and core-genome MLST analyses. Results Genome sequences revealed 28 sequence types (STs) and nine distinct serotypes, including uncommon serotypes VII and VIII. Majority of these STs (n=76) belonged to the human-associated clonal complexes (CCs) CC1 (33.71%), CC19 (25.84%), CC17 (11.24%), CC10/CC12 (7.87%), and CC452 (6.74%). Major CCs exhibited intra-lineage serotype diversity, except for the hypervirulent CC17, which exclusively expressed serotype III. Virulence profiling revealed that nearly all isolates (94.38%) carried at least one of the four alpha family protein genes (i.e., alphaC, alp1, alp2/3, and rib), and 92.13% expressed one of the two serine-rich repeat surface proteins Srr1 or Srr2. In addition, most isolates harbored the pilus island (PI)-2a alone (15.73%) or in combination with PI-1 (62.92%), and those carrying PI-2b alone (10.11%) belonged to CC17. Phylogenetic analysis grouped the sequenced isolates according to CCs and further subdivided them along with their serotypes. Overall, isolates across all CC1 phylogenetic clusters expressed Srr1 and carried the PI-1 and PI-2a loci, but differed in genes encoding the alpha-like proteins. CC19 clusters were dominated by the III/rib/srr1/PI-1+PI-2a (43.48%, 10/23) and V/alp1/srr1/PI-1+PI-2a (34.78%, 8/23) lineages, whereas most CC17 isolates (90%, 9/10) had the same III/rib/srr2/P1-2b genetic background. Interestingly, genes encoding the CC17-specific adhesins HvgA and Srr2 were detected in phylogenetically distant isolates belonging to ST1212, suggesting that other highly virulent strains might be circulating within the species. Resistance to macrolides and/or lincosamides across all major CCs (n=48) was associated with the acquisition of erm(B) (62.5%, 30/48), erm(A) (27.1%, 13/48), lsa(C) (8.3%, 4/48), and mef(A) (2.1%, 1/48) genes, whereas resistance to tetracycline was mainly mediated by presence of tet(M) (64.18%, 43/67) and tet(O) (20.9%, 14/67) alone or in combination (13.43%, 9/67). Discussion These findings underscore the necessity for more rigorous characterization of GBS isolates causing infections.


Introduction
Group B Streptococcus (GBS) is primarily known as a commensal pathogen that resides asymptomatically in the gastrointestinal and genitourinary tracts of over 30% of healthy adults (Raabe and Shane, 2019;Chen et al., 2023).However, the species can sometimes cause serious illnesses, leading to death among susceptible hosts, including neonates, pregnant women, and non-pregnant adults, especially the elderly and those with underlying health conditions (Furfaro et al., 2018).The incidence of GBS infections has been increasingly reported over the last two decades, with an estimated rate ranging from 3.6 to 7.3 cases per 100,000 inhabitants and a case-fatality rate of over 15% (Skoff et al., 2009).Penicillin remains the first-line drug for the treatment of GBS infections.However, reduced susceptibility to penicillin associated with mutations in penicillin-binding proteins (PBPs) has been reported (Kimura et al., 2008;McGee et al., 2021).Erythromycin, clindamycin, and levofloxacin are recommended as alternative agents for patients with b-lactam allergies, but nonsusceptibility to these antibiotics has also been documented globally, thus limiting their use in the treatment of GBS infections (Castor et al., 2008;American College of Obstetricians and Gynecologists, 2020).Resistance to macrolides and lincosamides in GBS is mainly associated with the acquisition of methyltransferases encoded by the erm genes or efflux pumps encoded by the mef, msr, or lsa genes, whereas resistance to fluoroquinolones is caused by alterations in the targets GyrA and ParC (Leclercq, 2002;Metcalf et al., 2017;Hayes et al., 2020).
The transition from commensal to pathogenic GBS involves an array of multifunctional virulence factors.Among these, the capsular polysaccharide (CPS), which is the most studied factor, defines GBS serotypes and plays a critical role in immune evasion (Shabayek and Spellerberg, 2018).To date, ten distinct serotypes (i.e., Ia, Ib, and II to IX) have been identified in the species, of which, serotypes Ia, III, and V are the most frequent in adult carriage and disease (Bianchi-Jassir et al., 2020).GBS also possesses a pilus-like structure that mediates GBS colonization and adhesion to host cells.Two different pilus islands (PI), namely PI-1 and PI-2, with the latter divided into PI-2a and PI-2b variants, have been identified.All GBS strains harbor one or two PIs, with PI-2a and PI-2b being mutually exclusive (Springman et al., 2014).Other surface proteins, including the alpha-like protein family (Alpha C, Alp1, Alp2/3, and Rib), serine-rich repeat protein (Srr), C5a peptidase (ScpB), laminin-binding surface protein (Lmb), hypervirulent adhesin (HvgA), and fibrinogen-binding protein (Fbs) have also been reported to contribute to host cell adherence and invasion (Tazi et al., 2010;Furfaro et al., 2018;Shabayek and Spellerberg, 2018).The population structure of GBS has been studied using various techniques including multilocus sequence typing (MLST).Six major clonal complexes (CCs), namely, CC1, CC10/CC12, CC17, CC19, CC23, and CC26, have been largely associated with asymptomatic colonization and GBS infections (Skov Sørensen et al., 2010;Da Cunha et al., 2014).Of these, CC17 is strongly linked to CPS III and has been globally associated with invasive diseases and meningitis in infants, while CC1, CC10/CC12, CC19, and CC23 are frequent colonizers and can present with multiple serotypes (Da Cunha et al., 2014).
In recent years, genome-based studies on GBS isolates have provided a more comprehensive understanding of the epidemiology of this species (Kapatai et al., 2017;Meehan et al., 2021).Although whole-genome sequencing (WGS) has been widely used for the characterization of GBS isolates, no published genome sequences from Saudi Arabia are available.Here, we used WGS to characterize the population structure of colonizing and infecting GBS isolates recovered from pregnant women and non-pregnant adults.In particular, this study aimed to comprehensively analyze the distribution of genetic features associated with virulence and resistance to antibiotics and their relationship with GBS genomic lineages that are currently circulating in the kingdom.Ultimately, the generated data will enrich public databases by contributing with WGS input from Saudi isolates.

Phylogenetic analysis
Phylogenetic analysis using a SNP-based approach grouped the isolates according to their CCs and STs (Figure 2).Overall, isolates within the same cluster tended to share similar combinations of virulence factors and PI variants, suggesting that these phylogroups represented distinct genetic lineages.Isolates belonging to the most common CC1 were further sub-divided into separate subgroups according to serotypes (i.e., Ib, II, V, VI, and VII); however, with no suggestions of any recent events of serotype switching.Indeed, all CC1 clusters exhibited the PI-1 and PI-2a loci and the surface protein Srr1, but harbored genes encoding different alpha-like  (cylX, cylD, cylG, acpC, cylZ, cylA, cylB, cylE, cylF, cylI, cylJ, and cylK); alp: gene encoding alpha-like protein variants (i.e., alphaC, alp1, alp2/3, and rib).
proteins (Alpha C, Alp2/3, and Rib).Furthermore, CC19 clusters were dominated by the III/rib/srr1/PI-1+PI-2a (43.48%, 10/23) and V/alp1/srr1/PI-1+PI-2a (34.78%, 8/23) lineages, while most CC17 isolates (90%, 9/10) shared the same III/rib/srr2/P1-2b genetic background.The profiles of antimicrobial resistance determinants were found to be more diverse within the phylogenetic groups, suggesting that these genes might have been acquired through separate genetic events.The phylogeny of the sequenced isolates showed that those belonging to ST1212 and ST934, which also carried the CC17-specific adhesion hvgA gene, were genetically distinct, suggesting that they evolved and acquired the hypervirulence gene independently (Figure 2).Two pairs of sequenced isolates were found to be three and five SNPs apart from each other; they included isolates obtained from the same patients from different sites at different times.Relationships between all isolates were also reconstructed based on the GBS core-genome MLST (cg-MLST) scheme using the built-in pipeline of EnteroBase.Here also, cg-MLSTs separated the isolates by STs and their respective CCs (Figure 3; Supplementary Table 1).
Overall, isolates belonging to separate CCs differed by up to 336 different loci, confirming high genetic diversity within the species (Supplementary Table 2).In particular, the isolates belonging to ST1212, ST934, and CC17, which all harbored the hypervirulent gene hvgA, clustered in separate groups and differed from each other by 306 to 324 loci (Figure 3; Supplementary Table 1).In addition, the two pairs of isolates from the same patients clustering together in the SNP-based approach belonged to the same cg-MLST types (i.e., cgST74217 and cgST74216) (Figure 3).

Discussion
The current study evaluated several important features of GBS isolates that are currently circulating in Saudi Arabia, including serotypes, sequence types, virulence factors, and molecular mechanisms of resistance to clinically important antibiotics, using whole-genome sequencing.Indeed, epidemiological studies characterizing the population structure of GBS isolates in the Whole-genome single nucleotide polymorphism (SNP)-derived mid-point rooted phylogenetic tree of 89 GBS isolates and their molecular characteristics.The phylogenetic tree was linked to the MLST type (first column) and the corresponding CC for each isolate (second column).The tip shapes show the infection type, whereas the color indicates the serotypes, as denoted by the legend bar on the bottom right.The filled and unfilled circles colored in pink indicate the presence or absence of the major surface protein genes (alphaC, alp1, alp2/3, rib, srr1, srr2, and hvgA) and pilus islands (PI-1, PI-2a, PI-1b, and PI-2b).Circles colored in blue indicate the presence or absence of acquired antimicrobial resistance genes or mutations in gyrA and parC.country and their resistance to antibiotics are important for the prevention and treatment of GBS infections and eventually for future vaccine implementation.
Sequence analysis inferred the serotypes of all isolates, including those that were non-typeable by standard PCR.Overall, the comparison of PCR-based serotyping with those inferred from whole-genome sequencing using the GBS-SBG database showed good concordance (i.e., 92.13%, n = 82/89), with only two isolates typed differently.In particular, whole-genome sequencing identified a handful of isolates belonging to serotypes VII and VIII, which were not previously reported in Saudi Arabia.
The sequenced isolates were checked for the presence of determinants associated with antibiotic resistance.Of particular concern was the high proportion of isolates with reduced susceptibility to macrolides and lincosamides, which are the antibiotics of choice for the treatment of GBS infections in patients with penicillin allergies.Here also, the molecular mechanisms of resistance to these antibiotics were consistent with those previously described for this species (Bergal et al., 2015;Jin et al., 2022;Rodgus et al., 2022;Khan et al., 2023).Phylogenetic analysis showed that resistance to macrolides and lincosamides was not necessarily linked to the expansion of certain CCs but was distributed across all detected CCs.This might explain the high rate of resistance to these antibiotics in the species that is likely to impact their use for the prevention and treatment of GBS infections.Isolates showing the cMLS B phenotype were distributed across all major CCs and carried the methyltransferase-encoding gene erm (B), while the isolates exhibiting the iMLS B phenotype belonged to GrapeTree minimum-spanning tree showing cg-MLST of sequenced isolates (n = 89).The tree was constructed using the EnteroBase pipeline based on the Streptococcus scheme, comprising 1,918 target loci.
CC1 and CC19 and carried the erm(A) gene (Lopes et al., 2018;Martins et al., 2022;Rodgus et al., 2022;Khan et al., 2023).The few isolates showing the L phenotype coherently harbored the lsa(C) gene and belonged exclusively to CC19 expressing serotype III-1.On the other hand, the high rate of resistance to tetracycline was largely mediated by tet(M) and to a lesser extent by tet(O), which was largely in agreement with previously published studies (Bergal et al., 2015;Jin et al., 2022;Rodgus et al., 2022;Khan et al., 2023).Sequence analysis linked gentamicin resistance with the presence of the gene encoding the bifunctional aminoglycoside-modifying enzyme AAC(6')-APH(2'), and also identified other genes associated with resistance to kanamycin and streptomycin (i.e., ant(6)-Ia, aph(3')-III, and aadE) in a small proportion of isolates (16.85%, n = 15/89), mainly belonging to CC17 and CC12.The coexistence of erm(B), tet(O) with aminoglycoside resistance genes ant(6)-Ia, aph(3')-III, and aadE in sequenced isolates was linked to the acquisition of the integrative conjugative element ICESag37, which was recently described in the species (Khan et al., 2023).Interestingly, multidrug resistance in a handful of isolates exhibiting resistance to macrolides, tetracycline, ciprofloxacin, and gentamicin was predominantly linked to ST19.
GBS pathogenicity is mediated by a set of virulence factors that may confer a selective advantage to the bacteria in terms of enhanced colonization, invasiveness, and virulence within the host cell.This study confirmed that PI-2b was exclusively associated with highly pathogenic CC17/III-2 isolates (Lu et al., 2018).In addition, all CC17 isolates expressed the hypervirulent adhesion HvgA, as expected.Surprisingly, the presence of this CC17-distinguishing gene was also detected in isolates belonging to ST1212 and ST934, which were phylogenetically distinct from CC17 isolates, suggesting that other highly virulent strains may be circulating within the species.Similarly, three hvgA-positive isolates of ST934 have been recently reported in Ethiopia and Egypt (Ali et al., 2020;Shabayek et al., 2023).Most sequenced isolates expressed Srr1 (77.53%, n=69/89), whereas the gene encoding the variant Srr2 was identified in CC17 isolates and, interestingly, in all ST1212 isolates.Srr2 has been reported to have a greater binding affinity for fibrinogen and plasminogen than Srr1, which further enhances the adherence to epithelial and endothelial cells in invasive niches and is therefore likely to contribute to the virulence of ST1212 isolates (Seo et al., 2013).In contrast to CC17, all ST1212 isolates expressed serotype III-3, had the P1-1 locus in addition to PI-2b, and lacked the rib-encoding genes.More than half of the isolates (62.92%, n = 56/89) carried the PI-1 and PI-2a variants and belonged to the main CC1, CC10/CC12, CC19, CC23, and CC459 (Martins et al., 2017;Del Carmen Palacios-Saucedo et al., 2022).In 2017, a study identified a novel PI-1 variant, named PI-1b, among isolates of serotypes Ia, Ib, II, III, VI, and VIII, although the significance of its presence remained unknown (Teatero et al., 2017).In our study, PI-1b was detected in combination with PI-2a in (n = 18) isolates of various serotypes (i.e., Ia, Ib, II, VI, and VIII), of which more than half (55.56%, n = 10/18) belonged to CC1 and expressed serotype VI.
Screening showed that genes for alpha family proteins were present in 94.38% (n = 84/89) of the isolates, suggesting that protein-based vaccines targeting this family of proteins would offer high protection.The Alpha C protein-encoding gene was present among various CCs (i.e., CC1, CC10, CC12, and CC452) and was commonly expressed on the surfaces of various serotypes, including Ib, V, and VI.In contrast, the gene encoding Alp2/3 was confined to CC1 isolates presenting serotypes Ib, II, V, and VII, whereas the gene encoding Alp1 was mainly present in CC19 isolates exhibiting serotypes V and VIII.In addition, one-third of the isolates carried the rib gene and mainly belonged to CC17 and CC19, presenting serotypes III-2 and III-1, respectively.However, isolates without any of the four alpha family members included those belonging to ST1212, which might potentially be highly virulent, and selective pressure of vaccine based upon the alphalike surface proteins is likely to impact their prevalence.Other major virulence factors that mediate adhesion and invasion, including laminin-binding protein (lmb) and C5a peptidase (scpB) and those associated with the production of b-hemolysin/ cytolysin (cylE), hyaluronidase (hylB), and adherence to CAMP factor pore-forming toxin (cfb), were nearly ubiquitous in all sequenced isolates (Udo et al., 2013).Interestingly, the majority of fbsB-encoding fibrinogen-binding proteins, thought to be important for GBS spread by promoting the invasion of host epithelial cells, were detected only in isolates belonging to CC17, CC23, and CC452 (Del Carmen Palacios-Saucedo et al., 2022).Finally, although in some cases, the serotype and surface protein genes were predictive of CC, there was no clear association between GBS infections and the presence of these genetic features.Indeed, the dominant CC1, CC17, and CC19 isolates were almost evenly distributed among colonized and infected GBS isolates.

Conclusions
Although the number of sequenced isolates was limited, the study provided for the first-time important insights into the genetic diversity of GBS isolates that are currently associated with human colonization and infections in the country.The data regarding the distribution of genetic lineages and the prevalence of genes associated with antibiotic resistance and virulence lay the foundation for future GBS surveillance studies in the country.The decrease in macrolide and lincosamide susceptibility and their distribution across all common humanassociated clonal complexes is worrisome, corroborating the need for continued surveillance programs in our country to prevent the dissemination of GBS-causing diseases.In addition, the identification of the hypervirulent adhesion hvgA gene in non-CC17 isolates that were phylogenetically distinct, highlight the dynamic nature of this pathogen and underscores the need for more rigorous characterization of the genetic lineages causing infections.

Data availability statement
The datasets presented in this study can be found in online repositories.The names of the repository/repositories and accession number(s) can be found below: https://www.ebi.ac.uk/ena/browser/ view/PRJEB70279.

FIGURE 1
FIGURE 1 Distribution of serotypes and genetic lineages among GBS isolates.Segments were scaled according to the number of isolates belonging to each genotype.The inner circle represents the CCs, the middle circle represents the STs in relation to each CC, and the outer circle represents the capsular serotypes in relation to each ST.

TABLE 1
Distribution of antimicrobial resistance profiles among sequenced GBS isolates (n = 89).

TABLE 1 Continued
MLS B , resistance to macrolides, lincosamides, and streptogramin B with the prefix letter referring to the constitutive (cMLS B ) or inducible (iMLS B ) expression phenotype; M, resistance to macrolides; L, resistance to lincosamides; CCs, clonal complexes; STs, MLST sequence types.

TABLE 2
Distribution of pili, surface protein, and virulence factor profiles among GBS serotypes and clonal complexes.