Genomic analysis of Streptococcus pneumoniae serogroup 20 isolates in Alberta, Canada from 1993–2019

In the province of Alberta, Canada, invasive disease caused by Streptococcus pneumoniae serogroup 20 (serotypes 20A/20B) has been increasing in incidence. Here, we characterize provincial invasive serogroup 20 isolates collected from 1993 to 2019 alongside invasive and non-invasive serogroup 20 isolates from the Global Pneumococcal Sequencing (GPS) Project collected from 1998 to 2015. Trends in clinical metadata and geographic location were evaluated, and serogroup 20 isolate genomes were subjected to molecular sequence typing, virulence and antimicrobial resistance factor mining, phylogenetic analysis and pangenome calculation. Two hundred and seventy-four serogroup 20 isolates from Alberta were sequenced, and analysed along with 95 GPS Project genomes. The majority of invasive Alberta serogroup 20 isolates were identified after 2007 in primarily middle-aged adults and typed predominantly as ST235, a sequence type that was rare among GPS Project isolates. Most Alberta isolates carried a full-length whaF capsular gene, suggestive of serotype 20B. All Alberta and GPS Project genomes carried molecular resistance determinants implicated in fluoroquinolone and macrolide resistance, with a few Alberta isolates exhibiting phenotypic resistance to azithromycin, clindamycin, erythromycin, tetracycline and trimethoprim-sulfamethoxazole, as well as non-susceptibility to tigecycline. All isolates carried multiple virulence factors including those involved in adherence, immune modulation and nutrient uptake, as well as exotoxins and exoenzymes. Phylogenetically, Alberta serogroup 20 isolates clustered with predominantly invasive GPS Project isolates from the USA, Israel, Brazil and Nepal. Overall, this study highlights the increasing incidence of invasive S. pneumoniae serogroup 20 disease in Alberta, Canada, and provides insights into the genetic and clinical characteristics of these isolates within a global context.


INTRODUCTION
Streptococcus pneumoniae is a Gram-positive bacterium that is a global healthcare concern.The natural reservoir of S. pneumoniae is the nasopharynx of asymptomatic carriers and is spread between individuals via respiratory droplets [1].Pneumococci are a frequent cause of community-acquired pneumonia (CAP), as well as severe invasive pneumococcal disease (IPD).IPD manifests most often as bacteremia, meningitis or sepsis and is problematic primarily for children, the elderly, and those with comorbidities such as asthma, chronic lung/heart/kidney infections, alcohol abuse, cigarette smoking and diabetes [2][3][4].Unfortunately, reduced susceptibility to antibiotics has emerged among clinical isolates, including to β-lactams, macrolides, lincosamides, tetracyclines, trimethoprim-sulfamethoxazole and fluoroquinolones, presenting treatment challenges [5].
S. pneumoniae isolates can be classified by their polysaccharide capsule of which there are currently over 100 recognized serotypes [6][7][8].The invasiveness and type of disease caused by pneumococci varies between capsule types, with only a subset of serotypes having historically caused the bulk of IPD [9].Among serotypes of clinical relevance is the relatively rare serotype 20, recently proposed to be renamed serogroup 20 [10,11].In comparison to more dominant serotypes, little is known about the epidemiological and clinical characteristics of serogroup 20; however, serogroup 20 isolates have been identified as both colonizers and agents of disease in children and adults, being associated with invasiveness and mortality [12][13][14][15][16].In 2012, a renaming of serotype 20 to serogroup 20 was suggested following the discovery of two serotype 20 subgroups, which the authors designated as 20A and 20B [10,11].While indistinguishable by current serotyping antibodies used in diagnostics, serotypes 20A and 20B are structurally and genetically diverse, with truncation and loss of function of the capsular whaF gene being the hallmark genetic differentiator [10].However, the discovering authors indicate that additional molecular work is required to confirm the connection between whaF truncation and the phenotypic differences between serotypes 20A and 20B [10].
Whole-genome sequencing (WGS) has become a valuable tool in the field of epidemiology, providing detailed genetic information about the organisms responsible for outbreaks [36].By sequencing the entire genome of a pathogen, evolution and transmission patterns can be detected, as well as the carriage of virulence and antimicrobial resistance genes and the identification of sequence types (STs) of concern.The GPS Project (https://www.pneumogen.net/gps/index.html) is an example of a successful whole-genome surveillance effort [37].The GPS Project, in collaboration with several contributors, has sequenced over 21 000 genomes of Streptococcus pneumoniae from various parts of the world (www.pneumogen.net/gps/gps-database-overview/;accessed 15Sept2023).The database contains metadata information for each isolate, which minimally includes date of collection and geography, but also allows for input of clinical data (e.g.gender, age, syndrome, source, HIV status, underlying conditions), serotype, sequence type and antimicrobial susceptibility testing results.Whole-genome analysis provides valuable information

Impact Statement
This study presents a comprehensive genomic characterization of invasive S. pneumoniae serogroup 20 identified in Alberta, Canada, which has been increasing in incidence since 2007.Included is a comparison of these isolates to invasive and noninvasive serogroup 20 isolates from the Global Pneumococcal Sequencing (GPS) Project, providing insights into the genetic diversity, virulence and antimicrobial resistance of serogroup 20 isolates from a global perspective.This study is aimed primarily at those engaged in infectious disease surveillance, epidemiology and genomics, and its findings may inform vaccine development and administration programmes.The study underscores the need for continuous monitoring and characterization of serogroup 20 isolates, given the increasing incidence and potential to cause severe disease.
that can be used for tracking and preventing outbreaks, characterizing isolates of concern, and the development of therapies and vaccines.
Here, we report a cluster of IPD caused by serogroup 20 that began in 2007 and spiked in subsequent years in the province of Alberta, Canada.The objective of this work was to characterize 274 invasive serogroup 20 isolates in Alberta from 1993 to 2019 using genomics approaches and compare them to 95 invasive and non-invasive serogroup 20 isolates from the GPS Project from 1998 to 2015.Characterization of this minimally explored serogroup will foster a better understanding of the emerging invasive pathogen within Alberta, as well as provide a genomic overview of global serogroup 20.

Isolate and clinical data collection
Cases of IPD were defined as per the national case definition and are notifiable to Public Health in Alberta (population 4.36 million in 2019) [38].Pneumococcal isolates from sterile sites are submitted to the Provincial Public Health Laboratory (PPHL) located in Edmonton, Alberta for serotyping and antimicrobial resistance profiling.Only one isolate was counted per case within a 30 day period unless the second isolate was a different serotype.Serogroup 20 isolates were not included in our analysis if there were duplicates collected from the same patient in the same year, the isolate did not grow for sequencing, the isolate had poor coverage as determined by Gubbins (see below), or the isolate had poor sequencing/assembly quality as determined by Quast (see below).Maps were obtained from https://open.canada.ca,which are licensed under the Open Government License -Canada and generated with Tableau Desktop v2021.2 and Inkscape v0.92.3.

Laboratory identification, serotyping and antimicrobial susceptibility
S. pneumoniae isolates received at the PPHL were confirmed as S. pneumoniae based on characteristic morphology and optochin susceptibility [39].All pneumococcal isolates that exhibited a positive Quellung reaction using commercial type-specific antisera obtained from Statens Serum Institut, Copenhagen, Denmark were assigned a serotype designation [40].Antibiotic susceptibility was determined using the reference broth micro-dilution and disc diffusion methods as described by Clinical and Laboratory Standards Institute (CLSI) [41].The following antimicrobial agents were assayed: amoxicillin, azithromycin, cefepime, cefotaxime, ceftriaxone, cefuroxime, chloramphenicol, clindamycin, ertapenem, erythromycin, induced clindamycin resistance, levofloxacin, linezolid, meropenem, moxifloxacin, penicillin, tetracycline, tigecycline, trimethoprim/sulfamethoxazole and vancomycin.All antibiotic powders were purchased from Sigma-Aldrich Canada (Oakville, Ontario).Interpretation of the MIC or disc diffusion (DD) tests were based on CLSI Performance Standards M100-Ed32 [42] and FDA breakpoints for tigecycline [43].

Limited phenotypic antibiotic resistance was observed among Alberta serogroup 20 isolates
In Table 1 are the antibiotic susceptibilities of Alberta isolates, as determined by MIC and/or DD assays (data for individual isolates can be found in Table S4).Due to changes in susceptibility testing at the PPHL in Alberta over the years included in this study, not all isolates were tested with the same antibiotic panels or with the same method (i.e.MIC vs. DD).Among isolates tested, all were found to be susceptible to amoxicillin, cefepime, cefotaxime, ceftriaxone, cefuroxime, chloramphenicol, ertapenem, levofloxacin, linezolid, meropenem, penicillin/penicillin PO and vancomycin (Table 1).In addition, all Alberta isolates tested were negative for inducible clindamycin resistance.Limited antibiotic resistance was observed, with a few isolates exhibiting azithromycin (n=4/193 isolates tested), clindamycin (n=4/269), erythromycin (n=5/270), tetracycline (n=3/254) and trimethoprim/sulfamethoxazole (n=1/270) resistance, as well as four isolates exhibiting non-susceptibility to tigecycline (n=4/254; Tables 1 and S4).An intermediate phenotype was observed for three isolates to moxifloxacin and four isolates to trimethoprim/sulfamethoxazole.Interestingly, 10 isolates had conflicting results between MIC and DD assays for trimethoprim/sulfamethoxazole, with one result indicating susceptibility and the other indicating an intermediate phenotype (Table 1).

Phylogenetic analysis of Alberta and GPS project serogroup 20 isolates
To examine the phylogenetic relationships between Alberta and GPS Project serogroup 20 isolates, a maximum-likelihood tree was generated from a core SNP alignment masked for recombination (Fig. 3).All isolates grouped primarily by ST, and most Alberta isolates clustered together (highlighted in grey in Fig. 3).Several GPS Project isolates grouped closely with Alberta isolates, which are indicated in Fig. 3.These included two ST235 isolates, one from the USA and one from Israel, that clustered with the Alberta ST235 isolates, five Brazilian ST1030 isolates that formed a sister group to the ST11843/ST6805 clade of Alberta isolates, four USA ST1257 isolates and one ST13913 isolate from Israel that clustered with Alberta ST1257 isolates, and four ST4745 isolates from Nepal that clustered within the Alberta ST4745/ST7828 clade.All the aforementioned GPS Project isolates that grouped with Alberta genomes were from cases of invasive disease with the exception of three isolates from Nepal, which were isolated via nasopharyngeal swab.One ST1794 Alberta isolate did not cluster with the other Alberta isolates but instead formed a sister group to a clade of ST5726/ST10625/ST12800 isolates from The Gambia, among which the majority were carriage (n=21/23 of The Gambia isolates in clade; Fig. 3).The single Alberta isolate that was not assigned a ST grouped within the Alberta ST235 clade.

Pangenome analysis of Alberta and GPS isolates
The pangenome of Alberta and GPS Project serogroup 20 isolates (n=369) comprised a total of 3185 genes (Fig. 4).The core genome contained 1396 genes present in all genomes (including paralogues), representing 44 % of the pangenome.The accessory genome (present in more than one but less than 369 isolates) contained 1353 genes (42 % of pangenome).Singletons (present in only one isolate) made up 14 % of the pangenome (436 genes).Singleton gene counts ranged from 0 to 132 genes, and most were carried by GPS Project isolates, with an ST10625 isolate from The Gambia carrying the highest number of singletons (ERR1192063; n=132 singletons).Among Alberta isolates, singleton genes ranged from 0 to 11 genes, with the highest numbers of singletons carried by isolates SRR3486098 (ST1794; n=11), SC21-2825-P (ST235; n=10), and SC21-2761-P (ST235; n=8).The total length of all genomes ranged from 1908068 to 2251137 bp, GC-content ranged from 39.4-40.6 %, and percent completion ranged from 97.2-100 %, as calculated by Anvi' o.These calculations were similar to Quast/BUSCO results, which determined genome length to range from 1889737 to 2227598 bp and percent completion to range from 95.27-97.97% (Table S1).All isolates exhibited high average nucleotide identity (ANI) with a minimum ANI of 98.3 % among all Alberta and GPS Project isolates combined and a minimum ANI of 98.6 % among Alberta isolates alone.

Antibiotic resistance genes carried by Alberta and GPS project serogroup 20 isolates
To determine the presence of antimicrobial resistance genes among both Alberta and GPS Project isolates, genomes were mined for antibiotic resistance ontologies (AROs) using the Comprehensive Antibiotic Resistance Database, a curated collection of peerreviewed resistance determinants [50].AROs were mined and categorized by the Resistance Gene Identifier (RGI) as 'perfect' or 'strict' , signifying a perfect match to curated reference sequences or imperfect matches that met curated blastp bit score cut-offs and represent likely functional AMR gene variants, respectively [50].All isolates carried genes similar to patA, patB and pmrA, which are transporters implicated in fluoroquinolone resistance, as well as a gene for RlmA(II), a methyltransferase involved in macrolide/lincosamide resistance (Table 2).Some resistance factors were carried by only a few Alberta and GPS Project isolates, which included fluoroquinolone-resistant parC and ribosomal protection protein tetM, which provides tetracycline resistance (Table 2).Antimicrobial resistance genes carried by only Alberta isolates included fosfomycin inactivation factor FosA4 and antibiotic target modifier ErmB, which confers resistance to macrolide, lincosamides and streptogramin antibiotics (Table 2).Antibiotic resistance genes present among only GPS Project isolates included an acetyltransferase involved in phenicol antibiotic inactivation, copies of pbp1a and pbp2x genes with mutations conferring β-lactam antibiotic resistance, and tet(W/N/W), which protects against tetracycline (Table 2).The RGI also identified 'loose' hits, which are hits that fall outside of the detection model cut-off values.Loose hits can provide detection of new AMR genes or distant homologs, but also may identify homologues and spurious matches not truly involved in antibiotic resistance.Included in the loose hits for this collection of genomes were 2015 resistance determinants that fell into a wide range of resistance mechanisms with percent identities below 88 %.Loose hits have been included separately in Table S5.

Virulence factors carried by serogroup 20 isolates
Both Alberta and GPS Project serogroup 20 genomes were surveyed for virulence factors via comparison to the Virulence Factor Database (VFDB) core gene set, which contains representative genes for experimentally verified virulence factors [51].

Alberta and GPS Project isolates are primarily serotype 20B
The serotypes of Alberta and GPS Project serogroup 20 isolates (n=369 total) were confirmed in silico by analysing capsular polysaccharide (CPS) gene clusters using PneumoCaT [47].Among all isolates analysed, 304/369 were clearly serogroup 20, the remaining isolates were most similar to serogroup 20 but did not meet PneumoCaT confidence cut offs.To determine if a particular serotype was dominant among serogroup 20 isolates (i.e.serotype 20A or 20B), blast was used to identify isolates with a full-length whaF gene, which would be suggestive of serotype 20B [10].There were 253/369 isolates (69 %) that carried full-length whaF genes suggestive of serotype 20B, which included 235/274 (86 %) of the Alberta isolates.The remaining Alberta and GPS Project isolates had truncated whaF genes, including 31 isolates where the gene sequence was on a contig end, inhibiting the ability to determine the true length.The whaF gene was not identified in six isolates.

DISCUSSION
In this study we characterized 274 invasive S. pneumoniae serogroup 20 isolates identified in Alberta from 1993 to 2019 and compared these to 95 invasive and non-invasive serogroup 20 GPS Project isolates collected from 1998 to 2015.The goal was to characterize this increasingly prevalent serogroup within primarily through genomic analyses, which has not been conducted previously for this serogroup on as large a scale, to the best of our knowledge.We evaluated basic demographics, STs, in silico serotyping, antimicrobial resistance, virulence factors, and phylogenetic and pangenomic relationships between isolates.
In this study, the majority of Alberta serogroup 20 isolates were part of MLST 235 (ST235), which was first identified in the province in 2007.In PubMLST, 503 isolates were labelled as serotype 20, with representative isolates from Africa, Asia, Europe, North America, South America and Oceana, including 35 countries not represented in our dataset (https://pubmlst.org/;accessed 3 May 2023).Among the serogroup 20 isolates, 45 were designated ST235 (https://pubmlst.org/;accessed 3 May 2023).Source data was only available for 10/45 serogroup 20 ST235 isolates, with the majority (8/10, 80 %) being considered invasive.The second most common ST in the Alberta serogroup dataset, ST6805, had only a single isolate of representation among serogroup 20 in PubMLST, which was an invasive isolate from New Zealand (PubMLST id:13791).Among all ST235 in PubMLST (n=52), the majority were serogroup 20, with one isolate being serotype 7C and six with inconclusive serotyping.Interestingly, ST235 was limited among the GPS Project genomes included in this study, with single isolates in Israel and the USA that clustered phylogenetically with Alberta ST235 (Figs 2a and 3).Identification of invasive serogroup 20 ST235 in the literature is also scarce.A study from Brazil surveyed serotypes and genotypes of invasive S. pneumoniae before and after PVC10 implementation, finding that serogroup 20 had increased post-vaccine introduction [64].Interestingly, these isolates were all from ST8889, which is a single locus variant of ST235 [64].Similar to our study, the majority of isolates were from males with average and median ages of 61.6 and 62 years, respectively, and were recovered primarily from blood [16].Clinical data was collected for the presented patients and all were found to have at least one underlying condition, with the most common being chronic obstructive pulmonary disease, alcoholism, systemic arterial hypertension and smoking, which are well-understood to facilitate pneumococcal disease [16].isolate SC21-2726-P was not assigned an ST, indicating that perhaps this isolate a novel ST or one that is not present in the current MLST scheme.Alternatively, a lack of ST assignment may be the result of missing relevant sequences due to limitations in sequencing/assembly of the genome.
The appearance of a limited number of serogroup 20 ST235 identified outside of Alberta may be due to a variety of factors.First, if ST235 is associated with invasive disease, representation within the GPS Project may be limited since most isolates were not invasive.Second, the GPS Project sample size was small, isolates were collected over a slightly different time frame than Alberta isolates (1993-2019 for Alberta vs. 1998-2015 for GPS), and isolates had a limited global representation of only 12 countries.If ST235 has had a more recent or globally restricted emergence, it is possible that inclusion of a larger number of global genomes over an extended time frame would reveal additional ST235 isolates and shed more light on the profile of this ST globally.Finally, limited distribution may be due to the overall low incidence of ST235 among S. pneumoniae isolates as suggested by the relatively small number of ST235 in PubMLST.As we do not have representative global data and are limited by sample size, additional study and surveillance would be required to ascertain the true global distribution of ST235.
Rates of antibiotic resistance are increasing among S. pneumoniae isolates and vary depending on geographic location [65].Therefore, tracking antimicrobial resistance can provide insight into local, national and international rates and types of resistance.In general, resistance has been observed for multiple antibiotics, including β-lactams, macrolides, fluoroquinolones, trimethoprimsulfamethoxazole (TMP-SMX) and tetracyclines.Antibiotic resistance was limited among the Alberta S. pneumoniae serogroup 20 isolates in this study, with ≤2 % resistance among isolates challenged with azithromycin, clindamycin, erythromycin, tetracycline and TMP-SMX (Table 1).In 2020, a Canadian Communicable Disease Report publication on invasive pneumococcal disease across Canada reported 45 serogroup 20 isolates, among which the following antibiotic resistance was observed: 4.4 % to clarithromycin, 2.2 % to clindamycin, 6.7 % to doxycycline, 6.7 % to TMP-SMX and 2.2 % multi-drug resistance [66].These percentages were below the already low national percentages for all serotypes combined for 2020.No resistance was reported for serogroup 20 to penicillin, ceftriaxone, imipenem, meropenem, levofloxacin and chloramphenicol in the national study.Among the antibiotics tested in both our study and included in the national survey, similar antibiotic resistance rates were with the exception of clindamycin and TMP-SMX, which had higher percent resistance in the national study (Table 1).
Serogroup 20 isolates carried several antibiotic resistance genes (Table 2).All isolates included in our study carried patA, patB and pmrA efflux pumps, which can confer fluoroquinolone resistance, as well as RlmA(II), a methyltransferase that can confer resistance to tylosin and other mycinosylated macrolides (Table 2) [67].The efflux pump pmrA has been shown to impart lowlevel resistance to norfloxacin but does not appear to play a major role in fluoroquinolone resistance in S. pneumoniae [68,69].
Resistance to hydrophilic fluoroquinolones (e.g.ciprofloxacin, norfloxacin) appears to be linked to overexpression of patA/patB, and the presence of these genes within wild-type strains confers only low-level intrinsic resistance [70].The remaining antibiotic resistance genes identified had very limited distribution, including among Alberta serogroup 20 isolates, which is not surprising the limited phenotypic resistance observed (Table 2).
The Alberta and GPS Project genomes were characterized further in terms of genomic composition.All isolates had Average Nucleotide Identity (ANI) values above 98 %, indicating their close genomic relatedness.Similarities were also observed for virulence factor carriage, which was comparable across all isolates, with most virulence factors being present in all isolates (Table 3).No distinct patterns emerged linking virulence factors to the invasive Alberta isolates in comparison to the predominantly non-invasive GPS Project isolates.Interestingly, the serogroup 20 isolates carried a high number of accessory and singleton genes.This observation aligns with the notion that S. pneumoniae has an 'open' genome, whereby the inclusion of more isolates would contribute an additional, albeit decreasing, number of genes in the form of non-core genes [72].Notably, the Alberta isolates carried the fewest singleton genes, potentially due to the clonal nature of these isolates, particularly those belonging to the ST235 lineage (Fig. 4).
Currently, PPV23 is the only vaccine that contains serogroup 20.Previous studies have determined that PPV23 contains serotype 20A polysaccharide and that antibodies from PPV23 vaccination mediate opsonophagocytic killing of serotype 20B in vitro; nevertheless, the clinical relevance of this cross-reactivity requires further epidemiological investigation [10,11].In Alberta, despite the implementation of PPV23 vaccination in 1997, cases of invasive serogroup 20, appearing to be predominantly serotype 20B, have been increasing, potentially calling into question the cross-protectivity of the vaccine.However, as detailed clinical and socioeconomic data was not available for the cases presented in this study, it is possible that the increase in cases is due to transmission among vulnerable groups that have not been previously vaccinated.In Alberta, the increase in case numbers has been primarily among middle-aged males, who would not necessarily be eligible under current vaccination programme criteria.We also saw clustering in cities, which could be an indicator of increased risk within these areas (Fig. 2b).For example, serogroup 20 has been found to be disproportionately represented among homeless individuals in comparison to the general population, and in general, IPD in this group has highest incidence among middle-aged males [73].If this is the case in Alberta, targeted vaccination programmes for at-risk individuals using serogroup 20-containing vaccines such as the PPV23, or when available the conjugate vaccine V116, may be necessary if case numbers continue to increase.

CONCLUSION
The increasing prevalence of invasive S. pneumoniae serogroup 20 in Alberta underscores the need for ongoing surveillance and identification of high-risk population groups.This will require continued surveillance of serogroup 20 both locally and globally to better understand epidemiological trends, develop effective prevention and control strategies, and contribute to pathogen characterization.Genomics has emerged as a valuable tool in pathogen characterization and epidemiology, enabling better understanding of the spread and genetic characteristics of this pathogen.By leveraging genomics-based epidemiological studies to inform public health strategies it is hoped that the burden of pneumococcal disease can be minimized.

Fig. 1 .
Fig. 1.Sequence types (STs) of invasive S. pneumoniae serogroup 20 in Alberta over time.Stacked bars are coloured according to MLST.

Fig. 2 .
Fig. 2. Geographic location of S. pneumoniae serogroup 20 isolates included in this study.(a) Worldwide distribution of serogroup 20 isolates from GPS Project (invasive and non-invasive isolates) and Alberta (invasive isolates).Pie charts are coloured by MLST and diameters correspond to number of isolates.Abbreviations: no., number; UA, unassigned; USA, United States of America.(b) Invasive serogroup 20 cases in Alberta.Dots are centred in postal code regions (first three digits) and circle diameters correspond to number of isolates.Twenty-six isolates were from individuals without an address (not shown on map).

Fig. 3 .
Fig. 3. Maximum-likelihood phylogenetic tree of Alberta and GPS Project S. pneumoniae serogroup 20 isolates.GPS project isolates that grouped with Alberta isolates are indicated with letters: (a) one isolate from the USA (ST235; invasive) and Israel (ST235; invasive); (b) five isolates from Brazil (ST1030; invasive); (c) four isolates from the USA (ST1257; invasive) and one from Israel (ST13913; invasive); and (d) four isolates from Nepal (ST4745; one invasive, three colonizing).The reference sequence (Alberta isolate SRR3486078; ST235), used in SNP alignment generation is indicated with an asterisk (*).Branch tips are coloured by MLST and Alberta isolates are highlighted in grey.Abbreviation: UA, unassigned.

Fig. 4 .
Fig. 4. Pangenome of Alberta (yellow) and GPS Project (grey) S. pneumoniae serogroup 20 isolates.Each ring represents a single genome and dark orange or dark grey designates the presence of a gene.The dendrogram of isolates is ordered by fastANI.Singleton genes ranged from 0 to 132 genes, GC-content from 39.4-40.6 %, and total length from 1908068 bp-2251137 bp.Abbreviation: SGC, single gene cluster (singletons).

Table 1 .
Phenotypic antibiotic susceptibility of invasive Alberta S. pneumoniae serogroup 20 *Among the 270 isolates challenged with trimethoprim/sulfamethoxazole, 255 isolates tested strictly S, 4 isolates tested strictly I, 9 isolates tested S for DD and I for MIC, and one isolate tested I for DD and S for MIC.DD, disk diffusion; I, intermediate; MIC, minimum inhibitory concentration; Non-S, non-susceptible; PO, by mouth; R, resistant; S, susceptible; TMP-SMX, trimethoprim/sulfamethoxazole.

Table 2 .
Antimicrobial resistance genes carried by Alberta (AB) and GPS Project S. pneumoniae serogroup 20 isolates* Shown are perfect and strict Resistance Gene Identifier (RGI) hits to the Comprehensive Antibiotic Resistance Database (CARD) for 369 total strains of which 274 are from Alberta.†The lowest percent identity of an isolate gene to the RGI hit among all isolates with that gene.‡Excludes multiple copies if applicable and includes only perfect and strict RGI hits. *

Table 3 .
Virulence factors carried by Alberta (AB) and GPS Project S. pneumoniae serogroup 20 isolates* Shown are Virulence Factor Database (VFDB) hits for 369 total strains of which 274 are from Alberta.†The lowest percent sequence coverage of an isolate gene to the VFDB gene among all isolates with that gene.‡The lowest percent identity of an isolate gene to the VFDB among all isolates with that gene.§Excludes multiple copies if applicable. *