Local Genomic Surveillance of Invasive Streptococcus pyogenes in Eastern North Carolina (ENC) in 2022–2023

The recent increase in Group A Streptococcus (GAS) incidences in several countries across Europe and some areas of the Unites States (U.S.) has raised concerns. To understand GAS diversity and prevalence, we conducted a local genomic surveillance in Eastern North Carolina (ENC) in 2022–2023 with 95 isolates and compared its results to those of the existing national genomic surveillance in the U.S. in 2015–2021 with 13,064 isolates. We observed their epidemiological changes before and during the COVID-19 pandemic and detected a unique sub-lineage in ENC among the most common invasive GAS strain, ST28/emm1. We further discovered a multiple-copy insertion sequence, ISLgar5, in ST399/emm77 and its single-copy variants in some other GAS strains. We discovered ISLgar5 was linked to a Tn5801-like tetM-carrying integrative and conjugative element, and its copy number was associated with an ermT-carrying pRW35-like plasmid. The dynamic insertions of ISLgar5 may play a vital role in genome fitness and adaptation, driving GAS evolution relevant to antimicrobial resistance and potentially GAS virulence.


Introduction
Streptococcus pyogenes, also known as Group A Streptococcus (GAS), is a strictly human pathogen causing over 700 million infections and over 500,000 deaths annually worldwide as of 2005 [1].GAS is among the top ten infectious-disease-related causes of death globally [1].The Centers for Disease Control and Prevention (CDC) has classified erythromycinresistant GAS as a "concerning threat", causing 5400 infections and 450 deaths per year in the United States (U.S.) as of 2017, up from 1300 infections and 160 deaths as of 2010 [2].In December 2022, the World Health Organization (WHO) reported a surge of invasive GAS (iGAS) infections in several European countries [3], including the United Kingdom [4,5], France [6], The Netherlands [7], Belgium [8], Portugal [9], and Denmark [10].In February 2023, the CDC also reported a higher level of iGAS infections in some U.S. areas, including Colorado and Minnesota [11].In response, we investigated iGAS infections locally in Eastern North Carolina (ENC) in 2022-2023 and compared them to U.S. iGAS infections in 2015-2021 that have been whole-genome sequenced under the CDC's Active Bacterial Core surveillance (ABCs).
As of 8 August 2023, there are 1384 multi-locus sequence types (MLSTs) [12] in the PubMLST typing scheme (https://pubmlst.org/)and 2713 cell surface M protein gene (emm) types and subtypes [13] in the U.S. CDC M type-specific sequence database (https://ftp.cdc.gov/pub/infectious_diseases/biotech/emmsequ/) for GAS, illustrating its vast genetic diversity.Despite the recent updated emm-typing protocol [14], wholegenome sequencing (WGS) is increasingly used to study GAS epidemiology and evolution, as it provides comprehensive information that includes MLSTs, emm-types, antimicrobial genes, virulence factors, and alterations in genome organization.By determining the genetic relationship of isolates, WGS-based genomic surveillance could support the close monitoring and tracking of GAS transmission, assist in outbreak investigations, and help inform vaccine development [15,16].
In our previous genomic surveillance [17], we compared various genomic analysis methods, including single-nucleotide variant (SNV)-based, core-and/or pan-genomebased, and k-mer-based analyses.We found an SNV as a unit was too small to identify structural variants, while a gene as a unit in core-/pan-genome analysis was too large.Also, SNV-based analysis is reference-dependent, whereas gene-based analysis is annotationdependent.Any missing data or potential errors in the reference or during the gene annotation can introduce bias into the results.As such, SNV-based analysis and gene-based core-/pan-genome analysis often result in a relatively lower resolution.In contrast, k-merbased analysis is adjustable in size and is neither reference-nor annotation-dependent.With these advantages, we employed k-mer (i.e., 31-mer, unless specified otherwise) analysis in this local GAS genomic surveillance study.

Local Genomic Surveillance of Streptococcus pyogenes Strains
We conducted a retrospective genomic surveillance of 96 GAS isolates collected in ENC from March 2022 to July 2023, one of which (Spyo57) was found to be a mixture of GAS and Streptococcus agalactiae after WGS and was thus excluded from the study.Of the remaining 95 isolates, 54 were invasive, 32 were non-invasive (nGAS), and nine were from urine (uGAS).Invasive GAS was defined as the collection of a GAS isolate from a normally sterile site or from a wound in a patient with necrotizing fasciitis or streptococcal toxic shock syndrome (STSS), according to the standard [18].Due to the retrospective nature of the study, we did not investigate these cases further to determine whether these uGAS strains were invasive or non-invasive.The median age of these patients was 33 years (ranging from 1 to 84), 45 years for iGAS and 10.5 years for nGAS, consistent with the prevalence of nGAS pharyngitis in children.Additionally, 47/95 (49.5%) were male, with 32/54 (59.3%) having iGAS and 14/32 (43.8%) having nGAS.Blood was the main source of iGAS isolates (31/54, 57.4%).Of 54 iGAS patients, 22 were in the intensive care unit (14 males with a median age of 58.5), and 15 had STSS (11 males with a median age of 61), seven of whom died (six males with a median age of 61).The demographic clinical information of these 95 patients, including the antimicrobial susceptibility test results of 36 isolates (27 iGAS), is summarized in Supplementary Table S1.No tested isolate was found to be resistant to penicillin, ceftriaxone, or vancomycin.
The CDC ABCs is an active laboratory-and population-based surveillance system for invasive bacterial pathogens of public health importance, including iGAS.To compare our local genomic surveillance with the existing national surveillance in the U.S., we analyzed all WGS data of iGAS isolates publicly available in the ABCs' BioProject PRJNA395240, a total of 13,064 isolates.Notably, the iGAS surveillance data in 2015, 2016, and 2017 were reported previously [23,24].Nevertheless, Table 2 summarizes the main GAS genotypes each year in the U.S. from 2015 to 2021, including ST399/emm77, for a comparison to those in our dataset.Among these, ST28/emm1 was the most prevalent each year, except in 2021 (n = 17, only 2.73%), and ST36/emm12 decreased almost each year, from 9.1% (n = 132, the third most common) in 2015 to 4.1% (n = 76, 6th) in 2020, with only six (0.96%) in 2021.The decrease of ST101/emm89 is also notable, as well as that of ST36/emm12.Intriguingly, the COVID-19 pandemic not only reduced iGAS incidences [25] but also changed the pattern of prevalent iGAS genotypes.While ST334/emm82 was significantly increased in 2020 and 2021 (which was second in both years), ST433/emm49 became the most common in 2021 (15.27%).In contrast to this national surveillance data, ST28/emm1 and ST36/emm12 were much more dominant in our local iGAS data during 2022-2023, with 16/54 (29.6%) and 18/54 (33.3%), respectively, along with an upsurge of ST399/emm77 (4/95, 4.2%).
Our focused kWIP analysis on the M1a-2b isolates further distinguished M1 UK from M1 global (Figure S4C,D), the discrepancy between which was defined as 27 SNVs and four indels [21,27].Our three nGAS strains, along with 21 isolates from 2019 and 20 isolates from 2020 in the U.S., were close to M1 UK .We validated this clustering result by using the SNV-based analysis with MGAS5005 (M1 serotype, NC_007297) [26] as a reference and confirmed 23/27 SNVs in three M1 UK isolates and all 27 SNVs in the remaining 41 M1 UK isolates (Figure S4E).The relatedness clustering of all ST28/emm1 GAS strains (n = 939) in our local and national comparative genomic analyses is detailed in Table S5.

Multiple-Copy ISLgar5 and Single-Copy ISLgar5 Variants in Streptococcus pyogenes
To identify genetic elements associated with iGAS infection, we performed a genomewide association study (GWAS) on 54 iGAS vs. 32 nGAS strains using the kmdiff algorithm, a differential k-mer analysis between two populations [28].A total of 189 differential k-mers were identified in iGAS with significance (p ≤ 2.2 × 10 −9 ), from which two fragments (123 bp each) were assembled de novo.A BLAST search of these two fragments hit a single insertion sequence, ISLgar5, with one at its 5 ′ end and the other at its 3 ′ end (Figure 3A).According to ISFinder [29] (https://www-is.biotoul.fr/,accessed on 8 August 2023), ISLgar5 is 1336 bp long, with 26 imperfect terminal repeat sequences (IRs, four mismatches), belonging to the IS256 family.Initially identified in Lactococcus garvieae IPLA31405, ISLgar5 has been found in firmicutes, mainly in Enterococcus, Staphylococcus, and Streptococcus, but only one has been found in GAS, i.e., emmSTG866.1 (CP035428) in Kenya in 2005 [16], in our BLAST search of the NCBI public nucleotide database.However, we later found out that our GWAS was biased by multiple copies (14-15 as estimated) of ISLgar5 existing in our four ST399/emm77 isolates, based upon the assessment of differential sequencing coverage between the mobile genetic element (MGE) and the whole genome.In our screening of the NCBI Sequence Read Archive (SRA) public database, we detected a total of 111 isolates harboring multiple copies of ISLgar5, all of which belonged to the genotype ST399/emm77, with one exception, SRR18933625 (ST458/emm28) (Table S6A).Of 110 ST399/emm77 isolates, 99 were from the ABCs' BioProject PRJNA395240 in the U.S. from 2015 to 2021; one was from Texas in 2014 (PRJNA494557); one was from Ireland in 2020 (PRJEB34287); and nine were from the UK before 2015 (PRJEB13551, PRJEB12015, and PRJEB17673), indicating its geographically wide existence.The estimated ISLgar5 abundance in each ST399/emm77 isolate ranged from 3 to 39, with a median of 11.

Association of ISLgar5 with Streptococcus pyogenes Antimicrobial Resistance
Most ISLgar5-carrying isolates (121/129, 93.8%), including our four ST399/emm77 isolates, harbored the tetM gene, conferring resistance to tetracycline [15,31], which caused us to suspect that ISLgar5 is linked to tetM.By using the emmSTG866.1 genome as a reference, we identified a novel ~26 kb Tn5801-like integrative and conjugative element (ICE, 986,768-960,966 in CP035428) carrying both ISLgar5 and tetM (Figure 3C).The ICE starts with ISLgar5 and ends with a sitespecific integrase, between which are genes encoding ATP-dependent endonuclease and helicase, anti-restriction protein ArdA, conjugal transfer proteins, regulatory transcription factors (the XRE family and sigma-70 family), accessory tetM, etc.All tetM-positive iGAS strains harbored this tetM-carrying ICE in the chromosome, whereas the tetM-negative ST399/emm77 isolates (n = 7) had a large deletion of internal genetic components, with only ISLgar5, the sigma-70 family transcription factor, and integrase remaining (Figure 3C).Notably, compared to its genome sequencing coverage, SRR7706789 had a relatively lower sequencing coverage on the ICE, suggesting this MGE was not fully integrated into the genome (either moving in or moving out), putting the tetM positivity in question (Table S6B).In the ISLgar5-3-carrying ST904/emm77 isolate, the tetM-carrying ICE had a 53 bp deletion at the 3 ′ end of a gene encoding a DUF87 domain-containing protein, disrupting its translation (Figure 3C).Mutations of the Tn5801-like ICE have also been found in ISLgar5-1-carrying SRR18923745 (ST677/emm12) isolates.

Characterization of Multiple-Copy ISLgar5 in ST399/emm77
Currently, the complete ST399/emm77 genome is not available in the NCBI public genome database.We therefore conducted Pacific Bioscience (PacBio) HiFi long-read sequencing on Spyo01 and Spyo09, along with an M1a-3 isolate Spyo06, to obtain their complete genomes.Figure 4A demonstrates a whole-genome comparison of these isolates with references M1, emmSTG866.1,and emm77 (ST588, CP035439).We confirmed 14 copies of ISLgar5 in the Spyo01 chromosome (CP136948), but only 13 copies in Spyo09 (CP136951), one copy short of our initial estimation.Their insertion sites and correspondingly affected genes are summarized in Table S7.Interestingly, ISLgar5 had also inserted into its own ICE in both Spyo01 and Spyo09 (#6, contrast to original #5, Figure 3C), which might prevent the MGE from further transposition.Apart from one ISLgar5 copy difference between Spyo01 and Spyo09, there was an insertion of 10 bp (2× TGTTT repeat) in Spyo01 impacting the expression of LPXTG-anchored collagen-like adhesin Scl2 (R3H37_05230 in Spyo01, compared to R3H61_05235 in Spyo09) and an insertion of ~5.3 kb in Spyo09 containing eight genes (R3H61_07350-07385), including exotoxin gene speJ, which is consistent with the virulence factor profiling shown in Table S4.Other than these discrepancies, the genomes of Spyo01 and Spyo09 are almost identical.
The insertion of ISLgar5 generates an eight bp duplicate repeat (DR) or target site repeat (TSR) at its ends.We uncovered that all single-copy ISLgar5 isolates and their variants had the same conservative DR, TTATAATG, in the ISLgar5-containing ICE.However, in the ICE of multiple-copy ISLgar5-containing iGAS strains, there was a four or five bp deletion in their 3 ′ DR.Moreover, we found more deletions of various lengths at the 3 ′ end of ISLgar5, e.g., 82 bp at #7, 598 bp at #13, and 5994 bp at #10, including the aforementioned large deletion in the tetM-negative ST399/emm77 isolates, which led to their 3 ′ DR differing from their 5 ′ DR (Table S7).We thus suspect that multiple-copy ISLgar5 variants may have an additional role in deletion, distinct from single-copy ISLgar5 variants.

Discussion
We employed a k-mer-based kWIP relatedness analysis for the genomic surveillance of local iGAS strains and compared the results to those of the national surveillance.The k-mer-based population approach has a discriminatory power to differentiate isolates via mainly structural variants (e.g., ∆11k and contaminated phi174), as well as SNVs (e.g., M1 UK vs. M1 global ).In the national genomic re-analysis of M1 GAS, we found three main lineages and several sub-lineages and uncovered GAS epidemiological changes in the U.S. before and during the COVID-19 pandemic.Notably, compared to previous M1 UK emergence [27] and recent M1 UK outbreaks in European countries [4,[7][8][9]21], only a limited number (n = 41, 2018-2020) of M1 UK isolates were identified in the U.S., similar to an earlier observation [35], which demonstrates a geographic difference in GAS circulation.Additionally, we revealed significant GAS prevalence differences between 2018 and 2019: the M1b lineage dominated in 2018 but disappeared from 2019 to 2020, and the M1a lineage prevailed in 2019.During the COVID-19 pandemic, the iGAS cases and deaths were significantly reduced [25], from 7.6 in 2019 to 6.1 in 2020 and from 0.67 in 2019 to 0.55 in 2020 per 100,000 population, respectively (https://www.cdc.gov/abcs/reports-findings/surv-reports.html,accessed on 14 May 2024).As shown in Table 2, we also observed a pattern change in iGAS prevalence.These phenomenal changes might be closely correlated with the implementation of COVID-19-associated nonpharmaceutical interventions [25].
Our local genomic surveillance of GAS revealed a unique sub-lineage, M1a-5, in ENC among the most common iGAS ST28/emm1 isolate.However, our local genomic surveillance was conducted in a relatively short time period with a relatively small size and was confined to ENC, a relatively large geographic region with a relatively small population.The significant age difference between our iGAS and nGAS collection is noted, mainly because nGAS is much more common among young people, whereas iGAS is much more common among adults.Compared to national GAS in the U.S. before 2022, our local GAS in 2022-2023 showed a high percentage of ST28/emm1, ST36/emm12, and ST399/emm77 isolates.Unfortunately, to date, there are no national data available on iGAS in 2022-2023.It will be of additional interest in the future to compare our local GAS data with the national GAS data in the same period to further explore GAS geographic features.
We noticed there is an inconsistency between resistance phenotypes and genotypes in the Spyo02, Spyo11, Spyo19, Spyo35, and Spyo51 strains.We therefore investigated rare erythromycin and clindamycin resistance mechanisms of mutations in 23S rRNA, as well as ribosomal proteins L4 and L22 [36].Although we observed some mutations in 23S rRNA, e.g., A1302C, T2021G, and T2166C (using AE004092 as a reference), they demonstrated no significant difference between our susceptible and resistant isolates, except a rare mutation, C2702T, in Spyo11.No mutation in L4 or L22 was found.We reason that additional resistance mechanism(s) might exist in these erythromycin-and clindamycin-resistant isolates, which remain to be explored.
We employed both a kWIP k-mer-based relatedness analysis and a kmdiff differential analysis to investigate ST28/emm1.The combined use of kWIP and kmdiff can discriminate clusters and/or subclusters without any reference or genome annotation and can identify structural variants, such as ∆11k, phi174 contamination, and ISLgar5.In a subsequent global survey, we found multiple-copy ISLgar5 in ST399/emm77 isolates and single-copy ISLgar5 variants in other GAS strains.The host specificity of multiple-copy ISLgar5 in ST399/emm77 (except SRR18933625) strains and single-copy ISLgar5-3 in ST904/emm77 strains is intriguing.Based upon our current characterization of ISLgar5 and its variants, we assume the ISLgar5 mobility between GAS strains is through the ISLgar5-and tetM-carrying ICE, whereas the ISLgar5 mobility inside GAS depends on the T980 (W296) variant.
The ISLgar5-and tetM-carrying Tn5801-like ICE was originally detected in Staphylococcus aureus Mu50 (NC_002758) [37], and its presence was documented as early as 1953 in Streptococcus agalactiae [38].Compared to Tn916 [39,40], Tn5801 has the common synteny but with an extra ISLgar5 transposase, ATP-dependent endonuclease, and ATP-dependent helicase at one end, and it differs in the site-specific integrase at the other end.In consideration of ICE excision circularization, we propose a four-component cluster for the transfer mechanism of Tn5801-like ICE, comparable to the bacterial UvrABCD excision repair system [41].Interestingly, in a genomic regional comparison between CP035428 and CP043530, this ICE integration generated a 11 bp target site replication (though attP GAGTGGGAGTA and attB GAATGGGAATA were not a perfect match), also comparable to the 12 bp excision by UvrABC.The 11 bp sequence located at the end of the glutamine-hydrolyzing GMP synthase-coding gene guaA is consistently found to be in association with Tn5801-like MGE in various species of Enterococcus, Staphylococcus, and Streptococcus [42], suggesting it is the Chi (crossover hotspot instigator) site specific for the Tn5801-like tyrosine recombinase.
ISLgar5 contains a unique 396 aa transposase with a DDE motif that has an 88% similarity to that in ISEfm2, both of which belong to the IS256 family.The IS256 family is featured with replicative copy-out/paste-in transposition, generating an eight to nine bp DR on insertion [43,44].The deletions identified at the 3 ′ -end of multiple-copy ISLgar5 are extraordinary.Essential for multiple-copy ISLgar5 and its deletions, W296 resides on the protein surface away from the DDE activity center in our predicted threedimensional protein structure.The vital role of W296 in ISLgar5 transposition thus merits further investigation.
Like multiple-copy ISLgar5, IS256 also produces multiple insertions throughout the bacterial chromosome [45].A recent study [46] demonstrated that IS256 mobility was tightly controlled at the transcriptional level and that IS256 insertion abundance coincided with phage infection and antibiotic exposure.Our genomic surveillance also revealed the association of ISLgar5 with antimicrobial resistance.Additionally, our transcriptomics analysis suggested that multiple-copy ISLgar5 might alter GAS virulence and pathogenesis.These results support that multiple-copy ISLgar5 is a crucial component for ST399/emm77 rapid genome fitness and adaptation.Environmental selective pressure, such as phages and antibiotics, may promote ISLgar5 diversification inside the GAS genome.After all, insertion sequences are key drivers of bacterial genome evolution, shaping bacterial responses [43,46,47].
In conclusion, using k-mer-based analyses, we revealed three main clusters of M1, i.e., M1a, M1b, and M1c, and a novel sub-cluster, i.e., M1a-5, in our local genomic surveillance of GAS along with the national genomic surveillance of iGAS that is publicly available.We also identified multiple-copy ISLgar5 specific in ST399/emm77 isolates, the copy number of which was associated with antibiotic resistance, i.e., ermT.Meanwhile, we demonstrated the power of reference-free and annotation-free genomic analyses, which should have wider applications in the future.

Bacterial Isolates
All GAS isolates in this retrospective study were collected from March 2022 to July 2023 in the Clinical Microbiology Laboratory of the ECU Health Medical Center.Clinical data on all cases were retrieved from patients' electronic medical records for review and consisted of age, gender, sample collection source and date, infection type/site, clinical diagnosis, and mortality.The clinically isolated GAS strains were cultured on 5% sheep blood agar plates and amplified in LIM broth (Todd Hewitt with CAN) or Tryptic Soy Broth (Becton, Dickinson and Company, Sparks, MD, USA).When clinically indicated, antimicrobial reagent susceptibility testing was performed utilizing a combination of ETEST ® for penicillin, ceftriaxone, and vancomycin (bioMérieux Inc., Durham, NC, USA) and Kirby-Bauer disk diffusion for erythromycin and clindamycin (Becton, Dickinson and Company).This retrospective study was approved by the University and Medical Center Institutional Review Board at East Carolina University (UMCIRB 23-000323).

Short-Read Whole-Genome Sequencing
Genomic DNA was extracted using the GenFind v3 kit and the Biomek i7 automation system (Beckman Coulter, Indianapolis, IN, USA).DNA quantification was performed using the AccuClear Nano dsDNA Assay and SpectraMax iD3 Fluorometer (Molecular Devices, San Jose, CA, USA).Multiplex sequencing libraries were prepared with the Nextera XT Library Prep kit (Illumina, San Diego, CA, USA), the quality and quantity of which were measured using the 4200 TapeStation (Agilent, Santa Clara, CA, USA) and the Qubit 4 Fluorometer (ThermoFisher, Waltham, MA, USA), respectively.Paired-end sequencing (300 × 2 cycles) was conducted using the NextSeq 2000 or MiSeq (Illumina) platform.

Long-Read Whole-Genome Sequencing
High-molecular-weight (HMW) genomic DNA was extracted using the Quick-DNA HMW Magabead kit (Zymo Research, Irvine, CA, USA) and sheared to 7-10 kb using g-TUBE (Covaris, Woburn, MA, USA).A HiFi long-read sequencing library of each isolate was prepared using the SMRTbell Prep kit 3.0 and Barcoded Adapter Plate 3.0 (PacBio, Menlo Park, CA, USA).Pooled libraries were loaded using the Polymerase Binding kit 3.2 and sequenced with an SMRTCell 8M tray in the Sequel IIe system (PacBio), with 2 h pre-extension time and 30 h movie run.

Figure 1 .
Figure 1.Relatedness analysis of 95 Streptococcus pyogenes isolates in Eastern North Carolina, using kWIP analysis.(A) Hierarchical clustering of the 95 Group A Streptococcus isolates along with three complete genome references: M1 (ST28/emm1), HKU488 (ST28/emm1, M1UK), and HKU360 (ST36/emm12).Height indicates the degree of difference between branches.(B) Multi-dimensional scaling (MDS) plot of the 95 GAS isolates along with the references.ST28/emm1 and ST36/emm12 are the main two clusters.Statistical significance p-value of MDS clustering is listed beside the figure, along with cluster number k. ST: sequence type.

Figure 1 .
Figure 1.Relatedness analysis of 95 Streptococcus pyogenes isolates in Eastern North Carolina, using kWIP analysis.(A) Hierarchical clustering of the 95 Group A Streptococcus isolates along with three complete genome references: M1 (ST28/emm1), HKU488 (ST28/emm1, M1 UK ), and HKU360 (ST36/emm12).Height indicates the degree of difference between branches.(B) Multi-dimensional scaling (MDS) plot of the 95 GAS isolates along with the references.ST28/emm1 and ST36/emm12 are the main two clusters.Statistical significance p-value of MDS clustering is listed beside the figure, along with cluster number k. ST: sequence type.

Figure 2 .
Figure 2. Relatedness analysis of Streptococcus pyogenes ST28/emm1 (serotype M1) isolates in both local and national comparative genomic surveillances, using kWIP analysis.Statistical significance p-value of each multi-dimensional scaling (MDS) clustering is listed beside the figure, along with cluster number k. (A) MDS plot of 939 M1 isolates.Three main clusters, M1a, M1b, and M1c, are identified, compositions of which are parsed in (B), with the dominant M1a isolates boxed in orange.(C) MDS plot of 633 M1a isolates demonstrates five different clusters, compositions of which are parsed in (D).Each cluster has two sub-clusters, one with intact prophage 315 and the other with deletions in the ~11 kb region of the prophage (Δ11k).Numbers in parentheses (B,D) are non-invasive isolates sequenced in Eastern North Carolina (ENC).The novel sub-lineage identified in ENC is shaded in blue (D), which comprises two invasive and two non-invasive isolates.

Figure 2 .
Figure 2. Relatedness analysis of Streptococcus pyogenes ST28/emm1 (serotype M1) isolates in both local and national comparative genomic surveillances, using kWIP analysis.Statistical significance p-value of each multi-dimensional scaling (MDS) clustering is listed beside the figure, along with cluster number k. (A) MDS plot of 939 M1 isolates.Three main clusters, M1a, M1b, and M1c, are identified, compositions of which are parsed in (B), with the dominant M1a isolates boxed in orange.(C) MDS plot of 633 M1a isolates demonstrates five different clusters, compositions of which are parsed in (D).Each cluster has two sub-clusters, one with intact prophage 315 and the other with deletions in the ~11 kb region of the prophage (∆11k).Numbers in parentheses (B,D) are non-invasive isolates sequenced in Eastern North Carolina (ENC).The novel sub-lineage identified in ENC is shaded in blue (D), which comprises two invasive and two non-invasive isolates.

Figure 3 .
Figure 3. Identification of an insertion sequence, ISLgar5, in association with invasive Streptococcus pyogenes antimicrobial resistance.(A) Identification of ISLgar5 in a genome-wide association study of Group A Streptococcus (GAS), though this was later found to be biased by multiple copies of ISLgar5 in ST399/emm77 isolates.ISLgar5 encodes a transposase of 396 amino acids and features an 8 bp duplicate repeat (DR).(B) Three ISLgar5 variants identified in single-copy ISLgar5-containing GAS, which has 1, 2, and 3 amino acids changed in its encoded transposase, respectively, as marked in the figure with nucleotide changes (orange bars) on the top and amino acid changes at the bottom.The DDE sites necessary for the transposase activity are also labeled.All single-copy ISLgar5 variants have the same conservative DR, TTATAATG.(C) A ~26 kb Tn5801-link integrative and conjugative element (ICE) carries both ISLgar5 and tetM.Gene annotation of the ICE is from emmSTG866.1,986,768-960,966 in CP035428.Shown on the top are the 53 bp deletion in ISLgar5-3-carrying ST904/emm77 (green triangle), the deleted region in tetM-negative ST399/emm77, and the site of ISLgar5 insertion in some of the ICEs (red arrow).TF: transcription factor.(D) The abundance of ISLgar5 in ST399/emm77 is associated with ermT antimicrobial gene.Left: Isolates without ermT (in red) have a much lower copy number of ISLgar5 than those with ermT (in cyan).Right: Isolates with ermT (in cyan) are increasing during the surveillance time, with more isolates without ermT (in red) in 2015-2018 (n = 21) than in 2019-2021 (n = 2).

Table 1 .
Genotyping of local 95 Group A Streptococcus isolates in Eastern North Carolina during 2022-2023.

Table 1 .
Genotyping of local 95 Group A Streptococcus isolates in Eastern North Carolina during 2022-2023.

Table 2 .
National genotyping surveillance of invasive Group A Streptococcus in the U.S. during 2015-2021, with a total of 13,064 isolates in BioProject PRJNA395240.Shaded in blue: top one for the year; shaded in grey: the 2nd-5th ones for the year (from the darkest to the lightest).