Genetic determinants underlying the progressive phenotype of β-lactam/β-lactamase inhibitor resistance in Escherichia coli

ABSTRACT Currently, whole-genome sequencing (WGS) data have not shown strong concordance with Escherichia coli susceptibility profiles to the commonly used β-lactam/β-lactamase inhibitor (BL/BLI) combinations: ampicillin-sulbactam (SAM), amoxicillin-clavulanate (AMC), and piperacillin-tazobactam (TZP). Progressive resistance to these BL/BLIs in the absence of cephalosporin resistance, also known as extended-spectrum resistance to BL/BLI (ESRI), has been suggested to primarily result from increased copy numbers of bla TEM variants, which is not routinely assessed in WGS data. We sought to determine whether addition of gene amplification could improve genotype-phenotype associations through WGS analysis of 147 E. coli bacteremia isolates with increasing categories of BL/BLI non-susceptibility ranging from ampicillin (AMP) susceptibility to being fully resistant to all three BL/BLIs. Consistent with a key role of bla TEM in ESRI, 112/134 strains (84%) with at least ampicillin non-susceptibility encoded bla TEM. Evidence of bla TEM amplification (i.e., bla TEM gene copy number estimates > 2×) was present in 40/112 (36%) strains. There were positive correlations between bla TEM copy numbers with minimum inhibitory concentrations of AMC and TZP (P < 0.05) but not for SAM (P = 0.09). The diversity of β-lactam resistance mechanisms, including non-ceftriaxone hydrolyzing bla CTX-M variants, bla OXA-1, and ampC and bla TEM strong promoter mutations, was greater in AMC- and TZP-non-susceptible strains but rarely observed within SAM- and AMP-non-susceptible isolates. Our study indicates that comprehensive analysis of WGS data, including β-lactamase-encoding gene amplification, can help categorize E. coli with AMC or TZP non-susceptibility but that discerning the transition from SAM susceptibility to SAM non-susceptibility using genetic data requires further refinement. IMPORTANCE The increased feasibility of whole-genome sequencing has generated significant interest in using such molecular diagnostic approaches to characterize difficult-to-treat, antimicrobial-resistant (AMR) infections. Nevertheless, there are current limitations in the accurate prediction of AMR phenotypes based on existing AMR gene database approaches, which primarily correlate a phenotype with the presence/absence of a single AMR gene. Our study utilized a large cohort of cephalosporin-susceptible Escherichia coli bacteremia samples to determine how increasing the dosage of narrow-spectrum β-lactamase-encoding genes in conjunction with other diverse β-lactam/β-lactamase inhibitor (BL/BLI) genetic determinants contributes to progressively more severe BL/BLI phenotypes. We were able to characterize the complexity of the genetic mechanisms underlying progressive BL/BLI resistance including the critical role of β-lactamase encoding gene amplification. For the diverse array of AMR phenotypes with complex mechanisms involving multiple genomic factors, our study provides an example of how composite risk scores may improve understanding of AMR genotype/phenotype correlations.

complex mechanisms detectable by WGS analysis also contribute to AMC and TZP non-susceptibility.The lack of ability to separate SAM susceptibility from SAM non-sus ceptibility using WGS data, including analysis of the bla TEM copy number, indicates that improvements of these genotype/phenotype correlations may require alternative approaches to current AMR genetic database query methods.
The plural majority (230/637; 36%) of CRO-susceptible E. coli BSI isolates were pan-susceptible to the studied β-lactams (Group 1).The next most common isolates were SAM-NS (Group 3; 27%) and AMC-NS (Group 4; 20%) whereas the lowest frequen cies were found within AMP-NS (Group 2; 9%) and TZP-NS (Group 5; 8%).We next looked at the minimum inhibitory concentration (MIC) distributions for each respective BL/BLI across each of the BL/BLI groups (Fig. S1B).For AMP and TZP antimicrobial susceptibility testing (AST), there was a bifurcation of susceptible and resistant isolates with very few isolates near the respective MIC breakpoints (Fig. S1B).Conversely, for SAM and AMC AST, there were large numbers of isolates, which had intermediate resistant phenotypes, which likely reflects a more diffuse spectrum of resistance for SAM and AMC vs. AMP and TZP (Fig. S1B).
From the CRO-susceptible E. coli BSI cohort, we selected 147 isolates for wholegenome sequencing across the spectrum of BL/BLI susceptibilities with the group distribution of sequenced strains relative to the total cohort shown in Fig. 1.We underand over-sampled Group 1 and Group 5, respectively, whereas Groups 2, 3, and 4 had similar proportions of isolates sequenced within each group respective to the total cohort (148/637; 23%) as shown in Fig. 1.
For both AMP-NS and SAM-NS strains, nearly all strains carried bla TEM-1 alone with a single SAM-NS strain containing bla CARB-2 whereas bla LAP and bla CARB-2 were present in two AMC-NS strains.Conversely, strains in the TZP-NS had the most diverse array of β-lactamase-encoding genes including two strains with IRT β-lactamase variants (n = 2) and CTX-M enzymes (n = 2), respectively.One CTX-M enzyme was a deriva tive of CTX-M-15 (i.e., CTX-M-189) with an S133G polymorphism whereas the other was a derivative of CTX-M-27 (i.e., CTX-M-255) and contained a G239S polymorphism.Both of these polymorphisms have been shown in an experimental model to engen der resistance to β-lactamase inhibitors but also to diminish the ability to hydro lyze cephalosporins (31).There was a statistically significant difference in exogenous β-lactamase gene content between the groups (χ 2 , P < 0.05).Specifically, AMP-NS strains were significantly less likely to contain any exogenous β-lactamase whereas TZP-NS strains were more likely to contain bla OXA-1 (Fisher's exact test, P < 0.001 for each).When only Groups 2 through 4 (i.e., AMP-NS, SAM-NS, and AMC-NS) were analyzed, no statistically significant differences in β-lactamase-encoding gene content were observed.Thus, we conclude that the fully susceptible and fully resistant groups can be distin guished from the other groups using exogenous β-lactamase content alone, but the middle three groups cannot.

FIG 1
Summary of ceftriaxone-susceptible (i.e., CRO MIC < 4 mg/L) Escherichia coli bloodstream isolates (n = 637) stratified by increasing BL/BLI-resistant phenotype (Groups 1-5) along with the number of sequenced isolates from each group.Unique BSI samples are grouped into progressively more resistant BL/BLI phenotypes from being pan-susceptible (Group 1) to being resistant to all four drug combinations in our study (Group 5).Frequency counts above each bar give group totals with the percentage of total population in parenthesis [e.g., there were 230 Group 1 isolates out of the total 637 (36%)].We further split groups into sequenced (n = 147) (dark gray) vs non-sequenced (light gray) with frequency counts and percentages of each respective group labeled (e.g., 13

Contribution of additional mechanisms to BL/BLI non-susceptibility
Given that exogenous β-lactamase-encoding genes only separated the fully susceptible and fully resistant strains from the remainder of the cohort, we next focused on other factors previously associated with BL/BLI resistance (8).First, E. coli strains encode an AmpC β-lactamase which is typically transcriptionally silenced but can become active in the presence of promoter mutations (27).We identified three strains, all in AMC-NS isolates, which contained ampC promoter variations previously associated with AMC resistance (27).Similarly, we assessed for variation in the bla TEM promoter region and found nine instances of strong bla TEM promoter variants (i.e., eight Pa/Pb and one P5) (25,26).For eight of the nine cases, strains were in the TZP-NS group with the remaining strain having a borderline TZP MIC of 16 mg/L indicating that bla TEM promoter mutations were associated with TZP non-susceptibility.Finally, we analyzed the OmpC and OmpF contents of our cohort inasmuch as variation in the outer membrane protein profile has been associated with BL/BLI resistance (20).Only four strains had predicted inactivating ompC (n = 1) or ompF (n = 3) mutations, and the strains were present in varied groups (one in AMP-NS, one in SAM-NS, and two in TZP-NS).Thus, 16 strains (11%) had genetic changes predicted to increase ampC or bla TEM expression or eliminate ompC/ompF production, which have been previously associated with BL/BLI resistance.

Association of bla TEM amplification and BL/BLI resistance
Increased TEM-1 activity has previously been shown to be an integral aspect of progressive BL/BLI resistance (16,18) with mechanisms involving increased transcription due to promoter variation and/or increase in bla TEM-1 copy number (18,25,26).The median copy number estimate of bla TEM was 1.57× with a maximum of 45× observed (Fig. 2A).Taking a cut-off of 2.0-fold DNA coverage as evidence of amplification (32), 40/115 bla TEM NIR-containing strains (35%) had bla TEM amplification.To understand the relationship more clearly between bla TEM amplifications and BL/BLI resistance, we focused on the 91 strains that only contained N-IRT genes as a β-lactam resistance mechanism (i.e., no other exogenous β-lactamase, no ampC/bla TEM promoter variation, and no ompC/ompF disruptions).We found statistically significant increases in bla TEM amplification levels between SAM-NS and TZP-NS isolates (Fig. 2B).However, even for groups where there was a statistically significant difference, there remained an overlap such that bla TEM-1 amplifications did not clearly distinguish between the various resistance profiles (Fig. 2B).Consistent with the inability of bla TEM-1 amplifications to distinguish between strains from AMP-NS and SAM-NS, there was no significant correlation between SAM MIC and bla TEM-1 amplification (P = 0.09; Fig. 2C).However, AMC and TZP MICs were significantly associated with bla TEM-1 amplification (P < 0.01; Fig. 2D and E).When bla TEM amplification > 2× was considered as a potential β-lac tam mechanism, there was a statistically significant difference in β-lactamase content between SAM-NS and AMC-NS (Fisher's exact test, P < 0.001) whereas AMP-NS and SAM-NS continued to have similar mechanisms (Fisher's exact test, P = 0.17).Thus, these data suggest that bla TEM-1 copy numbers help distinguish AMC resistant from AMC-susceptible strains but do not assist with assessing SAM susceptibility.As previously noted, bla OXA-1 -containing strains were only present in AMC-NS and TZP-NS isolates with the vast majority of bla OXA-1 -containing strains belonging to TZP-NS (16/19).The three bla OXA-1 AMC-NS strains did not contain another BL/BLI resistance mechanism whereas six TZP-NS strains contained both bla OXA-1 and bla TEM-1 , and one TZP-NS strain had a TEM-promoter variation, bla OXA-1 , and bla TEM-1 .Similar to bla TEM-1 , an increase in bla OXA-1 CNV estimates was commonly observed in our cohort with 11/19 strains (57%) having >2.0× normalized coverage depths.When strains containing only bla OXA-1 as a β-lactam-resistant mechanism were considered, there was an increase in CNV positively correlated with TZP MIC albeit this did not reach statistical significance (Fig. 2F).Consistent with these data, all strains with a bla OXA-1 copy number ≥2× were in the fully resistant category.

Correlations of BL/BLI genetic determinants with BL/BLI groups across the population structure
A core gene alignment-inferred maximum-likelihood phylogeny is presented in Fig. 3A.There were eight core gene-inferred clusters identified through the hierBAPS algo rithm (33) that are highlighted in the phylogeny (Fig. 3A).Each of the cluster levels was associated with previously established phylogroups (34) or more closely related sequence types (STs) with highly related STs labeled on branch tips grouping together (Fig. 3A).The majority of isolates in this CRO-S cohort belonged to the B2 clade (60%; 88/147) although a total of 37 distinct STs were present.Strains of the pandemic ST131 clade (33%; 49/147) or the emergent ST1193 (16%; 23/147) were the most frequently observed STs with both belonging to the B2 phylogroup.There was only one other ST with more than 10 observations, which was ST69 (7%; 11/147).Other STs comprising of five or more strains included ST73 (n = 8), ST648 (n = 6), and ST10 (n = 5).We grouped STs with less than five strains together as "Rare STs, " and such strains accounted for 31% of the cohort (n = 45).When comparing all STs comprising of ≥5 isolates, there was a statistically significant relationship between ST and AMR grouping (χ 2 simulated P < 0.001) (Fig. 3B).Specifically, ST648 strains were more likely to be TZP-NS, and ST1193 strains were more likely to be SAM-NS (pairwise Fisher's test adj.P < 0.05).Conversely, ST131 strains were not statistically significantly associated with any particular susceptibil ity profile.While not statistically significant, similar trends can be seen in Fig. 3A, where ST73/ST12 isolates (hierBAPS level 7; light blue) have 85% of isolates with AMC-NS or TZP-NS phenotypes whereas ST69 has more susceptible patterns with 91% of isolates having SAM-NS or a more susceptible phenotype.We further characterized each of the BL/BLI groups by the number of BL/BLI genetic determinants, which could range from 0 to 10 (see Materials and Methods for details) (Fig. 3C).Each of these BL/BLI genetic determinants was based on the presence/absence of genetic features that have been previously identified as contributing to BL/BLI resistance (13,20,25,27,31,35).The number of unique combinations of BL/BLI genetic determinants across each of the BL/BLI phenotype groups is shown in Fig. S2.All PAN-S isolates (n = 13) had no identifiable BL/BLI determinants whereas almost all isolates (130/134; 97%) with AMP-NS or higher had at least one genetic determinant identified (Fig. S2).Interestingly, all isolates in our cohort with 3+ BL/BLI determinants (n = 10) were TZP-NS (Fig. 3C).Importantly, there was a positive monotonic relationship (ρ = 0.62, P = 2.2e−16) between the BL/BLI group status and the BL/BLI genetic determinant score, thus underscoring how increasing numbers of BL/BLI genetic determinants impact BL/BLI non-susceptibility.Nevertheless, there was a similar distribution of BL/BLI genetic determinant scores across Group 2 and Group 3 isolates (Table S1), further highlighting the difficulty in discriminating genetic differences across these isolates.
Finally, we next sought to combine genetic determinants associated with BL/BLI resistance along with our phylogenomic structure to assess the odds of being in a greater or lesser BL/BLI non-susceptibility group using ordinal logistic regression methods.Table 1 provides an overview of BL/BLI and population-level covariate relationships with the odds of being in each BL/BLI phenotypic group.There were no statistically significant associations observed between the increasing BL/BLI group and ST or phylogroup; however, there was a weak association between Cluster 2, predomi nantly consisting of B1/C isolates that had 0.3 times (95% CI: 0.1-0.9)lesser odds of belonging to a higher (i.e., less susceptible) BL/BLI group compared with Cluster one isolates (ST131).When looking at the relationships between increasing BL/BLI non-sus ceptibility with the composite BL/BLI genetic determinant score, for every one unit increase in BL/BLI genetic determinant, the odds of being in a higher BL/BLI group were 10.9 times (95% CI: 6.2-20.2) greater than a lower BL/BLI group (Table 1).Importantly, there were no associations between N-IRT gene carriage and the BL/BLI group (OR = 1.1; 95% CI: 0.5-2.4)whereas bla TEM amplification, bla OXA-1 presence, and other non-TEM bla amplifications all appeared to contribute to increasing odds of BL/BLI non-suscepti bility.Thus, these data suggest that the presence and amplification of narrow-spectrum β-lactamases are the primary drivers of the progressive BL/BLI phenotype.

DISCUSSION
The increasing affordability and availability of WGS data have markedly increased the understanding of bacterial AMR to the point where it has been postulated that such analysis might augment or even replace routine phenotypic susceptibility testing (36).For fast-growing bacteria, such as Enterobacterales, WGS has shown good concordance with the AMR phenotype for many "bug-drug" combinations, but in general, WGS has not accurately predicated Enterobacterales susceptibility to commonly used BL/BLIs such as SAM, AMC, and TZP (5,7,37).Herein, we sought to determine whether a compre hensive WGS analysis, including assessment of the bla copy number, could separate cephalosporin-susceptible E. coli based on BL/BLI resistance patterns.Our data indicate that an increase in copy numbers of narrow-spectrum β-lactamases does occur with progressive BL/BLI resistance but accounts for only part of this process in conjunction with an accumulation of a variety of other mechanisms (13).
A key strength of our study was using WGS data to analyze a phenotypically diverse cohort of E. coli bloodstream isolates from across the spectrum of BL/BLI resistance in contrast to previous studies which investigated a single BL/BLI combination (8,16,20).Moreover, we focused on cephalosporin-susceptible strains and integrated analyses of bla gene copy numbers to elucidate the recently proposed concept of extended-spec trum resistance to BL/BLI (13).This approach allowed for an augmented understanding of how each β-lactam resistance mechanism contributes to the spectrum of BL/BLI resistance in clinical isolates.For example, mutations in the ampC promoter were associated with resistance through AMC but not TZP whereas mutations in the bla TEM promoter were observed only in fully resistant strains except for a single isolate with a TZP MIC of 16/4 mg/L which would be considered susceptible dose dependent (SDD) by current CLSI MIC breakpoints.Similarly, although bla OXA-1 has been associated with TZP resistance (16), we observed that strains in which bla OXA-1 was the only β-lactamase and did not have an increase in copy number remained TZP susceptible although resistant to AMC and SAM suggesting a necessary gene dosage and/or OXA-1 activity effect for progressive resistance.Given that bla OXA-1 amplification is clearly associated with TZP resistance (16,17), these and other data raise concerns regarding the clinical efficacy of TZP therapy of bla OXA-1 -containing strains even if they test initially susceptibility given the capacity of such strains to readily increase the bla OXA-1 copy number (38).
A major surprising finding of our study was that whereas there was a progressive increase in bla TEM copy numbers moving from SAM-NS to AMC-NS to TZP-NS strains, there was conversely a small decrease in bla TEM copy numbers between AMP-NS (Group 2) and SAM-NS (Group 3) isolates.Such data were even more surprising given that bla TEM-1 was nearly universal and the only β-lactam resistance mechanism for both groups, leading us to hypothesize that increased TEM-1 production due to gene amplification would be the major mechanism distinguishing these two groups (13).Differentiation between AMP-NS and SAM-NS E. coli has not been systemically studied using WGS data such as has been done for AMC-NS and TZP-NS isolates (8,13,15,16,19), and thus, we are limited in our ability to benchmark our data against oth ers.Noguchi et al. studied SAM-resistant isolates from Japan using phenotypic assays and found some relationship between increased TEM-1 activity, decreased membrane permeability, and SAM resistance (20).Consistent with our data, they found only a small percentage of strains had inactivating mutations in ompC or ompF.Thus, together with other data regarding E. coli responses to antimicrobials, we hypothesize that bla TEM-1 harboring E. coli develops SAM-NS through an increase in TEM-1 activity that is not detectable via assessment of the gene copy number when the organism is grown in the absence of antimicrobial pressure, along with decreased SAM entry through outer membrane changes which are not due to defined genetic mutations.As E. coli moves from SAM-NS to AMC-NS to TZP-NS, the resistance mechanisms then become both more diverse and genetically fixed in the population such that they become detectable using WGS-based methodologies.Thus, barring identification of new genetic mechanisms that can separate AMP-NS and SAM-NS isolates, it seems unlikely that WGS methods using bacteria grown in the absence of antimicrobial pressure will be able to accurately separate these two groups.Consistent with data from other groups (8,16), WGS analyses are likely to be useful in detecting AMC and TZP resistance in E. coli but need to consider a large group of potential mechanisms, including single-nucleotide polymorphisms (SNPs), which are additive in nature and thus much more difficult to predict relative to ESBL phenotypes primarily driven by single genes.There is growing interest to create "mechanism agnostic" machine learning models that can classify AST phenotypes based on detecting associations with SNPs and gene presence/absence after controlling for population structure (39)(40)(41)(42).
There are several limitations to our study worth noting.We sought to provide a "real-world" assessment of using WGS to analyze BL/BLI resistance and thus did not recapitulate the phenotypic data nor did we assess BL/BLI phenotypes comparing different methodologies such as broth microdilution vs. agar dilution.Thus, given the difficulties in the accuracy of BL/BLI susceptibilities using automated methodologies (24), it is likely that there was some misclassification of organisms into the various groups, particularly for those whose MICs were near the susceptibility breakpoints.Additionally, we used the WGS data to assess for known β-lactam resistance mechanisms and thus cannot be certain that other mechanisms were not present.Furthermore, bacterial outer-membrane remodeling and changes in efflux pump activity can further affect susceptibility to multiple classes of antibiotics and cannot be easily assessed using the WGS-based methods we employed in our study (43).These aforementioned changes may lead to phenomena known as antibiotic tolerance as well as heteroresist ant subpopulations that can render significant challenges to antibiotic susceptibility interpretations (44)(45)(46).Further work is necessary to elucidate the contributions of these phenomena to treatment failure and how to effectively incorporate these factors into prediction models (47,48).Even though we assessed nearly 150 strains, our sample sizes for some groups, such as group 2, were relatively small which could have limited our power to detect statistically significant differences in various mechanisms.Finally, we only sequenced strains from a single center meaning we do not know the generalizabil ity of our findings although our data are in accord with other data from geographically distant locales (13-15, 18, 21).
In summary, we present herein a comprehensive WGS analysis of a large cohort of cephalosporin-susceptible E. coli strains from across the susceptibility spectrum of commonly used BL/BLIs.Our data both support and add to the complexity of the progressive ESRI spectrum proposed by Rodriguez-Villodres et al. (13).Our data suggest that the addition of β-lactam copy number and promoter variation assessment does assist with the delineation of AMC-NS and TZP-NS but not SAM-NS suggesting that the initial movement of E. coli along the ESRI spectrum may not be detectable using methods that query AMR databases.

Bacterial strains and AST characterization
Escherichia coli bloodstream isolates are collected weekly through an IRB-approved protocol and stored at −80°C in 40% glycerol stocks.Initial antimicrobial suscepti bility testing is performed on a Vitek2 (bioMérieux) automated platform through the University of Texas MD Anderson Cancer Center clinical microbiology laboratory.Additional AST was performed on selected candidate isolates to determine cohort inclusion using gradient ETest strips (Liofilchem) for SAM, AMC, and TZP, respectively, on MHB plates.MIC breakpoints were determined based on the current CLSI guidelines as established in M100-ED33 (49).Isolates with "intermediate" susceptibilities to the respective BL/BLI were grouped with isolates with interpretations of "resistant" within the same BL/BLI group and thus defined as non-susceptible to that particular BL/BLI combination.Further AST information is included in Table S2.

Whole-genome sequencing of bacterial isolates
Isolates with no indication of contamination and that fit into one of five groups based on AST confirmation were sub-cultured in LB and incubated for 3 hours at 37°C with 225 rpm agitation.A culture cell pellet was used to perform genomic DNA extraction using the QIAGEN DNeasy Blood & Tissue Kit following the manufacturer's instructions.The quality of gDNA was assessed through a measure of gDNA concentration with the Qubit 4 fluorometer (Invitrogen) and high molecular weight with the 4200 Tapesta tion (Agilent) system using the Genomic DNA ScreenTape assay.The respective quality controlled (QC) gDNA was submitted to the MDACC Advanced Technology Genomics Core (ATGC) for library preparation and short-read, paired-end (150 × 2) whole-genome sequencing using the Illumina NovaSeq6000 instrument.Samples were multiplexed, barcoded, and sequenced to achieve coverage depths > 100×.

Short-read sequencing data QC and copy number quantification
Short-read data were QC'd and processed through a bespoke pipeline (W.Shropshire, spades_pipeline, GitHub: https://github.com/wshropshire/SPAdes_pipeline).Briefly, paired-end fastq reads have Illumina adaptors and low-quality bases trimmed using Trimmomatic-v0.39with a seed match (16 bp) that allows up to two mismatches, a sliding window of 4 bp, minimum average quality = 15, and minimum length of 36 bp.Trimmed fastq reads were then quality assessed using fastqc-v0.11.9.Sequencing QC information as well as BioSample IDs are provided in Table S2.
Trimmed reads were inputted into the copy number variant quantification tool (CONVICT; W. Shropshire; GitHub: https://github.com/wshropshire/convict). A core-gene control file, which is created through the panaroo pan_reference.fafile, is used to normalize coverage depths.CONVICT uses a coverage-depth normalization approach whereby the coverage-depth of target genes is divided by the coverage-depth of control genes to give an estimation of the ploidy-agnostic copy number.In the first step of the CONVICT algorithm, reads are aligned with a target set of antimicrobial resistance genes determined by kmerresistance (50,51) with the resfinder database (v4.0) (52,53).Aligned reads are then sorted with Samtools into a corresponding bam file.Coverage measurements are determined by pileup.shfrom the BBMAP suite of tools in bins of 100 bases along the length of a target gene.Bins are removed or maintained in an iterative process by comparing coverage-depth of a given bin to the mean coverage-depth of all bins for the respective gene until a specific coefficient of variation (CV = 0.175) is met.Coverage-depth for control genes is calculated in the same way as for target genes.
Copy numbers are then estimated by dividing the coverage-depth of target genes by the coverage-depth of control genes.Complete convict-estimated copy number results are provided in Table S2.

Short-read assembly, database query, and phylogenetics
Trimmed reads are then used as input for short-read genome assemblies using SPAdes-v.3.15.5 using the -isolate parameter (54).Short-read assemblies were used as input to AMRfinder Plus (v3.11.11) using database version 2023-04-17.1 to con firm AMR variant alleles (55) detected with the short-read-based convict tool.We used known strong bla TEM promoter variants (25) and ampC promoter variants (27) that have been experimentally characterized as database input into a variant anno tator tool (Selvalakshmi27, VariantAnnotator, https://github.com/Selvalakshmi27/VariantAnnotator).Each variant was confirmed using blast searches as well as visually inspected on SnapGene-v5.0.8.Genome assemblies were first annotated using Prokka-v1.14.6 (56) Annotated .gfffiles and were used to produce the core gene alignment file using panaroo-v1.2.10 (57).We used the MAFT-v7.505aligner (58), with the panaroo param eters in strict mode and a core gene threshold of greater than 99% present within the population (n = 147).The core gene alignment file was then used to gener ate a maximum-likelihood (ML) phylogeny using IQtree2-2.2.0-beta.Parameterization included 1,000 replicates for the SH approximate likelihood ratio test, 1,000 replicates for ultrafast bootstrap (59), and model optimization using ModelFinder (60).The optimized ML model was an unrestricted model with optimized base frequencies using ML with a FreeRate of heterogeneity chosen according to BIC.A core gene alignment file was used as input to investigate the population structure using an implementation of the hierarchical population clustering tool, RhierBAPS (33), with a max depth of hierarchical search = 2 and max populations = 30.The ggtree-v3.3.1 R package was used for tree visualization and mapping metadata to each individual tree branch-tip (61).

Statistics
Genetic determinants were coded into 10 binary variables to create a cumulative BL/BLI genetic determinant score (Table S3).These variables included N-IRT presence, TEM gene amplification (≥2×), IRT gene, strong bla TEM promoter variant, ampC promoter variant, bla OXA-1 , bla CTX-M , other bla gene, other bla gene amplification (≥2×), and predicted Omp mutations based on previous research (13,20,25,27,31,35).A Spearman's rho coefficient was calculated to test for the relationship between BL/BLI groups and the BL/BLI genetic determinant score.Global Wilcoxon rank sum or Kruskal-Wallis tests were performed to detect significant differences in log-transformed copy numbers between groups.For global comparisons with significant P-values, pairwise comparisons using Wilcoxon rank sum exact test were performed with a false discovery rate P-value adjustment method.Ordinal logistic regression was performed using the BL/BLI group as an ordinal outcome and various genetic features as independent covariates with the MASS R package (v.7.3-54) and the "polr" function with a proportional odds logistic regression model.The brant test was performed to determine if parallel regression assumption held for each model using the "brant" R package (v0.3-0).Statistical analysis was performed using R-v4.0.4.

FIG 3
FIG 3 BL/BLI genetic determinant distribution across ESRI population.(A) Core gene alignment inferred maximum-likelihood phylogeny of 147 CRO-S E. coli bloodstream isolates.The tree is mid-point rooted with the timescale indicating mean nucleotide substitutions per site.Clades are shaded by core hierarchical population structure using the hierBAPS algorithm with each clade strongly associated with one or more phylogroups.Branch tips are labeled by the respective sequence type.Inner rings 1 and 2 correspond to BL/BLI groups and BL/BLI determinant scores, respectively, as labeled in the legend.Outer ring corresponds to bla TEM , bla OXA-1 , and other bla gene copy number variants with copy number estimates color coded as indicated in the legend.(B) Stacked bar chart of the total proportion of STs across each of the BL/BLI groups.(C) Stacked bar chart of frequency counts of BL/BLI groups across each of the BL/BLI determinant scores.

TABLE 1
Ordinal logistic regression to measure covariate associations with BL/BLI groups a OR, odds ratio.b CI, confidence interval.c BL/BLI genetic determinant score composite variable of 10 binary-coded variables (see Materials and Methods).d Each group is compared with Cluster 1 which is B2 phylogroup ST131 isolates; parenthesis includes phylogroup and most common ST group where applicable.