Genomic Sequence Analysis of Methicillin- and Carbapenem-Resistant Bacteria Isolated from Raw Sewage

ABSTRACT Antibiotic resistance is one of the largest threats facing global health. Wastewater treatment plants are well-known hot spots for interaction between diverse bacteria, genetic exchange, and antibiotic resistance. Nonpathogenic bacteria theoretically act as reservoirs of antibiotic resistance subsequently transferring antibiotic resistance genes to pathogens, indicating that evolutionary processes occur outside clinical settings and may drive patterns of drug-resistant infections. We isolated and sequenced 100 bacterial strains from five wastewater treatment plants to analyze regional dynamics of antibiotic resistance in the California Central Valley. The results demonstrate the presence of a wide diversity of pathogenic and nonpathogenic bacteria, with an arithmetic mean of 5.1 resistance genes per isolate. Forty-three percent of resistance genes were located on plasmids, suggesting that large levels of gene transfer between bacteria that otherwise may not co-occur are facilitated by wastewater treatment. One of the strains detected was a Bacillus carrying pX01 and pX02 anthrax-like plasmids and multiple drug resistance genes. A correlation between resistance genes and taxonomy indicates that taxon-specific evolutionary studies may be useful in determining and predicting patterns of antibiotic resistance. Conversely, a lack of geographic correlation may indicate that landscape genetic studies to understand the spread of antibiotic resistance genes should be carried out at broader scales. This large data set provides insights into how pathogenic and nonpathogenic bacteria interact in wastewater environments and the resistance genes which may be horizontally transferred between them. This can help in determining the mechanisms leading to the increasing prevalence of drug-resistant infections observed in clinical settings. IMPORTANCE The reasons for the increasing prevalence of antibiotic-resistant infections are complex and associated with myriad clinical and environmental processes. Wastewater treatment plants operate as nexuses of bacterial interaction and are known hot spots for genetic exchange between bacteria, including antibiotic resistance genes. We isolated and sequenced 100 drug-resistant bacteria from five wastewater treatment plants in California’s Central Valley, characterizing widespread gene sharing between pathogens and nonpathogens. We identified a novel, multiresistant Bacillus carrying anthrax-like plasmids. This empirical study supports the likelihood of evolutionary and population processes in the broader environment affecting the prevalence of clinical drug-resistant infections and identifies several taxa that may operate as reservoirs and vectors of antibiotic resistance genes.

could not be identified beyond the genus level, and the remaining 75 isolates comprised 18 species.
All but two isolates-one Paenibacillus sp. and one Weissella cibaria isolate-had at least 2 known antibiotic resistance (AR) genes, with an arithmetic mean of 56.83 (SD, 70.62) gene hits per isolate. The arithmetic mean number of drug classes to which a given sample contained resistance genes was 5.1 (SD, 2.12). When visualized by site ( Fig. 3 and 4) and species ( Fig. 5 and 6), 22.1% of AR genes were specific to a single site, 26.4% were specific to a single genus, 32.9% were found in all sites, and none were found in all species.
A substantial proportion (43.4%) of isolates had resistance genes on plasmids, with an arithmetic mean of 11.30 (SD, 27.74) hits per isolate, when considering only isolates with AR genes on plasmids. The arithmetic mean number of drug classes a given isolate contained resistance genes to was 2.5 (SD, 1.58). As visualized by site ( Fig. 3 and 4) and species (Fig. 5 and 6), 85.6% of AR genes were specific to a single site, 90.1% were specific to a single species, only a single gene (MexI) was found at all sites, and none were found in all species. No AR genes were found among predicted prophage genes.
Alignment of Fresno16 to pX01 plasmid toxin genes pagA, lef, and cya (42) yielded 81%, 100%, and 93.8% length matches at 14Â, 478Â, and 1,418Â average coverage depths, respectively. Pairwise similarities of these matches were 53.9%, 87.1%, and 93.8%, respectively. Translation alignment of these three toxin genes showed at least one frameshift mutation and multiple coverage gaps for pagA and a frameshift mutation and a number of premature stop codons in lef and cya.
Alignment to the pX01 regulatory genes atxA and pagR (43,44) yielded length matches of 100% for both at average coverage depths of 1,591.7Â and 9.0Â, respectively, with pairwise similarities of 55.1% and 76.0%, respectively. Translation alignment of the two genes yielded a frameshift mutation in atxA and 29 nonsynonymous single nucleotide polymorphisms (SNPs) in pagR. Alignment to the capBCADE operon of the We did not find a significant correlation between human population size and number of AR genes using factorial logistic regression or analysis of covariance (ANCOVA) when controlling for species diversity (residual deviance = 0.10, degrees of freedom = 1, and   P value = 0.74) and not controlling for species diversity (residual deviance = 5.85, degrees of freedom = 3, and P value = 0.12). ANCOVA results were similarly nonsignificant when species number was included (F value = 11.86 and P value = 0.18) and excluded (F value = 12.04 and P value = 0.17) as a covariate. We did not find significant correlation between the list of AR genes detected at each locality and the geographic distance between those localities (R = 20.27 and P value = 0.76) or significant correlation between the taxonomic composition of each locality and geographic distance between them (R = 20.41 and P value = 0.86), nor did we find a significant correlation between AR genes present and geographic distance when controlling for taxonomic composition (R = 0.10 and P value = 0.38) using full and partial Mantel tests for comparison. However, there was significant positive correlation between taxonomic composition and AR genes present (R = 0.80 and P value = 0.01).

DISCUSSION
This study demonstrated a wide diversity of AR bacteria and AR genes present in wastewater samples in the Central Valley of California. Despite this region being a major agricultural production center (29), and a nexus for water transport from the Sierras to southern California (47), few studies of AR resistomes have focused on this region (48)(49)(50)(51). Studies of geographic distribution of antibiotic resistance show varied results: a study of the freshwater lakes (52) and forest soils (53) showed patterns of variation by geographic distance over large spatial scales; conversely, another study of resistance genes present in glaciers did not find spatial structure, even at a global scale (54). Small-scale studies have shown similar variation in spatial distributions of catchment variation in antibiotic resistance genes, with variation detected in association with wastewater treatment plants (55) but not in association with agricultural runoff (56). Spatial patterns of antibiotic resistance fluctuate, and understanding the ecological and evolutionary dynamics of AR in the Central Valley is critical to developing predictions of AR impacts for the population of California as a whole.
Of the 16 genera and 18 species detected in this study, only 35% are routinely considered human pathogens (57); however, other isolates may cause infections in rare cases. This study therefore highlights that the evolution and spread of AR in the environment involves nonpathogenic vectors and reservoirs (23). Studies entirely focused on clinical isolates are insufficient to holistically understand the processes which govern antibiotic resistance (14,15,58).
The initial intent of this study was to specifically evaluate the population diversity of methicillin-resistant Staphylococcus aureus (MRSA) and carbapenem-resistant Enterobacteriaceae. However, none of the samples isolated from MRSA selective medium plates were S. aureus, and only 36% of the samples isolated from carbapenemase-producing Enterobacteriaceae plates were Enterobacteriaceae. While it should be acknowledged that the samples in this study were not clinical samples, it does raise the issue of potential misdiagnosis when using selective medium plates to identify clinical infections, a subject which has not been widely studied to date (59)(60)(61).
We found two isolates that displayed resistant phenotypes on selective media but did not detect any known resistance genes using genomic methods. There are two plausible explanations for these results; first, the sequencing effort may have not captured the resistance genes present in our data (62), and second, the resistance mechanisms in these bacteria may not be currently documented (63,64). As both genera, Paenibacillus and Weissella, are not generally considered human pathogens (65,66), it is possible that resistance mechanisms in these isolates require further characterization. However, the presence of these AR isolates in wastewater alongside significant human pathogens highlights their potential role as vectors of antibiotic resistance.
The 100 genomes sequenced in this study were from both organisms known to be human pathogens and those not generally known to cause human infections (Fig. 7). Comparing the resistance genes present in pathogenic versus nonpathogenic species (57), it is observed that 35.5% of genes are specific to nonpathogens, 22.0% are specific to pathogens, and 42.6% are found in both nonpathogens and pathogens (Fig. 5). This analysis illustrates that horizontal gene transfer between pathogenic and nonpathogenic bacteria under diverse environmental conditions is likely to be a significant factor in the ecological and evolutionary dynamics of antibiotic resistance (23,67). Evolutionary selection for antibiotic resistance in nonpathogenic bacteria can play a significant role in the development of resistant clinical infections (68). This result highlights that a comprehensive understanding of resistance mechanisms in the broader environment is necessary to provide context for studies focusing on antibiotic resistance in clinical settings. Trends observed in the clinic may be the result of processes occurring outside of it, potentially even in the case of nonpathogenic bacteria.
A large proportion of sequenced isolates (43.4%) had resistance genes on plasmids, indicating that conjugative transfer is an important mechanism in the development and spread of antibiotic resistance in wastewater communities (69,70). Plasmids can be shared between distantly related bacterial species (71), further reinforcing the possibility of nonpathogenic bacterial populations contributing to clinically relevant antimicrobial resistance. Conversely, no resistance genes were identified in regions identified as prophage, despite 90% of isolates containing genomic regions identified as prophage sequences. This supports previous work that demonstrates that antibiotic resistance genes are rarely present in prophage (72,73) and that conjugation is likely to be of greater importance in the evolution and spread of antibiotic resistance than transduction (74).
Multiple methods identify sample Fresno16 as Bacillus cereus; however, present in the genome of this sample were anthrax-like capsid and toxin plasmids. Other anthraxlike Bacillus cereus strains have been well characterized (41,75), and the toxin-like plasmid in this strain is clearly distinct from previously described anthrax-like B. cereus, conversely capsid-like plasmid appears similar to other described capsid-like plasmids. Of note, B. anthracis and the anthrax-like B. cereus strains are generally not reported to be drug resistant (75,76). We demonstrate that Fresno16 phenotypically demonstrates b-lactamase expression and contains genes encoding resistance to b-lactams (i.e., bcI, bcII, bla1, and bla2) (77,78), fosfomycin (i.e., fosB) (79), mupirocin (mupA) (80), and vancomycin (i.e., vanRM and vanZF) (81,82) and the multidrug resistance gene vlmR (83). This indicates that Fresno16 is likely multidrug resistant. Alignment of Fresno16 to anthrax pathogenicity genes suggests that it is unlikely that this strain is pathogenic in the same manner as anthrax and anthrax-like infectious agents, but this potential remains to be tested. It may be a worthy model for the study of multidrug-resistant, anthrax-like agents, pending further investigation of its phenotypic characteristics.
The correlation between species diversity and antibiotic resistance indicates that a population-scale evolutionary process may well be key to elucidating geographic patterns of antibiotic resistance (17,58,84). Factors affecting bacterial species diversity in the broader environment may be critical in predicting and managing environmental antibiotic resistance (14, 15, 67). We did not observe a significant correlation with geographic distance, although the geographic scale of the study may have been FIG 7 Venn diagram of antibiotic resistance genes specific to bacterial strains identified as pathogens and nonpathogens and those which were found in both groups. The largest proportions are shared, which suggests that nonpathogenic bacteria in wastewater can act as reservoirs of resistance that can be transferred to pathogens. The fact that many genes are also specific to each group highlights the necessity of considering ecological and evolutionary processes affecting both clinically relevant bacterial strains and the non-clinically relevant strains they may come into contact with in the broader environment in understanding the evolution and spread of antibiotic resistance. insufficient to detect regional variation in antibiotic resistance. It does raise the question of the landscape genetic processes which drive genetic interconnections between geographically discrete bacterial communities, which may be elucidated using population-level genetic studies (85).

MATERIALS AND METHODS
Sample collection, DNA extraction, and sequencing. Single 1-day composite samples of wastewater influent from treatment plants in Fresno, Los Banos, Mariposa, Merced, and Modesto were collected in 50-ml Falcon tubes. Samples were transported on ice and stored at 2°C prior to being plated onto two ChromID (bioMérieux, France) selective medium plates-one MRSA (i.e., methicillin) and one CARBA (i.e., carbapenem) plate-according to manufacturer guidelines. Ten green-pigmented colonies from each plate (i.e., 20 total per site) were picked from each plate and cultured in liquid LB broth for 24 h. Whole genomic DNA from isolates was then extracted using an innuPREP bacterial DNA kit (Analytik Jena, Germany) according to the manufacturer's guidelines, with the exception of the addition of lysozyme and lysostaphin to ensure complete lysis of cells. Library preparation was performed using an Illumina MiSeq V2 300-cycle kit in a paired-end configuration. Samples were pooled into two multiplexed libraries of 50 samples. Sequencing was performed at UC Merced using an Illumina MiSeq sequencer.
Sequences were quality filtered using Sickle v1.33 (86) using default settings (data not shown). Unassembled reads were taxonomically identified using both Kmerfinder 3 (87) and Strainseeker (88). Isolates were categorized as either pathogenic or nonpathogenic based on the NIH's National Microbial Pathogen Data Resource (NMPDR) disease phenotype records and Public Health Agency of Canada's pathogen safety data sheets (PSDSs). For Bacillus isolates, samples were compared to a custom BLAST database of diagnostic genes (for B. thuringiensis, Cry1 to Cry78; for B. cereus, Nhe, Hbl, and CytK; and for B. anthracis, pX01 and pX02) to diagnose species.
For sample Fresno16, we used BWA (89) to align trimmed reads to the Bacillus anthracis pX01 (GenBank accession no. CP008847.1) and pX02 (GenBank accession no. CP008848.1) plasmids and anthrax-like Bacillus cereus plasmids pBCX01 (GenBank accession no. NC_010934) and pBC218 (GenBank accession no. AAEK01000004) using default settings. Single nucleotide polymorphisms were called using the GATK pipeline (90). Consensus sequences of regions mapping to the toxin component genes (i.e., pagA, lef, and cya) and regulatory genes (i.e., atxA and pagR) of the pX01 plasmid, and the capsule synthesizer operon capBCADE of the pX02 plasmid, were translated and aligned to corresponding B. anthracis protein sequences using MUSCLE v3.8.31 (91) to determine the potential presence and functionality of these anthrax-specific genetic components. We also aligned the plcR gene to determine if Fresno16 carries this gene in either an activated or inactivated state. To assess reference bias in individual alignments, Fresno16 reads were simultaneously aligned to pX01/pBCX01 and pX02/pBC218 using GenomeMapper (92) using default settings.
De novo assembly of reads was performed using SPAdes v3.14.0 (93), and de novo plasmid assembly was performed using plasmidSPAdes v1.0 (94). Assemblies were compared to the Comprehensive Antibiotic Resistance Database (CARD) (95) using BLASTn (96). Matches to large, highly similar gene families [i.e., ACT, CMY, LEN, OKP, PDC, SHV, and TEM beta-lactamase families, ANT aminoglycoside modifiers, AAC(3) and AAC(6) acetyltransferases, MCR phosphoethanolamine transferase group, and quinolone resistance proteins] were considered single hits due to the sequence similarity of these groups. Prophage sequences for each isolate were estimated using PHASTER (97), and these were also compared to CARD (95) using BLASTn (96). Antibiotic resistance genes detected in both whole-genome assemblies and plasmid assemblies were visualized by site and by species using ggplot2 (98), and interactions between species and sites were visualized using upsetR (99).
In order to test the hypothesis that the diversity of AR genes is higher in larger urban centers, we determined if the human population size of the sampling locality was correlated with the number of resistance genes detected when controlling for species diversity and without, we conducted factorial logistic regressions using a Poisson linear model using the glm function in R (100) and ANCOVA using the aov function in R (101) to allow for the addition of controlling covariate data. Finally, to test the hypothesis that similarity of resistomes between sites was correlated with geographic proximity, we generated matrices of AR genes found at each site, and species diversity at each site using Jaccard/Tanimoto coefficients (102), and compared distance matrices using both full (i.e., AR genes Â geographic distance, species diversity Â geographic distance, and AR genes Â species diversity) and partial (i.e., geographic distance Â AR genes Â species diversity) Mantel tests implemented in the Vegan package of R (103), using 9,999 permutations.
Whole genome sequencing data for this project is available through NCBI's Short Read Archive BioProject#PRJNA734303.