Chicken Production and Human Clinical Escherichia coli Isolates Differ in Their Carriage of Antimicrobial Resistance and Virulence Factors

ABSTRACT Contamination of food animal products by Escherichia coli is a leading cause of foodborne disease outbreaks, hospitalizations, and deaths in humans. Chicken is the most consumed meat both in the United States and across the globe according to the U.S. Department of Agriculture. Although E. coli is a ubiquitous commensal bacterium of the guts of humans and animals, its ability to acquire antimicrobial resistance (AMR) genes and virulence factors (VFs) can lead to the emergence of pathogenic strains that are resistant to critically important antibiotics. Thus, it is important to identify the genetic factors that contribute to the virulence and AMR of E. coli. In this study, we performed in-depth genomic evaluation of AMR genes and VFs of E. coli genomes available through the National Antimicrobial Resistance Monitoring System GenomeTrackr database. Our objective was to determine the genetic relatedness of chicken production isolates and human clinical isolates. To achieve this aim, we first developed a massively parallel analytical pipeline (Reads2Resistome) to accurately characterize the resistome of each E. coli genome, including the AMR genes and VFs harbored. We used random forests and hierarchical clustering to show that AMR genes and VFs are sufficient to classify isolates into different pathogenic phylogroups and host origin. We found that the presence of key type III secretion system and AMR genes differentiated human clinical isolates from chicken production isolates. These results further improve our understanding of the interconnected role AMR genes and VFs play in shaping the evolution of pathogenic E. coli strains. IMPORTANCE Pathogenic Escherichia coli causes disease in both humans and food-producing animals. E. coli pathogenesis is dependent on a repertoire of virulence factors and antimicrobial resistance genes. Food-borne outbreaks are highly associated with the consumption of undercooked and contaminated food products. This association highlights the need to understand the genetic factors that make E. coli virulent and pathogenic in humans and poultry. This research shows that E. coli isolates originating from human clinical settings and chicken production harbor different antimicrobial resistance genes and virulence factors that can be used to classify them into phylogroups and host origins. In addition, to aid in the repeatability and reproducibility of the results presented in this study, we have made a public repository of the Reads2Resistome pipeline and have provided the accession numbers associated with the E. coli genomes analyzed.


Reads2Resistome assessment
Using Reads2Resistome we assembled and characterized the AMR genes, virulence genes and prophage sequences of genomes associated with two bacterial isolates recovered from the ceca of 2-week-old broiler chickens; SH-IC: Salmonella enterica serovar Heidelberg (S. Heidelberg) and EC-IC: Escherichia coli (Table   S2). Illumina, PacBio and Oxford Nanopore MinION sequences were used to evaluate and compare the three assembly methods available through the pipeline: short read-only, long read-only and hybrid assembly.
Short and long read-only assemblies, regardless of the read source, resulted in the shortest run-time with an average of 6 minutes per sample. Hybrid assembly, as expected, was the most time-intensive assembly method taking on average 1 hour and 8 minutes per sample, regardless of long read source (Table S3).
Genome assembly and annotation metrics were compiled from QUAST and Prokka outputs. Hybrid assembly of both EC-IC and SH-IC using MinION long reads and Illumina short reads gave the fewest contigs, longest total length and highest number of annotated genes as compared to long read assembly using MinION.
Hybrid assembly of both isolates using PacBio reads resulted in fewer contigs but comparable total length to that of the MinION hybrid assembly. Genome contiguity was best obtained by hybrid assembly and can be visualized with Bandage-generated graphs. While hybrid and long read-only assemblies are comparable with respect to number of contigs and genome length, the long read-only assembly greatly lacked in annotated genomic features and resistome elements.
Annotated genes and features across all assembly methods for both isolates were considerably reduced under the long read-only assembly, while both short read and hybrid methods resulted in comparable numbers of annotated genes. We suspect this is due to relative lower quality of long reads as compared to Illumina short reads. This is mirrored in AMR and virulence gene characterization and prophage identification. While both short read and hybrid assembly methods for both isolates resulted in comparable identified resistome elements and prophage sequences, long read-only assembly identified elements were significantly reduced ( Table S4,   Table S5).

Read_Assembly
Each command was executed independently on a Linux server with 128 compute cores and 504GB of memory.
Resources allocated and run-time in Table 2 were obtained from the "report.html" which is generated using the '-with-report' option.
Results from our case study indicated that a highly contiguous genome assembly with robust gene annotation, prophage identification, and resistome characterization is best obtained under a hybrid assembly approach. While hybrid assembly is the most time-intensive assembly method, it produces the most complete annotated genomes in our case study. Long read-only assembly is able to produce a respectable genome length with high contiguity but falls short when annotating genomic features. Figure S1. Principal component analysis of identified antimicrobial resistance genes from RGI and ResFinder.

3) Supplementary Figures
Isolates are labeled corresponding to their state origin.    VirulenceFinder Table S6. Drug classes, identified through the ResFinder output of acquired AMR genes, which significantly differed in proportion between human clinical isolates and chicken production isolates as determined by the Wilcoxon rank-sum test. Chicken production isolate average proportions were compared against human clinical isolates. Genes conferring resistance to drug classes were enumerated for each isolate and a proportion was calculated using the total number of genes in the study population conferring resistance to a given drug class. P value adjustment performed by the Benjamini-Hochberg false discovery rate correction.  Table S7. Drug classes, identified through RGI (CARD) output of AMR genes, which significantly differed in proportion between human clinical isolates and chicken production isolates as determined by the Wilcoxon rank-sum test. Chicken production isolate average proportions were compared against human clinical isolates.

Drug Class
Genes conferring resistance to drug classes were enumerated for each isolate and a proportion was calculated using the total number of genes in the study population conferring resistance to a given drug class. P value adjustment performed by the Benjamini-Hochberg false discovery rate correction.    Escherichia coli GlpT with mutation conferring resistance to fosfomycin E448K

R20H and G121D
Escherichia coli soxS with mutation conferring antibiotic resistance (fluoroquinolone antibiotic; monobactam; carbapenem; cephalosporin; glycylcycline; cephamycin; penam; tetracycline antibiotic; rifamycin antibiotic; phenicol antibiotic; triclosan; penem) n/a Escherichia coli acrR with mutation conferring multidrug antibiotic resistance (fluoroquinolone antibiotic; cephalosporin; glycylcycline; penam; tetracycline antibiotic; rifamycin antibiotic; phenicol antibiotic; triclosan) n/a    Table S15. Proportion (%) of virulence factor-associated functions across identified phylogroups. The set of functions for each gene was counted and summed for all isolates in a given phylogroup. Virulence genes associated with each function were enumerated for each isolate and a proportion was calculated using the total number of genes in the study population with the given function.