Hybrid genome assembly of colistin-resistant mcr-1.5-producing Escherichia coli ST354 reveals phylogenomic pattern associated with urinary tract infections in Brazil

Highlights • Colistin-resistant E. coli carrying mcr-type genes have spread rapidly worldwide.• E. coli ST354 carrying mcr-1.5/IncI2 from urinary tract infection is reported in Brazil.• Phylogenomic cluster of ST354 associated with urinary tract infections is highlighted.• The rapid adaptation of mcr-positive E. coli within a One Health context is discussed.


Materials and methods
In January 2017, a patient was admitted to a hospital in South Brazil with a urinary tract infection.A urine sample was collected and then submitted to urine culture.One E. coli isolate was recovered from the urine sample, identified by matrix-assisted laser desorption ionisation-time of flight mass spectrometry (MALDI-TOF MS) analysis, and further confirmed by whole genome sequencing (WGS) analysis.E coli strain 14005RM was subjected to an antimicrobial susceptibility test by the disk-diffusion method following the recommendations of the Clinical and Laboratory Standards Institute -CLSI Supplement M100, 30th ed ( https://clsi.org).Specifically, breakpoints used for enrofloxacin and ceftiofur were obtained from supplement VET08 (CLSI supplement VET08, fourth ed).Moreover, colistin susceptibility was determined by the broth microdilution method according to European Committee on Antimicrobial Susceptibility Testing (EUCAST) guidelines ( http://www.eucast.org/ast_of_bacteria/warnings/#c13111).
Genome sequencing was carried out on the Illumina PE NextSeq platform (San Diego, USA), and MinION sequencer (Oxford Nanopore Technologies, Oxford, UK).For Illumina, total genomic DNA was extracted using a PureLink quick gel extraction kit (Life Technologies, CA).Subsequently, genomic DNA was used to library construction with a Nextera DNA Flex Kit (Illumina, San Diego, CA).For Nanopore sequencing, genomic DNA was extracted using the MasterPure Complete DNA and RNA Purification Kit (Lucigen) and the Nanopore Rapid Barcoding Sequencing Kit (SQK-RBK004; Oxford Nanopore, Oxford, UK) was used for library construction.Total DNA was sequenced with an R9.4.1 MinION flow cell (FlO-MIN106) for a 48h run using MinKNOW v.19.10.1 software.
The short reads were initially subjected to a quality check using FastQC software ( http://www.bioinformatics.babraham.ac.uk/ projects/fastqc ), and the paired reads were trimmed to remove adapters and low-quality regions (with a PHRED quality score below 20) using TrimGalore v0.6.5 ( https://github.com/FelixKrueger/TrimGalore ).For long-read basecalling and to trim barcode and adapter sequences, Guppy v3.3.3 software was used.Additionally, Filtlong v0.2.0 ( https://github.com/rrwick/Filtlong ) was used to filter long reads based on quality, employing the default parameters of -min_length 10 0 0 (discarding reads shorter than 1 kbp); -keep_percent 90 (eliminating the worst 10% of read bases); -trim (trimming bases from the start and end of reads that do not match a k-mer in the reference); and -split 500 (splitting reads when 500 consecutive bases fail to match a k-mer in the reference).
For investigations of the phylogenetic relationship between E. coli 14005RM and other mcr -1 variants of E. coli strains from Brazil, we used the available genomes deposited in the NCBI GenBank database ( https://www.ncbi.nlm.nih.gov/genbank/ ).Genomes were annotated by Prokka ( https://github.com/tseemann/prokka).We used Roary v3.13.0.( https://github.com/sanger-pathogens/Roary ) to deduce the group of genes (core genome) shared by the colistinresistant mcr-1.5-producingE. coli ST354 and the 28 related E. coli strains of interest.A multi-FASTA alignment of all of the core genes was created, and SNPs found in the genes in the core genome (core genome SNPs) were used to infer relationships between the strains.In this regard, SNP sites was used for assessing polymorphic sites ( https://github.com/sanger-pathogens/snp-sites), and SNP-Dists was used to construct an SNP distance matrix ( https: //github.com/tseemann/snp-dists) providing the number of single nucleotide polymorphisms between each pair of isolates in the alignment.In brief, our analysis focused on SNP variations of a core genome shared among isolates.A limitation of the methodology was it not being possible to verify deletions.
From Illumina sequencing, a total of 2,901,194 reads were generated, with an average read length of 75 bases, resulting in a cumulative sequence length of 291,150,004 nucleotides.In contrast, Nanopore sequencing yielded 119,978 reads, with a total of 854,994,143 nucleotides and a read length ranging from 110 to 102,758 bases (median read length of 3,984).After filtering, a total of 34,007 reads and 500,000,984 nucleotides, along with a 11,699 median read length, were obtained.
The genome size of E. coli 14005RM after hybrid assembly was 5,333,039 bp, comprising 23 contigs with a GC content of 50.4%.The NCBI PGAP annotation identified 4832 protein-coding genes.
In silico analysis based on MLST, Clermont typing, serotype, and fimH subtyping revealed that E. coli strain 14005RM belonged to ST354, phylogroup F, O153:H34, and fimH38 , respectively.The ST354 has been globally recovered from human, animal, soil, and raw vegetable samples [8 , 9 , 18-25 ], supporting its potential for dissemination and adaptation to different settings and representing a One Health concern.Noteably, previous studies also reported phylogroup F E. coli ST354 causing human bloodstream and urinary tract infections (UTIs) in China [21] and Brazil [22] , as well as UTIs in a dog in Thailand [23] .In this regard, the high ability to infect both the bladder and the kidney of phylogroup F isolates has also been reported [20] .
Hybrid genome assembly revealed that colistin-resistant E. coli strain 14005RM harboured the mcr-1.5 gene on a circular p14005RM plasmid 65 kb in size (GenBank accession no.JAAWUF020 0 0 0 023.1)belonging to the IncI2 replicon type ( Figure 1 A).The p14005RM plasmid shared a very similar genetic environmental (identity: > 99% and coverage: > 94%) with the IncI2/ mcr-1.5 plasmid (pMCR-015049) from ST6756 E. coli previously identified in a human from Argentina (GenBank accession no.KY471308) [26] ( Figure 1 B).The mcr-1.5 gene of the p14005RM plasmid was flanked upstream by mobC and relaxase genes, IS 30 -like element IS Apl1 , IS 3 family transposase, and IS 30like element IS Apl1 , while downstream, a pap2 gene and IS 30 element IS Apl1 were located ( Figure 1 B).Additionally, the mcr-1.5 gene shared a very similar genetic environment to previously For phylogenomic analysis, we selected 28 E. coli genomes from the NCBI database that harboured mcr-1 variants, isolated from humans, animals, food, and natural environments in Brazil (Supplementary Table S2).A total of 2714 core genes were shared by all E. coli strains.The E. coli strains from distinct hosts and sources (i.e.humans, animals, food, and the environment) were closely grouped on the tree ( Figure 2 ).Interestingly, this study reports the presence of the mcr-1.5 gene in only two E. coli strains (i.e.14005RM/ST354 and 5137/ST57; GenBank accession no.JAATKR0 0 0 0 0 0 0 0 0.1).
In summary, we report the first draft genome sequence of an E. coli ST354 carrying an IncI2 plasmid-mediated mcr-1.5 gene isolated from a human patient in Brazil.We also demonstrated that, although there is a certain degree of relatedness among the mcr-1producing E. coli clones in Brazil, the diversity within the selected strains may be indicative of various factors influencing the genetic makeup, such as different hosts and sources, which may reflect its adaptability and versatility.Therefore, the identification of MDR E. coli carrying an IncI2 plasmid-mediated mcr-1.5 gene among human clinical isolates represents a clinical challenge and an epidemiological alert that deserves continuous and effective surveillance.Considering the increasing rates of such pathogens glob-ally, not only in human nosocomial settings but also outside hospitals, epidemiological genomic studies are in urgent demand.Finally, our data might provide additional information for comparative genomic analyses of molecular mechanisms, genetic structure, and epidemiological links of mcr-1 -positive E. coli strains under the One Health umbrella.

Fig. 1 .
Fig. 1.Circular representation of plasmid harbouring mcr-1.5 in Escherichia coli strain 14005RM.(A) A schematic representation of genes encoded by p14005RM plasmidshowing colistin resistance in red, ABC transporter membrane/periplasmic binding protein in black, heavy-metal resistance in pink, plasmid replication in lime green, mobilegenetic elements in jade green, type IV secretion system in French lime green, type III secretion system in magenta, type IV conjugation system in jungle green, topoisomerase III in purple, virulence factors in orange, recombinant protein/cytoplasmic membrane protein in mulberry, and hypothetical proteins in dark grey.(B) The genetic context of mcr-1.5 is highlighted in a linear view.

Fig. 2 .
Fig.2.Phylogenetic relationship between mcr -1.5-producing Escherichia coli strain 14005RM (this study) belonging to the international clone ST354 and other 28 E. coli genomes that presented the mcr -1 gene from Brazil.The iTOL version 5.6.1 ( https://itol.embl.de ) was used to view the image.