Genomic characterization of Escherichia coli LCT-EC001, an extremely multidrug-resistant strain with an amazing number of resistance genes

Background Multidrug resistance is a growing global public health threat with far more serious consequences than generally anticipated. In this study, we investigated the antibiotic resistance and genomic traits of a clinical strain of Escherichia coli LCT-EC001. Results LCT-EC001 was resistant to 16 kinds of widely used antibiotics, including fourth-generation cephalosporins and carbapenems. In total, up to 68 determinants associated with antibiotic resistance were identified, including 8 beta-lactamase genes (notably producing ESBLs and KPCs), 31 multidrug efflux system genes, 6 outer membrane transport system genes, 4 aminoglycoside-modifying enzyme genes, 10 two-component regulatory system genes, and 9 other enzyme or transcriptional regulator genes, covering nearly all known drug-resistance mechanisms in E. coli. More than half of the resistance genes were located close to mobile genetic elements, such as plasmids, transposons, genomics islands, and insertion sequences. Phylogenetic analysis revealed that this strain may have evolved from E. coli K-12 but is a completely new MLST type. Conclusions Antibiotic resistance was extremely severe in E. coli LCT-EC001, mainly due to mobile genetic elements that allowed the gain of a large quantity of resistance genes. The antibiotic resistance genes of E. coli LCT-EC001 can probably be transferred to other bacteria. To the best of our knowledge, this is the first report of a strain of E. coli which has such a large amount of antibiotic resistance genes. Apart from providing an E. coli reference genome with an extremely high multidrug-resistant background for future analyses, this work also offers a strategy for investigating the complement and characteristics of genes contributing to drug resistance at the whole-genome level. Electronic supplementary material The online version of this article (10.1186/s13099-019-0298-5) contains supplementary material, which is available to authorized users.


Background
According to the World Health Organization (WHO) report ' Antimicrobial resistance: global report on surveillance 2014' , multidrug resistance is a growing global public health threat with far more serious consequences than generally anticipated. Out of the WHO member states, 50% reported that E. coli isolated from within these states was resistant to third-generation cephalosporins and fluoroquinolones-the best antibiotics available for treating multidrug-resistant bacteria. In February 2017, the WHO published its first ever list of antibioticresistant "priority pathogens"-a catalogue of 12 families of bacteria that pose the greatest threat to human health. E. coli was defined as one of the most critical multidrug-resistant bacteria, which were considered to have built-in abilities to find new ways to resist treatment and pass along genetic material that allows other bacteria to become drug-resistant as well. It is widely accepted that infections caused by antibiotic-resistant bacteria burden healthcare resources and increase the risk of poor clinical outcomes for patients. Global estimates suggest that more than 700,000 people per year die from drug-resistant infections [1]. It is predicted that antibiotic-resistant infections will kill ~ 10 million people per year by 2050, costing the global economy ~ $100 trillion [2]. The seriousness of this situation was surmised in the WHO report: ' A post antibiotic era, in which common infections and minor injuries can kill, is instead a very real possibility for the 21st century' .
Revealing the mechanisms underlying drug resistance in bacterial pathogens is crucial in infection disease control and management. With significant progress in highthroughput sequencing and bioinformatics analysis of pathogens, whole-genome sequencing has become more accessible for the identification and tracking of multidrug-resistance (MDR) microorganisms in hospitals and communities [3]. In this study, we isolated E. coli strain LCT-EC001 from a 78-year-old male patient with several health issues, including diabetes, hypertension and chronic obstructive pulmonary disease, who had received long-term therapy with multiple drugs. The drug resistance of E. coli strain LCT-EC001 was tested, and wholegenome sequencing was conducted to understand the genetic elements contributing to antibiotic resistance. This work contributes a clinically isolated drug-resistant E. coli strain as a valuable reference for future studies and presents a strategy for the comprehensive analysis of drug resistance at the whole-genome level.

Bacterial isolation and culture conditions
An E. coli isolate (designated LCT-EC001) was obtained from the sputum of a 78-year-old male patient who had several health issues (diabetes, hypertension and chronic obstructive pulmonary disease) and had received multidrug therapy over a long time period. The bacterium was inoculated in Brain Heart Infusion (Oxoid, UK) medium at 37 °C.

High-throughput sequencing and assembly
Isolation of genomic DNA was carried out using the cetyltrimethylammonium bromide (CTAB) method. Total DNA obtained was subjected to quality control by agarose gel electrophoresis and quantified by Qubit [5]. The genome of E. coli strain LCT-EC001 was sequenced with MPS (massively parallel sequencing) Illumina technology. Two DNA libraries were constructed: a pairedend library with an insert size of 500 bp and a paired-end library with an insert size of 5 kb. The 500 bp library and the 5 kb library were sequenced using an Illumina HiSeq 2000 platform (Illumina, USA). Quality control of the two paired-end library reads was performed using readfq (version 10) program [6] with the following steps: (1) Eliminate reads once its low quality nucleotide bases (Q-value ≤ 38) exceeding the threshold (40 bp by default), (2) Eliminate the reads containing Ns in the reads greater than the threshold (10 bases by default), (3) Eliminate reads whose overlap with the adapter exceeding the threshold (15 bp by default), and (4) Filter duplicates to keep only one copy of the totally same reads. For a library of 500 bp, 6.19% of reads were filtered, while 8.48% of reads were filtered for a library of 5 kb. The filtered reads were assembled by SOAPdenovo [7] to generate scaffolds. The parameters used for assembly were as follows: SOAPdenovo all -F -K 107 -k 107. All reads were used for further gap closure by using GapCloser (version 1.12) [8] with default parameters.

Phylogenetic analysis and multilocus sequence typing (MLST)
The genome datasets of the other 62 E. coli strains were compared with the genome of LCT-EC001 for SNP detection by using MUMmer with default settings (version 3.22). Then, the repeat regions of LCT-EC001 were detected by self-blast (choosing BLASTn parameter with blastall, using BLAST v2.2.23), TRF and Repeat-Masker. After that, SNPs located in the repeat region were filtered. Based on the location array of SNPs, a phylogenetic tree was generated using the neighborjoining method with 1000 bootstraps via MEGA6. MLST was performed with the web tool at http:// cge.cbs.dtu.dk/servi ces/MLST/, using the assembled genome. By comparing the sequences of seven housekeeping genes (ADK,FUMC,GYRB,ICD,MDH,PURA ,RECA) in LCT-EC001 with that in the database, the MLST type was analyzed.

Analysis of antibiotic resistance genes
A BLASTp [10] search (E-value less than 1·e −5 , minimal alignment length percentage larger than 40%) was performed against 3 databases for drug resistance analysis. The databases are ARDB (Antibiotic Resistance Genes Database), CARD and ARG-ANNOT (Antibiotic Resistance Gene-ANNOTation). Then, the identified sequences were all BLAST searched online (https ://blast .ncbi.nlm. nih.gov/Blast .cgi) to match genes in NCBI. The identified resistance genes were further verified by PCR and Sanger sequencing. Location relationships between these identified genes and genomic islands, prophages, repeat regions, transfer elements, plasmids, and IS elements were analyzed.

Strain LCT-EC001 is resistant to most clinical antibiotics
We tested the susceptibility of E. coli strain LCT-EC001 to 17 kinds of widely used antibiotics with the VITEK 2 Compact System in triplicate. Our findings showed that E. coli strain LCT-EC001 was resistant to 16 kinds of antibiotics, including fourth-generation cephalosporins (cefepime) and carbapenems (ertapenem and imipenem), and was only sensitive to amikacin, indicating that it is a severely multidrug-resistant bacterium. However, extended spectrum β-lactamases (ESBL) were negatively detected. The results are shown in Table 1.
Normally, E. coli colonizes the intestines of humans and other animals [22]. However, it is a frequent cause of community and hospital-acquired infections, such as those of the urinary tract, bloodstream, abdomen, skin and soft tissues under certain circumstances [23]. This bacterium also causes pneumonia, neonatal meningitis and food-borne infections on a global scale [24]. It is well accepted that antimicrobial resistance is related to widespread antibiotic use, especially their inappropriate use in humans and other animals, as well as in the food industry [25]. With the increasing incidence of multidrug-resistant organisms, antibiotic resistance has now become a serious global public health problem.

Genomic features of the strain LCT-EC001
An illustration of the genomic contents in the genome of E. coli strain LCT-EC001 is shown in Fig. 1. The final assembled genome consisted of 17 scaffolds with a total length of 5,198,242 bp and a mean GC content of 50.79%. The gene annotation included 5013 protein coding sequences (CDSs) accounting for 86.61% of the genome (Table 2), 84 tRNA (transfer RNA) fragments, 65 snRNA (small nuclear RNA) genes, 7 copies of 5S rRNA (ribosomal RNA), 6 copies of 16S rRNA, 6 copies of 23S rRNA (Additional file 1: Table S1), 17,031 bp of interspersed repeat sequences and 31,219 bp of tandem repeat sequences (Additional file 2: Table S2). A total of 69.18% of the gene distribution in the GO database is shown in Additional file 3: Table S3, 78.04% in the COG database shown in Additional file 4: Table S4, and 65.93% in the KEGG database shown in Additional file 5: Table S5.

Phylogenetic tree and MLST analysis of LCT-EC001
To interpret the evolution of such an extreme multidrug-resistant Escherichia coli isolate, a selection of 62 E. coli complete genomes (1 chromosome) downloaded from NCBI was used to map phylogenetic trees by using neighbor-joining. All samples except LCT-EC001 were named as E. coli plus the NCBI uid. The results showed that LCT-EC001 was most closely related to E. coli K-12, which is mostly used in laboratories (Fig. 2), indicating that LCT-EC001 may have evolved. MLST analysis showed that the seven housekeeping genes in LCT-EC001 were ADK10, FUMC11, GYRB4, ICD8, MDH8, PURA13, and RECA2. However, no available MLST type could match that of LCT-EC001, revealing that this strain was a completely new type.

Analysis of the complement of antibiotic resistance genes
To understand the basis of antibiotic resistance in E. coli strain LCT-EC001, we carried out sequence alignments with the ARDB database, CARD database and ARG-ANNOT database. A total of 68 determinants associated with antibiotic resistance were identified, with a length range of 348-3594 bp, and mean length of 1305 bp (Additional file 6: Table S6). All those determinants were matched to genes in NCBI with similarity of at least 99%, then further named and classified according to the matched gene information, including 8 beta-lactamase genes, 31 multidrug efflux system genes, 6 outer membrane transport system genes, 4 aminoglycoside-modifying enzyme genes, 10 two-component regulatory system genes, and 9 other enzyme or transcriptional regulator genes (Fig. 3). PCR and Sanger sequencing were further used to confirm that all the genes did exist in E. coli strain LCT-EC001. Beta-lactamases are enzymes produced Fig. 2 Evolutionary relationships between LCT-EC001 and other E. coli strains. Sixty-two strains of E. coli with complete genomes from NCBI were used for phylogenetic analysis. The phylogenetic tree was deduced by neighbor-joining. From the results, we found that the strain LCT-EC001 was close to the lineage of E. coli K-12 (represented with "Δ"). The names of the E. coli strains were composed of E. coli and the NCBI uid by bacteria that provide resistance to β-lactam antibiotics such as penicillins, cephalosporins, and cephamycins by breaking the antibiotics' structure, a four-atom ring known as a β-lactam. Among the 8 beta-lactamase genes, 2 were the extended-spectrum β-lactamase (ESBL) genes Tem-1 and CTXM-14, and 1 was the Klebsiella pneumoniae carbapenemase (KPC) gene KPC-2. ESBLs can hydrolyze extended-spectrum cephalosporins, including cefotaxime, ceftriaxone, and ceftazidime, as well as the oxyimino-monobactam aztreonam. Thus, ESBLs confer multiresistance to these antibiotics and related oxyimino-beta lactams, which play an important role in antibiotic resistance in E. coli. KPC is another key enzyme in MDR, due to its ability to hydrolyze a broad variety of β-lactams, including carbapenems, cephalosporins and penicillins [26]. Interestingly, ESBL gene were not detected by VITEK 2 Compact System, highlighting its flaws in clinical setting.
The drug resistance genes in LCT-EC001 covered nearly all known drug-resistance mechanisms in E. coli. Of these genes, 34 genes were detected from the ARDB database, 61 genes were detected from the CARD database, and 19 genes were detected from the ARG-ANNOT database (Additional file 7: Table S7). In addition, 6 of these genes were located in genome islands, 11 genes were located in plasmids, 3 genes were near transposons, 14 genes were near insertion sequences, and no genes were related to prophages or repeat regions (Additional file 7: Table S7). A more concerning problem is that antibiotic resistance traits in bacteria can transfer between each other, regardless of their genus [27], via mobile genetic elements (MGEs) such as plasmids [28], insertion sequences [29], integrons/transposons [30], and chromosomal fragments (including resistance islands) [31]. A plasmid is a kind of extrachromosomal DNA molecule with the ability to autonomously replicate. A plasmid can harbor genes encoding β-lactams, even carbapenemases or extended-spectrum β-lactamases, and aminoglycosides [32] and genes producing antibiotictarget protecting proteins, antibiotic-modifying enzymes or multidrug efflux pumps [33]. Plasmids can also acquire mobile genetic elements by encoding endonucleases/ methylase restriction systems [34]. Furthermore, plasmids can move from one bacterial cell to another by conjugal transfer [34], playing a vital role in the spread of resistance determinants among bacteria. An insertion sequence (IS) is an important MGE that widely exists in bacterial genomes, usually with a length of 0.6-2.0 kb [35]. IS elements can help resistance genes to transfer between and within bacteria [36] and can upregulate downstream resistance genes [37]. Integrons are another MGE responsible for the emergence and spread of antibiotic resistance genes, including β-lactamases, aminoglycosides, and fluoroquinolones [38]. Transposons, like plasmids, have the potential to transfer horizontally or vertically among pathogens, driving the development of antibiotic resistance [39]. A genomic island (GI), usually with a size of 4.5-600 kb and generated by lateral gene transfer (LGT), is a large continuous genomic region. In addition, GIs can carry tens to hundreds of genes, often important for bacterial evolution, such as antibiotic resistance [40].
It is worth mentioning that our genome is a draft genome comprising 18 contigs, which means there are 17 gaps of sequence missed and other drug-resistant genes that may not have been identified.