Genome Sequence and Comparative Pathogenic Determinants of Multidrug Resistant Uropathogenic Escherichia coli O25b:H4, A Clinical Isolate from Saudi Arabia

Escherichia coli serotype O25b:H4 is involved in human urinary tract infections. In this study, we sequenced and analyzed E. coli O25b:H4 isolated from a patient suffering from recurring UTI infections in an intensive care unit at Hera General Hospital in Makkah, Saudi Arabia. We aimed to determine the virulence genes for pathogenesis and drug resistance of this isolate compared to other E. coli strains. We sequenced and analyzed the E. coli O25b:H4 Saudi strain clinical isolate using next generation sequencing. Using the ERGO genome analysis platform, we performed annotations and identified virulence and antibiotic resistance determinants of this clinical isolate. The E. coli O25b:H4 genome was assembled into four contigs representing a total chromosome size of 5.28 Mb, and three contigs were identified, including a 130.9 kb (virulence plasmid) contig bearing the bla-CTX gene and 32 kb and 29 kb contigs. In comparing this genome to other uropathogenic E. coli genomes, we identified unique drug resistance and pathogenicity factors. In this work, whole-genome sequencing and targeted comparative analysis of a clinical isolate of uropathogenic Escherichia coli O25b:H4 was performed. This strain encodes virulence genes linked with extraintestinal pathogenic E. coli (ExPEC) that are expressed constitutively in E. coli ST131. We identified the genes responsible for pathogenesis and drug resistance and performed comparative analyses of the virulence and antibiotic resistance determinants with those of other E. coli UPEC isolates. This is the first report of genome sequencing and analysis of a UPEC strain from Saudi Arabia.

Uropathogenic Escherichia coli (UPEC) are ubiquitous and are involved in human urinary tract infections.Although several uropathogenic strains have been studied, the emergence of multidrug resistance with virulent phenotypes is a worldwide concern 1 .The cause for the increased prevalence of multidrug resistance in E. coli is attributed to horizontal exchange of genetic material or mobile genomes 2 .The E. coli O25b:H4 isolate is a uropathogenic pandemic clone primarily involved in the emergence of antimicrobial drug resistant community infections 3 .The prevalence of virulent multidrug resistance genes of sequence type 131 (ST131) in E. coli O25b:H4 has been spreading widely due to its transferable plasmid.The emergence of clone ST131 O25b:H4 harboring extended-spectrum ²-lactamase (ESBL) genes has been documented by several countries in Europe, Asia, and Middle East including Saudi Arabia 4 ,5 .The sero group O25 is associated with ST131 and is linked to enterotoxigenic E. coli (ETEC) 4 .Whole genome sequencing studies have shown that E. coli ST131 strains encode virulence genes linked with extraintestinal pathogenic E. coli (ExPEC) and other virulence factor genes expressed constitutively in E. coli ST131.Therefore, the differences in virulence gene content may contribute to the variability in pathogenesis and host immunity 1 .This work describes the whole genome sequencing and analysis of a UPEC strain that was isolated from a patient in the intensive care unit in a hospital in Saudi Arabia.We identified the functional role of this genome and performed targeted comparative analysis of virulence and antibiotic drug resistance determinants with other E. coli UPEC isolates.

DNA isolation
The E. coli O25b:H4 Saudi strain was isolated from a male patient admitted at Hera General Hospital in Makkah, Saudi Arabia.Bacteria was isolated from a single colony of E. coli O25b:H4 Saudi strain grown on 5% sheep blood agar and MacConkey agar, and genomic DNA was prepared using standard protocols from bacteria grown overnight in 5 ml LB broth at 37°C.The bacterial cells were centrifuged, and the cell pellet was used for genomic DNA extraction using the Qiamp DNA mini kit according to the manufacturer's instructions (QiaGen, Valencia, CA, USA).

Genome sequencing and annotations
The E. coli O25b:H4 Saudi strain genome (WLH) was sequenced using multiple nextgeneration sequencing strategies.First, a random DNA library was constructed and paired-ends sequenced using the Illumina Mi Seq method, and they were assembled into 213 contigs using a CLC assembler.Second, the DNA was also sequenced from a library by PacBio SMRT cell.The sequence reads were assembled into 4 contigs using PacBio SMRT Analysis software (version 2.1.1,Pacific Biosciences, California, USA), and default filters removed reads <50 bases and less than 0.75 accuracy.The assembled PacBio verision was used for further bioinformatics and targeted comparative analysis.The open reading frames (ORFs) were identified using a combination of Glimmer (v 2.1), CRITICA and Prokpeg (a protein sequence similarity based ORF caller), as described in Kapatral et al 6 .The reconciled predicted ORFs were integrated into the ERGO annotation environment for computing protein similarities and functional identification 7, 8 .The virulence and antibiotic resistance features were compared with other sequenced uropathogenic strains.

Phylogenetic analysis
We initially identified phylogenetic relationships between the hospital isolate and other pathogenic E. coli.Using three distinct phylogenetic marker DNA sequences, such as 16s rRNA, dnaA and gyrB sequences, against the Ribosomal Database Project (http:// rdp.cme.msu.edu) and NCBI database (http:// blast.ncbi.nlm.nih.gov),our drug resistant hospital isolate was most similar to uropathogenic E. coli O25b:H4-ST131 str.EC958.

Genome analysis
To identify the virulence and drug resistant determinants, we sequenced the genome (WLH) and assembled it into four contigs.The total size of the genome was ~5.4 Mb with an average GC content of 50%.The genome features are given in (Table 1).It includes the chromosome of 5.2 Mb, three plasmids (13 kb, 3.2 kb, 2.9 kb).A total of 6,108 ORFs were identified in the chromosome, including rRNA and tRNA operons.The second contig was a plasmid containing the ORFs for the multidrug resistance bla-CTX gene, hemin receptor and nucleotidyl transferase protein.
Using the ERGO annotation procedures described above, 70% of the ORFs were assigned with a functional annotation.Nearly 70% of ORFs functions belonged to COG categories 9 , and 68% of ORFs had a pfam domain (10), signifying potential functions.A significant number of ORFs (49%) was identified as fusions proteins or frame shifts.Fusion proteins as composites (33%) consisting of two or more fused ORFs and ORFs representing individual components (34%) were identified.

Plasmid maintenance systems
As in other enteric bacteria, two ORFs belonging to the entericidin family operon ecnAB (RWLH03657 & RWLH03658) were identified.These proteins are involved in plasmid maintenance by post-segregation killing of cells (11).

Adhesion
Several ORFs had adhesion/invasion functionality.An ORF similar to adhesion SefD (RWLH1073), invasion protein (RWLH00839, RWLH05418, RWLH04377, RWLH00840, RWLH00837, RWLH05574 and RWLH00838), in addition two ORFs (RWLH04280, RWLH00341) had similarities to NlpC/P60 family protein.One ORF for a lipoprotein antigen (RWLH02358) in other pathogenic bacteria, such as S. flexineri 5 Str.8401, E. coli APEC 01 and E. coli O157:H7, was identified Two ORFs with similarity to a polysaccharide involved in intercellular adhesion (RWLH00953, RWLH05282) and a polysaccharide deacetylase (RWLH04186) similar to other pathogenic enteric pathogens were identified.A polysaccharide as a biofilm operon pgaABCD involved in synthesis and export of 1-6-N-acetyl-D-glucosamine was identified.Upstream of this operon was a PGA transcriptional regulator.Similar to operons in other enteric pathogens, ORFs for secretin protein PgaA protein (RWLH05284), a specific deacetylase PgaB (RWLH05283), N-acetylglucosaminyltransferase PgaC (RWLH0582) and polymerization protein PgaD (RWLH05281), which are involved in the synthesis and export of biofilm, were identified.

Motility
We identified both the flagellar and fimbrial types of motility that are typically found in enteric Gram-negative bacteria.
Among the signal transduction systems, RcsB-RcsC containing three ORFs for the response regulator RcsB (RWLH01262), sensor protein RcsC (RWLH012630) and a specific sensor kinase YojN (RWLH01261), along with the QseB-QseC pathway consisting of the regulatory protein QseB (RWLH02221) and its cognate sensor protein QseC (RWLH02222), were identified.We conclude that both the lateral and peripheral flagellar systems are potentially functional in this genome.

Swarming motility
Two types of swarming motility pathways, namely type I and curli pilin, were identified in this genome.

Curli pilin
Curli pilin play a major pathogenic role by mediating adhesion and invasion into the host cell.The curli pilin proteins are also known to interact with host proteins, such as fibronectin, laminin, MHC class I proteins, TLR2 and fibrinogen, resulting in systemic infection.The curli pilin is encoded by two divergently placed operons: csgBAC (RWLH 05296, RWLH05297 and RWLH05298) and csgDEFG (RWLH05294, RWLH05293, RWLH0592, and RWLH05291).As in other entero-pathogenic bacteria, the role of the CsgC protein is unknown (13).

Multidrug resistance Beta lactams
Resistance to antibiotics, such as such as penicillin, carbapenems and cephamycins, is due to the action of beta-lactamases.Three ORFs belonging to the beta-lactamase family, similar to non E. coli pathogens, have been identified: one ORF (RWLH05858) is similar to Klebsiella oxytoca, a second ORF (RWLH03661) is similar to pathogenic E. coli APEC 01 and the third (RWLH05853) is most similar to Pelobacter propionicus.Interestingly, upstream of the ORF (RWLH05853) is an ORF for aminoglycoside N6'acetyltransferase, which is involved in aminoglycoside resistance; these two ORFs are flanked by an IS element that is located on the plasmid (contig 2: 139.5 kb) (Figure 1).It is interesting to note the divergence of two types of pathogenic E. coli.The genomes of WLH and E. coli 536 have identical organization at the ampicillin induction protein AmpE, unlike E. coli strains 0111:H, 11128, E2348/69 and EDL933 (Figure 2).

Tetracycline resistance
We identified two ORFs similar to the tetA family.One ORF similar to Salmonella spp and Acinetobacter spp was found in an IS element that was split into two ORFs, RWL05871 (792 bp) and RWLH05872 (447 bp).A second ORF (RWLH05709) (1272 bp), similar to the TetA protein of Acinetobacter spp, was identified in an IS element.

Chloramphenicol resistance
An ORF (RWLH02992) similar to the chloramphenicol export proton antiporter (multidrug efflux system protein MtdL) was identified, as in other pathogenic E. coli.These proteins are known to confer chloramphenicol resistance in several Gram-negative pathogens.

Polymyxin resistance
These antibiotics possess a long hydrophobic tail and are known to be effective against Gram-negative bacteria.Resistance against these types of compounds is accomplished by expressing proteins, such as invasin.This protein is essential for invading the host cell membrane; the ORF for invasin was mutated at four sites, creating four non-functional ORFs.This frameshift is found in other pathogenic E. coli strains, such as 042, 536 and APEC 01.

DISCUSSION
We sequenced and analyzed a multidrug resistant uropathogenic E. coli isolated from a clinical setting from Saudi Arabia.Based on 16S RNA, gyrB and DNA sequence phylogenetic analysis, the strain was identified to be similar to uropathogenic E. coli O25b:H4 ST131.Like in other pathogenic E. coli, a number of drug resistant genetic determinants was identified along with other virulence, invasion, secretion and lysogenic phage genes.Interestingly, type III secretion systems necessary to deliver toxins into the host were not identified.Two sets of flagellar motility systems, lateral as well as peritrichous types, were identified.Similarly, two type I and curli pilin swarming systems appeared to be functional.
Multidrug resistant isolates of E. coli ST131 have been isolated from several countries in Asia and the Middle East; in Japan, approximately 21% of isolates were isolated between 2002-2003.A number of multidrug uropathogenic E. coli strains has been identified in Saudi Arabia (5); however, few strains have been fully sequenced.A huge genetic diversity was found within those isolates.The most notable diversity was among strains carrying CTX-M determinants (14).The prevalence of fluoroquinolone and ciprofloxacin resistant isolates (range 25-63%) was observed in various Japanese, Chinese and Philippines regions (15).These resistance markers were identified in this genome.In Lebanon, small hospitals reported E. coli with ESBL phenotypes from fecal samples (16).The exact epidemiology of E. coli ST131 clones in Saudi Arabia or other neighboring countries (e.g., Kuwait, UAE, Qatar, Bahrain, and Oman) remains unknown.However, the data is expected to be underestimated due to the lack of identification or reporting from these countries.Uropathogenic E. coli O25b: H4-ST131 strain EC958 is the biggest group of E. coli involved in extraintestinal infection.In 2005, E. coli O25b:H4-ST131 strain EC958 was isolated from a urine sample of a young girl diagnosed with UTI in the United Kingdom (17).The whole genome sequence of EC958 was studied, and it was found to contain the drug resistance gene bla CTX-M15 and multiple virulence factors involved with UPEC, including genes encoding autotransporter proteins (PicU, UpaH, UpaG and Ag43), adhesins (curli, type 1 fimbriae and a fimbrial adhesin) and siderophore biosynthesis genes (enterobactin, aerobactin and yersiniabactin).We found similar genes in our study, consistent with that by Totsika et al. (17).
A major dissemination of E. coli plasmidmediated extended-spectrum ß-lactamases (ESBLs) producers has become evident worldwide similar to Acinetobacter baumannii which is rapidly emerging pathogen globally including Saudi Arabia (18).These bacterial strains, in addition to being resistant to ß-lactamases, are also resistant to aminoglycosides and fluoroquinolones.E. coli O25b:H4 ST131 is known for its CTX-M-15 production and spreading in both in-patients and out-patients globally (19).It harbors multiple drug resistant determinants on its plasmids in addition to virulence genes (20,21).Consistent with previous studies, this study presents evidence of other drug resistant genes, such as tetracycline, chloramphenicol, aminoglycoside and acriflavins, and ESBLs determinants from this hospital isolate highlight its international dissemination (22).Consistent with our study, the gene acquisition by uropathogenic E.coli may enhance urinary tract infections and may aid in evading the host immune response (23).
FlhC (RWLH00606), which turn on the class-II flagellar system, were identified.