Uropathogenic Escherichia coli pathogenicity islands and other ExPEC virulence genes may contribute to the genome variability of enteroinvasive E. coli

Enteroinvasive Escherichia coli (EIEC) may be the causative agent of part of those million cases of diarrhea illness reported worldwide every year and attributable to Shigella. That is because both enteropathogens have many common characteristics that difficult their identification either by traditional microbiological methods or by molecular tools used in the clinical laboratory settings. While Shigella has been extensively studied, EIEC remains barely characterized at the molecular level. Recent EIEC important outbreaks, apparently generating more life-threatening cases, have prompted us to screen EIEC for virulence traits usually related to extraintestinal pathogenic E. coli (ExPEC). That could explain the appearance of EIEC strains presenting higher virulence potential. EIEC strains were distributed mainly in three phylogroups in a serogroup-dependent manner. Serogroups O124, O136, O144, and O152 were exclusively classified in phylogroup A; O143 in group E; and O28ac and O29 in group B1. Only two serogroups showed diverse phylogenetic origin as follows: O164 was assigned to groups A, B1, C, and B2 (one strain each), and O167 in groups E (five strains), and A (one strain) (Table 1). Eleven of 20 virulence genes (VGs) searched were detected, and the majority of the 19 different VGs combinations found were serogroup-specific. Uropathogenic E. coli (UPEC) PAI genetic markers were detected in all EIEC strains. PAIs IJ96 and IICFT073 were the most frequent (92.1 and 80.4%, respectively). PAI IV536 was restricted to some serogroups from phylogroups A, B1 and E. PAI ICFT073 was uniquely detected in phylogroups B2 and E. A total of 45 (88%) strains presented multiple PAI markers (two to four). PAIs IJ96 and IICFT073 were found together in 80% of strains. EIEC is a DEC pathovar that presents VGs and pathogenicity island genetic markers typically associated with ExPEC, especially UPEC. These features are distributed in a phylogenetic and serogroup-dependent manner suggesting the existence of stable EIEC subclones. The presence of phylogroups B2 and E strains allied to the presence of UPEC virulence-associated genes may underscore the ongoing evolution of EIEC towards a hypervirulent pathotype.


Background
Enteroinvasive Escherichia coli (EIEC) is a diarrheagenic E. coli (DEC) pathotype that shares its pathogenic behavior with Shigella so that they have been considered a unique pathovar [1]. Like the other DEC pathotypes EIEC infections occur by the fecal-oral route, and usually are more prominent in low-income regions due to poor sanitation [2][3][4]. Nevertheless, outbreaks associated with contaminated food and water can occur among highincome people groups as well [5][6][7][8][9].
The pathogenicity of EIEC/Shigella is dependent on genetic virulence determinants coded by genes frequently located in pathogenicity-associated islands (PAIs) usually dispersed both on the chromosome and in the invasion plasmid (pINV), present in all virulent EIEC strains [10][11][12][13].
Compared to other DEC, EIEC is referred as a rare etiological agent of endemic diarrhea while Shigella is estimated to cause millions of cases annually worldwide, with a mortality rate that can reach 1% among children groups [14]. This discrepancy may be due to the absence of simple laboratory methods to differentiate EIEC from Shigella since both shares almost all the characteristics used in their identification. With all these common features, it is plausible that misidentification occurs, and many of those million Shigella cases may be in fact caused by EIEC.
The use of modern approaches like multilocus enzyme electrophoresis (MLEE), multilocus sequence typing (MLST), and comparative genomics [12,[15][16][17] is bringing new comprehension to the phylogenetic origin and evolution of the pathogenic and commensal lifestyles of the microorganisms in general. Clermont et al. [18] have developed a simple multiplex PCR method which has allowed assessing the phylogenetic origin of great numbers of E. coli isolates in epidemiological surveys all over the world. A further improved method based on the genotype of genes arpA, chuA, yjaA and TspE4.C2 was able to correctly assign more than 95% of E. coli strains into eight phylogroups (A, B1, B2, C, D, E, F, and clade I) [19].
Studies using these various approaches have confirmed that E. coli have a clonal structure [20] and that nonpathogenic strains differ from diarrheagenic E. coli (DEC), and extraintestinal pathogenic E. coli (ExPEC) in respect to their phylogenetic origin. There is common sense that both DEC and ExPEC have evolved in expenses of blocks of genes acquisition by strains with a permissive phylogenetic background. Those gene blocks, generally originated from non-related microorganisms, were moved to E. coli via horizontal gene transfer mechanisms mediated by mobile genetic elements such as PAIs, plasmids, and bacteriophages. After all, the set of foreign genes has established in specific genetic backgrounds given rise to different E. coli pathovars with EIEC being one of them.
While there is an extensive literature covering the details of Shigella genetic virulence determinants, these aspects have been poorly exploited in EIEC. The recent occurrence of two important outbreaks in Europe due to an EIEC strain [9,21] has motivated studies on the molecular characterization and comparative genomics of EIEC strains isolated in various geographic regions to better understand the evolution of this E. coli pathovar [12,17]. However, a possible way of EIEC rearrangements driven by the acquisition of mobile genetic elements engaged in virulence of other E. coli pathovars has not been addressed yet. That could explain the appearance of EIEC strains presenting higher virulence potential.
In this work, we examined a comprehensive set of EIEC strains encompassing most of the frequently isolated serogroups to assess their phylogenetic origin as well as carriage of virulence genes (VGs) and PAIs genetic markers considered to be involved in ExPEC pathogenicity. We have found that EIEC strains are distributed in various phylogroups in a serogroup-dependent manner. Besides, they may harbor several VGs and genetic PAI markers usually associated with uropathogenic E. coli (UPEC). The simultaneous carriage of extra-intestinal and diarrheagenic virulence traits by EIEC could play a role in their evolution towards hypervirulent clones. Such a possibility may turn out to be a matter of public health concern and deserves further investigation.

EIEC main biochemical characteristics
As is usual for EIEC, all strains did not decarboxylate lysine and were non-motile, exception done to two O124 strains exhibiting flagellar type H30. Typical E. coli biochemical behavior in respect to lactose fermentation and indole production was serogroup-related, and uniformly observed for O29, O143, and O144, but not for O124, O136, and O152 which did not ferment lactose. All O136, one O28ac, and most O124 and O167 strains did not produce indole. Serogroup O164 was the only variable regarding lactose fermentation (two positives and two negative strains) ( Table 1).

Presence of invasion plasmids
All strains presented a variable number of plasmid bands, but all of them possessed a high molecular weight (MW) DNA-band ranging from 180 to 220 Kb compatible with the MW described for invasion plasmids (pINV) [10,13].

Invasive phenotype
All strains were able to invade HeLa cells in the gentamicin protection assay, confirming the invasive phenotype and showing the stability of the genes necessary for cell culture invasion even after long periods of storage.

Determination of phylogenetic origin
By using the quadruplex PCR method [19], EIEC strains were distributed within five of the eight phylogenetic groups, with all strains in a given group presenting a uniform quadruplex (arpA, chuA, yjaA, and TspE4.C2) genotype. The frequency of phylogroups and their quadruplex genotypes were: 49% from phylogroup A (+ ˗ ˗ ˗), 29% from phylogroup E (+ + ˗ ˗), 18% from phylogroup B1 (+ ˗ ˗ +). Groups C (+ ˗ + ˗) and B2 (˗ + + +) were represented by one strain each. It is worth mentioning that all but two serogroups were strictly distributed in a given phylogroup as follows: strains from serogroups O124, O136, O144, and O152 were from phylogroup A, while O28ac and O29 were from phylogroup B1, and O143 from phylogroup E. Serogroup O167 strains were most originated from phylogroup E (83%) with only one strain typed as group A. A broad divergence was found only in serogroup O164 that was dispersed in phylogroups A, B1, B2, and C (one strain each) ( Table 1).

Presence of UPEC PAI genetic markers
All EIEC strains, regardless their phylogroup or serogroup, presented markers for at least one of eight pathogenicity islands searched (Table 3). PAI I J96 and PAI II CFT073 were detected at high frequencies (92.2% and 80.4%, respectively); whereas PAI IV 536 and PAI I CFT073 were less frequent (37.2% and 31.4%, respectively). The majority of strains harbored multiple PAIs (two to four). In fact, strains harboring only one PAI were rarely found (one strain from group B2, two from group A, and three from group B1). It is worth mentioning that PAI I CFT073 was exclusively detected in all strains from phylogroups B2 and E. Besides, PAI IV 536 was present only in strains from specific serogroups (O124, O152, O164, and O167) belonging to phylogroups A, B1, and E. All strains from EIEC serogroups O136, O143, O144, and O152 were uniforms in their content of PAIs. Interestingly, strains from serogroup O164 varied in their content of PAIs according to the phylogroup they were originated (Table 3).

Discussion
EIEC is known to be restricted to a dozen serogroups [22,23], all non-motile with very rare exceptions [21,22,24]. However, it has been demonstrated by molecular methods that non-motile EIEC strains present the genes for flagella [12,25]. Although necessary for epidemiological assessment, molecular methods are not feasible in clinical laboratory settings in less developed geographical areas where negative tests for motility and lysine decarboxylation continue to be important phenotypic clues to direct the diagnosis of EIEC. It was interesting to note that biochemical behavior regarding lactose fermentation and indole production was phylogrouprelated. The lack of indole production, although rarely reported among E. coli isolates [26], was frequently found among EIEC, remarkably associated to serogroups O124 and O136. It has been accepted that diarrheagenic E. coli pathotypes are primordially originated from phylogenetic groups A, and B1 [27]. Nevertheless, prototype strains from important DEC pathovars are assigned to more virulent phylogroups as happens with EPEC strain E2348, and EHEC strain EDL933, which are originated from phylogroups B2 and E, respectively [19].
We have shown that EIEC strains analyzed in this study, do not have a unique phylogenetic origin. Instead, they have arisen most (67%) from phylogroups A, and E, while B1 origin was less common (18%). Moreover, the phylogenetic distribution showed to be strictly related to serogroup, since all O124, O136, O144, and O152 strains were assigned to phylogroup A, all O28ac, and O29 were B1, and all O143 and the majority of O167 were group E. Serogroup O164 was unique in presenting a flexible phylogenetic distribution, with strains spread in phylogroups A, B1, C, and B2. By using various molecular approaches, several authors have also shown the distribution of EIEC in different clusters [1,28], or phylogroups [12]. In agreement with our results, Hazen et al. [12], using in silico analysis of complete genome sequences, showed that serogroup O124 was classified   in phylogroup A, and O143 in phylogroup E. However, our results were discrepant from those authors in respect to serogroups O136, O144, and O164 that they assigned to group B1. One reason for this inconsistency might be the small number of strains sequenced in that study. Overall, and in accordance with other authors [1,12,28], the diversity in phylogenetic origin emphasizes the multiple evolutionary histories of EIEC. As mentioned, other DEC pathovars have already been demonstrated to exhibit diverse phylogenetic origin but, to our knowledge, the link between phylogeny and serogroup has not been covered in a comprehensive manner hitherto. Although EIEC is a diarrheagenic pathotype, it showed to harbor two to seven out of 20 searched virulence genetic markers that are associated with virulence of ExPEC [29,30]. The data presented in this study corroborate and further extend the idea that diarrheagenic pathotypes may share ExPEC VGs, and vice-versa [31][32][33]. In fact, some of the found VGs herein are widespread in E. coli, regardless the pathogenic potential of the strain.
Despite the low number of strains in each serogroup in this study, the presence of those VGs in EIEC may not be casual since the majority of strains (67 to 100%) presented specific virulence gene profiles for each of eight among nine serogroups studied. Further research is needed to understand the contribution of extraintestinal VGs to the pathogenesis of EIEC. Nonetheless, it is already known that the conjunction of virulence traits from diverse E. coli pathotypes may give rise to extreme virulent clones as have recently happened with E. coli O104:H4 which combined enteroaggregative and toxigenic properties [34].
Using the same methodology described by Sabaté et al. [35] this study detected four out of eight PAIs primarily described in UPEC. Interestingly, those authors reported 14% of UPEC missing all these PAIs, while all EIEC strains analyzed herein presented one to four PAIs. It was remarkable that PAI I J96 was the most frequent in EIEC (92%) while it was not detected in any of the commensal or UPEC isolates in that study. Other authors have also reported the absence or very low frequency of PAI I J96 in commensal and extraintestinal pathogenic E. coli isolates, including UPEC [36][37][38][39]. Island II CFT073 was detected in 80% of our EIEC isolates, and none were from phylogroup B2, while it was found by Sabaté [35] only in isolates from phylogroup B2. In agreement with Sabaté et al. [35], PAI I CFT073 was exclusively detected in EIEC strains from phylogroups B2 and E. Dobrindt et al. [40] reported the presence of PAIs I 536 to IV 536 , or at least some of their DNA regions, in diarrheagenic isolates. Besides PAI IV 536 no other PAIs from UPEC 536 were detected in this study. We have shown that EIEC also harbors multiple PAIs as described for UPEC [38]. However, the most predominant combination found in EIEC (PAIs I J96 and II CFT073 ) was not detected in UPEC [35,38]. Except for PAI IV 536 the other three PAIs detected in this study did not correlate with the presence of the recognized virulence traits they are reported to encode namely: α-hemolysin, P-fimbriae, Prs-fimbriae, cytotoxic necrotizing factor 1, and aerobactin [35]. The occurrence of DNA rearrangements in these PAIs could explain the missing virulence traits [40,41], while other yet-unknown regions [42] could be perpetuated in EIEC.
Recently reported a genomic analysis of EIEC strains [12,17] had not mentioned the presence of sequences similar to UPEC PAIs although serogroups and phylogroups studied encompassed some of those analyzed herein. A possible explanation may be that those studies have focused only in virulence markers already known to be involved in the pathogenicity of EIEC and Shigella for the intestinal tract.
While extraintestinal virulence determinants tended to be segregated according to serogroup and phylogroup, the enteroinvasive determinants, either chromosomal or plasmid located, were maintained in all strains in this study, regardless their serogroup or phylogroup. This finding suggests that the enteroinvasive pathotype has evolved later in the EIEC evolution as pointed by others [12,17].
Finally, as happens with the finding of ExPEC VGs, the presence of UPEC PAI genetic markers in EIEC needs to be further investigated to gain comprehension not only on the pathogenesis but also on the evolution of this important diarrheagenic pathotype.

Conclusions
EIEC is a DEC pathovar that presents both VGs and PAI genetic markers typically found in ExPEC, especially UPEC. These features are distributed in a phylogenetic and serogroup-dependent manner suggesting the existence of stable EIEC subclones. The constant and uncommon presence of PAI I J96 suggests a not yet unveiled fact in the EIEC evolution which requires special attention. The data herein extends the observations reported in the current literature on the occurrence of heteropathogenic E. coli, meaning those strains that combine VGs specific of different E. coli pathovars.
The occurrence of clones combining virulence genes derived from different E. coli pathovars as a result of horizontal gene transfer could play a previously unsuspected role in the pathogenesis of EIEC infections.

Bacterial strains
A total of 51 enteroinvasive E. coli (EIEC) strains belonging to the bacterial collection of the Department of Microbiology, Immunology, and Parasitology (DMIP) at Federal University of São Paulo, were studied. The strains were isolated during a 24 years spanned period (between 1964 and 1987) from sporadic cases of human diarrhea in varied geographical regions, being 44 in São Paulo city (Southeast-Brazil), and four in Pernambuco state (North-Brazil). The other three isolates were received from Chile, Japan, and CDC (Centers for Disease Control and Prevention)-USA. Upon arrival at DMIP, strains were biotyped and serotyped by conventional methods [26] and submitted to the keratoconjunctivitis assay (Serény test) [43]. After characterization, the collection was kept in 15% glycerol at −70°C, and in stabbed cultures at room temperature protected from light.

Growth conditions and characterization
For this study, bacteria were grown in tryptic soy broth (TSB) (Difco Laboratories, Detroit, MI, USA), purified on MacConkey agar (Difco Laboratories, Detroit, MI, USA), and serogroup-confirmed by conventional methods [26] using specific antisera (Probac do Brasil-SP-Brasil). Biochemical behavior was assessed for the following reactions: gas from glucose, H 2 S production, urea hydrolysis, tryptophan deamination, lysine decarboxylation, motility, indole production, and Simmons citrate as sole carbon source, by using the identification kit enterokitB® (Probac do Brasil -SP-Brasil) as recommended.

HeLa cells invasion assay
The invasion capacity was confirmed using cell culture assays as described by Keller et al. [44] with the following modifications: HeLa cells were used instead of Hep-2 cells, and incubation time after inoculation of bacteria was two hours, followed by another three hours in the presence of 50 μg/mL gentamicin. After saline wash, the coverslips containing the HeLa cells were fixed, stained and observed under light microscopy (x1,000 magnification).

Plasmid profile
Plasmid profile was assessed using the rapid alkaline extraction method of Birnboim and Doly [45] exactly as described except that the volumes were scaled down as follows: 1 mL of broth culture, 100 μL of solution I, 200 μl of solution II, 150 μL of solution III, and 100 μL of solution IV. Electrophoresis was carried out during 130 min at 100-120 V and 40 mA in 0.8% agarose gel. Stained gels were visualized and recorded (Universal Hood II -Biorad Laboratories Inc., USA).

Determination of phylogenetic origin
Determination of phylogenetic origin was carried out following the revised method described by Clermont et al. [19]. The method is based on a quadruplex PCR reaction to detect the genes arpA, chuA, yjaA, and TspE4.C2. The pattern of amplification for those four markers in that order defines the phylogroups as follows: A (+ − − -), B1 (+ − − +), B2 (− + + +) or (− + + −) or (− + − +), and F (− + − −). Strains presenting other amplification patterns were submitted to a second PCR reaction for the distinction between phylogroups A and C, D and E, or E and clade I according to Clermont et al. [19]. Prototype strains E. coli RS218, and EDL933 were used as positive controls. Reactions were performed in a Mastercycle Gradient Thermocycler (Mastercycle gradient -Eppendorf) using boiled isolated colonies as template DNA, primers from IDT (Integrated DNA Technologies -USA), and Go Taq Green Master Mix (Promega -USA). The resulted amplified DNA fragments were resolved on 2% agarose gels, stained and photographed in a digital image capture system (Universal Hood II -Biorad Laboratories, Inc. USA). "1Kb Plus DNA Ladder" (Invitrogen™ -Thermo Fisher Scientific) was used as a mass molecular marker.