A «repertoire for Repertoire» Hypothesis: Repertoires of Type Three Effectors Are Candidate Determinants of Host Specificity in Xanthomonas

Background: The genetic basis of host specificity for animal and plant pathogenic bacteria remains poorly understood. For plant pathogenic bacteria, host range is restricted to one or a few host plant species reflecting a tight adaptation to specific hosts.


Introduction
Deciphering the mechanisms used by bacterial pathogens to evolve and adapt to new hosts is a major issue for both medical and plant sciences. Despite the tremendous achievements of 30 years of intensive research and the mass of information provided by the sequenced genomes in understanding the interactions between bacterial pathogens and their hosts, the molecular factors underlying host specificity of pathogenic bacteria still remain to be identified. Such fundamental knowledge tackles both the coevolution between the host and the pathogen, as well as the factors underlying the emergence of new pathogens. For example, recent studies investigated the similarities between human and avian pathogenic strains of Escherichia coli, to gain insight into potential zoonotic risks of avian pathogenic strains [1][2][3][4][5]. Indeed, strains of E. coli belonging to the same phylogenetic groups may display pathogenicity either on poultry or on humans, and the question whether avian strains may serve as potential reservoir of antibiotic resistance or virulence genes is of crucial importance [2].
For plant pathogenic bacteria, host specificity of strains is usually very high and well characterized. In many species of plant pathogenic bacteria numerous pathovars are defined. A pathovar is a subspecific division that groups all bacterial strains that cause the same symptoms on the same plant host range [6]. Within a pathovar, a second level of specificity is defined: races were defined based on the observation that some strains, although fully pathogenic on most host cultivars, may reveal avirulent on a certain cultivar. Such specificity between bacterial races and cultivars follows Flor's « gene for gene » theory [7], and has been largely exploited in crop breeding. In the last 20 years, examples accumulated of bacterial plant pathogens bypassing monogenic resistances introduced in crops. By contrast, host jumps only seldom occur, suggesting that host specificity barriers are more difficult to bypass. Therefore, understanding the molecular mechanisms underlying host specificity may lead to engineering more durable resistances for crop protection.
In the last ten years, genomes of many animal and plant pathogenic bacteria were completely sequenced. Comparative genomics studies demonstrated that repertoires of virulence associated genes can be highly variable among the sequenced strains, thus suggesting a role in host specificity [8][9][10][11][12][13][14]. However, despite the increasing number of microbial genomes available, genome comparison of several model strains may not fully represent the extreme variability of host specializations that can be found within one bacterial genus. For example, the Xanthomonas genus is constituted of 27 species causing diseases on more than 400 different host plants, among which many economically important crops [15]. At present, genomes of only 12 model strains belonging to diverse species and pathovars of Xanthomonas are sequenced or on the way to be sequenced (http://www. genomesonline.org/). As long as at least one strain representative of each pathovar is not sequenced, comparative genomics, although highly informative, is not fully suited for the identification of the molecular determinants of host specificity.
A pioneer study by Sarkar and colleagues [16] determined the distribution of a large scope of virulence associated genes in a collection of 91 strains of the plant pathogenic bacterium Pseudomonas syringae, isolated from diverse host plants. They postulated that looking at the distribution of these virulenceassociated genes may provide clues on their possible role in host specificity: genes that are highly conserved among all strains, irrespective to the host of isolation, probably do not play a major role in host specificity. On the contrary, virulence-associated genes heterogeneously distributed among strains are good candidates to explain host specificity.
Among pathogenicity determinants shown to display heterogeneous distribution between strains are Type III Effectors (T3Es) [12,16]. T3Es are bacterial proteins that are directly injected inside the cytoplasm of the host cell by a bacterial molecular apparatus called Type III Secretion System (T3SS). This system is conserved among most of the gram negative plant and animal pathogenic bacteria. Using the T3SS, each bacterial strain can inject up to 30 T3Es in the eukaryotic host cell simultaneously [17,18]. In Xanthomonas, a mutation in the T3SS impairs the ability to inject T3Es in the host plant, and as a consequence abolishes pathogenicity and multiplication in planta [19]. More precisely, many studies demonstrated that T3Es alter the physiology of the host cell in a way that is beneficial for the pathogen [11,20].
In 2006, Jones and Dangl [21] proposed a mechanistic model for the interaction between gram negative bacteria and plants in which the combined action of T3Es leads to suppression of host defence reactions that are induced after recognition of Pathogen Associated Molecular Patterns (PAMPs), thus resulting in an effector-triggered susceptibility. In some cases, individual T3E (or its action inside the cell) may specifically be recognized by the plant. This recognition induces a hypersensitive response (HR), resulting in an effector induced resistance. Such specific recognition of a T3E in the host cell by the product of a plant resistance gene constitutes the molecular basis of the ''gene for gene'' theory ruling race/cultivar specificity. Thus, T3Es may enlarge the host range of a given bacterium by suppressing host defences, or narrow the host range when one of them is specifically recognized by the plant.
Experimental data suggest that T3Es may also be involved in host specificity. Indeed, several studies used suppression subtractive hybridization (SSH) approaches to perform genomic comparison of non-sequenced strains, that are very closely related phylogenetically but differing in the hosts they attack. Among the genes isolated in these SSHs were single T3Es, suggesting that they can play a role in host specificity in plant pathogenic bacteria [22][23][24] as well as in animal pathogenic bacteria [2].
Regarding the different T3Es injected in the host cell, they probably act collectively. Supporting this idea, mutations achieved in one single T3E rarely affect the virulence phenotype of strains. This suggests that among all the T3Es injected in the host cell, some have redundant functions virulence [25,26]. T3Es seem to act synergistically or antagonistically on different pathways of the host cell, to create a physiological status of the host that would be optimal for the development of the pathogen. Examples illustrating such idea can be found in plant and animal pathogenic bacteria. Among the conserved T3Es of Pseudomonas syringae pv. tomato DC3000, HopPtoM induces an increase of the number and the size of the lesions, whereas HopPtoN induces a decrease of the number and the size of the lesions [27,28]. In E. amylovora, similar antagonism can be found also between HrpN and HrpW. Indeed, HrpN induces ion fluxes that induce cell death in Arabidopsis, whereas HrpW induce ion fluxes that prevent cell death induced by HrpN. When both purified proteins HrpW and HrpN are added in the culture medium, the quantity of cell death depends on the relative concentration of the proteins [29]. In Salmonella typhimurium, the T3E AvrA stabilizes cell permeability and tight junctions in epithelial cells, whereas the T3Es SopB, SopE and SopE2 were shown to destabilize tight junctions [30].
Moreover, it was shown that in maize, resistance genes involved in race/cultivar specificity may also play a role in the recognition of the non-host bacterium Xanthomonas oryzae [31]. This shows that the same type of genes may be involved at both levels, race/ cultivar resistance and non-host resistance. Host specificity is most probably not governed by a unique gene. If so, emergence of new plant diseases would be as frequent as cases of bacterial pathogens escaping monogenic resistances. May we generalize the model described by Jones and Dangl [21], the outcome of the interaction would depend greatly on the confrontation of repertoires of T3Es and ''guard genes'' of the plant. Race/cultivar specificity would then be explained in a ''Gene for Gene'' manner [7] whereas host specificity would be explained in a ''Repertoire for Repertoire'' manner.
In the present study, we have chosen, as a model for investigating the molecular determinants of host specialization, the species Xanthomonas axonopodis [32]. Indeed, within this bacterial species, numerous pathovars displaying various plant host ranges have been defined [32][33][34][35]. Furthermore, it is worth noting that recent phylogenetic studies performed on strains belonging to X. axonopodis demonstrated that some pathovars did not form monophyletic groups [36,37]. Some strains being phylogenetically very close may belong to different pathovars, whereas some strains belonging to the same pathovar may be distant phylogenetically. For example, the pathovar phaseoli, that groups all the strains pathogenic on bean, comprises four distinct genetic lineages [24]. Strains belonging to the genetic lineages 2 and 3 of X. axonopodis pv. phaseoli are phylogenetically closer to strains pathogenic on citrus or cotton, than strains belonging to the genetic lineage 1 of the pathovar phaseoli [24]. This suggests that in X. axonopodis, host specialization results from phylogenyindependent factors.
In this study, we assayed for the presence of 35 T3Es in a set of 132 strains of the species X. axonopodis, representative of 18 different pathovars. We then tested for associations between pathovars and the repertoire of T3Es of the tested strains, to provide data challenging a ''repertoire for repertoire'' hypothesis that may explain host and tissue specificity.

Results
Presence or absence of 35 T3Es in strains used in this work was achieved by using specific primers for PCR amplification, as well as by dot blot hybridization. Both methods were used simultaneously to obtain complementary results: hybridization tells whether strains contain orthologs of the probed T3E gene, whereas PCR approach provides clues on whether genetic rearrangements may have occurred in the target sequence.
Phylogenetic position of strains used in this work was obtained by sequencing the housekeeping gene rpoD. Comparison of dendrograms obtained using the rpoD phylogenetic data and the dendrogram obtained based on presence or absence of T3Es documents the involvement of T3E repertoires in host specialisation.
The X. axonopodis species comprises monophyletic and polyphyletic pathovars In order to test whether the phylogeny could explain the distribution of the strains of our collection (Table 1) among the 18 pathovars of X. axonopodis, we sequenced rpoD, one of the housekeeping genes commonly used in multilocus sequence analysis and typing (MLSA and MLST) studies [37,[58][59][60][61]. Phylogenetic trees based in rpoD sequences were constructed using the method of Maximum Likelihood. Akaike information criterion used in Modeltest [62]  proportion of invariable sites (I) = 0.5065 and Gamma distribution shape parameter (G) = 0.7388. The Maximum Likelihood tree, rooted with the orthologous rpoD sequences from X. campestris pv. campestris strain CFBP5241, is presented in Figure 1. A very similar tree was also obtained with sequences of four housekeeping genes originating from about forty strains belonging to diverse pathovars of X. axonopodis [36,37]. The tree constructed based on rpoD sequences matched well with the 6 rep-PCR clusters (from 9.1 to 9.6) defined within the X. axonopodis species (Figure 1) [34,35]. Moreover, high bootstrap values indicated that this clustering was well supported and that the tree was robust (Figure 1). It is worth observing that, depending on the X. axonopodis pathovars, strains were grouped together or were distributed into phylogenetically unrelated groups (Figure 1). From this phylogenetic analysis, pathovars anacardii, axonopodis, begoniae, citri, mangiferaeindicae, malvacearum, manihotis, ricini, vesicatoria and vignicola can be considered as monophyletic whereas pathovars alfalfae, allii, aurantifolii, citrumelo, dieffenbachiae, glycines, phaseoli and vasculorum can be considered as polyphyletic.
The rpoD-based tree was sufficient to highlight the existence of different genetic lineages within a pathovar. As an example, the four genetic lineages of the pathovar phaseoli that have been determined by Amplified Fragment Length Polymorphism (AFLP) [24, our unpublished data], were supported by the rpoD sequence analysis (Figure 1). Our rpoD-based tree confirmed that strains belonging to the genetic lineage 1 of the pathovar phaseoli are phylogenetically distant from strains belonging to the three other genetic lineages. X. axonopodis pv. phaseoli strain CFBP6987 diverged from all other strains belonging to this pathovar, which also supports our previous AFLP analyses [our unpublished data]. Concerning strains of the pathovar dieffenbachiae, it is interesting to notice that rpoD sequences analysis gather strains into three phylogenetic groups according to their host of isolation (Anthurium sp., Dieffenbachia sp. and Philodendron sp.) ( Figure 1). Another striking observation was that some strains belonging to different pathovars (aurantifolii, citri and glycines in a first case and anacardii, aurantifolii and citrumelo in a second case) exhibited identical rpoD sequences ( Figure 1).

T3E repertoires of X. axonopodis strains combine ubiquitous and variable effectors
Before investigating the T3E gene distribution among our collection of strains, we checked that all strains of our collection have a T3SS of the Hrp2 family that is usually present in xanthomonads [8,9,[63][64][65][66]. This analysis, performed by using specific PCR primers [55], revealed that all strains of our collection carry this T3SS (data not shown). Then, the distribution study of the 35 selected T3E genes among the X. axonopodis strains was studied and the results are presented in Figure 2 and in Table S1. When we did not detect the presence of a T3E gene in a given strain, we considered that this gene is absent or too divergent to be detected. Indeed, our approach cannot completely rule out the fact that some T3E genes may have been subjected to diversifying selection which resulted in a sufficient divergence sequence to avoid detection through dotblot hybridization.
Our results clearly revealed that T3E repertoires contained two categories of genes. Some genes showed a broad distribution among strains whereas the remaining ones displayed a variable distribution. The first class comprised T3E genes (xopF1, avrBs2, xopN, pthA1, xopX, xopQ, avrXacE3 and xopE2) that were present in at least 87% of the strains tested. These genes will be referred to as ubiquitous T3Es. We could then consider ubiquitous T3Es as the core suite of T3E genes for strains of X. axonopodis. The other T3Es, which distribution were not as broad as the ubiquitous genes, constituted a second class of genes: for instance xopP was detected in 67% of the strains, but 10 other genes were present in less than 10% of the X. axonopodis strains tested. Two of them, avrXccA1 and XCC2565, were not detected in any of the strains, although they were found in X. campestris pv. campestris strain CFBP5241. This second class of genes will be referred to as variable T3Es. We could then consider variable T3Es as the variable suite of T3E genes for X. axonopodis strains. Interestingly, all ubiquitous T3Es have a G+C content (,65%) similar to the average value of total DNA for X. axonopodis strains [8,32,64] whereas the majority of variable T3E genes have a G+C content considerably lower (until 42.1%) ( Figure 2). Another point of    campestris. The tree constructed based on rpoD sequences is congruent with previous grouping based on Rep-PCR profiles by Rademaker [34]. Polyphyletic pathovars are reported in red, whereas monophyletic are reported in black. *GL1, GL2, GL3, and var. fuscans correspond to the 4 genetic lineages previously described in the pv. phaseoli [24]. The dendrogram also displays the evolutionnary history of the T3E xac3090 as inferred from the parsimony method implemented in Mesquite [89]. Occurence of xac3090 is indicated on the tree by gray branches. For example, xac3090 is present in all the strains of pv. glycines. The parsimony analysis indicates that this T3E appeared several independent times in strains of X. axonopodis pv. glycines, as well as in other parts of the tree. doi:10.1371/journal.pone.0006632.g001 Figure 2. Distribution of T3E genes among strains belonging to 18 pathovars of Xanthomonas axonopodis. In this figure, the presence or the absence of an ortholog of each selected T3E gene was determined by dot-blot hybridizations. Black squares represent presence of the corresponding gene, whereas white squares represent absence of sequence similar to the probe used. In the latter case, gene may be absent or its sequence is too divergent to be detected. The GC% of each T3E gene is indicated on the basis of sequences of the orthologs found in the databases. *AMGE indicates whether the considered gene was reported to be Associated to Mobile Genetic Elements in Xanthomonas strains whose genome was sequenced (Y:Yes; N: No) [8,64]. doi:10.1371/journal.pone.0006632.g002 Different pathovars have different T3E repertoires. Some diversity in T3E repertoires may be observed within some pathovars T3E repertoires were highly variable between X. axonopodis strains, both in terms of T3E present and of size of the repertoires ( Figure 2). Our results showed that repertoires of T3Es were different between strains belonging to different pathovars.
When looking at the size of T3E repertoires, we observed a large variability. Strains of the pathovar vasculorum harboured the smallest T3E repertoire (6 or 7 of the 35 selected T3E genes depending on the strains) whereas strains of pathovar vesicatoria exhibited the largest T3E repertoire (from 22 to 26 of the 35 selected T3E genes depending on the strains). Regarding strains of the 16 remaining pathovars, their T3E repertoires were composed of 10 to 20 of the 35 selected T3E genes.
Within most pathovars, repertoires of T3Es were conserved. Indeed, we observed identical or almost identical (only one T3E gene in one strain differs) T3E repertoires from strains of the monophyletic pathovars anacardii, axonopodis, malvacearum, manihotis, mangiferaeindicae, ricini and vignicola ( Figure 2). We also observed almost identical T3E repertoires (only one T3E gene in one or two strains differs) from strains belonging to the polyphyletic pathovars dieffenbachiae, glycines and vasculorum. For example, strain CFBP3132 of the pathovar dieffenbachiae carries one more gene (avrXccA2) than other strains of this pathovar (Figure 2).
When a significant variation in T3E repertoires occurred between strains of the same pathovar, the observed variation can be linked to the reported genetic diversity within this pathovar. For example, the four genetic lineages, that were defined in the pathovar phaseoli [24], possess T3E repertoires that are similar but not identical ( Figure 2). Some variations in T3E repertoires within genetic lineages of the pathovar phaseoli were noticed as well, but differences are smaller among strains belonging to the same genetic lineage than among strains belonging to different genetic lineage. In other cases, the variation observed within a pathovar can be linked to the host of isolation. For example, among strains of the pathovar anacardii, T3E repertoires are almost identical, but strains isolated from Mangifera indica (CFBP2913 and CFBP2914) carry one more T3E gene (xopX) than strains isolated from Anacardium occidentale (CFBP7240, CFBP7241, CFBP7242, 7243) ( Figure 2). A similar observation can be made for pathovar glycines strains isolated from Glycine hispida (CFBP1519, CFBP1559 and CFBP2526) which carry one more T3E gene (xopQ) than strains isolated from Glycine javanica or Glycine max (CFBP7118 and CFBP7119) (Figure 2).

Distance between studied repertoires highlights a correspondence between T3E repertoires and pathovars of X. axonopodis
To test the hypothesis of a correspondence between repertoires of T3Es and pathovars, we constructed a dendrogram from a matrice summarizing presence/absence of the T3Es for each of the 132 selected strains. Strikingly, this dendrogram grouped the tested strains according to their pathovar, and these groupings were supported by high bootstrap values ( Figure 3). However, one exception was observed since strains of the pathovar aurantifolii were distributed in two distinct groups. Such result was particularly interesting when looking at the case of pathovars that appeared polyphyletic based on our rpoD sequence analysis (see above). For example, strains of the pathovars citrumelo, dieffenbachiae, glycines, and vasculorum, which have been split into phylogenetic distinct groups (Figure 1), clustered together based on T3E repertoires (Figure 3). Regarding the pathovar phaseoli, the four genetic lineages appeared tightly related when looking at the T3E repertoires whereas one genetic lineage was phylogenetically distant from the three other ones (Figures 1 and 3). The same observation could be made for the pathovar phaseoli strain CFBP6987, that was evolutionary divergent based on rpoD sequencing ( Figure 1) but indistinguishable from other strains of the genetic lineage 1 of the pathovar phaseoli based on T3E repertoires ( Figure 3). Therefore, our results revealed some correlations between repertoires of T3Es and pathovars. This suggests that T3E repertoires might promote the pathogenicity of strains that are phylogenetically distinct on the same host plants. Thus, T3E repertoires represent candidate determinants of the pathological adaptation of the X. axonopodis strains on their hosts.
Furthermore, some T3Es may allow to discrimate between pathovars. Indeed, among the T3Es we tested, some appear specific of certain pathovars. For example, xopD and xopO were specific of the pathovar vesicatoria. Some T3E genes also allowed discrimination between different genetic lineages of polyphyletic pathovars. For example, avrRxo1 allowed discrimination between genetic lineage 1 of the pathovar phaseoli and the other genetic lineages. Genetic lineages 2 and fuscans may be discriminated from the two other genetic lineages of the pathovar phaseoli by the presence of avrXacE1 and avrXacE2.
When one considered the tissue specificity of the X. axonopodis strains, our results presented in Figures 2 and 3 showed no clear delineation between vascular and non-vascular pathogens. We did not detect T3E genes that allow distinction between vascular and non-vascular pathovars. No apparent correlation was observed between T3E repertoires and tissue specificity of the tested strains in contrast to our results between T3E repertoires and host specificity. Nevertheless, in Figure 3 we noted that certain vascular pathogens, such as pathovars vasculorum and manihotis, or certain non-vascular pathogens, such as pathovars citrumelo and alfalfae, appeared closely related in the dendrogram but these groupings were not supported by high bootstrap values.

Association between T3E genes within repertoires
The notion that a T3E repertoire enables a pathological convergence on a particular host implies that it is the coordinated action of several T3Es rather than the action of one unique T3E that matters. Thus, we wanted to estimate potential associations between T3Es, which may provide clues on potential functional synergies or redundancies between pairs of T3Es.
Therefore, we calculated for each pair of T3E gene a frequency of association, and we considered only cases where both T3E genes in a pair are present in the tested strains ( Figure 4). When we found high frequencies of association between T3E genes in X. axonopodis, we revealed that either these genes are geneticallylinked or -unlinked based on the genome sequence of the X. axonopodis pv. vesicatoria strain CFBP5618 ( = strain 85-10) [64]. The first case is illustrated by avrBs1 and avrBs1.1 that are geneticallylinked and showed 100% of association ( Figure 4). Interestingly, the observed genetic linkage of xopN and xopF2 in X. axonopodis pv. vesicatoria strain CFBP5618 [64] does not seem to be conserved in all X. axonopodis strains since both genes, when present, did not exhibit 100% of association but 45% of association ( Figure 4). The second case is illustrated for instance by avrBsT and xccB (46% of association) or by avrRxv and xopJ (43% of association) (Figure 4) that are genetically-unlinked. This point is particularly important when one consider the functional families of T3Es and then functional redundancy between T3Es. For instance, avrBsT, xccB, avrRxv and xopJ belong to the same functional family, namely the YopJ/AvrRxv family of cysteine proteases (Table S2) [67,68]. Interestingly, we also found high frequencies of association between genetically-unlinked T3Es belonging to the HopX/ AvrPphE family (Table S2) [8,68]. For example, we found that avrXacE1 was highly associated with xopE1 (87%), xopE2 (60%), avrXacE3 (60%) and avrXacE2 (59%) (Figure 4). Regarding genetically-unlinked hpaF and xac3090 which encode T3Es belonging to the PopC family (Table S2) [8], it appeared that both genes showed 39% of association ( Figure 4). In contrast, genetically-unlinked xopX and ecf that encode T3Es belonging to the HopAE1 family (Table S2) [68] exhibited a very low frequency of association (only 6%) in X. axonopodis strains (Figure 4).

Few DNA rearrangements are identified within T3E genes
We tried to detect the presence of the 35 selected T3E genes in our large X. axonopodis strains collection by both PCR and dot-blot hybridization methods. Interestingly, we observed that some PCR products were clearly different in size from the expected PCR products. Indeed, 22 PCR products were larger and one was smaller as compared with PCR products of the reference strains, suggesting insertion or deletion of DNA sequences within some T3E genes. To further characterize the DNA rearrangements within these T3E genes, we sequenced the PCR fragments generated from these genes. The sequence analyses revealed three types of DNA rearrangements: deletion, tandem duplication and insertions of IS element ( Table 2). We identified one in-frame deletion of 384 bp in the xopF2 gene of X. axonopodis pv. aurantifolii strain CFBP2866, the only strain of this pathovar carrying this gene (Table 2 and Figure 2). One perfect tandem duplication of 90 bp in size was identified in xopD of the X. axonopodis pv. vesicatoria strain CFBP6817 ( Table 2).
Most of the DNA rearrangements (21/23) corresponds to insertions of IS elements. We found that seven T3E genes (avrXv3, avrXacE2, avrRxo1, ecf, xopC, xopO, xopN) from strains belonging to 6 pathovars of X. axonopodis were disrupted by 6 different IS elements (IS1595, ISXca2, ISXac2, IS1389, IS1404 and IS1479). Interestingly, except IS1479 and IS1595, these IS elements are closely related since they are classified within the single IS3 family -IS407 group (http://www-IS.biotoul.fr/is.html). The determination of the usual 4 bp DRs generated by insertions of the IS elements belonging to the IS3 family-IS407 group revealed no consensus sequence thus reflecting no insertion site specificity ( Table 2). The determination of the location of IS element insertions (Table 2) showed that a T3E gene can be disrupted at the same position by the same IS element in all strains of the same pathovar (for example avrXv3 disrupted by IS1595 at position 513 in all pathovar alfalfae strains) or at different positions by different IS elements in different strains of the same pathovar (for instance avrRxo1 disrupted by ISXca2 and IS1389 at positions 770 and 411 respectively in strains CFBP6369 and CFBP6107 of the pathovar allii). We also observed that a T3E gene can be disrupted by different IS elements at different positions in strains belonging to different pathovars. This is the case of xopC that is disrupted by IS1404 and IS1479 in strains of pathovars citrumelo and mangiferaeindicae respectively. In this latter example, it is interesting to note that xopC, carried by pathovar mangiferaeindicae strains, is altered in strains isolated from Schinus terebenthifolius but not in those isolated from Mangifera indica. Since strains from both hosts of isolation exhibited identical T3E repertoires (Figures 2 and 5), this result might suggest that the alteration of this T3E gene might have a role in host adaptation for pathovar mangiferaeindicae strains.
To gain insight on sequence variation among orthologs of variable T3E, a subset of 120 sequences of variable T3Es was obtained. Genetic diversity thus observed was extremely reduced, and sequences obtained were almost identical to that of sequences of the functional orthologs found in the databases (data not shown). Only the sequence of avrXccB in strain CFBP1845 of the pathovar phaseoli displayed a premature stop codon (data not shown).

Discussion
In this paper, we investigated the distribution of 35 T3Es among 132 strains belonging to 18 pathovars of the species X. axonopodis [32]. To our knowledge, this strain collection is the largest used in any other distribution study of virulence-associated genes in plant pathogenic bacteria. To provide the largest diversity, strains were chosen to represent the broad host range, wide geographic distribution, and genetic diversity of the species X. axonopodis [32,34,35]. In the course of this study, the phylogeny of the 132  selected strains was also constructed based on the sequence of the housekeeping gene rpoD to provide the frame necessary for the analysis of the results of our distribution study.
T3E repertoires of X. axonopodis strains combine core and flexible gene sets that may play distinct roles in pathogenicity and may have evolved differently It is important to note that our results supports previous observations made for Pseudomonas syringae [16,69,70] or Ralstonia solanacearum [12]. Indeed, we identified two classes of genes within T3E repertoires of X. axonopodis strains. The first class comprises 8 ubiquitous T3E genes (avrBs2, xopN, xopF1, xopX, pthA1,xopE2, avrXacE3 and xopQ) whereas the second class contains the remaining T3E genes that are variable among strains. Then, one can consider that the first class represents the core T3E genes set and the second one the flexible T3 genes set of X. axonopodis strains. Both genes sets may play distinct roles in pathogenicity of the strains and may have evolved differently.
Regarding pathogenicity, the core T3E genes set could provide virulence functions of broad utility and then target defence components broadly conserved among a wide range of hosts [11,70,71]. Loss of these ubiquitous T3Es would lead to loss of fitness for the pathogen. Indeed, such hypothesis is supported by experimental data accumulated over two decades in diverse laboratories. For instance, mutations in avrBs2, xopX, xopN or members of the AvrBs3/PthA family were shown to alter fitness and pathogenicity of strains belonging to pathovars of X. axonopodis [72][73][74][75][76][77]. However, not fitting this picture is xopF1 and xopQ, for which inactivation does not seem to alter pathogenicity of X. axonopodis pv. vesicatoria [74]. No data in the literature are available for both xopE2 and avrXacE3 genes. In contrast, the flexible T3E genes set could contribute to strategies specific to particular plant pathogen-host interactions and thus could account for hostspecificity of plant pathogenic bacteria [11,70,71]. The role of variable T3Es would be then more subtle, and loss of such effectors may not be necessarily associated with a decrease of pathogenicity. For example, avrBsT is a variable effector as it is mainly found in strains belonging to the pathovar phaseoli. Inactivation of avrBsT does not seem to alter the pathogenicity of X. axonopodis pv. phaseoli [our unpublished data]. The same observations were made for xopC, xopF2, xopJ, xopO and xopP that appeared as variable T3E genes in our study [74,78]. Altogether, these data, obtained from Xanthomonas strains, can be compared to what is known in Pseudomonas syringae. Indeed, mutations in T3Es of the conserved effector locus (CEL) usually alter pathogenicity [79]. Substantial experimental evidence is available for hopPtoM, hopPtoN and avrE in Pseudomonas syringae [27,28,80], as well as for dspA/E in Erwinia amylovora [81,82]. Conversely, mutations in T3E genes of the exchangeable effector locus (EEL) of Pseudomonas syringae are not associated to strong impairment of pathogenicity [79].
Our study contributes also to a better understanding of the evolutionary history of T3E genes within the X. axonopodis species. The core T3E genes set might represent the ancient T3E gene suite, acquired by the ancestor of the X. axonopodis species before diversification of pathovars, and thus before host specialization occurred. These core T3E genes might have evolved from this ancestor by vertical descent among X. axonopodis strains. However, some of these core T3E genes might have been acquired later in the evolution and then have been stably inherited along with the core genome. One can also postulate that among the core T3E genes set, some genes might have been lost during evolution in phylogenetically closely related pathovars, such as xopE2 and avrXacE3 in pathovars manihotis and vasculorum or as xopQ in pathovars allii and ricini (Figures 1 and 2). In contrast, the flexible T3E genes set might have evolved by horizontal gene transfer even though we cannot completely rule out gene loss during evolution. Analyses of Xanthomonas genomes clearly showed that these bacteria have been subjected to numerous horizontal gene transfers during evolution, sometimes from phylogenetically distant organisms [83,84]. Moreover, gene acquisition is considered to be a major factor contributing to the genomic diversity of these bacteria but it seems that, once acquired, these genes are rarely transferred among lineages [85,86]. Horizontal gene transfer events were supported by the fact that the majority of the variable T3E genes in our study cluster within pathogenicity islands in their Xanthomonas host genomes [8,64,87]. Indeed, these variable T3E genes exhibit a G+C content lower compared to the average value of the rest of the host bacterial genome, they are often associated with integrase genes, transfer RNA genes and/or IS elements or remnants of them, and they are found sometimes on plasmids. Regarding ubiquitous T3E genes, no linkage to pathogenicity islands can be detected since their G+C content is similar to the rest of their host bacterial genome, they are flanked by orthologous sequences, they are not associated with mobile elements, integrase or transfer RNA genes, and they reside on chromosome (except for pthA1).
Finally, the importance of knowing which T3E is ubiquitous or variable may be illustrated by the durability in the field of resistances introduced in crops. The pepper resistance gene Bs2, that matches the ubiquitous T3E avrBs2 has been widely deployed in the field and still provides good level of resistance. On the other hand, prediction was made for low durability of the resistance conferred by Bs1 that matches the variable T3E avrBs1 [88].
A correspondence between composition of T3E repertoires and pathovars of X. axonopodis supports a ''repertoire for repertoire'' hypothesis The phylogeny of the strains we used in this study was constructed based on the sequence of rpoD housekeeping gene. Our results confirm that host specificity is not necessarily correlated to phylogeny [24,35,37]. Indeed, some pathovars are clearly polyphyletic, e. g. pathovars phaseoli, dieffenbachiae, glycines or vasculorum. However, the dendrogram constructed based on the T3E presence/absence matrix groups strains by pathovar (except for the pathovar aurantifolii), irrespective of the phylogenetic relationships between strains. For example, in the rpoD phylogeny, the pathovar phaseoli is scattered over the tree. In particular, the genetic lineage 1 highly diverges from the other lineages, as previously mentioned [24]. In contrast, on the dendrogram constructed on the matrix of presence/absence of T3Es, the four distinct genetic lineages identified in the pathovar phaseoli clustered together. Thus in our study, strains displaying a similar T3E repertoire belong to the same pathovar, even though they may be phylogenetically distant.
Conversely, strains displaying different host specialisation exhibit different T3E repertoires, even though these strains may be very close phylogenetically. For example, based on our rpoD phylogeny, strains belonging to the pathovar vignicola are mixed with strains belonging to the genetic lineage 2 of the pathovar phaseoli. However, their T3E repertoires are highly divergent, and strains do not display the same host range. Even more striking is the example of strains CFBP3541 and CFBP3835 that belong to the pathovars citrumelo and alfalfae, respectively. Phylogenetically, these strains are much closer to strains belonging to the pathovar anacardii or to the pathovar phaseoli than other strains of their respective pathovars. However, the T3E repertoire of strains CFBP3541 and CFBP3835 is identical or highly similar to that of other strains of pathovars citrumelo and alfalfae, respectively.
Such results support the hypothesis that T3E repertoires may explain a pathological convergence of phylogenetically distant strains. Thus, for a given strain, the T3E repertoire in its entirety would greatly determine the host range. Such hypothesis was also suggested by recent data obtained on a wide collection of strains of Pseudomonas syringae isolated from different host plants [16]. In addition, we performed an analysis of T3E gene history using parsimony as implemented in the Mesquite software package [89]. Parsimony method is particularly well suited for such binary data like presence or absence of T3E gene. Figure 1 shows that the trait ''presence of the T3E gene xac3090'' appears at several nodes in the phylogenetic tree. For example, it is shown that the occurrence of xac3090 in the pathovar glycines probably results from multiple independent evolutionary events compatible with the hypothesis of an adaptive convergence for pathogenicity.
The variability observed in T3E repertoires between strains belonging to the same pathovar may explain race/cultivar specificity. Furthermore, in polyphyletic pathovars such as pathovar phaseoli the differences in repertoires observed between the four genetic lineages [24] of this pathovar may reflect differences in host range that was not revealed yet. One could think that pathovar phaseoli strains may have evolved diverse T3E repertoires to extend their host ranges or increase their survival on various unrelated plant species, as it was postulated for Pseudomonas syringae strains [90]. We now plan to thoroughly test host ranges of each genetic lineage of the pathovar phaseoli on plants belonging to the Fabaceae family in order to test such hypothesis.
Thus, our results support a ''repertoire for repertoire'' hypothesis as the molecular basis of host specificity of plant pathogenic bacteria. In such hypothesis, the outcome of the interaction between the bacterial pathogen and the plant would greatly depend on the confrontation of the repertoires of bacterial pathogenic determinants, such as T3E genes, and plant ''guard'' genes. Such hypothesis is compatible with the model proposed by Jones and Dangl [21], as well as with the fact that non-host resistance is constituted of multilayered basal defences that bacteria must overcome to induce disease [91,92].
Our next goal will be to determine by Southern-blot hybridization whether T3E genes are present in multiple copies in our strain collection. Indeed, in Pseudomonas syringae pv. tomato strain DC3000, two copies of the hopAM1 gene has been found [18,93]. For T3E genes belonging to the avrBs3/pthA gene family, it is common to find more than 10 copies of these genes in strains of Xanthomonas such as X. axonopodis pv. malvacearum, X. oryzae pv. oryzae or X. oryzae pv. oryzicola [75]. The presence of such multiple copies of T3E genes within T3E repertoires may impact the host range of the strains. It has been reported that the contribution to pathogenicity in a given strain is not equal between the different avrBs3/pthA gene members: only a few members encode major virulence determinants whereas other members are potential reservoir genes providing sources for rapid evolution and adaptation in the event of host recognition [75].
However, one should keep in mind that, although T3E repertoires of plant pathogenic bacteria probably greatly impact their host range, other molecular determinants are also likely involved in host specificity and tissue specificity as well. In particular, early interactions such as host perception may also greatly impact host range in natural conditions. The importance of phenomena such as chemotaxis in the interactions between plant associated bacteria and their hosts has been widely documented. In the case of the plant pathogen Ralstonia solanacearum, for example, a chemotactic mutant is not able to colonize its host when inoculated in the soil, whereas it retains full pathogenicity when infiltrated directly in the plant tissues [94]. Hemagglutininrelated proteins, that appeared variable among Ralstonia solanacearum strains, are molecular determinants that could account for host specificity [12]. Furthermore, a recent comparative analysis of eight Xanthomonas genomes revealed that host-and tissue-specificity may result from subtle changes in a small number of individual genes in the gum, hrp, xps, xcs or rpf clusters and differences among regulatory targets, secretory substrates or genes for environmental sensing [38]. By analyzing amino acid residues, hpaA and xpsD have been revealed as candidate determinants of tissue specificity in Xanthomonas [38]. Since our study did not reveal correlation between T3E genes and tissue specificity, further sequencing of T3E genes and analysis of the T3E gene products polymorphisms are now required to identify new candidate determinants of tissue specificity.
Our results, which show a correspondence between composition of T3E repertoires and pathovars of Xanthomonas, do support the hypothesis that T3Es can affect host range in Xanthomonas. Nevertheless, our approach based on PCR and dot-blot hybridization methods is not sufficient to unequivocally consider that repertoires of T3Es determine host specificity in Xanthomonas axonopodis pathovars. To support the ''repertoire for repertoire'' hypothesis, we now plan to point our work towards functional studies based on our results.
Typing the T3E repertoires of plant pathogenic bacteria may provide clues for functional studies on host specificity and insight into understanding the redundancy between T3Es Repertoire of T3Es represents candidate determinants of host specificity of plant pathogenic bacteria since it has been shown that many T3Es can act as molecular double agents that betray the pathogen to plant defences in some interactions and suppress host defences in others [11,20,95]. T3Es have been shown to be involved in varietal resistance as well as in non-host resistance and they are reported to suppress both PTI (PAMP-triggered immunity) and ETI (effector-triggered immunity), the multilayered plant defences that bacteria must overcome to induce disease [21,31,91,92,[96][97][98]. Thus, within a T3E repertoire, there are evidences of interplay among T3Es since they can suppress ETI [96,99] and they can make redundant contributions to virulence [25,26]. Moreover, individual T3E may contribute differently to the outcome of the infection on different hosts [90]. Comparisons of T3E repertoires in Pseudomonas strains lead to the conclusion that either different combinations of sequence-unrelated T3Es (or T3E alleles) with redundant functions or few common T3Es may promote successful pathogenesis by distinct strains on the same hosts [90,93].
Our work provides clues for functional studies that will aim at showing gain or loss of function. For instance, focusing on strains of pathovars vasculorum and manihotis may be an excellent approach since strains of both pathovars have similar T3E repertoires ( Figure 2) and the number of variable T3Es is not too important to reasonably set up functional studies for further analysis of the role of T3E repertoires in host specificity. It would be interesting to observe whether the deletion of the variable T3E gene, xopB, in pv. vasculorum or the transfer of this T3E gene to pv. manihotis narrow or enlarge the host range of the strains. The same kind of functional studies might be performed with the three variable T3E gene, avrXv3, avrXccA2 and avrRxo1 in pv. manihotis. Another example could be with strains CFBP1519 (pv. glycines) and CFBP3530 (pv. aurantifolii). Indeed, these strains are phylogenetically closely related since they exhibit the same rpoD sequence ( Figure 1) and they harbour highly similar T3E repertoires since only two T3E genes discriminate both pathovars: xopC (present in pv. glycines and absent in pv. aurantifolii) and avrXccB (present in pv. aurantifolii and absent in pv. glycines) (Figure 2).
A major pitfall in deciphering the role of T3Es in the pathogenicity of plant pathogenic bacteria is that inactivation of a single T3E has often no detectable effect on pathogenicity. Functional redundancy among T3Es has largely been hypothesized to explain such phenomenon [25,26]. Data provided in this study may help to better select strains for mutating single T3Es and combined T3Es to provide insight into the functional redundancy of T3Es that may have a role in the delineation of the host range of the strains. Several T3E families have been found within Xanthomonas genomes, such as the YopJ/AvrRxv family, the AvrBs3/PthA family, the HopX/AvrPphE family, the HopAE1 family or the PopC family [8,11,64,68]. For instance, to reveal functional overlap between T3Es, one could focus on T3Es that belong to the YopJ/AvrRxv family of cysteine proteases since, in our study, we selected several T3Es (XopJ, AvrRxv, AvrBsT, XccB and AvrXv4) of this family [67,68]. Noel and colleagues reported that a mutation of the T3E XopJ in X. axonopodis pv. vesicatoria strain 85-10 cannot be associated with any decrease in pathogenicity [78]. Genome sequence analysis [64] and results obtained in our study reveal that this strain carries AvrRxv, another cysteine protease of the same family that may partially complement an inactivation of XopJ. Furthermore, we show in the present study that both xopJ and avrRxv are frequently associated in the species X. axonopodis (Figure 4) even though they are not genetically linked [64]. It would now be interesting to construct a double mutant by deleting both xopJ and avrRxv in a Xanthomonas axonopodis strain in order to provide insight into the functional redundancy of these T3Es. Similarly, we constructed an avrBsT mutant in strain CFBP4834 of X. axonopodis pv. phaseoli. No phenotype could be associated to the mutation [our unpublished data]. But besides AvrBsT, the repertoire of the strain CFBP4834 also contains XccB, another cysteine protease of the YopJ/AvrRxv family. We also show that there is a high frequency of association between AvrBsT and XccB in the species X. axonopodis ( Figure 4). However, some strains of the pathovar mangiferaeindicae or of the pathovar alfalfae, only display one cysteine protease of the YopJ/ AvrRxv family. Selecting one of these strains may ease functional studies on T3Es of the YopJ/AvrRxv cysteine proteases in plant pathogenic bacteria. In order to carry out functional comparisons of T3Es, one could also use gene ontology annotations that do not only depend on sequence similarities [100][101]. Such an approach may highlight shared and divergent pathogenic strategies of T3Es deployed by the various pathovars of X. axonopodis. Our results will then help in the determination of redundant-effector groups (REGs) in Xanthomonas strains as it has been done recently in the Pseudomonas syringae pv. tomato strain DC3000 [26]. These authors clearly demonstrated that plant pathogenic bacteria have evolved the capacity to deliver into plant cells T3Es with very little sequence similarity that are redundant in function [26]. Another demonstration of sequence-unrelated T3Es that function in the same plant defense pathway is AvrRpm1, AvrRpt2 and AvrB, that are not recognized by the same resistance genes but all target the Arabidopsis RIN4 protein [11]. Since these three T3Es only rarely cooccur in Pseudomonas syringae strains, this suggests that convergent evolution is driven by the need to manipulate particular host proteins [11]. Finally, elucidation of functional overlaps between T3Es should help us understand how the diverse T3Es in a repertoire may function as a system in plant hosts and may shape the host range of the strains.

Pathoadaptation of X. axonopodis strains is suggested by sequence variations revealed in some T3E genes
In regard to their central role in pathogenicity, T3Es are likely under strong selection pressures imposed by the defence system of the host plant. To escape plant defences, a pathogen may acquire new T3Es by horizontal gene transfer that would suppress defence reactions induced after recognition of the pathogen by the plant [99]. Alternatively, pathoadaptation of bacterial strains may occur through diverse mechanisms (single nucleotide polymorphism, insertion, deletion, or loss of a given T3E), to avoid being recognized by the host plant [102][103][104][105][106]. In the course of this study, we found DNA rearrangements that suggest pathoadaptation for X. axonopodis strains.
Our distribution study performed by the PCR amplification method allowed us to identify 23 DNA rearrangements within T3E genes. Interestingly, these DNA rearrangements were found only in T3E genes belonging to the accessory genome. If we consider that these variable genes may influence host specificity, such identified DNA rearrangements in some T3E genes might have a significant role in pathological adaptation of these plant pathogenic bacteria to their hosts. Among DNA rearrangements identified in the course of this study, there are a deletion within xopF2 of one pathovar aurantifolii strain and a perfect tandem duplication within xopD of one pathovar vesicatoria strain. Interestingly, both DNA rearrangements do not shift the reading frames suggesting that these strains used these strategies to generate modified form of the XopF2 and XopD proteins to avoid recognition by the plant. Regarding xopD, to our knowledge, this is the first example of a T3E gene, except for genes belonging to the avrBs3/pthA gene family [75], exhibiting perfect tandem duplication within its nucleotide sequence. It is tempting to speculate that the tandem duplication in the xopD gene may affect the host adaptation of this pathovar vesicatoria strain. Indeed, in Xanthomonas, it has been reported that insertions or deletions in the central part, where tandem duplications reside, of T3Es belonging to the AvrBs3/PthA family induce alterations of the host range of the strains [75,107,108].
Another example of pathoadaptive evolution comes from the action of transposable elements. Indeed, we identified in the frame of this study several T3E genes that are disrupted by different IS elements. Numerous ISs have previously been found inserted into T3E genes among plant pathogenic bacteria, and some of them were shown to shift plant-pathogen interactions from incompatible to compatible [93,105,[109][110][111][112]. In our study, the majority of identified IS elements belongs to the IS3 family-IS407 group whereas the remaining ones belong to the IS5 and IS1595 families. Interestingly, when looking at the flanking sequences of T3E genes in Xanthomonas sequenced genomes [8,64], we only found IS elements that belong to the same three families, with again a large majority of ISs classified within the IS3 family -IS407 group. Furthermore, another IS element (IS476) belonging to the IS3 family-IS407 group has been disclosed in the avrBs1 gene within one X. axonopodis pv. vesicatoria strain [113]. Altogether, these observations suggest that, in Xanthomonas strains, members of these three IS families might play an important role in T3E gene evolution since these mobile elements may alter their expression, they may be involved in their mobility as well as in the terminal reassortment process [71,109,[114][115][116][117][118]. Otherwise, it is also striking to note that some IS elements belonging to the IS3 family-IS407 group have been found just downstream PIP boxes, the binding motif for the transcriptional regulator HrpX [119], in the sequenced genome of X. axonopodis pv. vesicatoria [64]. It is thus tempting to speculate that the transposition of such replicative IS elements [115], and then the subsequent inactivation of a given T3E gene, might be co-regulated with the hrp genes cluster. Thus, one can reasonably think that inactivation of some T3E genes by IS elements might be of importance in host adaptation for plant pathogenic bacteria. To verify this hypothesis, it would be interesting to focus for instance on avrXv3 since it is altered by IS1595 in all tested pathovar alfalfae strains. We plan to complement these strains with a functional avrXv3 gene in order to observe a modification of the interaction between the pathovar alfalfae strains and their hosts. It would also be interesting to focus on xopC in pathovar mangiferaeindicae strains since this gene is altered in strains isolated from Schinus terebenthifolius but not in those isolated from Mangifera indica ( Figure 5). Since strains from both hosts of isolation exhibit identical T3E gene repertoires, the functional complementation of xopC might lead to a modification in host adaptation of these strains.
The finding that multiple T3E genes are affected by DNA rearrangements raises the question of the functionality of these genes within the repertoires. Our approach allowed us to show that several T3E genes are likely inactive since they are disrupted by ISs or exhibit a frameshift mutation leading to a premature stop codon. But, T3E genes may be non-functional for other reasons that we did not challenge by our approach, such as lack of expression or inability to translocate T3E proteins. Schechter and colleagues, in 2006, by using multiple approaches on the T3E repertoire of the Pseudomonas syringae pv. tomato strain DC3000, revealed that 33 T3Es are likely to be active, 12 T3Es are likely to be inactive and 8 T3Es may or may not be produced at functional levels [18]. It will be now important to check the functionality of each T3E gene in the repertoires that may impact the host range of Xanthomonas strains. Knowing whether a T3E gene is active or inactive is of interest for evolutionary studies. Indeed, it is possible that selection pressure for the inactivation of a T3E gene may result from the acquisition of new gene functions in both the host and the bacterium and that loss of function may be an important factor in the evolution of Xanthomonas axonopodis virulence. Loss of gene function may be beneficial to bacterial strains and it is considered to be a contributing factor to the evolution of virulence of many pathogens [104,105,120].
Finally, our results do support the hypothesis that T3E repertoires can affect the host range of Xanthomonas strains and that the evolution of T3E repertoires is driven by the need for interactions among T3Es as they co-ordinately disarm multiple layers of plant defenses. Our results also support the hypothesis that the evolution of T3E repertoires is also likely driven by the exposure to diverse resistance mechanisms in plants. So, the evolution and function of T3Es in a repertoire may be influenced by a co-evolutionary arms race between pathogens and hosts [105]. The second hypothesis is supported by the observation that a T3E loss and then the evolution of a T3E repertoire can be driven by exposure to host defence system [121]. Recently, it has been proposed that the host defence can accelerate the generation of genomic rearrangements that provide selecting advantage to the pathogen [104]. Therefore, it is tempting to speculate that the numerous DNA rearrangements found in T3E genes from Xanthomonas strains in the course of our study may be the result of exposure to various host plants. In that case, one could speculate that pathogens in response to selection pressure imposed by host defence systems, may have driven the inactivation of some T3E genes by insertion of ISs, or the modification of other T3Es by inframe deletion or perfect tandem duplication. These DNA rearrangements may have had a significant role in avoidance of host recognition and then in shaping the host range of the Xanthomonas strains. It is also reported that similar exposure of bacterial strains to environmental stress outside the host could also drive the horizontal transfer of T3Es from ecologically related plant pathogens that could lead to evolution of T3E repertoires and then of bacterial pathogenicity towards plants [104,105].

Perspectives
Our results provide resources for functional studies on host specificity of plant pathogenic bacteria. Our work will help to select strains to study the role of single or combination of T3E genes in the interaction with plants, as well as for studies aiming at understanding the molecular mechanisms of redundancy between T3Es. Moreover, the discovery of genetic rearrangements in genes encoding T3Es demonstrates the importance of looking at the allelic diversity of T3Es as well as at the expression of these genes. Indeed, impact of genetic rearrangements in T3E genes on host range has recently been well documented [105,106]. Thus, we plan to continue to analyze the allelic diversity of T3Es in our collection of Xanthomonas strains for evolutionary studies. Furthermore, our results strongly suggest that determination of T3E repertoires may be used for identification of Xanthomonas strains at the pathovar level. Thus, we will aim at developing a diagnostic tool for such purpose.
Finally, our study illustrates the importance of distribution analyses of virulence-associated genes by using large collections of bacterial strains. This approach can be useful for the identification of the candidate determinants of host specificity. For instance, it could be useful to perform such investigation on large collections of E. coli strains that are pathogenic on poultry or on humans. Indeed, no set of virulence genes was clearly identified yet to discriminate between avian and human strains [3]. To our knowledge, if presence or absence of genes of the T3SS was analyzed, repertoires of T3E were not yet compared. But interestingly, similarly to what was found for plant pathogenic bacteria, T3E genes appeared as differential genes in SSH between avian and human strains [2].

DNA extraction
Genomic DNA was extracted from all bacterial strains grown overnight at 28uC in YP medium (yeast extract, 7 g/liter; peptone, 7 g/liter) by using the standard hexadecyltrimethylammonium bromide method [122]. Quality and quantity of DNA was spectrophotometrically evaluated (Nanodrop ND-1000, Nanodrop Technologies). Table S2 presents the complete list of the 35 T3E genes included in this study. Selected genes comprised those identified from the sequenced genomes of Xanthomonas strains (17 from X. axonopodis pv. vesicatoria strain CFBP5618, 8 from X. campestris pv. campestris strain CFBP5241 and 7 from X. axonopodis pv. citri strain 306) [8,64]. We also selected 3 avirulence genes from other X. axonopodis pv. vesicatoria strains whose genome has not been sequenced: avrBsT from strain 75-3 [123] and avrXv3 and avrXv4 from strain 91-118 [124,125]. Some of the selected T3E genes are members of the defined T3E families in bacterial pathogens such as the AvrRxv/YopJ (C55) family of cysteine proteases, the AvrBs3/PthA family of transcriptional activators, the PopC family of Leucin-Rich Repeats proteins, the HopAE1 family and the HopX/AvrPphE family (Table S2) [8,11,64,68]. The other selected T3E genes have unknown functions to date. It is important to note that in the present study we tried to be as exhaustive as possible since we selected T3E genes not from only one but from 5 different Xanthomonas strains that belong to diverse species and pathovars. We followed this approach to minimize unavoidable bias of this kind of analysis; indeed it is certain that unidentified T3E genes may reside in Xanthomonas strains whose genome has not been sequenced yet.

PCR amplifications
Two complementary approaches were undertaken to characterize the T3E repertoires of our collection of strains: PCR and dot blot hybridization. The presence or the absence of an ortholog of each selected T3E gene was first determined by PCR. All X. axonopodis strains were first submitted to a PCR analysis by using specific T3Es primers. Pairs of primers (Table S2) were designed from the DNA sequences of the selected T3E genes available in databases. All of these primers pairs allowed the amplification of the full-length T3E DNA sequence, except for the avrBs2 and avrBs3 genes for which only partial DNA sequences were amplified (Table S2). The Xanthomonas strains, from which T3E genes were selected, were taken as positive or negative controls for all PCR experiments. PCR amplifications were carried out with a 20 ml reaction mixture containing 16 Go Taq Buffer (Promega), 200 mM dNTP, 0.5 mM of each primer, 0.4 U of Go Taq Polymerase (final concentrations) and 1 ng of template genomic DNA. The amplification conditions using the T3E primers were 2 min of initial denaturation at 94uC; followed by 30 cycles of 94uC for 1 min, 60uC for 1 minute and 72uC for 2 min; with a final extension of 10 min at 72uC. A reaction was considered as positive if a single clear band with the expected size was detected onto agarose gels. When a single band with an unexpected size was observed, the amplified PCR product was recovered from the gel and then sequenced (see below).

Dot blot hybridizations
As sequence variation may occur between T3E orthologs, thus preventing annealing of the PCR primers used, presence or absence of an ortholog was then confirmed by nucleic acid hybridization. For each dot blot hybridization experiment, we included, as negative and positive controls, water, E. coli strain DH5a and Xanthomonas strains whose genome has been sequenced (see above for strain details).
Hybridization probes were obtained by PCR amplification of the selected T3E genes from the sequenced genome of Xanthomonas strains using specific primers listed in Table S2. Probes were labelled using the PCR Digoxigenin (DIG) labelling mix (Roche Applied Science, France). PCR reactions contained 200 mM dNTP-DIG, and the other components as above. PCR for preparation of DIG-labeled DNA probes was performed in a themocycler programmed for denaturation at 94uC for 2 min and then for 30 cycles of 94uC for 1 min, 60uC for 1 min, 72uC for 2 min and finally 72uC for 10 min. The PCR products were purified by using the NucleospinH extract II kit (Macherey-Nagel Hoerdt, France).
Genomic DNA (250 ng) of each strain was transferred to BiodyneH N+ membranes (Pall Gelman Laboratory). DNAs were randomised onto membranes. Prehybridization, hybridization and detection were carried out by using the DIG Labelling and Detection Starter kit (Roche Applied Science, France) according to the manufacturer's instructions. Hybridizations were performed overnight at 42uC. To ensure high stringency, membranes were washed twice for 15 min in 26 SSC and 0.1% SDS buffer and twice in 0.16 SSC and 0.1% SDS buffer at 68uC. Hybridization signals were detected using the Fab fragments of an anti 1-2 digoxigenin antibody conjugated with alkaline phosphatase (Anti-DIG-AP) and Nitro-Blue Tetrazolium Chloride/5-Bromo-4-Chloro-39-Indolyphosphate p-Toluidine Salt (NBT/BCIP). A subset of hybridization experiments was replicated twice to assess the reproducibility of the dot blot results. Furthermore, we assessed the robustness of our approach by using sequenced Xanthomonas strains. Indeed, for these strains we were able to compare the T3E repertoires obtained by PCR and dot-blot hybridization with the expected T3E repertoires based on the genome sequence. For each of these strains, the obtained T3E repertoire corresponded to the expected T3E repertoire. This approach combined with BLAST analyses (http://blast.ncbi.nlm. nih.gov/Blast.cgi) allowed us to determine that the minimum identity at the DNA level between a T3E on the membrane and a T3E ortholog in the probe had to be at least 71% to give a signal above background.

rpoD amplification, sequencing and sequence analysis
The phylogenetic analysis of X. axonopodis strains was performed by sequencing the housekeeping rpoD gene (RNA polymerase sigma-70 factor). Primers were designed from the rpoD sequence of the X. campestris pv. campestris strain CFBP 5241 (GenBank accession no. NP639081) (Table S2). PCR amplifications were performed in a total volume of 50 ml using 3 ng of genomic DNA, 200 mM dNTPs, 0.5 mM of each primer, 1.5 mM MgCl 2 and 0.4 U of Go Taq polymerase in 16 Colorless Go Taq buffer (Promega). The PCR cycling conditions consisted of an initial denaturation step at 94uC for 5 min, followed by 30 cycles of 94uC for 30 s, 60uC for 60 s, 72uC for 30 s, with a final extension step at 72uC for 7 min. PCR amplicons were then sent to Ouest Genopole sequencing platform (Nantes, France). Forward and reverse sequences were obtained by using the rpoD specific PCR primers. These sequences were edited and assembled by using PREGAP 4 and GAP 4 of the Staden Package [126]. All rpoD sequences were then aligned using BioEdit (http://www.mbio. ncsu.edu/BioEdit/bioedit.html). All rpoD sequences have been deposited in the GenBank data library (accession numbers from FJ561596 to FJ561725).

Data analysis
Based on the presence/absence matrix of T3E genes for each of the 132 strains of X. axonopodis, we constructed a dendrogram using Jaccard distances and Neighbor-Joining method. Bootstrapping was performed with 1,000 replicates to assess the robustness of our dendrogram. The resulting dendrogram was visualised using the PAST 1.81 software (http://folk.uio.no/ohammer/past/).
Phylogenetic trees based on the rpoD sequences analysis were constructed by using maximum likelihood method. Best nucleotide substitution model was found using MODELTEST v.3.7 [62]. Akaike Information Criterion (AIC) was used for model selection.
Parameters of the selected model were used for maximum likelihood heuristic search using PAUP*4.0 beta10 [127]. Confidence on node was assessed by bootstrapping 1000 times.
Based on the presence/absence matrix of T3E genes, we calculated frequencies of association between T3E genes within X. axonopodis strains. We retained only cases where both T3E genes are present in the same strains. Furthermore, to analyze T3E gene histories, we used the parsimony method as implemented in the Mesquite software package [89].