Insights into the metabolic specificities of pathogenic strains from the Ralstonia solanacearum species complex

ABSTRACT All the strains grouped under the species Ralstonia solanacearum represent a species complex responsible for many diseases on agricultural crops throughout the world. The strains have different lifestyles and host range. Here, we investigated whether specific metabolic pathways contribute to strain diversification. To this end, we carried out systematic comparisons on 11 strains representing the diversity of the species complex. We reconstructed the metabolic network of each strain from its genome sequence and looked for the metabolic pathways differentiating the different reconstructed networks and, by extension, the different strains. Finally, we conducted an experimental validation by determining the metabolic profile of each strain with the Biolog technology. Results revealed that the metabolism is conserved between strains, with a core metabolism composed of 82% of the pan-reactome. The three species composing the species complex could be distinguished according to the presence/absence of some metabolic pathways, in particular, one involving salicylic acid degradation. Phenotypic assays revealed that the trophic preferences on organic acids and several amino acids such as glutamine, glutamate, aspartate, and asparagine are conserved between strains. Finally, we generated mutants lacking the quorum-sensing-dependent regulator PhcA in four diverse strains, and we showed that the phcA-dependent trade-off between growth and production of virulence factors is conserved across the R. solanacearum species complex. IMPORTANCE Ralstonia solanacearum is one of the most important threats to plant health worldwide, causing disease on a very large range of agricultural crops such as tomato or potato. Behind the R. solanacearum name are hundreds of strains with different host range and lifestyle, classified into three species. Studying the differences between strains allows to better apprehend the biology of the pathogens and the specificity of some strains. None of the published genomic comparative studies have focused on the metabolism of the strains so far. We developed a new bioinformatic pipeline to build high-quality metabolic networks and used a combination of metabolic modeling and high-throughput phenotypic Biolog microplates to look for the metabolic differences between 11 strains across the three species. Our study revealed that genes encoding enzymes are overall conserved, with few variations between strains. However, more variations were observed when considering substrate usage. These variations probably result from regulation rather than the presence or absence of enzymes in the genome.

These strains collectively constitute a devastating plant pathogen responsible for many diseases such as the bacterial wilt disease of solanaceous plants, the potato brown rot, the Moko disease on banana trees, or the Sumatra disease on clove (2,3). Strains can infect at least 392 hosts in 78 different botanical families (3,4) and are responsible for important economic losses throughout the world (5). The RSSC includes a large diversity of strains with phenotypic characteristics that may be specific for some strains (e.g., adaptation to cool temperatures or insect transmission) and not necessarily related to phylogeny (e.g., host range). Historically, several classification systems have been used to differentiate these strains, either on host range ("race") or metabolic ("biovar") criteria (6), but these systems have been progressively abandoned as not robust enough (7). Based on genomic comparison methods, strains were classified into four phylogenetic groups called phylotypes (I-IV). Each phylotype corresponds roughly to the geographical origin of the strains but is not related to host specificity (8). The three distinct species of RSSC correspond to these phylogenetic groups: R. pseudosolanacearum corresponds to phylotypes I and III, R. solanacearum corresponds to phylotype II, and R. syzygii corre sponds to phylotype IV, the latter being divided into three subspecies (1).
To better apprehend the genomic diversity of the RSSC strains, comparative genomic and proteomic analyses were performed on several strains either study ing the whole genome (2,3,(9)(10)(11)(12)(13)(14) or focusing on specific genes such as type 3 secretion effectors (15), antiphage systems (16), or mobile genetic elements (17,18). In particular, the species complex was shown to have a large pan-genome composed of at least 13,000 distinct genes, and a core genome composed of approximately 3,200 genes (14). The division into phylotypes of the RSSC is also supported by genomic and proteomic evidence (3,14).
Two studies have focused on differences in trophic preferences (i.e., the identification of the metabolic substrates preferentially metabolized by a strain) for some strains (19,20), but no study has attempted to connect metabolic specificities of strains with life traits within the RSSC, an approach that has become possible with the advent of genomic-based methodologies. To date, the metabolic network was studied only in GMI1000 strain (21,22). The metabolic network of R. pseudosolanacearum strain GMI1000 was reconstructed (21), which first provided a global view of the strains' anabolic and catabolic capacities. This study revealed the existence of a metabolic trade-off between virulence functions and bacterial growth, with the cost of producing virulence factors reducing the maximum growth rate of the pathogen. Indeed, it was shown that GMI1000 mutant strain lacking the central regulator PhcA could grow faster and on a wider range of substrates than the wild-type strain (21). PhcA, a central regulator of virulence (23), thus appears also as a major regulator of metabolism. Finally, the mapping of the substrates preferentially metabolized by the GMI1000 strain was carried out, thus allowing the identification of the compounds that sustain fast bacterial growth and were likely to be assimilated in tomato xylem sap (22).
In this manuscript, we present a systemic comparison of metabolism and trophic preferences of 11 strains belonging to all three species of the RSSC. To this end, the metabolic network of the 11 strains was reconstructed from their genome sequence, and trophic preferences of each strain were assessed using Biolog phenotypic microplates in order to establish the major convergences/divergences at this level. We also created phcA mutants in three strains to have a representative mutant of each species in order to establish the extent to which PhcA-dependent regulation impacts metabolism within the species complex and whether the metabolic trade-off between virulence and growth observed in GMI1000 is conserved.

Genome sequencing and annotation
Sequencing was performed as described by Gopalan-Nair et al. (24) using PacBio technology. Library preparation was performed at GeT-PlaGe core facility, INRAE Toulouse, France and SMRT (single molecule real time technology) sequencing at Gentyane core facility, INRAE Clermont-Ferrand, France. The mean reference genome coverage obtained was 227×.
For all the genomes, the Prokka software has been used to infer gene boundaries (25) with these parameters: --cdsrnaolap and --coverage 60. The coding sequences were translated in amino acid sequences. The quality of each genome annotation for completeness and contaminations has been measured with Busco (26), considering Burkholderiales lineage as reference (Fig. S1 at https://github.com/cbaroukh/rssc-meta bolic-networks).

Genome-scale metabolic network reconstruction
For the automatic reconstruction, 10 metabolic models were used as reference ( Table  1). The references were chosen because they are using BiGG ontology (27), they are of high curation quality, they have a phylogenetic proximity to R. solanacearum, and they were pathogens or had a similar lifestyle. Each model was downloaded in the systems biology markup language (SBML) format (28). Metabolic pathways information was added using BiGG database and ad hoc scripts. Reaction and metabolite identifiers were standardized, based mainly on their formula, to avoid redundancies. For instance, two reactions involving exactly the same participants are identified by the same identifier.
We built a pipeline called Meroom (MEtabolic Reconstruction from Orthology and Ordered Metabolic models) to build the metabolic network of each R. solanacearum strain studied. The Autograph method was used as starting point to infer the meta bolic reactions (29). Autograph builds a new target metabolic network using only one reference network and performing orthology associations between the target and the reference genes. Gene-reaction associations are propagated from the reference to the target. Meroom improved the Autograph method by (i) allowing several references at the same time and (ii) allowing the reconstruction of several metabolic networks at the same time. In addition, Meroom orders the references so that the gene associations found in the first references are privileged over the others in case of conflict or duplication. It also computes the pan-network and comparison matrices for reactions, metabolites, and pathways. Fig. S2 at https://github.com/cbaroukh/rssc-metabolic-networks gives an overview of the complete pipeline of Meroom approach.
Orthology relations between target and reference proteomes have been computed using Orthofinder (30). A 40% of identity has been set as parameter for the diamond execution. Then, orthologies are filtered according to the order of the references. For the first reference, all the orthology associations are kept. For the following references,

Computational simulations
Since some essential reactions for biomass production were missing in the metabolic network reconstructed by Meroom, we added manually these reactions to perform computational simulations. Tryptophan tRNA charging reaction was added because it was missing in the metabolic network of all strains. This was due to the fact that the GPR (gene-protein-reaction link) of this reaction relies only on an RNA sequence and not a protein, thus making it impossible to propagate to other networks, since orthologs were inferred at the protein level. Cobalamin was removed from BDBR229 biomass equation. Glycolaldehyde dehydrogenase was added to all R. syzygii strains as well as K60 and RUN2340. 1,4-Alpha-glucan branching enzyme was added in R24.
Simulations were performed using in-house Python 3.7 scripts. Reactions were parsed from metabolic network tabular files generated by Meroom. This allowed to create a numerical stoichiometric matrix and reversibility constraint vectors for each metabolic network. The linear programming solver CPLEX (Python API), developed by IBM, was used to solve the system and get solutions. All scripts and command lines are available online on GitHub: https://github.com/cbaroukh/rssc-metabolic-networks.

Carbon substrate phenotyping
Phenotyping was performed using Biolog Phenotype Microarray plates PM1, PM2-A, and PM3-B following the manufacturer's protocol. An initial OD of 0.10 (600 nm) was used for inoculation. Incubation time was between 67 h and 192 h, depending on the strain. Three independent replicates were performed.
Growth was assumed proportional to respiration and was assessed by calculating either the area under the curve (AUC) or the maximal intensity achieved (A, average on the top 10 values). Growth was considered effective if A >50. Fast growth was considered if AUC >7,000. The raw Biolog results are available on the GitHub repository (https:// github.com/cbaroukh/rssc-metabolic-networks).

Construction of phcA mutant strains
Disruptions of the phcA gene of the Psi07, CFBP2957, and UW551 receptor strains were created with the pGAΩ plasmid that was previously used to create the phcA mutant in strain GMI1000 (23). pGAΩ carries an insertion of the Ω interposon 255 bp downstream of the phcA start codon. The HindIII-linearized construct was recombined in the genome of recipient strains through natural transformation (31). Competence of recipient strains has been achieved after growth for 48 h in minimal medium (32) supplemented with 2% glycerol. Transformants were selected on medium supplemented with spectinomycin (40 µg/mL −1 ), and the genetic structure of the phcA::Ω recombinant locus was checked by PCR using the primers fw (forward) 5′-GGTACGACAACGAGTGG-3′ and rev (reverse) 5′-TTCATCAGCGAGTTGACCGT-3′ (except for strain CFBP57: rev was 5′-TTCATCAGCGAATT GACCGT-3′).

Metabolic network reconstruction of 11 strains belonging to the three species
To represent the diversity of the RSSC, strains from each species of the species complex were chosen, with more strains R. solanacearum to better apprehend its diversity ( Table  2). The genome of each strain was either taken from literature or sequenced by ourselves using PacBio technology and structurally annotated ( Table 2) using Prokka (25).
We developed an in-house reconstruction algorithm, which we called Meroom to reconstruct automatically the metabolic network of each of the 11 strains (https:// lipm-gitlab.toulouse.inra.fr/LIPM-BIOINFO/meroom-singularity). Briefly, this algorithm uses reference strains whose metabolic networks have either a high curation quality and/or are phylogenetically close to the organism whose metabolic network is to be reconstructed. The first step of the algorithm consists in defining ortholog groups using Orthofinder (30). Then each metabolic reaction linked to an orthogroup is propagated into the metabolic network under reconstruction (Fig. 1). The reference strains are ordered so that in case of conflicts, the reference strain with the highest order of priority is trusted. This allowed to yield draft metabolic networks of high quality, with few false positives and few gaps. The closer the model strains are phylogenetically, the easier it is to reconstruct the metabolic networks to perform flux balance analysis (33). Because the reference strain GMI1000 already had its metabolic network reconstructed (21), it allowed to generate in a straightforward manner the metabolic networks of the other 10 strains. Other bacteria, such as Cupriavidus necator (34) or Escherichia coli (35), were also used as models (see Materials and Methods for more details). The GMI1000 metabolic network was also reconstructed de novo with the Meroom algorithm to avoid any bias in a The Busco quality scores for completeness and contamination of each genome are shown in Fig. S1 at https://github.com/cbaroukh/rssc-metabolic-networksn. The number of proteins was computed from Prokka automatic annotation (25). The novel genome sequences are available in GenBank (Table S1 at https://github.com/cbaroukh/rsscmetabolic-networks). the analysis. The GMI1000 network, therefore, refers to the new reconstructed metabolic network, whereas the model network from Peyraud et al. (21) is referred to as Model GMI1000. All the metabolic networks are available as File S1 in a table format or in SBML format at https://github.com/cbaroukh/rssc-metabolic-networks. The merged metabolism of all strains, i.e., the pan-metabolome, is composed of 2,573 reactions and 2,562 metabolites. The common metabolism of all strains, i.e., the core metabolome, represents a majority of the pan-metabolome, since 2,111 (82%) reactions and 2,251 (88%) metabolites are present in all strains. For comparison, the metabolic network of GMI1000 strain shares 1,149 reactions to the E. coli metabolic network (36). A metabolic network contained on average 2,390 reactions; the smallest one was obtained for BDBR229 (2,315 reactions) and the largest for PSI07 and PSS4 (2,440 reactions), both strains belonging to R. syzygii (Fig. S3 at https://github.com/cbaroukh/rssc-metabolicnetworks). Meroom propagates the complex links between genes and reactions formed by AND (protein complexes) and OR (isoenzymes). In the propagation, an orthologous gene participating in an enzymatic complex may be missing, so the link between genes and reactions is said to be incomplete. These reactions with incomplete gene links represent only 3.1%-7.7% of the reactions associated with a gene in the different networks.

Tracking major metabolic differences among the strains
In order to identify metabolic markers that differentiate the 11 RSSC strains, we generated tables of presence/absence of the metabolic reactions for each strain and performed a clustering analysis and a multiple component analysis (MCA) on these FIG 1 Meroom pipeline to reconstruct automatically draft metabolic networks of high quality. The algorithm relies on reference strains that are trusted for their metabolic network quality and/or are phylogenetically close to the strains under reconstruction. The first step consists in using Orthofinder (30) to determine orthogroups between reference strains and strains under reconstruction. The second step consists in propagating reactions from reference metabolic models using gene-proteins-reactions associations and the orthogroups issued from step 1. The third step consists in merging the metabolic network obtained from each reference strain, for each strain under reconstruction. A more detailed description of the pipeline is available in Fig. S2 at https://github.com/ cbaroukh/rssc-metabolic-networks.
Research Article mSystems tables (Fig. 2). The clustering analysis clustered strains belonging to the same phylotype ( Fig. S4 https://github.com/cbaroukh/rssc-metabolic-networks) and strains belonging to same species. The MCA separated on the first axis (27% of explained variance) strains according to their phylotype, thus discriminating each species of the species complex. The second axis (21% of explained variance) separated strains within their phylotype. Phylotype and species could thus be separated using only the presence/absence of reactions in each metabolic network. From this MCA analysis, we wanted to identify the metabolic pathways that distinguished the four phylotypes (see the list File S2 at https://github.com/cbar oukh/rssc-metabolic-networks). Looking into which reaction(s) contributed the most to the first axis, we found that several reactions belonged to the general pathway of benzoate degradation. Most of these reactions appeared in fact to be related to salicylate degradation pathways, indicating that diverse pathways are able to degrade salicylate within the species complex ( Fig. 3; Fig S5 at https://github.com/cbaroukh/rssc-meta bolic-networks). Indeed, all strains possess a 4-aminobenzoate degradation pathway going through 3-oxoadipate, reaching the central core carbon network via succinyl-CoA and acetyl-CoA. All strains also have a salicylate degradation pathway via genti sate, reaching the core metabolic network via fumarate and pyruvate. However, R24, K60, and CFBP2957 have lost some of the genes in the operon (2, 4 genes) coding for this pathway (corresponding to RSc1085-RSc1091 in GMI100). Finally, all strains have a reaction converting salicylate to catechol ( Fig. 3; Fig. S5 at https://github.com/ cbaroukh/rssc-metabolic-networks). Phylotype I and phylotype III strains have another gentisate degradation pathway (RSc1821-1829 in GMI1000, Fig. S5 at https://github.com/ cbaroukh/rssc-metabolic-networks). Phylotype I has a catechol degradation pathway going through 2-oxopent-4-enoate to acetyl-CoA and pyruvate ( Fig. 3; Fig. S5 at https:// github.com/cbaroukh/rssc-metabolic-networks). All phylotype II and IV strains, except R24, have an operon degrading catechol to 3-oxodipate through cis,cis-muconate (Fig. 3, Fig S5 at https://github.com/cbaroukh/rssc-metabolic-networks). In summary, according blue, phylotype IIA strains; green, phylotype IIB strains; purple, phylotype III strains; and orange, phylotype IV strains.
Research Article mSystems to the phylotype, different pathways are used by RSSC strains to degrade salicylate; only R24 appears to have a non-functional degradation pathway of this metabolite.
Other differences contributing to the first axis of the MCA (listed File S2 at https:// github.com/cbaroukh/rssc-metabolic-networks) relied on the presence/absence of several catabolic pathways, rather involving non-central metabolites, or single determi nants in the primary metabolism which appear discriminating between groups of strains. For example, all the phylotype II and IV strains have a second glutamate dehydrogenase, converting glutamate to alpha-ketoglutarate, an extra phosphate import reaction, and a second pathway converting acetate into acetyl-CoA. Phylotype IIB and IV strains possess a gallate degradation pathway encoded in one operon (RSPSI07_04540-RSPSI07_04548). PSI07 (phylotype IV) and phylotype I and III strains have a degradation pathway for sarcosine, an intermediate metabolite for glycine-betaine degradation, and this pathway appears duplicated in phylotype I. Only phylotype I and III strains have a reaction converting N 2 O into N 2 and a reaction repairing di-iron centers in Fe-S proteins, in agreement with previous reports (3,37,38), and phylotype IV strains have a thymine degradation pathway which is absent in other strains.
Looking at the second axis of the MCA, both R24 and BDBR229 miss an enzyme in ketogluconate catabolism, an enzyme involved in nitrate and nitrite import/export, and four enzymes in the catabolism of glycogen. R24 also misses two additional enzymes belonging to the glycogen catabolic pathway and one for glycogen synthesis (1,4 alpha-branching enzyme). Finally, three enzymes involved in galactonate degradation are missing in phylotype II strains. In conclusion, reactions discriminating strains are involved in salicylate degradation pathways, catabolism of specific substrates, reactions involved in secondary metabolism, and some specific reactions involved in primary metabolism. present in all phylotype I and III strains; purple arrow, present in all phylotype I strains. 4hbz, 4-aminobutyrate; 5odhf2a, 5-oxo-4,5-dihydrofuran-2-acetate; 3oxoadp, 3-oxoadipate; succoa, succinyl-coa; accoa, acetyl-coa; ccmuc, cis,cis-muconate; op4en, 2-oxopent-4-enoate; fum, fumarate; pyr, pyruvate. A detailed representation of the diverse metabolic pathways comprising all reactions and all genes implied in each pathway is available in Fig. S5 at https://github.com/ cbaroukh/rssc-metabolic-networks.

Prediction of metabolic pathways critical for biomass growth in each reconstructed network
In order to predict in silico how the different strains were able to achieve growth under standard constraints, we performed flux balance analysis (33) on each reconstructed network, by setting the same constraints previously used for strain GMI1000 (21). Briefly, growth was simulated on L-glutamate, with imposed excretion of putrescine, EPS, ethylene, the diffusible signal molecule 3OHMAME (3-hydroxy fatty acid methyl esterase), and a protein substrate from the type II secretion system, as described by Peyraud et al. (21). Unfortunately, for most strains, the models were unable to yield growth or metabolite excretion, indicating that some essential reactions were missing in the corresponding reconstructed networks. To identify these essential reactions, assumed to be present in the ModelGMI1000, we have computed the systematic addition of each reactions from ModelGMI1000 missing in a given strain and then performed an in silico reaction essentiality test on each of them. This allowed to unravel the missing reactions that were mandatory for growth and metabolites excretion in our FBA models. Detailed results are available in File S3 at https://github.com/cbaroukh/rsscmetabolic-networks. First, BDBR229 missed nine reactions from the cobalamin synthesis pathway (vitamin B12), implying eight enzymes belonging to the same operon. The entire operon has disappeared in BDBR229 (e.g., RSp0614-RSp0628 in GMI1000). However, cobala min is probably non-essential for growth (39) and contributes to a faster growth of the bacteria in media without the presence of cobalamin in the environment (36). In addition, BDBR229 has lost its S-adenosylmethionine decarboxylase (RSp1293 in GMI1000) necessary for synthesizing decarboxylated-S-adenosylmethionine, an intermediate metabolite which allows the synthesis of spermine and spermidine from putrescine. However, these polyamines are not always essential for bacterial growth (40). Only putrescine, in GMI1000, was shown essential (41). Finally, R24 missed a 1,4 alpha glucan branching enzyme, necessary for the synthesis of glycogen. Looking in more details into the operon to which this enzyme belongs (RSp0235-RSp0242), the synteny of the operon is highly conserved in each strain, except for BDBR229 and R24. The operon is implied in glycogen synthesis and degradation. R24 misses seven genes out of the eight genes and BDBR229 four genes. The metabolic networks were modified to perform FBA for all strains by adding essential reactions or modifying the biomass equation. Results obtained also illustrates the high quality of the metabolic networks generated by Meroom since very few reactions (maximum 3) were modified or added in the metabolic network to be able to predict biomass growth and metabolite excretion. Some reactions were present in a strain but absent from Model GMI1000. To know if the presence of these reactions conferred any gain in growth, we performed a similar analysis as for the missing reactions. The flux of each extra reaction was set to zero to see if biomass growth was impacted. Results showed that none of the reactions conferred any significant gains in growth or metabolite excretion on glutamate (File S3 at https:// github.com/cbaroukh/rssc-metabolic-networks).

Prediction of growth robustness of RSSC strains through gene essentiality analysis
With the 11 metabolic networks reconstructed for each strain, we performed in silico a comparative gene essentiality analysis to estimate the growth robustness of each strain when growing on L-glutamate as sole carbon source (results listed in File S4 https://github.com/cbaroukh/rssc-metabolic-networks). Beyond a group of genes (196) that are essential for all strains and belong to the essential central metabolism, this analysis identified 32 genes for which essentiality differed between strains (File S4 https://github.com/cbaroukh/rssc-metabolic-networks). We examined this list in more detail to understand why these genes were predicted to be essential for growth in one strain and not in another, in order to uncover these specificities. The 32 genes were involved in majority in the synthesis of amino acids, or essential cofactors and vitamins such as folate, ubiquinone, or flavin. Some of the genes were essential only in specific phylotypes as, for example, a reaction step in the synthesis of leucine which did not have any associated isoenzyme in phylotype II and IV strains but had one in phylotype I and III strains (RSc1988 and RSp0329).

Trophic preferences among the phylotypes
In parallel to the metabolic network analysis, the carbon and nitrogen trophic prefer ences of the 11 strains (Table 2) were studied using Biolog phenotypic microplates type PM1, PM2-A, and PM3-B. Results confirmed that BDBR229 has a fastidious growth compared with the other strains (2), since 192 h instead of 96 h were necessary to reveal a metabolic activity (Fig. 4A). We developed a script to infer automatically if there was growth (respectively fast growth) and applied it on each substrate and for each strain (cf. Materials and Methods for details, and File S5 at https://github.com/cbaroukh/rsscmetabolic-networksfor detailed results). Overall, there is a noticeable level of variation in metabolic versatility between strains, whether for nitrogen or for carbon sources. At both extremes, strain R24 appears to be able to metabolize twice as many substrates as either strain K60 or BDBR229 (101, 49, or 44, respectively). BDBR229 is an exception with a reduced versatility and only one substrate enabling rapid growth (Table 3), which probably explains the slow growth phenotype of this strain compared with the others. For K60, versatility is also reduced but seems to be compensated by a better ability to metabolize carbon substrates ensuring a rapid growth. All the strains could grow on 11 common carbon sources and 8 nitrogen sources (Table 3; File S5 at https://github.com/ cbaroukh/rssc-metabolic-networks). When considering substrates which could sustain growth for 10 out of the 11 strains studied, we found 14 additional carbon sources and 6 additional nitrogen sources in common. The carbon sources included mainly organic acids and several amino acids but here too there were variations between strains, including strains within the same phylotype. This is particularly visible for certain amino acids (proline, alanine, serine, and threonine) or sugars (sucrose, fructose, and trehalose). We also detected automatically carbon substrates that could support a fast growth in at least 9 strains out of 10 (we excluded BDBR229, which does not have a "fast growth" phenotype). We found that glutamine, glutamate, aspartate, asparagine, fumarate, citrate, and malate could support a fast growth in most of the strains.
We performed a principal component analysis (PCA) on the Biolog profiles of each strain using the maximal intensity (A) reached for each substrate as values characterizing each strain. The PCA could not distinguish the four phylotypes, contrary Carbon sources were divided as sustaining a fast growth or a slow growth. Strains were grouped by phylotypes. The trophic preferences were assessed using Biolog phenotype microplates PM1, PM2-A, and PM3-B and an in-house script, which detect automatically if there is growth or fast growth (see Materials and Methods for details).

Research Article mSystems
to the MCA performed on the metabolic network reconstructions (Fig. S6 at https:// github.com/cbaroukh/rssc-metabolic-networks). Similar results were obtained using the AUC criterion instead of maximal intensity (data not shown). We performed hierarchi cal clustering, using the same table of values characterizing the strains (A and AUC).
Here again, the strains were not clustered according to their phylogenetic relationships ( Fig. S7 at https://github.com/cbaroukh/rssc-metabolic-networks). Only strains from phylotype I clustered together when performing clustering on AUC for PM1 plate only ( Fig. S8 at https://github.com/cbaroukh/rssc-metabolic-networks). Specific markers of the relationship between these strains were the ability to metabolize sorbitol, dulcitol, and mannitol. This finding confirmed previous results that identified a 22-kb region specifically present in phylotype I strains (42) and was associated to the degradation of these three sugar alcohols.

Conserved versus specific impact of the PhcA-dependent regulation on the metabolism of RSSC phylotypes
Since PhcA was shown to regulate metabolism (21), we wanted to determine if this PhcA-mandated regulatory pattern on the pathogen's metabolism was conserved within the RSSC. We, therefore, built phcA mutant strains in three strains representative of each species: CFBP2957 for R. solanacearum (phylotype IIA), UW551 for R. solanacea rum (phylotype IIB), and Psi07 for R. syzygii (phylotype IV). We then performed Biolog phenotype microplates experiments for each of these mutants. In addition, we already had Biolog results for phcA mutant in GMI1000 (21). Results showed that any phcA mutant strain grew faster on more carbon substrates than the wild type and could also grow on a larger number of substrates (Fig. 4B). Overall, strains carrying the phcA mutation acquire a capacity to metabolize substrates ranging from 30% to 37% higher than the wild type except for strain UW551 where this rate is only 13%. This   Phylotype  I  I  III  IIA  IIA  IIA  IIB  IIB  IV  IV  IV a The trophic preferences were assessed using Biolog phenotype microplates PM1 and PM2-A and an in-house script, which detect automatically if there is growth (+) or fast growth (+++, see Materials and Methods for details). The rest of the substrates is available in File S5 at https://github.com/cbaroukh/rssc-metabolic-networks.
behavior reveals that, as in GMI1000 (21), the phcA mutation leads to enlarged metabolic capacities and enhanced growth in representative strains of the four phylotypes and so suggests the occurrence of similar growth/virulence trade-off. It is interesting to note that these effects take place on a wide range of substrates, which again can vary between strains (Table 4). Several carbon sources that are not or poorly metabolized by wild-type strains can support a fast growth for phcA mutants in a majority of strains (proline, histidine, alanine, and gluconate), while other carbon sources appear to be exploited more specifically by the phcA mutant of only a given strain (e.g., sucrose and malonic acid for strain CFBP2957). This observation underlines that a specific PhcA-medi ated regulation may exist in some strains for some specific substrates.

DISCUSSION
In this study, we set up a metabolic network propagation pipeline on RSSC strains using genomic sequences and reference networks, including the manually curated network of GMI1000 strain. We sequenced (or re-sequenced) several strains in order to have at least three strains for each species of the RSSC with high-quality genome sequence. This sample size is still limited and must weigh the generality of the conclusions, but the group of 11 strains used, beyond evolutionary (i.e., phylogenetic) relationship, also covers large phenotypic differences (broad versus narrow host range strains, adaptation to cool temperature, and insect transmission versus root infection) and geographical origin.  Analysis of the 11 reconstructed networks reveals that the metabolic diversity is not so high between these strains, with a reactome (i.e., the set of possible metabolic reactions associated with the genome) comprising on average 2,390 reactions and a core reactome consisting of 82% of the pan-reactome. This is in sharp contrast to the inter-strain comparison at the genomic level as the core genome was estimated to comprise 1,940-2,370 genes (11.6%-17.9% of the pan-genome, respectively), thus showing a high degree of genomic diversity between strains throughout the species complex (3,13,14,(43)(44)(45). The high proportion of the core reactome thus reflects a very high (or even near complete) conservation of the core metabolism in the network of the different strains. Interestingly, on average, the metabolic network encompasses only 32% of the genome of each strain (Table S2 at https://github.com/cbaroukh/rssc-metabolicnetworks), allowing diversity over the remaining 68%. Another point to consider is that most of the metabolic genes involved in secondary metabolism, such as production of toxins or various uncharacterized diffusible molecules beside ralfuranones, ralsolamycin, Hrp-dependent diffusible factors, etc. (46)(47)(48)(49), are also probably involved in adaptation processes and may vary between strains. However, the current state of knowledge of these secondary metabolic processes is relatively poor, thus making it nearly impossible to incorporate them into the reconstruction step of the metabolic networks.
Based on a criterion of presence/absence of reactions in the metabolic network of each studied strain, both the MCA and clustering analyses were congruent with phylogeny, distinguishing correctly in both cases the four phylotypes and three species. Strains from phylotype I and III clustered together, IIA/IIB strains were distinct from each other but clustered together in the phylotype II group, and phylotype IV strains also clustered even if they were the ones with the largest differences between strains ( Fig.  2; Fig. S4 at https://github.com/cbaroukh/rssc-metabolic-networks). This observation is in agreement with the current view of the phylogeny of the RSSC, with phylotypes I and III taxonomically closer together and grouped into the R. pseudosolanacearum novel species and phylotype IV in which a wider diversity is predicted (10,14).

No clear association between metabolic specificities and phenotypic traits but a common preference for organic acids and some amino acids
Apart from metabolic pathways that had already been identified as strain/phylotype specific [metabolism of sugar alcohols (42) and nitrate assimilation (37)], the results point to a range of pathways involved in the catabolism of various "non-core" com pounds (salicylate, sarcosine, gallate, benzoate, galactonate, etc.). Interestingly, gallate and salicylate are found abundantly in plants. Gallate and derivatives, such as methyl gallate or epigallocatechin gallate, are metabolites exhibiting antimicrobial activities (50,51). In particular, it was shown that methyl gallate can inhibit growth of R. solana cearum (51). The fact that some strains of the RSSC possess a degradation pathway suggests that some plants used gallate as an immunity response to R. solanacearum. It can be hypothesized that the pathogen has acquired a gallate degradation pathway to counter the effect of these antimicrobial compounds. In a similar way, salicylate, a plant molecule involved in defence against pathogens, was shown to be degraded by R. solanacearum to protect the bacteria against inhibitory levels upon infection (52). Our study reveals that the degradation of salicylate involves up to four distinct salicylic acid degradation pathways in the RSSC pan-reactome. Some of the salicylate degradation paths were only present in some phylotypes and not others, possibly reflecting a variety of evolutionary solutions to degrade this molecule, probably through selection for more efficient degradation in some strains. Intriguingly, only one strain (R. syzygii R24) appears to be unable to degrade salicylate, which raises questions about the existence of an as yet unidentified alternative pathway or a dependence on its particular lifestyle (insect transmitted and restricted to the clove tree host).
Two of the strains belonging to phylotype IV (BDBR229 and R24) have both very distinct behaviors from the other studied strains. Some metabolic determinants that are widely conserved in the species complex appear to be missing in these strains such as the glycogen degradation pathway, a functional salicylate degradation pathway for R24 or genes for the biosynthesis of the cofactor cobalamin in BDBR229. Biolog phenotyping confirmed that these two strains had atypical metabolic profiles, with BDBR229 having a limited number of substrates that could support growth, and R24, in contrast, being able to grow on the largest number of different substrates. These observations also support the view that BDBR229 has a fastidious growth character, similar to other insect-borne plant pathogens (36,53), but the case of R24, also insect transmitted, remains enigmatic, and it will probably be necessary to obtain other genomes of R. syzygii strains for a better understanding. The rough appearance of the R24 strain on plate may suggest that a "phcA-like" mutation has occurred, either in the wild-type strain or during laboratory manipulation conditions. The hypothesis that R24 is a natural phcA-like mutant would be consistent with the fact that this strain is insect transmitted and thus restricted to the xylem compartment during plant infection, a condition for which we know, thanks to an experimental evolution study that this type of mutation can occur (54).
To a lesser extent, strain K60 (phylotype II) also has characteristics that distinguish it from most other RSSC strains. K60 seems to be able to metabolize a smaller number of substrates than the average of the other strains but with greater efficiency (higher proportion of substrates promoting fast growth, see Table 3). The growth of this strain also appears to be deficient on several sugars (e.g., lacking some transporters such as for sucrose). It has recently been proposed that strain K60 may be representative of a new IIC clade (14), and it is, therefore, unclear at present whether the metabolic behavior of K60 is related to this phylogenetic distinction or not.
Beyond an apparent diversity in the trophic spectrum of the RSSC strains, our study highlights the importance of organic acids (such as citrate, malate, or pyruvate) and amino acids (glutamine, glutamate, aspartate, and asparagine) as a common substrate base supporting efficient bacterial growth in all strains tested (except BDBR229). Amino acids and sometimes organic acids are the main components of xylem sap of plants, with glutamine and asparagine often present in high amount in xylem sap (55). The experimental approach showed the importance of amino acids, such as glutamine and asparagine, in enabling abundant multiplication of strain GMI1000 in tomato xylem (22,56). Amino acids and many organic acids are also present in soil extracts (57). The organic acids probably result not only from the decomposition of organic matter in the soil by bacteria and archaea but also from root exudates (58). Roots also appear to exudate amino acids (58). Interestingly, L-malate was shown to be a chemoattractant of R. solanacearum (59), and more generally, organic acids could play a role in the survival of the bacteria in the soil.

PhcA-mediated regulation reveals the metabolic potential of strains and also distinguishes substrate usage specificities
It is known that PhcA, a master transcriptional regulator in the RSSC, regulates in a direct or indirect manner expression of some metabolic pathways (60). GMI1000 phcA mutant could also grow faster and on more substrates than the wild-type relative (21). We generated phcA mutants in three additional strains, representing each species, including the IIA and IIB phylotype distinction. Characterization of the metabolic profile of these mutants showed that they could also grow faster and on more carbon and nitrogen substrates than wild-type strains, although there is significant variation in the number of additional carbon and nitrogen substrates between strains. Qualitatively, there were also significant variations in the type of substrates metabolized by the different phcA mutants, independent of phylogeny. These observations suggest that: (i) the trade-off between bacterial growth and virulence is globally conserved within the species, (ii) PhcA-mediated regulation may operate in a distinct manner on different metabolic genes depending on the strain and that this regulation introduces a strong differentiation at the metabolic level, probably linked to the necessities of the strains' life style and adaptated to their immediate environment.
The existence of such differentiation could also explain why the PCA analysis of Biolog metabolic profiles does not overlap with phylogeny (unlike the analysis based on the presence/absence of reactions). Indeed, we can see that the presence of a metabolic gene does not necessarily imply its expression and that the regulation that can take place will not depend on the phylogenetic origin of the strain. Another possible explanation to some discrepancies between Biolog profiling data and metabolic network prediction is that the absence of a single gene such as a substrate transporter can lead to a lack of growth even if the metabolic pathway is present (e.g., UW551 misses the trehalose transporter).