Simulating the evolutionary trajectories of metabolic pathways for insect symbionts in the Sodalis genus

Insect-bacterial symbioses are ubiquitous, but there is still much to uncover about how these relationships establish, persist and evolve. The tsetse endosymbiont Sodalis glossinidius displays intriguing metabolic adaptations to its microenvironment, but the process by which this relationship evolved remains to be elucidated. The recent chance discovery of the free-living secies of the Sodalis genus, S. praecaptivus, provides a serendipitous starting point from which to investigate the evolution of this symbiosis. Here, we present a flux balance model for S. praecaptivus. Metabolic modelling is used in combination with a multi-objective evolutionary algorithm to explore the trajectories that S. glossinidius may have undertaken after becoming internalised. The time-dependent loss of key genes is shown to influence the evolved populations, providing possible targets for future in vitro genetic manipulation. This method provides an unusually detailed perspective on possible evolutionary trajectories for S. glossinidius in this fundamental process of evolutionary and ecological change.

2 Introduction biological conditions. The aim was to investigate computationally the route that S. glossinidius may have taken in its transition to symbiosis. It is not known whether 136 the solutions found by S. glossinidius, described in i LF517, are the only possible 137 outcomes given the metabolic constraints of the microenvironment, or whether the 138 symbiont's unusual metabolic network evolved by chance. The application of the 139 MOEA to i RH830 enabled the observation that certain key pseudogenisations may 140 have occurred earlier in the symbiosis than previously thought. The effect of ex-141 posing the ancestral Sodalis to contrasting diets was also modelled, mirroring the 142 different trajectories that this genus has taken within blood-and sap-feeding insects. 143 It is hoped that the techniques used here can be applied to other symbiotic systems 144 to drive forward the discovery of novel relationship criteria.  The S. praecaptivus genome was mined for orthologues to metabolic genes in E. coli and S. glossinidius, before compiling into a draft model. An iterative process of testing and gap filling was then performed, using information provided in various databases.
3.2 S. praecaptivus can grow on the unusual sugar alcohol 158 xylitol 159 Computational models are strengthened when accompanied by experimental verifi-160 cation of their in silico predictions. A series of biochemical screens was therefore 161 conducted using Biolog phenotypic microplates to investigate carbon utilization by 162 S. praecaptivus. In total, 190 metabolites were tested for their ability to act as the 163 sole carbon source for S. praecaptivus. Experiments were conducted in triplicate 164 with full results detailed in Supplementary File 2. Through this phenotypic screen, 165 it was found that S. praecaptivus was able to use 19 of the metabolites tested as 166 a sole source of carbon (Table 1). When these metabolites were tested in silico by 167 the exogenous addition to i RH830, it was found that all but two mirrored the in 168 vitro data qualitatively; N -acetyl D-galactosamine (GalNAc) and xylitol. GalNAc 169 did not produce a viable biomass output, and xylitol was not included as a metabo-170 lite in i RH830 initially. Xylitol is not found in any prokaryotic model in the BiGG 171 Models database, hence it had not been considered for inclusion in the construction of i RH830. Table 1: Positive results from one representative S. praecaptivus phenotypic screen. i RH830, i JO1366, and i LF517 ∆biomass outputs (gr DW (mmol glucose) -1 hr -1 ) following the addition of these metabolites to models currently not supplied with a sugar. Metabolites in bold initially did not produce a positive biomass output in early iterations of the S. praecaptivus model.    Robustness analysis was used to examine reaction essentiality and therefore redun-211 dancy in the i RH830 network. i RH830 was run on a tsetse-specific nutrient limited 212 medium ("famine") and a blood medium simulating the internal tsetse environment 213 and informed by S. glossinidius requirements (Hall et al., 2019) ("blood", Table S1).

306
This is significantly more than for sap (13 in less than 0.1% of 1888 models) and 307 blood (five in less than 0.1% of 1194 models).
These core non-essential reactions were then analysed by subsystem to assess themes 309 across the different conditions. In blood, over half (eight of 14) of these are secondary  The time-dependent nature of pseudogenisations can therefore be estimated, using 344 the resulting evolutionary trajectories as a guide.  ). i RH830 was supplied with either 6 mmol gr DW -1 hr -1 GlcNAc and 1 mmol gr DW -1 hr -1 thiamine (henceforth "famine"), a tsetse-specific media (henceforth 520 "blood", Table S1 The population is then copied, allowed to mutate and the fitness evaluated again. A new population is selected from the original and copy populations. Green boxes represent the start and final populations, pink boxes represent the iterative process of mutation and selection.

Population initiation 550
Prior to starting an evolutionary run, reactions essential to growth were identified 551 using a single reaction knockout. Essential reactions were defined as those produc-552 ing a biomass output of less than 1 x 10 -3 gr DW (mmol glucose) -1 hr -1 . Reactions 553 that were identified as essential were not included in the subsequent mutation strat- or 0 corresponded to the reaction being active or inactive, respectively. This is a 560 proxy for gene loss, where a one-to-one gene-protein-reaction mapping is assumed.