Molecular noise shapes bacteria-phage ecologies

Mathematical models have been used successfully at diverse scales of biological organization, ranging from ecology and population dynamics to stochastic reaction events occurring between individual molecules in single cells. Generally, many biological processes unfold across multiple scales, with mutations being the best studied example of how stochasticity at the molecular scale can influence outcomes at the population scale. In many other contexts, however, an analogous link between micro-and macro-scale remains elusive, primarily due to the chal-lenges involved in setting up and analyzing multi-scale models. Here, we employ such a model to investigate how stochasticity propagates from individual biochemical reaction events in the bacterial innate immune system to the ecology of bacteria and bacterial viruses. We show an-alytically how the dynamics of bacterial populations are shaped by the activities of immunity-conferring enzymes in single cells and how the ecological consequences imply optimal bacterial defense strategies against viruses. Our results suggest that bacterial populations in the presence of viruses can either optimize their initial growth rate or their steady state population size, with the first strategy favoring simple and the second strategy favoring complex bacterial innate immunity.


Introduction
One of the major challenges in biology is to understand how interactions between individual molecules shape living organisms and ultimately give rise to emergent behaviors at the level of populations or even ecosystems.
At the very bottom of this hierarchy, inside single cells, interacting biomolecules such as DNA or proteins are often present in small numbers, giving rise to intrinsic stochasticity of individual reaction events [1,2]. As a result, genetically identical organisms occupying identical environments can express dierent phenotypes [3,4] and make dierent decisions when presented with identical environmental cues [5,6]. This molecular noise is known to be the cause of biologically and medically important traits of bacteria such as persistence in response to antibiotics [7,8] and competence during acquisition of heterologous DNA [9]. However, while its causes and consequences are relatively well-studied at the organismal level [10,11,12], how molecular noise propagates to higher scales of biological organization to aect the ecology and evolution of organisms remains mostly unknown [4]. Many ecosystems have been shown to follow surprisingly deterministic trajectories despite the prevalence of stochastic events [13,14], yet these trajectories could themselves be strongly inuenced by molecular noise. Thus, the extent to which ecological interactions are aected by molecular noise, and the extent to which these ecological consequences feed back to reshape individual traits, remain to be explored.
Perhaps the most prevalent biological systems in which molecular noise plays an important role are restrictionmodication (RM) systems [15]. Present in nearly all prokaryotic genomes [16], RM systems are a highly diverse class of genetic elements. They have been shown to play multiple roles in bacteria as well as archaea, including regulation of genetic ux [17] and stabilization of mobile genetic elements [18], but have most frequently been described as primitive innate immune systems due to their ability to protect their hosts from bacterial viruses [19].
When a virus (bacteriophage or phage) infects a bacterium carrying an RM system, the DNA of the phage gets cleaved with a very high probability, thus aborting the infection. With a very small probability, however, the phage can escape and become immune to restriction by that specic RM system through epigenetic modication, leading to its spread and potentially death of the whole bacterial population in absence of alternative mechanisms of phage resistance [20]. Thus, in the context of RM systems, molecular noise occurring at the level of individual bacteria can have profound ecological and evolutionary consequences. Because RM systems are ultimately based on only two very well characterized enzymatic activities (restriction and modication) [21], they represent a simple and tractable biological system in which we can investigate propagation of eects of molecular noise across dierent scales of biological organization.
Here, we mathematically model the action of RM systems from individual molecular events occurring inside a single cell, through individual bacteria competing in a population, to interactions between populations of bacteria and phages in a simple ecological setting, as shown in  may inadvertently lead to full methylation of invading phage DNA in a process known as phage escape from the bacterial immunity, resulting in immediate consequences at the ecological scale (blue shade). Also, stochastic activity of the same enzymes may lead to accidental cutting of bacterial own DNA in a process known as self-restriction, resulting in lowered bacterial growth rate at the population level (red shade). Trade-os between selfrestriction and phage escape at the ecology scale in turn inuence R and M enzyme activities in surviving cells, thereby optimally balancing the eciency and cost of immunity.

Results
Self-restriction in single cells and in growing populations RM -systems consist of two enzymes, a restrictionendonuclease R, that recognizes and cuts specic DNA sequences (restriction sites), and a methyl-transferase M , that recognizes the same DNA sequences and ensures that only invading phage DNA can be cut by the endonuclease while the bacterial DNA remains methylated and protected. However, since chemical reactions occur stochastically, RM -systems can produce errors and fully methylate invading phage DNA before it is cut and degraded (phage escape) [22]. Similarly, it is possible that newly replicated restriction sites on the bacterial DNA, which are originally unmethylated, are accidentally cleaved instead of methylated (self-restriction) [23].
Inside a single cell, the probability of such selfrestriction events depends on the total activity, r, of all restriction enzyme molecules R, the total activity, m, of all methylation enzyme molecules M , as well as the bacterial replication rate λ, since λ determines the rate at which new unmethylated restriction sites are generated.
To investigate how self-restriction depends on these parameters, we model the corresponding biochemical reactions at each individual restriction site on the bacterial DNA with the stochastic reaction network displayed in til the rst self-restriction event in a given celli.e., until that cell's death or substantial reduction in growth rate can be obtained as the time when the rst restriction site is cut, that is as τ S = min i∈{1,...,N S } τ i , for bacterial DNA with N S restriction sites, where τ i , i = 1, . . . , N S are the waiting times for cutting events at individual sites. It can be shown that all τ i follow a phase-type distribution (see [24] and Fig 2b,c): being the initial methylation conguration, i.e., the proportion of restriction sites that are unmethylated (p 0 ), hemi-methylated (p 1 ) and doublymethylated (p 2 ); see SI Appendix Section S.1.  nary dierential equation where λ e (r, m, λ) = λ − µ(r, m, λ) is the eective growth rate and µ(r, m, λ) is the rate of self-restriction, dened as the inverse of the per-cell expected waiting time until shown as a function of λ, the rate at which the cells, and thus the restriction sites, are replicated. In the small r & m regime, self-restriction is so infrequent that it can easily be outgrown by replication (except at very low λ). In the large r & m regime, m is suciently high to keep the restriction sites protected and thus self-restriction is rare, except at extremely large λ, where the green curve falls below the blue curve. In the reference regime, r is too large and m not high enough to protect, so self-restriction can not be outgrown; eective growth thus falls significantly below λ. Our numerical analyses further show that the self-restriction rate µ(r, m, λ) grows faster-thanlinearly with λ (SI Appendix Section S.1), causing the eective population growth to slow down and ultimately drop to zero at high enough λ. Assuming that all restriction sites are identical and independent, the probability of phage escape can be calculated [25] as where N V is the number of restriction sites on the phage DNA. From Eq [5] it is straightforward to see that p V (r, m) is monotonically increasing in m and decreasing in r. One might therefore expect that the balance between avoiding self-restriction that favors high m, Eq [4], and minimizing phage escape that favors low m, Eq [5], would impose a tradeo and thus lead to an optimal value of m. However, this is not the case, because phage escape probability p V (r, m) and the population self-restriction rate µ(r, m, λ) can both approach zero so long as r and m both increase to innity but r does so faster. While mathematically possible, this limit is, however, not biologically relevant: large enzyme expression levels should incur a cost (metabolic or due to toxicity) for the cells [26,27], which we sought to incorporate into our Our model can be generalized to multiple coexisting RM -systems that recognize dierent restriction sites and operate in parallel, as is often observed for bacteria in the wild [16]. This provides increased protection from phages since the phage has to escape all RM -systems to infect successfully. However, multiple RM -systems also imply that the bacteria either have to pay higher expression and self-restriction costs or that they have to rebalance the expression levels of the enzymes such that lower self-restriction rates per RM -system are obtained with the same overall enzyme activity. Allowing bacteria to have multiple RM -systems, but assuming for the sake of simplicity that these systems are all equivalent in terms of enzyme activities and number of recognition sites, we obtain the phage escape probability for k RM - The rst term on the right-hand side describes the growth of the population at an eective rate of λ e (r, m, k, λ) and already accounts for self-restriction.
The second term accounts for phage escape events. Here, v is the phage density which we take to be a constant parameter of the environment; phages enter bacterial cells at a rate proportional to vn(t), as prescribed by massaction kinetics, with a proportionality constant given by the phage adsorption rate, ρ. The rate of successful infections is then given by ρvp V (r, m, k)n(t). We assume that each successful infection event wipes out a fraction, 0 < l < 1, of the total population (i.e., on average, n(t)/l bacteria die following phage escape), yielding the full Eq [7]. For simplicity, we assume all infections to be lytic and do not consider the possibility of the phage lysogenizing the host bacteria.
The unknown parameters ρ, v, and l, enter Eq .
The same quantity, n s , also emerges as relevant in an alternative ecological model where bacteria grow exponentially until the occurrence of the rst phage escape event, which then wipes out the complete population.
This alternative ecology may be representative for bacteria that rarely encounter long-term stable environments (e.g. marine bacteria feeding on small organic particles in the ocean [28]) such that maximizing the number of ospring colonizing new environments may be more important than maintaining high steady-state population size (see SI Appendix S.4.1). In both ecological scenarios, two key quantities summarize the fate of bacterial populations coexisting with bacteriophages: λ e quanties the shortterm growth rate before rare but potentially catastrophic phage escape events are likely to occur, and n s , which quanties the long-term cost of phage escape. Importantly, both λ e and n s depend solely on three single-cell parameters: the restriction rate r, the methylation rate m, and the number of concurrently active RM-systems, k.

Tradeos and optimality in bacterial immunity
Can bacteria tune the single-cell parameters over evolutionary timescales in order to maximize the short-term growth rate, λ e , and the steady-state population size, n s ?
Equations [6,8] assert that these two quantities are necessarily in a tradeo and cannot be maximized simultaneously, which is illustrated in Fig 4a. This tradeo is the rst key result of the section.
With no single optimum possible, we look instead for Pareto-optimal parameter combinations, (r, m, k), i.e., solutions for which λ e cannot be further increased without reducing n s and vice versa [29,30].

Discussion
Despite the ubiquity of RM systems in prokaryotic genomes [16], basic ecological and evolutionary aspects of these otherwise simple genetic elements are poorly understood [19]. Although RM systems have been discovered more than six decades ago due to their ability to protect bacteria from phage [31] and this is often assumed to be their main function [32], only a few experimental studies focused on the ecological and evolutionary dynamics of interactions between RM systems and phage [33,34]. Similarly, eects of RM systems on their host bacteria, such as their cost in individual bacteria due to self-restriction, began to be addressed quantitatively only recently [23,35].
In this work, we bridged these two scales using math- Using this approach, we analytically described the tradeo between the cost and the eciency of immunity conferred by RM systems. The existence of a tradeo was previously indicated by quantitative single-cell experiments with two RM systems isolated from Escherichia coli [23]. We extended the mathematical model of restriction and modication in individual bacteria to a simple ecological setting, where a bacterial population grows in the presence of phages and thus showed that the tradeo between the cost and the eciency of immunity can profoundly aect the resulting population dynamics. As an important consequence of the tradeo between the cost and the eciency of immunity, there exists no optimal pair of R and M enzymatic activities suited for all ecological settings. Instead, we can expect the observed expression levels and enzymatic activities of naturally occurring RM systems to represent adaptations to specic environmental pressures. Such tuning of expression levels towards optimality has previously been directly experimentally shown in dierent molecular systems [26]. The expression levels of both R and M should be readily tunable by mutations in the often complex gene-regulatory regions [36].
While our predictions should be viewed as approximate, our analysis highlights two important conclusions.
First, optimality can make clear and quantitative predictions in the parameter regimes relevant for real strains, and improving the predictions to take into account more relevant biological detail (if needed and known) remains only a technical, rather than conceptual challenge. Second, parameter values measured for an E. coli RM system put optimal solutions into a regime that permits a large variation in the optimal number of RM systems, between one to six, with relatively small changes in the effective growth rate. This observation allows us to advance the following hypothesis: the number of RM-systems in dierent bacterial strains and species is not a historical contingency, but an evolutionary adaptation to dierent ecological niches characterized by dierent typical growth rates. In other words, the tradeo between the cost and the eciency of immunity can be partially alleviated in bacteria employing multiple RM systems. It is therefore interesting to note that many bacterial species carry multiple RM systems and the number of RM systems varies signicantly among bacteria with dierent genome sizes and lifestyles [16,15]. Our results indicate that dierent numbers of RM systems would be optimal in populations under dierent selection pressures (phage predation/resource limitation).
The analytical model presented here makes several simplifying assumptions. First, we consider only interactions between a single species of bacteria and single species of phage. In natural environments, many bacterial and phage species interact and this diversity will certainly impact the resulting ecological end evolutionary dynamics [37,38,33,39]. Second, we assumed the key parameters such as the numbers of restriction sites in bacterial and phage genomes to be constant in time and thus disregarded the long-term evolutionary dynamics. Bioinformatic studies have shown that many bacteria and phage avoid using restriction sites in their genomes [40,41,42]. Restriction site avoidance can represent an adaptive mechanism for increasing the likelihood of escape in phages [41,43] and decreasing the likelihood of self-restriction in bacteria [44,23]. The stochastic nature of RM systems observed at the level of individual cells is thus likely to critically shape the ecological and evolutionary dynamics of interactions between bacteria, RM systems and phage.