Molecular noise of innate immunity shapes bacteria-phage ecologies

Mathematical models have been used successfully at diverse scales of biological organization, ranging from ecology and population dynamics to stochastic reaction events occurring between individual molecules in single cells. Generally, many biological processes unfold across multiple scales, with mutations being the best studied example of how stochasticity at the molecular scale can influence outcomes at the population scale. In many other contexts, however, an analogous link between micro- and macro-scale remains elusive, primarily due to the challenges involved in setting up and analyzing multi-scale models. Here, we employ such a model to investigate how stochasticity propagates from individual biochemical reaction events in the bacterial innate immune system to the ecology of bacteria and bacterial viruses. We show analytically how the dynamics of bacterial populations are shaped by the activities of immunity-conferring enzymes in single cells and how the ecological consequences imply optimal bacterial defense strategies against viruses. Our results suggest that bacterial populations in the presence of viruses can either optimize their initial growth rate or their population size, with the first strategy favoring simple immunity featuring a single restriction modification system and the second strategy favoring complex bacterial innate immunity featuring several simultaneously active restriction modification systems.

In this section, we provide a more detailed description of self-restriction in single cells and demonstrate how a Markovian growth model for the population follows naturally under some mild assumptions.
Let N S be the number of restriction sites on the DNA (i.e. the number of binding sites for R and M ). Each of these sites can either be unmethylated (can be restricted or methylated), hemimethylated (can be further methylated), or double methylated (can be neither restricted nor methylated). In the following, we will denote the number of unmethylated, hemimethylated, and double methylated sites by S 0 , S 1 and S 2 , respectively. Assuming a well mixed reaction network and that it is equally likely that a molecule of M methylates an unmethylated or a hemimethylated site, it follows that the rates of restriction and methylation events are proportional to the number of available sites. Furthermore, we assume that no DNA-repair takes place and that any restriction event is lethal for the cell. With these mechanisms, the underlying stochastic process would be a finite state Markov chain with two absorbing states (all restriction sites double methylated or cell death) and more or less equivalent to the model that was used in [1] for calculating the probability of phage escape. However, in a growing cell population the DNA is continuously replicated with a speed that depends on the growth rate of the population and newly synthesized DNA is unmethylated. Given the double stranded nature of DNA and the mechanism of replication, when a double methylated restriction site gets replicated it will split into two hemimethylated sites, a hemimethylated site will split into a hemimethylated and an unmethylated site, and an unmethylated site will split into two unmethylated sites. This will double the total number of sites over the cell cycle. When the cell divides, daughter and mother will each receive N S of the 2N S sites. In order to not have to explicitly incorporate the cell cycle into the model, we average this over the cell cycle and assume that when a site splits into two sites in replication, with equal probability one of the new sites is kept by the cell whereas the other site is reserved for a future daughter cell and cannot cause self-restriction in the mother. This means that from the perspective of the mother cell, replication of a double methylated site converts the site into a hemimethylated site, a hemimethylated site gets converted to an unmethylated site with probability 1/2 or stays hemimethylated with probability 1/2, and an unmethylated site stays unmethylated. Ignoring events that do not change the state of the system, we obtain the following model: where D stands for a single absorbing death state that stops the entire process. This stochastic process is a finite state Markov chain with N 3 S + 1 states. Using the conservation law S 0 + S 1 + S 2 = N S , the number of states that are actually needed to fully describe the chain is ((N S + 1) 2 + N S + 1)/2 + 1, which means that the underlying master equation is a linear system of ((N S + 1) 2 + N S + 1)/2 + 1 ordinary differential equations. Unfortunately, the total number of sites N S typically ranges somewhere between 500 − 1000, which means that determining the solution of the master equation would require us to compute the matrix exponential of a matrix whose size tends to be in the hundreds of thousands, which is difficult even if we only require a numerical solution. For a single site, however, we can set N S = 1 and it is straightforward to calculate with this model whose master equation is then given by the simple system of ordinary differential equations and p(t) := [p D (t) p 0 (t) p 1 (t) p 2 (t)] with p D (t) := P (D(t) = 1) and p 0 (t), p 1 (t) and p 2 (t) as the probabilities for the site to be unmethylated, hemimethylated, or double methylated. Waiting times until self-restriction, such as displayed in Figure 2 in the main paper, can be readily derived from this equation using standard theory for finite state Markov chains [2].
In general, it becomes more difficult to obtain similar results when N S is large, but we can make use of the fact that conditional on the chain not having reached the death state D the reaction network is a monomolecular reaction network whose master equation admits an analytical solution [3], which is a consequence of the implicit assumption that all restriction sites are equivalent and independent. With this in mind, and denoting by S(t) = [S 0 (t) S 1 (t) S 2 (t)] the unconditional chain, we can define the process X(t) = S(t) | {D(t) = 0} and easily deduce that for all times t its distribution is a multinomial distribution where p c,0 (t), p c,1 (t) and p c,2 (t) are the probabilities for a single site to be unmethylated, hemimethylated, or double methylated conditional on it not having been cut, respectively.
This allows us to analytically calculate with the full single cell model conditional on restriction not having happened in the cell, which turns out to be convenient for understanding self-restriction in growing populations where cells that cut themselves are removed from the population such that the internal methylation configurations of the still alive cells are determined by these conditional methylation-replication dynamics.
Assuming that the dynamics in all cells are equilibrated, we can deduce that the internal methylation configuration of any randomly chosen cell at any time in a growing population follows the stationary distribution of X(t), also known as the quasi stationary distribution of the unconditional absorbing Markov chain S(t).
To obtain this quasi stationary distribution, and also the waiting time until self-restriction of cells in growing populations, we require the quasi stationary probabilities p Q,0 , p Q,1 , p Q,2 for single sites. Writing the matrix A from Eq.(2) as we can obtain the quasi stationary distribution p QSD = [p Q,0 p Q,1 p Q,2 ] as the normalized right eigenvector of the transpose C of C corresponding to the largest eigenvalue [4], i.e. as the solution of where γ 1 is the largest eigenvalues of C. Due to the properties of C, γ 1 is unique and guaranteed to be real and strictly negative if all reaction rates are positive λ, r, m > 0 (follows from the Perron-Frobenius theorem).
From a linear systems theory perspective the quasi stationary distribution can be understood as the distribution that equilibrates the system in all directions except the one leading into the absorbing state. A consequence is that if we use p QSD (and p D (0) = 0) as initial condition for the single site model, the solution of the master equation in Eq.
(2) will evolve along a line in four dimensional space given by 1 − p Q towards the steady state given by [1 0 0 0] which corresponds to probability one of the site being cut. These one-dimensional linear dynamics are the reason why the distribution of the waiting time τ for a single site to be cut has only one phase and reduces to an exponential distribution τ ∼ Exp(−γ 1 ) as stated in the main paper. The waiting time τ S until self-restriction of the cell is then obtained as the minimum of i.i.d. exponentially distributed random variables and is therefore also exponentially distributed τ S ∼ Exp(−γ 1 N S ). The result that this waiting time follows an exponential distribution implies that growth of a self-restricting population can be modeled as a Markovian birth-death process, and that it makes sense to reason about a rate of self-restriction µ(r, m, λ) of a growing population.
In the main paper, we claimed that the self-restriction rate µ(r, m, λ) converges to zero as the enzyme activities converge to infinity, i.e. that lim r,m →∞ (µ(r, m, λ)) = 0. It is important to realize that for this to be true it is necessary but not sufficient that the probability that unmethylated restriction sites are ever available converges to zero. To validate this result, one needs to study the largest eigenvalue γ 1 = γ 1 (r, m, λ) of C as a function of r and m and take the limit of infinite enzyme activities. We tested this numerically by calculating γ 1 (r, m, λ) for growing values of r and m and found that γ 1 (r, m, λ), and hence µ(r, m, λ), indeed converges to zero as r and m become large. Interestingly, this result would not hold if also hemi-methylated restriction sites could be recognized by the restriction endonuclease. This can be seen by adding a restriction reaction for hemi-methylated sites to the model, and verifying that the largest eigenvalue of the corresponding matrix C does not converge to zero as r and m converge to infinity, despite the fact that the probability that hemi-or unmethylated restriction sites are ever available converges to zero. We conclude that bacteria can only prevent both selfrestriction and phage escape at the same time because two subsequent demethylation steps need to take place to create susceptible sites, while any mechanism in which sites would become susceptible in a single step would retain a non-zero rate of self-restriction µ(r, m, λ) for large enzyme activities. Finally, to supplement the analysis in Figure 3c of the main paper, we investigated numerically how the effective growth rate λ e (r, m, λ) = λ−µ(r, m, λ) growth as a function of λ for a practically realistic scenario with N S = 599 restriction sites and different fixed enzyme activities. The results ( Figure A displays the case m = 77.02/min and r = 624.10/min) suggest that there always exists a value of λ at which the self-restriction rate µ(r, m, λ) starts to grow superlinearly in λ, which implies that increasing λ further only decreases the effective growth rate of the population and that λ e (r, m, λ) will eventually drop to zero as λ increases (albeit this only happens for values of λ that are unrealistically large and not practically achievable for bacteria).

S.2 Stochastic simulation of self-restriction in growing populations
To verify the analytical results, as well as to evaluate the assumptions made in the main paper, we implemented a stochastic simulation algorithm for a growing population that implements our model of self-restriction and explicitly simulates every methylation, demethylation and restriction event in every cell. Growth of the population is represented as a stochastic birth process, i.e. the waiting time for a birth event in a population of size N is exponentially distributed Exp(λN ). When a new cell is born, its initial methylation distribution is drawn from a multinomial distribution (see Eq. If the RM -system is such that the number of restriction sites that it recognizes is relatively large, for instance N S = 599, while at the same time phage escape probability needs to be kept low, then according to our model it is necessary that also the enzyme activities are relatively large, for instance m = 77.02/min and r = 624.10/min for a growth rate λ = 0.017. Smaller values of r and/or m would imply either large selfrestriction rates or phage escape probabilities (or both) since they would not allow to obtain a small number of cutting events at any of the N S = 599 bacterial restriction sites while ensuring at the same time that one of the typically small (e.g. N V = 5) number of restriction sites on the phage is reliably cut. Since the value of m is much larger than the growth rate λ = 0.017/min, which determines the demethylation rate of restriction sites, the QSD corresponding to these values of r and m is such that the probability for sites to be fully methylated is close to one while the probability for sites to be empty is extremely low (in the order of 10 −9 ). Self-restriction happens nevertheless due to the large number of restriction sites as well as the large restriction rates which leads to essentially direct cutting in the (rare) cases when sites become unmethylated. A consequence of this is that the assumption that the methylation distribution of newly born cells follows the QSD is similar to assuming that new cells are initially fully methylated for these parameters. From a biological perspective it is unclear how accurate this assumption is since the precise dynamics of the real methylation-replication process are only poorly understood. Within our stochastic simulation, it is possible to test what consequences different assumptions about the methylation distribution of newly born cells have for growth of the population. For instance, instead of representing the replication of sites as demethylation events, we can explicitly carry out all replication events at division events of cells. This corresponds to assuming that demethylated sites are not generated continuously as in our model in the main paper, but can only be produced in bursts at cell division events and revert back to being fully methylated when the cells are not dividing. Under this assumption, the methylation pattern of cells depends on when the cell has divided for the last time and growing populations will never be homogeneous. Furthermore, the assumption that all restriction sites in a cell are replicated in a single event leads to loss of independence of individual sites, which, for the purpose of a simulation of this model, implies that choosing consistent methylation patterns for the cells that are present at time zero is a non-trivial problem and cannot be done by assuming some distribution for an individual site (e.g. p QSD ) and simply drawing all sites of all cells according to this distribution. Therefore, to obtain statistics for the initial cells that are consistent with the population dynamics, we first run a simulation with arbitrary initial methylation patterns for a time horizon that is sufficiently large to guarantee convergence of all population statistics. We then extract some cells (N 0 = 40 in plots) from the final population and use them as initial cells for a second simulation. The results ( Figure C, panel A) show that this model can lead either to more or to less restriction events depending on the activities r and m of the enzymes. For the discussed parameters λ = 0.017, N S = 599, m = 77.02/min and r = 624.10/min, for which the rate of methylation is more than three orders of magnitude larger than the growth rate, in most cases all restriction sites are re-methylated after division before the cell divides again and the population self-restriction rate is very small (Figure C, panel B).

S.3 Fluctuations in enzyme levels
In the main paper, we assumed that the enzyme activities m and r are the same in all cells. In practice, it may be hard and costly for cells to always keep m and r constant.
In this section, we supplement the results of the main paper by investigating the consequences of fluctuations in enzyme activities. In particular, we study how populations grow if different cells have different enzyme activities. For now, let us assume that (iii) The population is growing much faster than it is self-restricting.
Under these assumptions, it is straightforwards to extend the results of the main paper to the case where m and r follow a distribution over the population and to show that the expected population size n(t) of such a population would follow the growth model This implies that the population self restriction rate can simply be obtained by averaging µ(r, m, λ) over the population distribution p R,M (r, m) of enzyme activities. Similarly, we can calculate a population phage escape probability by averaging p V (r, m) over p R,M (r, m) and ask how these quantities change when the variability in enzyme activities between cells increases. For µ(r, m, λ), we can understand this as imposing the distribution p R,M (r, m) onto Figure 3b in the main paper and averaging the displayed values over this distribution. More or less independently of the precise shape of p R,M (r, m), we find that increasing the marginal variances of p R,M (r, m) increases both the population self-restriction rate and phage escape probability, but that the increase in self restriction is small as long as the mean activity of M is large compared to its variance and the growth rate λ while the increase in phage escape probability is small if the distribution

S.4 Bacteria-phage ecology
In this section, we study alternative ecological scenarios and prove the results in Table  1 in the main paper. While a comprehensive investigation of all plausible ecological scenarios is out of scope of this paper, we do want to stress that the framework for quantifying the cost of immunity that we developed in this paper provides a general and easy to evaluate functional relationship between the single cell details of RM-systems and a population level growth cost. It is therefore straightforward to connect this framework to any standard model [6,7,8,9] of phage ecology and to study how the cost of immunity might affect bacteria-phage dynamics. In this paper, we limit ourselves to studying the consequences of the different efficiency criteria for bacteria colonizing phage dominated environments that are listed in the main paper. Furthermore, we provide simulation results for two simple models, tracking either only bacterial population growth or the ecological dynamics of methylated and non-methylated phages interacting with a bacterial population comprised of cells with and cells without RM-systems.

S.4.1 A minimal model of bacterial population dynamics under constant phage load
To gain some first insights into the dynamics of a bacterial population, n(t), under constant phage load, v, we consider a simplistic model that combines bacterial growth and reduction of the population size through phage infections in a single ordinary differential equation: The first term on the right-hand side describes the growth of the population at an effective rate of λ e (r, m, k, λ) and already accounts for self-restriction as detailed in the main paper. The second term accounts for phage escape events. Here, v is the (unit-free) size of the phage population which we take to be a constant parameter of the environment; phages enter bacterial cells at a rate proportional to vn(t), as prescribed by mass-action kinetics, with a proportionality constant given by the phage adsorption rate, ρ. The rate of successful infections is then given by ρvp V (r, m, k)n(t). We assume that each successful infection event wipes out a fraction, 0 < 1 l < 1, of the total population (i.e., on average, n(t)/l bacteria die following phage escape), yielding the full Eq (6). For simplicity, we assume all infections to be lytic and do not consider the possibility of the phage to lysogenize the host bacteria. In this minimal model, two key quantities summarize the fate of bacterial populations in the presence of bacteriophages: λ e quantifies the short-term growth rate before rare but potentially catastrophic phage escape events are likely to occur, while the long-term cost of phage escape can be quantified through the steady state population size to which the population can maximally grow for given v and a given configuration of the RM-system (see Figure F, panel (a)). It can be shown that the one biologically-relevant fixed point of the growth dynamics that represents the steady-state bacterial population size is given by where division by l in the definition of n s (r, m, k, λ) is incorporated here to point out that the steady state of this model is the same as the expression for the "expected growth until phage escape" that we found in Table 1 in the main paper up to a constant that does not depend in any way on the single-cell parameters r, m, and the number of concurrently active RM-systems, k. We can conclude that model Eq.(6) suggests to measure bacterial performance in the same way as scenario (i) in the main paper, and that analyzing trade-offs according to this model would lead to the same results as the main paper but additionally allows to interpret points on the provided Pareto fronts as possible trajectories of bacterial growth dynamics. To illustrate this, Figure F, panel (b), shows growth curves in which λ e is fixed to half the base growth rate λ and the RM-systems are optimized to maximize n s (r, m, k, λ) given this λ e for different number of concurrently active RM-systems, k. It can be seen that k opt = 2 is the optimal number of RM-systems to grow at the this rate λ e .

S.4.2 A coupled model of bacteria-phage ecology
To obtain a plausible model of the coupled ecological dynamics of phages interacting with bacteria carrying RM-systems, we start from the model in reference [6] where Campbell considered a simple phage predator model in a chemostat. In particular, Campbell's model comprises the phage concentration v(t) and the bacterial concentration n(t) evolving according to the following equations: where a is the flow rate of the chemostat, ρ is the phage adsorption rate, k I the rate of spontaneous phage inactivation, N the number of phages that are produced when a bacterium is infected. The growth law is assumed to be logistic with growth rate λ and maximal level L. Note that to be in line with Campbell's model, n(t) and v(t) are defined as concentrations in this section while elsewhere in the paper population sizes are generally treated as unit-free quantities. Consequently, also the parameters in this section have different units than in the rest of the paper. Furthermore, we point out that the original formulation of Campbell's model included a delay explicitly modeling the time that it takes from infection of a cell to lysis and the release of new phages. However, using such a delay in a chemostat model would require one to also take into account that some infected cells are washed out of the chemostat before they are lysed and release phage [7]. To keep our models easily understandable for readers not familiar with this literature, we have decided to omit the delay in our simulations. Eq.(8) captures the dynamics of phages interacting with bacteria that do not have a RM-system and has the property that, depending on the model parameters, the phage either dies out when it grows too slowly compared to the flow rate and the rate of phage inactivation or that it grows sufficiently fast and keeps the bacteria at a very low level [6]. To expand this model with bacteria that contain RM-systems, we need to add (at least) two additional species to it: bacteria that have RM-systems n r (t) and phages v r (t) whose restrictions sites are already methylated because they have been produced as the result of a phage escape event on RM + -bacteria. If methylated or normal phages infect bacteria that do not have a RM-system, the newly produced phages are unmethylated because the methyl-transferase is not present in the cell in which the new phages are produced. Overall, our expanded model is comprised of the following equations: (9) where all parameters have the same meaning as in Campbell's model and p V = p V (r, m, k) and λ e = λ e (r, m, k) are phage escape probability and effective population growth rate of RM + bacteria as defined in the main paper. For the sake of simplicity, we assume that bacteria have either no RM-system or that they have all k RM-systems (or alternatively that k = 1) since otherwise additional bacterial and phage species would need to be tracked in the model. For a first simulation of this model, we assumed that λ = 0.017min −1 , ρ = 0.001ml/min, k I = 0.0001min −1 , a = 0.005min −1 , L = 1/ml, N = 100, and that the RM-systems are not very costly but also not very efficient with p V = 10 −3 and λ e = 0.016min −1 . Figure G shows short and long-term dynamics of the system starting from an initial condition where bacterial and phage concentrations are low and methylated phages are completely absent. It can be seen that initially, due to the low phage concentration, both RM − and RM + bacteria grow almost unaffected by phages. Phages, on the other hand, do not grow but are maintained in the environment due to a small number of infections on RM − bacteria that approximately balance the flow rate of the chemostat and phage inactivation. At some point, however, when the population of RM − bacteria has grown to a sufficiently large size, the production of new phages surpasses the loss of phages. This leads to an increasing phage population and more infections, and consequently causes the concentration of RM − bacteria to drop quickly. With less RM − bacteria to prey upon also the phage population plummets again. However, the increased number of phages leads to phage escape events on RM + bacteria and the production of methylated phages. These methylated phages then start to prey upon the RM + bacteria leading to prey-predator damped oscillatory dynamics between these two populations. On a long time horizon the system equilibrates to a steady state where the environment contains mostly methylated phages while both RM − and RM + bacteria are maintained at low levels ( Figure G, panel b). To see whether more efficient RM-systems allow bacteria to defend themselves more successfully, we performed another simulation of the model assuming p V = 10 −10 and λ e = 0.016min −1 . Figure H shows that the dynamics of the system remain overall similar with the difference that the reduced phage escape probability delays the time when the population of methylated phages starts to increase. It does not, however, prevent methylated phages from eventually taking over and the steady state population sizes of bacteria remain at equally low levels as before. Decreasing the phage escape probability even further does not change anything about this result. To back up this result, assuming the same RM-system characteristics as in the main paper (N S = 599, N V = 5, c = 3.744 · 10 −7 ) and k = 1, we calculated effective population growth rate λ e (r, m, k) and phage escape probability p V (r, m, k) for 1.25 million parameter combinations of r and m (on a 2-dimensional grid up to r = 50000min −1 and m = 200min −1 ) according to the equations provided in the main paper. For each parameter combination, we used a numerical solver to approximately calculate the steady state of the model in Eq. (9) and found that in all cases the steady state concentration of RM + bacteria remains below 0.11/ml, that is below approximately 10% of the maximally possible concentration L in the logistic growth model. We conclude that, at least according to the considered deterministic model in which phages cannot go completely extinct, RM-systems can only provide transient resistance against phages and that the efficiency of RM-systems would have to be quantified as the duration of this transient resistance, similar to the scenarios studied in Table 1 in the main paper.  Table 1 of the main paper Theorem. Consider a bacterial population n(t) of initial size n(0) = n 0 growing exponentially at rate λ e according to n(t) = e λet n 0 in a well-stirred environment containing a number of phages v that remains constant at all times. Let c mut be the rate of the occurrence of immunity conferring mutations per cell and unit time, if λ e < 0.

Proof:
In a well-stirred environment with constant number of phages v, the rate of phages infecting bacteria is given by ρ · v · n(t) according to mass action kinetics. The rate of successful infections is then δ(t) := ρ v n(t) and the occurrence of successful infections can be regarded as firings of a non-homogeneous Poisson process with rate δ(t). It follows from the theory of non-homogeneous Poisson processes that the probability density function f τp of τ p is given by It follows that the density function can be expressed as To prove (a), writing out the expectation and plugging in Eq. (10) yields where Using n 0 > 0 and ρ v > 0, it follows that and thus which concludes the proof of (a).
To prove (b), writing out the expectation and plugging in Eq. (10) yields Using n 0 > 0 and ρ v > 0, it follows that which concludes the proof of (b) since c mut is just a multiplicative constant.
The theorem proves the results for the efficiency criteria (i) and (iii) in Table 1 of the main paper. To see why P (τ mut < τ p ) = cmut cmut+ρv for criterion (ii), we note that mutation events can be represented firings of an inhomogenous Poisson processes with a rate c mut n(t) that has the same dependence on the population size as the rate ρ v n(t) of successful infection events. As a consequence, the probability that mutation happens before phage escape is simply the same as the probability that a homogenous Poisson process with rate c mut fires before another homogenous Poisson process with rate ρ v , i.e., equal to cmut cmut+ρv .
Corollary 1. If the effective growth rate of the population is positive, the expected growth of the bacterial population until phage escape, the probability that mutation happens before phage escape, and the expected integrated population mutation rate until phage escape are all independent of the initial size of the population. The probability that mutation happens before phage escape and the population mutation rate are furthermore independent of the bacterial growth rate.
Corollary 2. For all three bacterial performance criteria in Table 1 in the main paper, the optimal bacterial defense strategy is constant in time.
Proof: This follows directly from the result that the performance criteria are independent of the initial population size. In particular, if a population is growing and phage escape has not happened up to a certain time t 1 , then the population size will have changed from n(0) to n(t 1 ). If the efficiency of immunity as quantified by the three bacterial performance criteria in Table 1 in the main paper would depend on population size, then the optimal bacterial defense strategy would need to be adjusted for the new population size n(t 1 ). Given that this is not the case, however, we find that there exist a single best defense strategy that does not change when the population is growing.
Since Corollary 2 has interesting biological implications, we will in the following discuss further for case (i) in Table 1 in the main paper how the expected amount by which the bacterial population can grow before phage escape, E[n(τ p )] − n 0 becomes independent of the initial population size n 0 . In fact, as we will demonstrate below one can also show the stronger results that g(τ p ) := n(τ p ) − n 0 follows an exponential distribution with parameter ρv λe that is independent of n 0 , despite the seemingly explicit dependence on n 0 and the fact that the probability density function of τ p does depend on n 0 . To see this, let us first denote the probability density function of g(τ p ) by f g(τp) (n), where we have chosen n as the running variable of the density to indicate that this is a probability density over population sizes. f g(τp) (n) can be obtained from the probability density function of τ p by exploiting the change of variables formula where g −1 is the inverse function of g. Since g(t) = n(t) − n 0 = e λet n 0 − n 0 we obtain its inverse function as g −1 (n) = 1 λ e log n + n 0 n 0 (12) To evaluate Eq.(11), we note that while the second term can be obtained by plugging g −1 (n) into Eq. (10): Plugging Eq.(13) and Eq.(14) into Eq. (11) gives which can be recognized as the probability density function of an exponential distribution with parameter ρv λe and hence mean λe ρv , as already derived earlier. We can conclude that the entire probability distribution of g(τ p ) = n(τ p )−n 0 is independent of n 0 . Correspondingly, for any bacterial performance measure that is derived from this probability distribution, such as the mean in criteria (i) in Table 1 in the main paper, one will obtain the result that the optimal bacterial defense strategy against phages does not depend on the bacterial population size. To visualize this mathematical result graphically, we fixed parameters ρv = 1min −1 , and λ e = 0.016min −1 and p V = 1.6·10 −5 in line with experimental results in [10] obtained for the EcoRI restriction modification system and calculated growth curves and all relevant probability densities using different values for initial population sizes. Figure I, panel (a), shows how two different initial conditions (n 0 = 10 and n 0 = 100) yield populations that require different amounts of time to grow to large sizes, but also how phages tend to escape faster in the case where the population starts at n 0 = 100. Panel (b) then visualizes how these two effects cancel each other out and how the probability distribution of the population increase until phage is the same in both cases. Table 1 of the main paper.

S.4.4 Pareto optimality for efficiency criteria (ii) and (iii) in
In this section, we investigate how the Pareto optimal fronts in Figure 4 of the main paper change when efficiency criteria (ii) or (iii) in Table 1 of the main paper are considered instead of efficiency criterion (i). Accordingly, using the same parameters as in panels c and d in Figure 4 of the main paper, and additionally assuming that the per cell mutation rate is given by c mut = 10 −8 min −1 , we calculated Pareto optimal fronts between λ e (r, m, k) and P (τ mut < τ p ) for criterion (ii) and E τp 0 c mut n(t)dt for criterion (iii). The results are displayed in Figure J. The results for efficiency criterion (iii) are overall relatively similar to the results with criterion (i) of the main paper. This is because the Pareto fronts are mostly shaped by the functional dependency of the criteria on the phage escape probability, which is the same in the two cases. In criterion (ii), on the other hand, the bacterial performance is quantified as P (τ mut < τ p ) = cmut cmut+ρv and the dependence on the phage escape probability is therefore more complex than just plain division through p V (r, m, k) as was the case for the two other criteria. For criterion (ii), we find that the main question is whether the rate of mutation c mut is a larger or smaller than ρ v = ρ · v · p v (r, m, k). If c mut is larger than ρ v , the probability that mutation happens before phage escape is typically close to one. If c mut is smaller the probability is typically close to zero. There also exists an intermediate regime in which c mut and ρ v are about the same size, or at least about the same order of magnitude, such that the probability for mutation to happen before phage escape is substantial but smaller than one. With the exception of the case where cells have only a single RM-system (k = 1), however, this regime is relatively small. Consequently, for any given mutation rate c mut there exists a minimal RM-system protection from phages that is necessary to ensure that P (τ mut < τ p ) ≈ 1 and a tuning of the RM-system(s) that realizes just this minimal protection while ensuring that the effective growth rate λ e is not reduced any more than is necessary. This optimal tuning is, however, a function of the environment given that ρ v = ρ · v · p V (r, m, k) depends on the number of phages v. For the parameter values considered here, we find that k = 3 RM-systems can ensure nearly full protection with the smallest reduction in effective growth rate (λ opt e ≈ 0.015min −1 , see black cross in Figure J, panel (a)).

S.4.5 Optimal regulation of enzyme expression levels
To supplement the analysis in the main paper, we investigated in more detail how the optimal enzyme activities change along the Pareto fronts in Figure 4b in the main paper, that is, how different trade-offs between λ e and n s are realized for different numbers of systems k. The results ( Figure K) suggest that any Pareto-optimal enzyme activities m k,opt and r k,opt are necessarily such that the growth cost λ − λ e is allocated in equal parts to self-restriction and metabolic cost. A consequence is that both the optimal total combined enzyme activity and the optimal total self-restriction rate are fully determined by the specific trade-off and do not depend on k, i.e. that k·c·(m k,opt +r k,opt ) = c·(m 1,opt +r 1,opt ) = µ(r 1,opt , m 1,opt ) = k·µ(r k,opt , m k,opt ) = λ − λ e 2 .
Phrased differently, this result can also be understood as follows: for every trade-off, the metabolic cost parameter c fixes the total allowed enzyme activity to λ−λe 2c . This enzyme activity then needs to be optimally allocated to restriction and modification taking into account the number of RM -systems k.  Figure 4c in the main paper. Additionally, the case k = 5 is shown (yellow) and it is outlined which system is optimal at which effective growth rate (thick black). (c,d) Same as panel a and b but showing the total enzyme activities r opt · k and m opt · k. (e) Optimal total combined activity (m opt + r opt ) · k of restriction and modification enzymes as a function of λ e (r, m, k). (f ) Logarithm of optimal phage escape probabilities p V (r, m, k) as a function of λ e (r, m, k).