Speed variations of bacterial replisomes

Replisomes are multi-protein complexes that replicate genomes with remarkable speed and accuracy. Despite their importance, their dynamics is poorly characterized, especially in vivo. In this paper, we present an approach to infer the replisome dynamics from the DNA abundance distribution measured in a growing bacterial population. Our method is sensitive enough to detect subtle variations of the replisome speed along the genome. As an application, we experimentally measured the DNA abundance distribution in Escherichia coli populations growing at different temperatures using deep sequencing. We find that the average replisome speed increases nearly fivefold between 17 °C and 37 °C. Further, we observe wave-like variations of the replisome speed along the genome. These variations correlate with previously observed variations of the mutation rate, suggesting a common dynamical origin. Our approach has the potential to elucidate replication dynamics in E. coli mutants and in other bacterial species.


Introduction
Every cell must copy its genome in order to reproduce. This task is carried out by large protein complexes called replisomes. Each replisome separates the two DNA strands and synthesizes a complementary copy of each of them, thereby forming a Y-shaped DNA junction called a replication fork. The speed and accuracy of replisomes is impressive (Baker and Bell, 1998). They proceed at several hundreds to one thousand base pairs per second (Pham et al., 2013;Elshenawy et al., 2015), with an inaccuracy of about one mis-incorporated monomer every 10 billion base pairs (Schaaper, 1993). In bacteria, two replisomes initiate replication at a well-defined origin site on the circular genome, progress in opposite directions, and complete replication upon encountering each other in a terminal region.
The initiation and the completion of DNA replication conventionally delimit the three stages of the bacterial cell cycle (Dewachter et al., 2018;Wang and Levin, 2009). The first stage, B, spans the period from cell birth until the initiation of DNA replication. The second stage, C, encompasses the time needed for replication. The last phase, D, begins at the end of DNA replication and concludes with cell division. While it is established that DNA replication and the cell cycle must be coordinated, their precise relation has been a puzzle for decades (Willis and Huang, 2017). A classic study by Cooper and Helmstetter, 1968 finds that, upon modifying the growth rate by changing the nutrient composition in Escherichia coli, the durations of stages C and D remain constant at about 40 min and 20 min, respectively. This means that the replisome speed must be unaffected by the nutrient composition, at least on average. When the cell division time is shorter than one hour, DNA replication is initiated in a previous generation. This implies that, in fast growth conditions, multiple pairs of replisomes simultaneously replicate the same genome (Fossum et al., 2007). Tuning the growth rate by changing the temperature has a radically different effect on bacterial physiology. For example, in vivo (Pierucci, 1972) and in vitro (Yao et al., 2009) studies show that the speed of replisomes is affected in this case.
More precise features of replisome dynamics, such as whether their speed is approximately constant or varies along the genome, are important to determine the location of their encounter point and the distribution of replication errors on the genome (Niccum et al., 2019;Dillon et al., 2018). However, this detailed information is hard to obtain (Pham et al., 2013). One way for inferring it is to measure the DNA abundance distribution, that is the frequency of DNA fragments along the genome in an exponentially growing cell population. In fact, the frequency of these fragments in the population depends on the proportions of synthesizing genomes of different lengths, which in turn depend on the replisome dynamics. Previous studies have used the DNA abundance distribution to understand the functioning of bacterial replication and how different proteins assist completion of DNA replication (Wendel et al., 2014;Wendel et al., 2018;Rudolph et al., 2013;Midgley-Smith et al., 2019;Midgley-Smith et al., 2018). However, these studies focused on qualitative analysis of the observed changes of the DNA distribution in knockout mutants with respect to the wild type, and did not attempt to predict the shape of the distribution using quantitative theoretical models. The DNA abundance distribution has also been used to identify actively growing species in a microbiome (Korem et al., 2015).
In this paper, we introduce a method to infer the replisome dynamics from the DNA abundance distribution. As an application, we experimentally measured the DNA abundance distribution of E. coli growing at different temperatures between 17 o C and 37 o C using high-throughput sequencing. Our approach, combined with our experiments, shows that the average speed of replisomes exhibits an Arrhenius dependence on the temperature, with an almost fivefold variation in the range we considered. Moreover, the precision of our experiments reveals that the speed of replisomes varies along the genome in a seemingly periodic and highly repeatable fashion around this average value. We find that this pattern is highly correlated with previously observed wave-like variations of the single base pair mutation rate along the bacterial genome (Niccum et al., 2019;Dillon et al., 2018). We discuss possible common causes for these two patterns.

Distribution of genome types
We consider a population of bacteria that grow exponentially in a steady environment. Each cell in the growing population can encompass three types of genomes, see Figure 1a and Figure 1b: (i) one template genome, that is, the genome that the cell inherited at its birth. (ii) incomplete genomes, that is, genomes which are being synthesized. (iii) post-replication genomes that will be passed to new cells and become their templates.
In nutrient-rich conditions, bacteria replicate their genome in parallel, so that the numbers of incomplete genomes and post-replication genomes per cell are variable, see Figure 1b. The classic Cooper-Helmstetter model (Cooper and Helmstetter, 1968) describes the dynamics of these genomes in a given cell through generations. We adopt a different approach and focus on the abundance dynamics of the three types of genomes in the whole population. We call N T (t) , N S (t) , N P (t) the total number of template genomes, incomplete (synthesizing) genomes, and post-replication genomes, respectively, that are present in the population at time t . Our first aim is to quantify the relative fractions of these three types of genomes.
The total number of genomes is N(T) = N T (t) + N S (t) + N P (t) . Since each cell contains exactly one template, the total number of cells is equal to N T (t) . The total number of genomes evolves as effect of: (a) replication initiation, which creates new synthesizing genomes at a rate k ; (b) completion of replication, which transforms synthesizing genomes into post-replication ones at rate β; and (c) cell division, Figure 1. Dynamics of genome types and DNA abundance distribution in an exponentially growing bacterial population. (a) Cell cycle. In slow growth conditions (top panel), newborn cells contain a template (stage B, red). As the cell cycle progresses, two replisomes synthesize a new genome (stage C, blue) starting from the origin on the template (yellow spot). When replication terminates, cells contain the original template and a post-replication genome (stage D, green). Upon subsequent cell division, the post replication genome becomes the template for the newborn cell. In fast growth conditions (bottom panel), newborn cells acquire a template which is already undergoing synthesis. In subsequent stages, multiple replicating genomes may exist in the same cell. (b) Composition of genomes in an exponentially growing population of cells. Each cell may contain a different number of genomes, depending on its stage in the cell cycle and growth conditions. (c) Dynamics of genome types. Dashed blue arrow represent initiation of  which turns post-replication genomes into templates at a rate α, see Figure 1c. This dynamics is described by the set of equations: It follows from Equations 1-3 that, in steady growth, the total number of genomes grows exponentially at a rate equal to the fork firing rate k . In this exponential regime, the fractions of the three genome types are constant: The ratio N/N T can be interpreted as the average number of genomes per cell. Since this ratio is constant, the fork firing rate k can also be identified as the exponential growth rate of the number of cells. For this reason, from now on, we refer to k as the 'fork firing rate' or the 'growth rate' interchangeably.
In principle, the rates α, β, and k should depend on the 'age' of each genome, that is the time spent by the genome in each stage. In Appendix 1, we generalize our model to an age-dependent model to account for this fact. We find that this age-dependent model is equivalent to Equations 1-3 in the exponential growth regime. This result supports the use of our simple model of genome-type dynamics.
We now analyze the incomplete genomes in more detail. We call x 1 and x 2 the portions of a given incomplete genome copied by the two replisomes at a given time, with 0 ≤ x 1 , x 2 ≤ L , see Figure 1d. Replication initiates at x 1 = x 2 = 0 and completes once the replisomes meet each other, that is, x 1 + x 2 = L . The replisome dynamics proceeds as follows. Each replisome is characterized by a speed which depends, in principle, on the replisome position (be it x 1 or x 2 ) and by a diffusion coefficient representing random fluctuations of the speed. The coordinates x 1 , x 2 of the two replisomes evolve according to the stochastic differential equations: where ξ 1 (t) and ξ 2 (t) are white noise variables satisfying ⟨ξ 1 (t)⟩ = ⟨ξ 2 (t)⟩ = 0 , ⟨ξ 1 (t)ξ 1 (t ′ )⟩ = ⟨ξ 2 (t)ξ 2 (t ′ )⟩ = δ(t − t ′ ) , and ⟨ξ 1 (t)ξ 2 (t ′ )⟩ = 0 . Here, ⟨. . . ⟩ denotes an average over realizations.
Close to thermodynamic equilibrium, the diffusion coefficient D can be estimated by the Stokes-Einstein relation (Hynes, 1977). However, since replisomes are driven far from equilibrium by hydrolysis of dNTPs, their diffusion coefficient could deviate from this estimate. In the absence of fluctuations ( D = 0 ), each of the two replisomes copies exactly half of the genome, whereas for D > 0 their meeting point is characterized by a certain degree of uncertainty.
replication. The dotted green arrow represents completion of replication. The solid red arrow represents cell division. (d) Replisome dynamics. Two replisomes begin replication at an origin and progress in opposite directions. Their speed may depend, in general, on their genome coordinate. (e) Sketch of the DNA abundance distribution as a function of the genome coordinate. All three types of genomes contribute to the DNA abundance distribution. Because of incomplete genomes, the DNA abundance is largest at the origin and smallest at the terminal region (i.e., towards the periphery). (f) Experimental DNA abundance distribution at different temperatures.

Figure 1 continued
In steady exponential growth, we call p st (x 1 , x 2 ) the stationary probability distribution of finding an incomplete genome with copied portions x 1 and x 2 . This probability distribution satisfies the equation: where we introduce the vector notation ⃗ v = (v(x 1 ), v(x 2 )) and ⃗ ∇ = (∂/∂x 1 , ∂/∂x 2 ) . The last term in the right hand side of Equation 8 is a dilution term that accounts for the exponential increase in newborn cells. We formally derive Equation 8 and discuss its boundary conditions in Methods.

DNA abundance distribution
The DNA abundance distribution A(y) is the probability that a small DNA fragment randomly picked from the population originates from genome position y , see Figure 1e. We define the genome coordinate y ∈ [−L/2, L/2] , where y = 0 corresponds to the origin of replication and L is the genome length. Templates and post-replication genomes yield a uniform contribution to the distribution A(y) (red and green in Figure 1e). In contrast, incomplete genomes contribute in a way that depends on the replisome positions along the genomes (blue in Figure 1e). Our experiments permit to measure the distribution A(y) with high accuracy, see Figure 1f and Methods.
To express the distribution A(y) in our model, we first introduce the probability P(y) that a randomly chosen genome (either complete or incomplete) in the population includes the genome location y . This probability is expressed by where we assumed that the dynamics of the two replisomes is symmetric, so that p st (x 1 , x 2 ) = p st (x 2 , x 1 ) , and we used that a randomly chosen genome is complete with probability (1 − N S /N) = β/(k + β) , see Equation 5. The DNA abundance distribution A(y) is proportional to P(y) , up to a normalization constant: For given choices of v(x) , D , and k , our theory permits to compute the distribution of incomplete genomes p st (x 1 , x 2 ) via Equation 8. From this solution, we can also calculate β as the steady rate at which replisomes complete replication (see Methods). This information can be used to compute the DNA abundance distribution A(y) using Equation 10. Therefore, by experimentally measuring the DNA abundance distribution, we can test our hypotheses on the speed function v(x) and the diffusion coefficient D .

Constant speed model
We first consider a scenario in which replisomes progress at a constant speed v and without fluctuations, D = 0 . We find that, in this case, the DNA abundance distribution is expressed by see Methods. We fit this distribution to the experimental data using maximum likelihood, see Figure 2a. The speed v is the only fitting parameter, because we independently measure the exponential growth rate k from the optical density, see Methods.
We find that the speed increases nearly fivefold with temperature in the range we considered and appears to follow an Arrhenius law, see Figure 2b. This behavior resembles that of the growth rate. The effective activation energy characterizing the cell cycle is larger than that characterizing the replisome speed, see Figure 2c, possibly due to the large number of molecular reactions involved in the cell cycle. The data point at 17°C appears to deviate from the Arrhenius law for both the speed and the growth rate (Roy et al., 2021), see Figure 2c. Figure 2c, the replisome speed does not increase as fast as the growth rate at increasing temperature. This observation suggests that the number of replisomes per genome should increase with temperature to compensate for this gap. Indeed, in the temperature range we studied, our model predicts that the average number of replisomes per complete genome increases from two to almost six, see Figure 2d and Methods.

As seen in
We now focus on the average genome content per cell. Since the model assumes that genomes evolve independently, the average DNA content per cell C is the product of the average genome    length ℓ times the average number of genomes per cell N/N T . Computing the average genome length in the model (see Methods) and using Equation 4 for the average number of genomes per cell, we obtain that the average DNA content per cell is expressed by The classic Cooper-Helmstetter model (Cooper and Helmstetter, 1968) predicts the DNA content per cell assuming constant durations of stages B, C, and D of the cell cycle. Since we assumed constant speed and D = 0 , the duration of the replication cycle L/(2v) is constant in our case as well. As a consequence, the prediction of Equation 12 is equivalent to that of the Cooper-Helmstetter model (see Appendix 3).
It is generally believed that the ratio between the average DNA content per cell C and the protein content per cell should be maintained approximately constant at varying physiological conditions. This implies that C should be proportional to the cell size. If the growth rate is varied by changing the nutrient composition, v remains constant (Cooper and Helmstetter, 1968). Equation 12 then predicts an approximately exponential growth of C with k , which is consistent with observations. In this case, the Schaechter-Maaloe-Kjeldgaard growth law states that the cell size grows exponentially with k (Schaechter et al., 1958), thereby ensuring DNA-protein homeostasis. In the case of varying temperature, we find that v and k present a similar dependence on T (see Figure 2c), so that their ratio and thereby C weakly depends on k (see Appendix 3-figure 1). Our result is consistent with observations showing that, at increasing temperature, the cell size remains approximately constant (Shehata and Marr, 1975) or possibly slightly increases (Trueba et al., 1982).   Table 1.
The online version of this article includes the following figure supplement(s) for figure 3:

Oscillating speed model
The assumption of constant speed leads to a rather good fit of our DNA abundance data. However, the precision of our data permits us to appreciate systematic deviations from the model predictions under the constant speed hypothesis, see This analysis further supports that this phenomenon is robust, at least in fast growth conditions. To account for these observations, we introduce a more refined model in which the replisome speed oscillates along the genome: where δ represents the relative amplitude of oscillations; ω their angular frequency along the genome; and ϕ their initial phase. We also take into account random speed fluctuations in this case, D ≥ 0 . In this case, we predict the DNA abundance distribution using stochastic simulations, see Methods. By fitting the DNA abundance, we estimate the parameters v , δ, ω, ϕ , and D , see Figure 3(f-j) and Table 1.
Our fitted speed oscillations are reminiscent of a previously observed wave-like pattern in the mutation rate along the genome of different bacterial species (Dillon et al., 2018;Niccum et al., 2019). For a quantitative comparison, we analyze this pattern in a mutant E. coli strain lacking DNA mismatch repair (Niccum et al., 2019). We find that the oscillations in mutation rate and speed are highly correlated, see Figure 4a. The mutation rate appears approximately in phase with the speed, meaning that regions where replisomes proceed at higher speed are characterized by a higher mutation rate. This observation leads to the hypothesis that the two phenomena have a common cause.
We consider two possible explanations for these oscillations. The first is that the oscillations originate from a systematic process related with the cell cycle (Niccum et al., 2019). The second explanation is that the oscillations are caused by competition among replisomes for nucleotides or other molecules required for replication. Assuming approximately constant cell division times, we estimate the cell division time as τ = (ln 2)/k . Since k is also equal to the fork firing rate per genome, the time between firing events in a cell is also approximately equal to τ, so that the two hypotheses lead to the same quantitative prediction. If the speed of replisomes was coupled to a factor oscillating with period τ, this would cause spatial oscillation of speed with angular frequency ω = 2π/(vτ ) = 2πk/[(ln 2)v] . This prediction qualitatively agrees with our fitted values of ω, see Figure 4b.
If the wave-like pattern were caused by competition among replisomes, one would expect either a minimum of the speed every time a new fork is fired ( ϕ = π ) or the speed to start decreasing when a new fork is fired ( ϕ = π/2 ). Our fitted values of the phase ϕ are also compatible with this range, see Figure 4c. Our results show that the diffusion coefficient D is quite small. For about one third of our experimental realizations at each temperature, our fitted value of D is not significant according to the Akaike information criterion (see Figure 4d and Figure 4-figure supplement 2). For comparison, we estimate the equilibrium diffusion constant of replisomes in the cytoplasm from the Stokes-Einstein relation as D SE ≈ 6 kbp 2 s −1 (see Appendix 5), of the same order of magnitude as our fitted values, see Table 1 and Figure 4. These results suggest that, despite their high average speed, the fluctuations of the replisome position are remarkably similar to the equilibrium case.  The diffusion coefficient determines the uncertainty about the genome site where the two replisomes meet. In the absence of diffusion ( D = 0 ), replisomes would always meet at the midpoint on the circular genome. For D > 0 , we estimate the typical size l D of the region in which the two replisomes meet as follows. Since the fitted values of δ and D are both small, we approximate the replication time as τ C ≈ L/(2v) . In this time, the accumulated uncertainty due to diffusion is equal to l D ≈ 2 √ 2Dτ D . From our estimated diffusion coefficients and average velocities, we obtain values of l D on the order of 100 − 200 kbp , depending on temperature.
We remark that our bacterial cultures are grown in LB medium. Although the growth curves appear exponential before saturating (see Methods), the nutrient composition can be such that the assumption of steady growth made in our model is not valid. It is therefore important to scrutinize whether the oscillations can be a consequence of this issue. To this aim, we analyzed a version of the model in which the fork firing rate is not steady, but gradually declines with time, see Appendix 7. We find that the discrepancy between the DNA abundance distributions predicted by this model and by the steady-state model is very small compared to the oscillations we observe. This observation supports that a potential lack of steady-state in the LB medium is not a likely cause for the oscillations.

Discussion
In this paper, we infer the dynamics of replisomes from the DNA abundance distribution in a growing bacterial population. Our theory can be seen as a generalization of the classic Cooper-Helmstetter theory (Cooper and Helmstetter, 1968;Bremer and Churchward, 1977), that permits to estimate the duration of the replication period from the abundance of certain genomic locations in a growing population, see e.g. (Zheng et al., 2016;Si et al., 2017). While the Cooper-Helmstetter theory assumes constant replisome speed, our approach allows for varying speeds. We test our method by measuring the DNA abundance distribution of E. coli populations growing at different temperatures. We thereby accurately estimate the average speed of replisomes in vivo, and their speed variations along the genome.
We find that the dependence of the average replisome speed on the temperature is well described by an Arrhenius law, similar to that governing the population growth rate. This quantitative dependence can be used to deduce other laws governing bacterial physiology at varying temperature. For example, we argue that precise DNA-protein homeostasis requires the cell size to mildly vary with temperature. This prediction is in qualitative agreement with previous observations (Shehata and Marr, 1975;Trueba et al., 1982) and calls for more systematic measurements of cell parameters at varying temperature, similar to what has been done in the case of varying nutrient composition (Si et al., 2017).
Our approach reveals a wave-like oscillation of the replisome speed along the E. coli genome. The relative amplitude of these oscillations ranges from 10% to 20% of the average replisome speed. A quantitatively similar pattern was observed in the bacterial mutation rate along the DNA of an E. coli mutant strain (Niccum et al., 2019) and of other bacterial species (Dillon et al., 2018). This similarity suggests that the two phenomena have a common dynamical origin. In particular, we hypothesize that this correlation could be a manifestation of the trade-off between accuracy and speed that characterizes DNA polymerases (Sartori and Pigolotti, 2013;Banerjee et al., 2017;Fitzsimmons et al., 2018). Because of this trade-off, any mechanism increasing the speed of a polymerase is expected to increase its error rate as well.
Our analysis of the frequency of these oscillations supports that this pattern may originate from a process synchronized with the cell cycle (Dillon et al., 2018), whose activity alters the replisome function. An alternative hypothesis is that the oscillations originate from competition among replisomes for shared resources, such as nucleotides. According to this idea, the firing of new forks can hinder the progression of existing replisomes. The frequency of oscillations is compatible with both explanations. The following additional evidence supports the latter hypothesis. We did not observe appreciable oscillations for our lowest temperature of 17°C, which according to our estimates falls outside the multi-fork replication regime. Further, we found that oscillations disappeared when analyzing previous data from a culture grown in a minimal medium (Midgley-Smith et al., 2018), where multi-fork replication is also not expected. On one hand, these facts point to competition between multiple replisomes in the same cell as a likely source for the oscillations. On the other hand, given the difficulty of obtaining steady exponential growth in LB medium, further experiments will be important to assess an eventual effect of the growth medium on the DNA abundance shape.
Beside these regular and repeatable variations, our analysis shows that random fluctuations of replisome speed are quite small, leading to an uncertainty of about 100 − 200 kbp on the location of the replisome meeting point. In bacteria, the terminal region of replication is flanked by two groups of termination (Ter) sites having opposite orientations. Ter sites are the binding sites for the Tus protein and permit passage of replication forks in one direction only (Elshenawy et al., 2015), so that the two groups effectively trap the two forks in the terminal region (Duggin et al., 2008). Out of the ten Ter sequences in E. coli, only two of them (TerB and TerC) are within 100 − 200 kbp of the point diametrically opposite to the origin. These two sequences have the same orientation. Our result therefore implies that most Ter sequences are usually not needed to localize the replisome meeting point. This prediction is consistent with previous observations that the phenotypes of Tus-E. coli mutants (Roecklein et al., 1991) or mutants lacking Ter sequences (Duggin et al., 2008) do not appear distinct from that of the wild type.
Quantitative modeling of the DNA abundance distribution has the potential to shed light on aspects of replisome dynamics beyond those explored in this paper. For example, it was observed that the knockout of proteins involved in the completion of DNA replication leads to either over-expression or under-expression of DNA in the terminal region (Wendel et al., 2014;Wendel et al., 2018;Sinha et al., 2018). Incorporating the role of these proteins into our model will permit to validate possible explanations for these patterns. More in general, our approach is simple and general enough to be readily applied to other bacterial species, to unravel common principles and differences in their DNA replication dynamics.

Methods
Cultivation and DNA extraction E. coli MG1655 was cultured in LB medium supplemented with 50 mM MOPS pH 7.2 and 0.2% glucose. Overnight cultures grown at 37 °C were diluted into fresh medium and grown until reaching an OD600 of about 1.0 at the target temperature. These cultures were used to inoculate 50 ml medium at the desired temperature in 500 ml Erlenmeyer flasks with baffles at a target OD of 0.01. Cultivation was performed with shaking at 250 rpm. OD was determined with a NanoDrop One in cuvette mode. harvested. Solid lines represents the exponential growth curve for each temperature. We computed the growth rate k i for each sample i = 1, 2, 3 at a given temperature by fitting the optical density to a logistic function a i /[1 + b i exp(−k i t)] , where a i and b i are sample-specific constants (Zwietering et al., 1990). The growth rate k for each temperature is the average of the k i s. (b) Same data as in (a), but the time in the x-axis is scaled by the growth rate k at each temperature. As a result of this rescaling, the growth curves collapse on each other. (c) Average growth rate as a function of temperature. The solid purple line is an Arrhenius fit to the data (see Figure 2), resulting in B = (6.0 ± 24.9) × 10 12 hr −1 and ∆ C = (74 ± 10) kJmol −1 . We exclude the data point for T = 17 o C from the fit.
The online version of this article includes the following figure supplement(s) for figure 5: The growth curves of E. coli were highly repeatable (over three replicate experiments for each temperature), see Figure 5a and Figure 5b. We computed the growth rate at each temperature by fitting a logistic function to individual growth curves, see Figure 5c. When the time was rescaled by the average growth rate, OD of different temperatures collapsed along a single curve. The cultures (1.4 ml) were harvested by centrifugation at 21,000 g for 20 s after reaching an OD of around 0.5 (midexponential phase, dashed lines in Figure 5a). Cells were kept growing for at least 45 doubling times (as measured in exponential phase) to reach stationary phase. Samples of 0.2 ml from the stationary phase cultures grown at 17°C, 27°C, and 37°C were harvested for DNA extraction. The pellets were immediately frozen at -80 °C until DNA extraction. DNA was extracted in parallel using Genomic DNA Purification Kit from Thermo Fisher Scientific.

Sequencing
We sequenced three samples in the exponential phase from different experimental realizations for each temperature. In addition, we sequenced three stationary samples at three different temperatures. DNA samples were sheared by ultrasound using Covaris AFA technology. Libraries were then prepared using the PCR-free NEBNext Ultra II DNA Library Prep Kit for Illumina. Sequencing was performed on a Novaseq6000 using paired-end 150 bp reads.

Alignment and bias elimination
We aligned reads from each sample using Bowtie2 2.3.4.1 (Langmead and Salzberg, 2012), using the MG1655 genome as a reference. We calculated the frequency of reads as a function of the genome coordinate with bin size 10kbp. To attenuate bias, we divided the frequency at each genome coordinate in a sample from the exponential phase by the frequency of the corresponding bin in a stationary sample (Wendel et al., 2014;Midgley-Smith et al., 2018). We alternatively used all of our three stationary samples to correct the bias of each sample in the exponential phase. Therefore, after bias elimination, we effectively have 3 × 3 = 9 different DNA abundance curves in the exponential phase at each temperature. See

Stationary distribution of replisome positions
In this section, we discuss how to compute the stationary distribution of incomplete replisome positions p st (x 1 , x 2 ; t) . We call n S (x 1 , x 2 ; t) the number density of incomplete genomes at time t with replisome positions at x 1 and x 2 . By definition ´L 0 dx 1´L −x1 0 dx 2 n S (x 1 , x 2 ; t) = N S (t) . It follows from Equation 7 that this number density evolves according to where ⃗ ∇ = (∂/∂x 1 , ∂/∂x 2 ) and ⃗ v = (v(x 1 ), v(x 2 )) . We now introduce the normalized probability p(x 1 , x 2 ; t) = n S (x 1 , x 2 ; t)/N S (t) . By substituting this definition into Equation 14, we obtain The stationary distribution p st (x 1 , x 2 ) is a time-independent solution of Equation 15, see Equation 8.
Because of replication completion, the line x 1 + x 2 = L is an absorbing state for the dynamics described by Equation 15. Equation 15 must be consistent with Equation 2, which describes the dynamics of incomplete genomes regardless of the coordinates of their replisomes. This implies that the rate β at which replication completes (see Equation 2) must equal to the probability flux through the absorbing boundary: where we introduce the probability current ⃗ J(x 1 , x 2 ) = ⃗ vp − D ⃗ ∇p , the unit vector n = (1/ √ 2, 1/ √ 2) , and the infinitesimal line increment dl along the absorbing boundary. Similarly, the probability flux entering the system at (x 1 , x 2 ) = (0, 0) must match the rate of replication initiation as given by Equation

2.
Given a hypothesis on the speed function v(x) and the diffusion coefficient D , we solve Equation 15 at stationarity using the experimentally measured growth rate k . From the stationary solution p st (x 1 , x 2 ) , we obtain β using Equation 16. Our approach does not permit to determine the cell division rate α appearing in Equation 1-Equation 3. However, this rate is not necessary to compute the DNA abundance distribution, which is expressed by Equation 9 and Equation 10 in terms of p st (x 1 , x 2 ) and β only.

Average genome length
To compute the average genome length, we first note that the integral of P(y) is equal to the average genome length ℓ in the population Combining Equation 17, Equation 10, and the fact that P(0) = 1 , we obtain a simple relation between the DNA abundance distribution and the average genome length:

Average number of replisomes per complete genome
We estimate the average number of replisomes per complete genome N in two alternative ways. On the one hand, using Equation 4-Equation 6 we find that On the other hand, it can be seen in Figure 2d that the fraction of complete genome in the population is equal to the ratio A(L/2)/A(0) between the DNA abundance at the terminal and at the origin. It follows that  Table 1), except for D which is chosen to be larger for illustration purposes.

Constant speed
We focus on the scenario with constant speed and D = 0 . In this case, the steady solution of Equation 15 is given by The rate at which replication completes is equal to Substituting Equation 21 and Equation 22 into Equation 9, we obtain.
from which Equation 11 follows by normalizing, see Equation 10. We exactly solved Equation 15 also in the case where the speed depends on the genome coordinate, provided that the diffusion coefficient vanishes, see Appendix 6.

Stochastic simulations
In the case of oscillating speed and D > 0 , we compute the stationary solution of Equation 15 using numerical simulations. To this aim, we interpret Equation 15 as describing a stochastic process subject to stochastic resetting (Evans and Majumdar, 2011). Specifically, we perform stochastic simulations of Equation 7. In addition to the dynamics described by Equation 7, with a stochastic rate equal to the fork firing rate k , trajectories are reset to the origin, x 1 = x 2 = 0 (blue trajectory in Figure 6a). Since the boundary x 1 + x 2 = L is an absorbing state, trajectories that reach this boundary are also reset to the origin (green trajectory in Figure 6a). The probability distribution associated with this dynamics evolves according to Equation 15. We simulate this stochastic dynamics to estimate the stationary distribution p st (x 1 , x 2 ) in a computationally efficient way, see Figure 6b. We estimate from the same simulations the parameter β as the empirical rate at which the absorbing boundary is reached, see Equation 16.

Acknowledgements
We are grateful for the help and support provided by the Scientific Computing and Data Analysis section of the Research Support Division at OIST. We thank the DNA Sequencing Section of OIST, in particular N Arasaki, for support with sequencing. This work was supported by JSPS KAKENHI Grant No. JP18K03473 (to SP) and a grant from Deutsche Forschungsgemeinschaft (DFG, grant number: 452628014 to SH). We thank M Cencini and P Sartori for feedback on a preliminary version of our manuscript. The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. Appendix 1

Additional information
Age-dependent dynamics of genome types In this Appendix, we generalize the model embodied in Equations 1-3 to the more realistic case in which genomes transition from one stage to another with an age-dependent rate. Specifically, we call τ the time since a genome fired its last fork and define the time-dependent fork firing rate k(τ ) . The time-dependent fork firing rate can be expressed in terms of the fork-firing time-distribution We also define the age a of genomes in synthesizing or post-replication stage, i.e. the time they spent in their stage. We define the age-dependent rate of completion β(a) and the age-dependent duration of post-replication stage α(a) . These rates can be expressed in terms of the distribution of time to complete the synthesizing stage, g(a) and the distribution of time to complete the postreplication stage, h(a) . The relations are: We call n T (τ ; t) the number density of templates at time t that fired their last fork at time t − τ . We call n S (a, τ ; t) , and n P (a, τ ; t) the number density of synthesizing and post-replication genomes (respectively) at time t that fired their last fork at time t − τ and with age a . These densities evolve according to the coupled equations: where n(τ ; t) = n T (τ ; t) +´∞ 0 n S (a, τ ; t) da +´∞ 0 n P (a, τ ; t) da is the total number density of genomes at time t that fired their last fork at time t − τ .
The total numbers of templates, synthesizing, and post-replication genomes are respectively given by N T (t) =´∞ 0 n T (τ ; t)dτ , N S (t) =˜∞ 0 n S (a, τ ; t)dadτ , and N P (t) =˜∞ 0 n P (a, τ ; t)dadτ . Integrating Equation 24 with respect to τ and Equations 25; 26 with respect to a and τ we obtain The total number of genomes, N(t) = N T + N S + N P grows with time as If the fork firing rate is independent of τ, k(τ ) = k , then the number of genomes grows as N = N(0)e kt . A computation of the growth exponent for general case requires knowledge of n(τ , t) . Using the definition of n (τ , t) in Equation 24-Equation 26 leads to We assume that the age dependent genome population scales with the total number of genomes, n(t; τ ) = q(τ ; t) N(t) . This assumption for n(t; τ ) yields an exponential growth for the number of genomes, N = N(0)ek (t)t , with a time-dependent exponent, Substituting this along with the relation n(τ ; t) = q(τ ; t) N(t) in Equation 31, we find In the steady state, qst(τ ) ≡ q(τ ; t → ∞) is independent of time. In this limit, the growth rate in Equation 32 is also time-independent, k =´∞ 0 k(τ )qst(τ )dτ . Using these conditions in Equation 33, we find where we used e −´τ 0 k(τ ′ )dτ ′ = 1 −´τ 0 f(τ ′ )dτ ′ . The normalization condition, ´∞ 0 qst(τ )dτ = 1 , yields the Euler-Lotka equation for the relation between k and fork firing time distribution f(τ ) , In the steady state, we assume that n T (τ ; t) = N T (t)q T (τ ) , n S (a, τ ; t) = N S (t)q S (a, τ ) and n P (a, τ ; t) = N P (t)q P (a, τ ) , see, e.g., (Jafarpour et al., 2018). Because of the definitions of N T , N S and N P , we have the normalization conditions These conditions permit to express the dynamics of genome types as where α =˜∞ 0 α(a)q P (a, τ )dadτ and β =˜∞ 0 β(a)q S (a, τ )dadτ . From Equation 37, we obtain in the long time limit: These relations are equivalent to Equation 4-Equation 6 for age dependent rates.

Appendix 2 Parameter estimation based on maximum likelihood
In this section, we outline the maximum likelihood method to fit the parameters of the model. We make a histogram of our data with bin size b = 10 kbp. We call B the total number of bins. We empirically estimate the DNA abundance in each bin i as where N e i and N s i are the numbers of reads in the i th bin from the exponential and stationary culture, respectively. We assume that the number of reads in each bin follows a Poisson distribution (Aird et al., 2011). In the limit of large N e i and N s i , the distribution of the empirical DNA abundance E i is Gaussian with a standard deviation The assumption of large number of reads is well satisfied for our sequencing depth: N e i ranges from 2.8 × 10 4 to 19.9 × 10 4 and N s i from 4.6 × 10 4 to 9.0 × 10 4 . We call A i ≡ A(ib) the DNA abundance predicted by our model at bin i for a given set of parameters. The joint likelihood of the empirical DNA abundance in all the bins is given by We fit the model parameters by maximizing the logarithm of the likelihood ln L .
In the constant speed case, we have a single fitting parameter k/v . In the oscillatory model without and with diffusion we have four and five fitting parameters, respectively.

Appendix 3 Correspondence with the Cooper-Helmstetter Theory
In this section, we study our model in the case of constant speed, constant duration of the B and D stages, and slow growth condition, so that all stages B, C, and D of the cell cycle are present. Our goal is to prove that, under these additional assumptions, our model predicts the same total DNA abundance as the Cooper-Helmstetter model.
The expression for the total DNA abundance in the constant speed case reads see Equation 12. The analogous expression for the Cooper-Helmstetter model is where τ C is the time taken by replisomes to complete replication on the genome and τ D is postreplication period (Cooper and Helmstetter, 1968).
Since we assumed constant replisome speed, we have τ C = L/2v . Therefore, proving that Equation 44 and Equation 45 are equivalent boils down to showing that e kτD = (α + k)/α . We call τ B the time spent by a bacterium in stage B. In the CH model the times τ B , τ C are constant. The division time is τ div = τ B + τ C + τ D . The population growth rate is k = ln 2/τ div .
Since the division time is constant, the steady-state age distribution is expressed by see e.g. Powell, 1956 We now express the steady-state fractions f B , f C , f D of cells in stage B, C, D as The ratio of the number of post replication genomes to the number of templates is equal to the fraction of cells in stage D. Therefore, equating the fraction f D with N P /N T = k/α obtained from Equation 4 and Equation 6, we obtain e kτD = (α + k)/α as expected. Exact solution of the oscillatory speed case for D = 0 In this section, we exactly solve the oscillatory speed model in the absence of diffusion ( D = 0 ). For an arbitrary choice of the function v(x) and D = 0 , the steady state solution of Equation 15 reads where A is a normalization constant that ensures ´L 0 dx 1´x 1 0 dx 2 p st (x 1 , x 2 ) = 1 . The rate at which replication completes is β = Ae −´L /2 0 dx ′ k/v(x ′ ) . From Equation 9, we obtain For the specific form of v(x) given in Equation 13, the integral in Equation 52 is equal to where ⌊·⌋ is the floor function. We use the expression of this integral in Equation 52 and substitute the result in Equation 10 to obtain the DNA abundance distribution. We computed the normalization factor of the DNA abundance distribution (i.e., the denominator of Equation 10) by numerical integration.

Appendix 7
Replisome dynamics with time-dependent fork firing rate To verify the robustness of our results, we generalize our model to the case in which the fork firing rate k depends on time. For simplicity, we assume that replisomes move with constant speed and without diffusion ( D = 0 ).
We assume that the total number of genomes grows with time as where N 0 = a/ (1 + b) is the initial number of genomes and we introduced the time-varying fork firing rate The step function θ(t) in Equation 54 serves to impose that replication occurs for t ≥ 0 only. The corresponding dynamics of synthesizing genomes reads where the delta function ensures that the replication initiates at x 1 = 0 . As we have set D = 0 , the coordinate x 2 of the second replisome obeys the same dynamics expressed in Equation 56 and the DNA abundance is symmetric around the replication origin.
The general solution to Equation 56 is of the form ns(x 1 ; t) = f(x 1 − vt) . Because of the boundary term at x 1 = 0 , i.e. ns(0; t) =k(t)N(t)/v , the general solution is The probability that a randomly chosen genome (either complete or incomplete) includes the genome location y at time t is given by P(y; t) = N P (t) + N T (t) +´L /2 |y| n S (x 1 , t)dx 1 N(t) .
where ´L /2 |y| n S (x 1 , t)dx 1 is the number of incomplete genomes which include the genome location y and N P (t) + N T (t) is the total number of complete genomes. From Equations 54; 55, and (Equation 56) , we obtain Therefore, the total number of incomplete genomes is . Further, because of the conservation of the total number of genomes, the number of complete genomes at time t is We now use this result to test robustness of the steady-state assumption. To this aim, we assume that the number of genomes follows a similar dynamics as the optical density. We accordingly consider, for each sample at each temperature, the parameters a i , b i , k i obtained from the OD fits (see Figure 5-figure supplement 1), and the average speed v estimated from the constant speed model (see Table 1). We substitute these in Equation 61 and compute the DNA abundance at time t = 4/k . This value is chosen as, in our experiments, cells were extracted after approximately four generations, see Figure 5.
We then attempt to fit these DNA abundance distribution with the one predicted the constant speed model, Equation 11, by least-square minimization and thus estimating the speed again. The newly estimated speeds at different temperatures are comparable to those inputted in the timevarying fork firing rate model, see Appendix 7-table 1. Finally, we plotted the discrepancy ratio in each of the cases, see Appendix 7-figure 1. At all temperatures, the discrepancy ratio is much smaller than the oscillations we observed in the data. This result supports that the potential nonstationary nature of the fork firing rate is not likely to be the cause of the oscillations.