The symmetrical pattern of base-pair substitutions rates across the chromosome in Escherichia coli has multiple causes

Mutation accumulation experiments followed by whole-genome sequencing have revealed that for several bacterial species the rate of base-pair substitutions is not constant across the chromosome but varies in a wave-like pattern symmetrical about the origin of replication. The experiments reported here demonstrate that in Escherichia coli several interacting factors determine the wave. Perturbing replication timing, progression, or the structure of the terminus disrupts the pattern. Biases in error-correction by proofreading and mismatch repair are major factors. The activities of the nucleoid binding proteins, HU and Fis, are important, suggesting that mutation rates increase when highly structured DNA is replicated. These factors should apply to most bacterial, and possibly eukaryotic, genomes, and imply that different areas of the genome evolve at different rates.

The fidelity of DNA replication, which, in E. coli, is about 1 mistake in 1000 generations (7), is 46 determined by the intrinsic accuracy of the DNA polymerase and error correction by proofreading and 47 mismatch repair (reviewed in (8,9). In E. coli, proofreading is performed by epsilon, a subunit of the 48 DNA polymerase III holoenzyme. If the polymerase inserts the incorrect base, epsilon's 5' to 3' 49 exonuclease activity degrades a few bases of the new strand and polymerase then re-synthesizes it. 50 The accuracy of DNA synthesis is improved about 4000-fold by proofreading (10). Mismatch repair is 51 performed by three proteins, MutS, MutL, and MutH. MutS recognizes a mismatch and recruits MutL. 52 Together they find a nearby GATC site in which, in E. coli, the A is methylated by the Dam methylase. 53 Because methylation lags behind replication, unmethylated As identify the "new", and presumably 54 error-containing, DNA strand. MutH is recruited by MutSL to the GATC site and activated to nick the 55 unmethylated DNA strand, which is then degraded past the mismatch by the concerted activity of the 56 UvrD helicase and one of four exonucleases. Pol III then re-synthesizes the strand. MMR improves the 57 accuracy of DNA replication 100 to 200 fold (11). 58 In our previous study of the BPS density pattern in MMR-defective E. coli (1), we correlated 59 mutation rates to the chromosomal sites that are affected by two nucleoid-associated proteins (NAPs), 60 HU and Fis. We suggested that when the replication fork encounters regions of the chromosome with 61 high superhelical density due to the binding of these NAPs, the mutation rate increases. An alternative 62 explanation ties mutation rates to replication timing (2, 3). An intriguing hypothesis is that mutation 63 rates vary in concert with fluctuations in dNTP concentration when the replication origin fires Because ribosomal operons are homologous, we could not call SNPs in these genes. The RNA 110 reads from the genes in ribosomal operons were also removed from the RNA-Seq data, but their 111 positions have been indicated in Figure 1B. Interestingly, bin 35, which includes the ribosomal rrnG 112 operon, has a high number of RNA reads even in the absence of reads from the rrnG genes. This high 113 level of expression is almost exclusively due to the ssrA gene, which encodes transfer-messenger RNA 114 (16), that is highly expressed under all three conditions (the RNA-Seq data will be further analyzed in a 115 subsequent paper). The BPS density pattern of several of the strains that will be discussed below tends 116 to reach a minimum in this general area, but it is as often at bin 33 as at bin 35. Given that the bin size 117 is 100 Kb, it seems unlikely that the high level of transcription level of ssrA, located in the middle of bin 118 35, is causing these patterns. 119

Replication Initiation 120
The BPS density pattern is centered on oriC, the origin of replication, and over many experiments 121 in different genetic backgrounds the pattern around the origin has proved to be stable. In both 122 replichores BPS rates decline to a minimum about 300 Kb from the origin and then increase until a 123 peak is reached about 900 Kb from the origin. Mutation rates then fall again and reach a minimum 124 about 3/5 th of the distance along each replichore. 125 We tested whether replication initiation was responsible for the maintenance of this pattern by 126 performing MA experiments with strains with errant replication start sites. The rnhA gene encodes 127 RNase H1, which degrades RNA-DNA hybrids; in the absence of RNase H1, persistent R-loops can 128 initiate aberrant DNA replication and disrupt normal fork migration (17,18). But an MA experiment 129 with a ∆rnhA ΔmutL mutant strain showed no difference in the BPS density pattern from that of the 7 replication Initiation does not influence where mutations occur, at least not when a powerful oriC is 132 present. 133 To further test the influence of replication initiation on the mutational density pattern, we 134 performed MA experiments with strains that have a 5.1 Kb region containing oriC moved to the 135 midpoint of the right replichore, where it is called oriZ (19). These strains are derived from E. coli K12 136 strain AB1157, instead of MG1655 strain, the ancestor of our MA strains, and have a large inversion in 137 the right replichore that relieves the head-on collision between replication initiating at oriZ and 138 transcription of the rrnCABE operon (20). As a control we created an AB1157 ΔmutL mutant strain, 139 which had the similar wave-like BPS density pattern as our MMRstrains, indicating that the pattern is relief is not clear. In the absence of SeqA, unregulated initiation presumably results in over-replication, 154 at least when cells are rapidly growing in rich medium (21). Downstream events, such as replication 155 fork collapse, add to the phenotypes of seqA mutant cells (22). 156 Loss of SeqA affects chromosomal structure in areas distant from OriC. By binding to 157 hemimethylated DNA, SeqA forms complexes behind the replication fork as it progresses around the 158 chromosome (21). In addition, SeqA binds to areas of the chromosome with closely spaced GATC sites, 159 as well as to particular genes regulated by GATC methylation (23). In the absence of SeqA the 160 superhelicity of the chromosome increases, the nucleoid condenses (24), and transcription is altered 161 To test whether SeqA affects the BPS density pattern, we performed an MA experiment with a 163 ∆mutL ∆seqA mutant strain. As shown in Figure 1F, Supplementary Figure S1F, and Tables 1 and 2, loss  164 of SeqA somewhat amplified the BPS density pattern of the right replichore, but the pattern was still 165 highly correlated to that of right replichore of the MMR-defective strains. However, the pattern in the 166 left replichore was disrupted in the ∆seqA ∆mutL mutant strain. Based on chromatin 167 immunoprecipitation analysis, this area of the left replichore is not targeted by SeqA to a greater 168 extent than the same area of the right replichore (23, 26), suggesting that the disruption of the 169 mutational pattern is not due to loss of binding by SeqA. As shown in Figure 1B, Table S1). 208 As shown in Figure 2E, Supplementary Figure S2E, Tables 1 and 2, loss of NrdR did not change the basic 209 BPS density pattern, but the peak rate was shifted away from the origin about 200 Kb on each side. 210 This pattern was similar to that observed when cells were grown at low temperature, as described 211

above. 212
Replication fork progression is aided by the accessory replication helicase, Rep, and, in its 213 absence, the time required for chromosome duplication is doubled (34). Rep removes proteins bound 214 to the DNA in front of the fork (34, 35). While these nucleoproteins are primarily transcription 215 complexes (36, 37), Rep could also free the DNA of blocking NAPs. In addition, Rep aids in restarting 216 replication forks after they stall or collapse (38, 39). As shown in Figure 2F, Supplementary Figure S2F, 217 Tables 1 and 2, with the exception of a region close to the origin, loss of Rep disrupted the BPS density 218 pattern across the chromosome, suggesting that slowing or stalling the fork results in a distribution of 219

Replication Termination 221
The results from almost all the strains tested show an increase in the BPS rate in the region 222 where replication terminates. The pattern of this increase varies somewhat among experiments. 223 Usually there are two unequal peaks, as shown in Figure1A, but in some experiments these peaks are 224 better defined and of equal heights and occasionally there is just one peak. We do not know the source 225 of this variation, but suspect it is simply random noise. 226 Replication terminates approximately 180 o from the origin in a 1200 Kb region bounded by 227 replication pause (Ter) sites; this region extends from bin 18 to bin 31 in our figures. The anti-helicase 228 Tus protein binds to the Ter sites and allows each replication fork to enter but not to exit, creating a 229 replication fork "trap", within which the two forks fuse and the chromosome dimer is resolved (40). To 230 determine if the interaction of replication forks with Tus contributes to the increased mutation rate 231 within this region, we performed an MA experiment on a Δtus ΔmutL mutant strain. As shown in Figure  232 2G, Supplementary Figure S2G, Tables 1 and 2, loss of Tus did not affect the BPS density pattern. 233 The Ter macrodomain (MD) extends from 1200 Kb to 2200 Kb (12), which is roughly from bin 20 234 to bin 28 in our figures. The structure of the Ter MD is maintained by the MatP protein, which binds to 235 23 matS sites within this region (41). In the absence of MatP, the Ter MD is disorganized, the DNA is 236 less compact, and the Ter MD segregates too early in the cell cycle and fails to localize properly at 237 midcell (41, 42). Because the mobility of the Ter MD is increased in the absence of MatP, DNA 238 interactions across MD barriers can occur in ∆matP mutant cells (41). Figure 2H, Supplementary Figure S2H, Tables 1 and 2, loss of MatP caused a severe  240   disruption of the BPS density pattern. The mutation rates in the Ter MD were depressed whereas new  241 peaks appeared on either side of the Ter MD in the Right and Left MDs. Interestingly, the BPS pattern 242 near the origin was maintained in the right but not in the left replichore. 243

Recombination and the SOS response 244
Homologous recombination is intimately connected to replication. As replication proceeds, 245 various blocks, such as DNA lesions, transcription complexes, and DNA secondary structures, can cause 246 the replisome to pause and to eventually disassemble. This potentially lethal event is prevented by 247 recombination, which can repair and restart the replisome (43). In addition, the termination region is 248 subject to hyperrecombination (44, 45) particularly in the region bounded by TerA and TerB (our bins 249 21 to 24), named the terminal recombination zone (TRZ) (46, 47). 250 Elimination of E. coli's major recombinase, RecA, had a modest effect on the mutational density 251 pattern. As shown in Figure 3A Tables 1 and 2). Either our protocol is not sensitive enough to detect an effect of loss of RecB, or 257 another recombination pathway, e.g. RecFOR (48), is sufficient to maintain the mutation rate in the 258 region. 259 In addition to its role in recombination, RecA is also a master regulator of the SOS response to 260 DNA damage, which includes the induction of two error-prone DNA polymerases, DNA Pol IV and V. To 261 test whether these polymerases are involved in determining the BPS density pattern, we performed an 262 MA experiment on a strain deleted for the genes that encode Pol IV, dinB, and PolV, umuDC. As shown 263 in Figure 3B,Supplementary Figure S3B, Tables 1 and 2, the BPS density pattern in the mutL dinB 264 umuDC mutant strain was not significantly different than the MMRpattern. The genes of the SOS 265 response are repressed by the LexA protein; the lexA3 allele encodes a super-repressor LexA protein 266 that prevents the SOS genes from being induced (49). When this allele was present the BPS density 267 pattern was also unaffected ( Figure 3C, Supplementary Figure S3C, Tables 1 and 2). Thus, the SOS 268 response appears to play no role in determining the pattern of BPSs across the chromosome. 269

Nucleoid Associated Proteins 270
In a previous report (1), we found that the BPS density pattern of a ∆mutL strain was correlated 271 with the density of genes activated by the HU protein and repressed by the Fis protein. Combining 272 these two factors in a linear correlation model accounted for 33% of the variation in the mutational 273 data. HU constrains supercoils and compacts the DNA into nucleosome-like particles; Fis also 274 constrains supercoils but, in addition, bends the DNA (31). While both of these NAPs affect 275 transcription, the general lack of correlation of the BPS rate with transcriptional levels (7); also see 276 above) led us to hypothesize that mutation rates across the chromosome were correlated not with 277 transcription per se, but with areas of high DNA structure (1). To further test this hypothesis we 278 preformed MA experiments with MMRmutant strains also defective for each of a number of NAPs. 279 HU exists as a dimer of its two subunits, HUα and HUβ, encoded by the paralogous genes hupA 280 and hupB, respectively, in the three possible configurations. While loss of both subunits confers a 281 severe growth defect, loss of only one has little consequence during a normal growth cycle, suggesting 282 they can substitute for each other. HUαβ is the dominant form over most of the cell cycle, but 283 significant amounts of HUα 2 are found during lag phase and early exponential phase, and HUβ 2 is 284 prominent in stationary phase (30). Chromatin immunoprecipitation sequencing (ChIP-Seq) results 285 revealed that HU binds non-specifically to the chromosome and the DNA binding patterns of the three 286 dimers appear to be identical (50). 287 We performed MA experiments with both ∆hupA ∆mutL and ∆hupB ∆mutL mutant strains. As 288 shown in Figures  The NAP HNS binds to DNA at its high-affinity binding sites and then spreads by oligomerization 299 along A:T rich regions of DNA. Bridging between HNS-DNA complexes condenses the DNA into a few 300 clusters per chromosome (53-55). However, as shown in Figure 3G, Supplementary Figure S3G, Tables 301 1 and 2), loss of HNS had little effect on the BPS density pattern, and, thus, the long-range structures 302 produced by HNS appear not to affect BPS rates. 303 The DPS protein accumulates in stationary phase cells, condenses the nucleoid into a crystalline-304 like state, and protects the DNA from oxidative and other damage (56). Despite this radical physical change, loss of DPS had little effect on the BPS density pattern Figure 3H, Supplementary Figure S3H, 306 Tables 1 and 2). Of course, we do not know the degree to which cells in stationary phase contribute to 307 the BPS rates under our experimental conditions. 308 Proofreading 309 As mentioned above, epsilon is the proofreading subunit of the DNA polymerase III holoenzyme. 310 The mutD5 allele encodes an epsilon protein that is inactive for proofreading, and strains carrying this 311 allele have a mutation rate 4000-fold greater than that of wild-type strains, and 35-fold greater than 312 that of MMR-defective strains (10). As shown in Figure 4A, Supplementary Figure S4A, Tables 1 and 2, 313 when proofreading was inactive but MMR was active, the BPS density pattern was less dramatic than 314 when MMR was inactive and proofreading was active; but, nonetheless, the wave pattern was basically 315 the same. When both MMR and proofreading were inactive, which reveals the mutations solely due to 316 replication errors, the BPS density pattern was nearly flat but around the origin it retained significant 317 correlations to the patterns of both the MMR-defective and mutD5 mutant strains ( Figure 4B, 318 Supplementary Figure S4B, Tables 1 and 2). Thus, the pattern of BPS on both sides of the origin appears 319 to be established by replication errors and then elsewhere across the chromosome the density pattern 320 is largely due to differential error-correction by both proofreading and MMR. 321

Wild-type 322
The mutational density wave patterns evident in our data, and in data from other bacteria (2, 3), 323 were obtained when MMR was inactive. Thus, the question arises: does the pattern appear in wild-324 type strains? It is difficult to answer this question because mutation rates in wild-type strains are so 325 low (in E. coli, 120-fold lower than that of MMR-defective strains (7, 11) that enormous experiments 326 would have to be conducted in order to accumulate enough mutations to approach statistical 327 confidence. In a recent study we compared the mutation rates and spectra of a number of E. coli 328 strains defective in various DNA repair activities; of these, the results from seven strains were 329 indistinguishable from those of the wild-type parent (57). By combining the mutations from these 330 strains, we achieved 1933 BPSs (11), enough to expect to see a wave pattern if it existed. As is evident 331 in Figures 4C, 4D, Supplementary Figures S4C, S4D, Tables 1 and 2, these BPSs did not create fall into a 332 recognizable pattern. Indeed, the pattern from the wild-type strains appears to be random; the 333 variance to mean ratio of the binned mutations is 1.4, indicating the values are not disperse, and the 334 MatLab "runstest", a test for runs, returns a P value of 0.58, also indicating that the bin values are 335 random. In the wild-type strain both MMR and proofreading are active, and, while these two activities 336 have similar correction biases ( Figure 4A), proofreading is much more powerful, producing the pattern 337 seen in the MMR-defective strains. Although we cannot conclude that the mutational density pattern 338 in the wild-type strain is other than random, it does have similarity to both the patterns seen in the 339 MMR-defective strains and in the mutD5 mutant strain, particularly around the terminus (Figures 4C,  340   4D, Supplementary Figures S4C, S4D and Tables 1 and 2). 341

Bacillus subtilis 342
In additional to E. coli strains, symmetrical mutational density patterns have been demonstrated 343 in MMR-derivatives of Vibrio fischeri, V. cholera (2-4), Pseudomonas fluorescens (5), and P. aeruginosa 344 (6). Here we add Bacillus subtilis to this list. As shown in Figure 4E, Supplementary Figure S4E, Tables 1 345 and 2, the BPS mutation rates in a B. subtilis mutS::Tn10 mutant strain fell into a wave like pattern that 346 was symmetrical about the origin. Although similar in shape, the pattern was significantly different 347 from that of E. coli ( Figure 4E, Supplementary Figure S4F, Tables 1 and 2). However, as in E. coli, the BPS rate appeared to increase in the terminus region, which in B. subtilis is not 180 o from the origin 349 and corresponds to bins 20-24 in Figure 4E. 350

351
In this report we have examined a number of factors that could be responsible for establishment 352 and maintenance of the symmetrical wave-like BPS density pattern across the chromosome. In broad 353 terms these factors were: transcription; DNA replication initiation, progression, and termination; 354 recombination and the SOS response to DNA damage; the binding of nucleoid-associated proteins; 355 and, error-correction by MMR and proofreading. As discussed above, we found that transcription and 356 the SOS response had little effect, and the effect of recombination was modest and confined to the 357 terminal region. We discuss the more significant, factors in greater detail here. 358

DNA replication initiation, progression, and termination 359
Providing additional replication origins, either by eliminating RNase H1 or by inserting an ectopic 360 oriC (oriZ), did not disrupt the wave ( Figure 1C and 1D). However, when oriZ was the only origin of 361 replication, the region of depressed BPS rate that surrounds oriC when it is in the normal position was 362 re-established about the new origin ( Figure 1E). Note that only 5.1 Kb of DNA containing oriC was 363 relocated (19), whereas the region of reduced mutation rate is about 200 Kb; thus, the mutation rate is 364 not determined just by the DNA sequence surrounding the origin. We hypothesize that the process of 365 replication initiation protects the DNA from damage and/or newly established replication forks have a 366 low error-rate. In addition, the BPS mutation rate was increased for about 1000 Kb (10 bins) on either 367 side the new origin so that it resembled the same region about the normal origin. The size of this area 368 is close to the same size detected as "interacting zones" around oriZ (58). However, the mutational 369 density pattern across the rest of the chromosome did not re-establish itself to be symmetrical about oriZ, but remained symmetrical about the absent oriC. Thus, other factors must be important at distant 371 regions. We can also conclude that the overall structure of the BPS pattern across the chromosome is 372 not determined by active replication initiation per se¸ but may have evolved in response to replication 373 initiation. 374 Both V. fischeri and V. cholera have multiple circular chromosomes of different sizes; by 375 comparing the mutational density patterns of these chromosomes, Dillon et al, 2018 (2) identified the 376 timing of replication as a significant determinant of the mutational density patterns. As mentioned 377 above, they suggested that the pattern could be the result of variations in the levels of dNTPs as origins 378 fire during rapid growth. Our results provide partial support for this hypothesis, but also indicate that 379 growth on rich medium, not growth rate per se, is a significant determinant of the mutational density 380 pattern, possibly because of effects on the expression and DNA binding of HU and Fis (see below). Our 381 results with a ΔnrdR mutant strain ( Figure 2E) also show that dNTP levels are important, but, again, 382 other factors are driving the overall wave pattern. In addition, the results with the Δrep mutant strain 383 ( Figure 2F) indicate that radically interrupting the progression of the replication fork disrupts the 384 mutational density pattern. 385 After declining to a local minimum about 3/5 th of the distance along each replichore, mutation 386 rates rise in the terminus region ( Figure 1A). We originally hypothesized that this increase was due to 387 collisions of the replication complexes with the Tus anti-helicase, most of which would take place in 388 the region between bins 18 and 31 (1). However, elimination of Tus had no effect on the wave pattern 389 Nucleoid-associated proteins 399 Our previous results predicted that the NAPs HU and Fis should play a role in establishing or 400 maintaining the wave-pattern of BPS, but HNS should not (1). The results presented here confirmed 401 that prediction. In addition, we also found that Dps had no effect on the BPS density pattern. Because 402 the NAPs affect gene expression in various ways, we cannot conclude that DNA binding by the NAPs 403 themselves is responsible for the mutational pattern. And, indeed, we found no significant positive 404 correlations between the BPS pattern and published binding sites of the NAPs, although the location of 405 the binding sites themselves vary widely among published results (e.g. see (32, 50) and the data in 406 RegulonDB (62). Nonetheless, we favor the hypothesis that structuring of the DNA by the NAPs, 407 directly or indirectly, contributes to the BPS pattern. 408 The local effect of HU binding is to bend the DNA, but dimer-dimer interactions produce higher-409 order HU-DNA complexes that can constrain negative supercoils (63, 64). The analysis of the effect of 410 HU on the mutational density pattern is also complicated by the variation in the cellular concentrations 411 of the three forms with growth cycle (30). Although the two subunits can compensated for each other 412 for viability, the affinities of three forms of HU for various DNA structures (linear, nicked, and gapped) 413 differ (65). Both HUα 2 and HUαβ can constrain supercoils, but HUβ 2 apparently cannot, at least in vitro The mutational pattern in the ∆hupA ∆mutL mutant strain is also similar to that of the ∆recA 425 ∆mutL strain ( Figure 3A). An mutation accumulation experiment with a ∆hupA ∆recA ∆mutL mutant 426 strain resulted in a pattern similar to both single mutant strains, and so was not informative (data not 427 shown). 428 Fis is a major transcriptional regulator, either activating or repressing, directly or indirectly, 429 The loss of Fis nearly doubled the overall BPS rate, but flattened the wave pattern outside of the 436 origin region ( Figure 3F, Supplementary Figures S5F, S6F). As mentioned above, the mutational density 437 pattern MMRstrains is correlated to the density of genes activated in a ∆fis mutant strain as reported 438 by Blot et al, 2006 (70). This correlation is particularly strong in bins 14 to 33 (ρ = 0.62, P = 0.003), 439 which corresponds to the area flattened in the ∆fis ∆mutL and ∆fis ∆mutS mutant strains. However, no 440 correlation exists between our mutational data and genes found to be responsive to Fis in a recent 441 study (32). Clearly more studies are needed to resolve these conflicts. 442

Error-correction 443
Assuming that the BPS recovered from the mutD5 ∆mutL mutant strain are due to intrinsic errors 444 made by DNA polymerase, we conclude that the polymerase is accurate close to the origin, then 445 becomes increasingly less accurate as replication proceeds to about 1/3 of the replichore, at which 446 Three biological triplicates were prepared for each growth phase. 488 A complete analysis of the RNA-Seq results will be the subject of a subsequent report. For this report 489 the numbers of RNA-Seq reads for each condition were first normalized to the number of reads 490 mapped to the gene holD, which was determined by rtPCR to be expressed at the same level in all 491 phases of growth. The means of the normalized RNA reads from the triplicates were then binned into 492 the same bins used for the mutational analysis. A fourth-order Daubechies wavelet transform was 493 performed on the binned RNA-Seq reads as described for the mutational data (1). 494

Statistical Analysis 495
To obtain the BPS density patterns, the numbers of BPSs were binned into 46 bins, each 100 Kb long, as 496 described (1). A fourth-order Daubechies wavelet transform was performed on the binned mutation 497 data as described (1). For presentation in the figures, these results were converted into rates by 498 dividing the number of BPS by the appropriate number of generations. Pearson's product-moment 1). Spearman's nonparametric correlation coefficient was also computed for a few data sets, but gave 501 similar results. To account for multiple comparisons, p values were adjusted using the  Hochberg method (71) with the false discovery rate set at 25%, implemented with the MatLab R2018a 503 "mafdr" command. Because comparisons using the data from the same strain are not independent, 504 this adjustment was made separately for each column in Tables 1 and 2.  505 To further compare the BPS density patterns between two data sets, wavelet coherence was calculated 506 and plotted using the MatLab R2018a "wcoherence" command. While the Daubechies wavelet 507 provides a good visual representation of the binned data, it is not continuous and thus not easily 508 adapted for wavelet coherence analysis. The MatLab program first converts the binned data to Morlet 509 wavelets and then computes the coherence between two of these wavelets. We chose to analyze the 510 data with wavelet coherence because it gives a measure of the correlation between the signals 511 (displayed as colors in the figures) (72, 73). In addition, the MatLab wavelet coherence plot indicates, 512 as a dashed curve, the 'cone of influence' within which results are free of artifactual edge effects (72). 513 The relative phase-lag between the two signals is indicated by small arrows: arrows pointing right 514 indicate in-phase, arrows pointing left indicate 180 o out-of-phase, and arrows pointing in other 515 directions indicate the various degrees in between. Because the MatLab program assumes a 516 frequency-time series, the X axis of the plot is cycles/sample and the Y axis is time; we converted these 517 to bins/cycle (on an inverted scale) and bins, respectively. 518

Declarations
The sequences and SNPs reported in this paper have been deposited with the National Center for 523 Biotechnology Information Sequence Read Archive https://trace.ncbi.nlm.nih.gov/Traces/sra/ 524 (accession no. in progress) and in the IUScholarWorks Repository (hdl.handle.net/2022/20340). 525 Bacterial strains are available upon request 526

Competing interests 527
The authors declare no competing interests 528  , and DK2143 (NCIB3610 mutS::Tn10 ∆comI). There were no differences in mutation rates or spectra among these strain   Rep, the auxiliary replication helicase, affect the BPS density pattern. 2E, PFM799; 2F, PFM677. 31 Figure 2G and H. Loss of the Tus antihelicase has no effect, but loss of the terminus organizing 32 protein, MatP, changes the BPS density pattern across the chromosome. The arrows in Figure  33 2G mark the major Ter sites where Tus binds. The bar in Figure 2H shows the region in which 34 MatP binds. Strains: 2G, PFM256; 2H, PFM257. 35  Figure 4C and 51 D. The BPS density pattern in the wild-type strain does not match the pattern of either the 52 MMR-defective strain, or the mutD5 mutant strain. The green line in Figure 1D is the 53 Daubechies wavelet transform of the mutD5 mutant strain (pink line in Figure 4A). Strains: 4C 54 and D, eight strains with wild-type mutational phenotypes (see text). Figure 4E. MMR-defective 55 B. subtilis also has a symmetrical BPS density pattern, but it is different than E. coli's pattern. 56 Strains: three B. subtilis mutant strains with the same mutational phenotype (see 57 supplementary Tables S1 and S3).  The strains used in this study are listed in Supplemental Table S1. All E. coli strains were derived from 3 PFM2 (1) or AB1157 (2). The oriC + oriZ + and ∆oriC oriZ + strains were a gift from Rodrigo Reyes-4 Lamothe (McGill University). The mutD5 allele was obtained from Roel Schaaper (NIEHS). The deletion 5 mutations originated in the Keio collection (3) and were moved by P1 phage transduction (4); the Kn r 6 element was removed by using FLP recombination (5). The deletions were confirmed by PCR analysis 7 using the oligonucleotides listed in Supplemental Table S2. The B. subtilis strains were derived from the 8 undomesticated ancestral strain NCIB3610 and were a gift from M.A. Konkol and D.B. Kearns (Indiana 9 University). 10 Rich medium was Miller Luria Broth (LB) (Difco; BD); minimal medium was Vogel-Bonner minimal 11 medium (VB min) with 0.2% glucose (6). When required, antibiotic concentrations were: carbenicillin 12 (Carb), 100 µg/ml; kanamycin (Kn), 50 µg/ml; nalidixic acid (Nal), 40 µg/ml; chloramphenicol (Cam), 30 13 µg/ml; and, rifampicin (Rif), 100 µg/ml. Half of these concentrations were used in minimal medium. 14

Estimation of mutation rates from fluctuation assays 15
Mutation rates were determined as described (7), using mutation to Nal R or Rif R . The Ma-Sandri-Sarkar 16 maximum likelihood method was used to calculate the mutation rates by using the FALCOR web tool 17 found at www.mitochondria.org/protocols/FALCOR.html (8). 18

Mutation accumulation experiments 19
The MA procedure has been described (1, 9, 10). The MA lines originated from single colonies isolated 20 from a founder colony, obtained by streaking from a freezer stock onto agar plates of the medium to 21 be used in the MA experiment. After incubation overnight at the experimental temperature, one well isolated colony was excised from the agar plate, soaked for 30 minutes in 0.85% NACL + 0.01% gelatin, 23 and then vortexed for 60 seconds. Appropriate dilutions for obtaining well-isolated colonies were then 24 plated onto the appropriate agar plates at the appropriate temperature to start MA lines. Plates were 25 incubated at 37°C for most experiments, or at 28 o C for the experiment at low temperature. Each MA 26 line was periodically streaked for a single colony: on LB and supplemented VB min agar plates at 37 o C, 27 this was done daily; on VB min and diluted LB agar plates at 37 o C, and on LB plates at 28 o C, this was 28 done every 48 hours. The number of passes required was determined by the preliminary mutation rate 29 obtained from a fluctuation assay. The parameters of the MA experiment including the number of lines 30 used for each strain and the total number of generations per experiment are given in Supplemental 31 Table S3. 32

Estimation of generations 33
The method to estimate that number of generations undergone in each MA experiment is described (1, 34 9). The diameter of the single colonies streaked was recorded daily, and then the number of cells in 35 colonies of different diameters was determined for each experiment as described (1) Poor sequence coverage resulted in some MA lines being eliminated. Cross-contamination can 52 occur during streaking, resulting in lines with identical mutations. If two lines shared over 50% of their 53 mutations then one of the lines was dropped from further analysis. If lines shared less than 50%, then 54 each shared mutation was assigned to one of the lines and dropped from the others. If lineage could 55 be established the shared mutation was assigned accordingly, otherwise the mutation was assigned 56 randomly. 57

RNA sequencing 58
The E. coli strain PFM144, which is PFM2 ΔmutL (10), was grown in LB and aliquots collected during lag 59 (OD = 0.022), log (OD = 0.3), and stationary (OD = 1.5) phase.. The number of cells collected was kept 60 constant for each growth phase. Cells were pelleted at 10,000gs for 10 min, 1ml of medium was added 61 and the cells were pelleted again at 10,000gs for 2 min. The pellets were then flash frozen in liquid 62 nitrogen and stored at -80°. Three biological triplicates were prepared for each growth phase. 63 RNA was extracted using FastRNA Pro Blue kit (MP Biomedicals). DNA was removed by using 64  (1) mutL Rv  Arrows indicate the phase-lag between the two data sets; arrows pointing right indicate in-138 phase, arrows pointing left indicate 180 o out-of-phase, and arrows pointing in other directions 139 indicate the various degrees in between. Supplementary Figures S2A, B, C,