RNA polymerase drives ribonucleotide excision DNA repair in E. coli

Ribonuclease HII (RNaseHII) is the principal enzyme that removes misincorporated ribonucleoside monophosphates (rNMPs) from genomic DNA. Here, we present structural, biochemical, and genetic evidence demonstrating that ribonucleotide excision repair (RER) is directly coupled to transcription. Affinity pull-downs and mass-spectrometry-assisted mapping of in cellulo inter-protein cross-linking reveal the majority of RNaseHII molecules interacting with RNA polymerase (RNAP) in E. coli. Cryoelectron microscopy structures of RNaseHII bound to RNAP during elongation, with and without the target rNMP substrate, show specific protein-protein interactions that define the transcription-coupled RER (TC-RER) complex in engaged and unengaged states. The weakening of RNAP-RNaseHII interactions compromises RER in vivo. The structure-functional data support a model where RNaseHII scans DNA in one dimension in search for rNMPs while "riding" the RNAP. We further demonstrate that TC-RER accounts for a significant fraction of repair events, thereby establishing RNAP as a surveillance "vehicle" for detecting the most frequently occurring replication errors.


INTRODUCTION
Due to a large excess of rNTPs over dNTPs 1,2 and only a single oxygen atom difference between the two nucleotide substrates, DNA polymerases misincorporate rNTPs in place of dNTPs at a remarkably high rate in all organisms during genome replication. 3-6 E. coli DNA polymerase III makes such errors at a rate of one ribonucleoside monophosphate (rNMP) every 2.3 kb, which is approximately 2,000 misincorporations per replication round. 7 To maintain genome integrity, the bulk of misincorporated rNMPs are removed by the highly evolutionarily conserved type 2 ribonuclease H (RNaseH)-dependent ribonucleotide excision repair (RER) pathway. [8][9][10][11][12][13] RNaseHII (in bacteria) and RNaseH2 (in archaea and eukaryotes) initiate the RER process by nicking double-stranded DNA containing a single rNMP at the 5 0 side of the ribonucleotide. [14][15][16][17] The following removal of the ribonucleoside is catalyzed by several enzymes that jointly constitute the RER pathway. [18][19][20] In E. coli, RNaseHII also removes abasic and oxidized ribonucleotides embedded in the DNA and exhibits elevated activity toward mismatched rNMPs, indicating an expanded role (compared with eukaryotic and archaeal counterparts) in maintaining genome integrity. 21 The principal unresolved question regarding RER is how the rNMPs, which are inserted throughout the genomes without im-parting any major distortions to the DNA, 22-25 can be promptly detected and discriminated from the bulk of canonical genomic DNA in vivo. RNaseHII has to identify thousands of genomic rNMPs under 30-min intervals between the replication rounds in exponentially growing E. coli cells. As RNaseHII has no intrinsic motor function, it would have to rely on some form of stochastic diffusion usually invoked to explain events of protein trafficking on DNA to locate its targets in vitro. This form of a target search, however, must be highly inefficient in vivo for the low-abundant RNaseHII ($50-80 molecules per E. coli cell) 26 due to macromolecular crowding, DNA compaction, and numerous non-specific DNAbinding competitors (e.g., highly abundant histone-like proteins) occupying much of the chromosome at any given moment. 27,28 Given the haphazard distribution of rNMPs in a genome, the scarcity of RNaseHII molecules in the cell, and the lack of a motor function, the target search for RER is limited to a random walk with its forward/backward symmetry. 29 In eukaryotic cells, the interaction with the replicative machinery or proliferating cell nuclear antigen (PCNA) is thought to assist the RNaseH2 delivery to its substrate, 30-33 acting as the symmetrybreaking biasing force, 34,35 but the mechanism of facilitated diffusion (if there is any) for bacterial RNaseHII is unknown.
Here, we demonstrate that bacterial RER relies on RNA polymerase (RNAP) for a rapid target location. It appears that one-dimensional (1D) DNA scanning enabled by transcribing RNAP is important for the efficient functioning of RNaseHII in vivo.

Majority of RNaseHII molecules interact with RNAP in vivo
To determine the interacting partners of RNaseHII in live E. coli cells, we combined a high-affinity RNaseHII pull-down with quantitative mass spectrometry (qMS). We constructed an E. coli strain (MG1655) carrying a chromosomal copy of FLAG-tagged rnhB and confirmed that the presence of the tag did not interfere with its RER functioning (described below). Exponentially growing cells were treated with cell-permeable formaldehyde to preserve intracellular protein interactions, followed by cell lysis, nucleic acid digestion, affinity chromatography, washing, and qMS ( Figure 1A) (STAR Methods). MS analysis demonstrates that the most prominent interactor of RNaseHII is RNAP, which is present in the eluted fraction almost stoichiometrically to RNaseHII ( Figure 1A). Such an enrichment of the core RNAP subunits, and not the initiation-specific sigma factors, suggests that most of RNaseHII associates with transcription elongation complexes (ECs) in E. coli under normal growth conditions. A mass-spectrometry-assisted mapping of in cellulo inter-protein cross-linking (in vivo cross-linking mass spectrometry [XLMS]), 36 for which we utilized MG1655 cells carrying a chromosomal copy of a 10x histidine-tagged (His-tag) b 0 subunit of RNAP, confirms the presence of RNaseHII among the prominent RNAP interactors and maps the contacts between RNaseHII and RNAP that occur in exponentially growing cells (Figure 1B) (see below). For comparison and context, we list in vivo cross-links discovered for general transcription elongation factors GreA and NusA ( Figure 1B).

Structures of the TC-RER complexes
To characterize RNaseHII-RNAP interactions structurally and functionally, we reconstituted the transcription-coupled RER (TC-RER) complexes from individually purified E. coli proteins ( Figures 1C and 1D). To assemble active ECs, we utilized two nucleic-acid scaffold derivatives of the scaffold we previously used to determine the cryo-EM structure of E. coli EC. 38 The first DNA scaffold (DNA29) was 29 base pairs (bp) long, with a 14-bp DNA duplex downstream of a 10-nt-long transcription bubble (Figure 2A). In addition to the recombinant wild-type (WT) RNaseHII, we also generated a catalytically inert (cleavage-defi-cient) version of the enzyme carrying an E17A substitution in its catalytic center, based on homology modeling with Thermotoga maritima RNaseHII. 15,39 RNaseHII(E17A) can still bind to its DNA substrate, but it is almost completely deficient in the ability to cleave DNA at the rNMP position (Figures 3 and S1A). The second scaffold (DNA59) carries a single guanosine monophosphate (rGMP) inserted at position +16 of the template strand within the 34-bp-long downstream DNA duplex, counting from position +1 of the catalytic site of RNAP (Figure 4A), and was used to assemble the TC-RER complex in its engaged state.
The EC18-RNaseHII and EC18(rGMP)-RNaseHII(E17A) complexes, containing 18-nt-long RNA, were assembled on DNA29 and DNA59 scaffolds, respectively, and purified by size-exclusion chromatography (SEC) (Figures 1C and 1D). In both cases, RNaseHII co-eluted with the EC as a single peak, suggesting that the protein-protein interactions are primarily responsible for the EC-RNaseHII complex formation. Dynamic light scattering (DLS) analysis demonstrates that both complexes are uniform and monodispersed. Their sizes, on the basis of the Raleigh sphere approximation, correspond to 414 and 431 kDa, that is, to 1 RNAP and 1 RNaseHII molecules. We then subjected both complexes to single-particle cryo-EM analysis to determine their structures (Figures S1B-S1E).
The maps of EC18-RNaseHII and EC18(rGMP)-RNaseHII(E17A) reached a nominal resolution of 2.96 and 3.16 Å , respectively ( Figures 2B, 4B, and S2A-S2C; Table S2; Videos S1 and S3). Both maps revealed specific protein-protein interactions between RNAP and RNaseHII. The major points of contact with RNaseHII occur at the front face of RNAP with respect to the direction of transcription (Figures 2C-2E and 4C; Videos S1 and S3).
In the unengaged state, represented by the EC18-RNaseHII complex, RNaseHII interacts with the RNAP b 0 subunit via two different domains, forming an extensive interface ( Figures 2D  and 2E). The most prominent contacts are established between the a helix-loop region of RNaseHII (residues 48-53) and the loop region between the b3 and b4 strands of b 0 (residues 151-155) ( Figure 2D, lower panel). Well-defined interactions include three pairs of ionic and hydrogen bonds: RNaseHII(K48)b 0 (E155), RNaseHII(R53)-b 0 (T152), and RNaseHII(K52)-b 0 (E175). Another important point of contact occurs between two loop regions of RNaseHII (residues 19-26 and 34-50) and the RNAP b 0 a9 helix (residues 195-207) ( Figure 2D, right panel). Here, RNaseHII(R20) interacts with b 0 (E203, E207), and RNaseHII(K47) interacts with b 0 (Q200). Figure 1. In vivo discovery and in vitro assembly of the TC-RER complex (A) Quantitative mass spectrometry analysis of the RNaseHII-associated proteins in vivo. Stoichiometry was estimated in the RNaseHII pull-down fraction isolated from exponentially growing cells carrying chromosomally FLAG-tagged RNaseHII (RnhB). RNAP subunits, RNAP-associated factors, and factors involved in DNA replication are highlighted in pink, green, and blue, respectively. (B) A list of highly confident in vivo non-redundant inter-protein cross-links between RNAP subunits and RNaseHII from XLMS of the chromosomally 10xHistagged RNAP pull-down (highlighted in pink). The confidence score of RNaseHII-RNAP cross-links is comparable to that of bona fide transcription elongation factors GreA and NusA (highlighted in green). Confidence is inversely proportional to the score value as per the pLink2 scoring algorithm. 37 (C) Formation of the EC18-RNaseHII complex in vitro. Left: size-exclusion chromatography (SEC) of the EC18-RNaseHII complex assembled from individually purified proteins and synthetic nucleic acids. The red line indicates the peak fractions analyzed by SDS-PAGE. Middle: SDS-PAGE analysis of the peak fractions from SEC. Right: dynamic light scattering (DLS) analysis of the peak fraction from SEC (%Pd, the polydispersity statistics; MW-R, estimated molecular weight). (D) Formation of the EC18(rGMP)-RNaseHII(E17A) complex in vitro. Left: size-exclusion chromatography (SEC) of the EC18(rGMP)-RNaseHII(E17A) complex assembled from individually purified proteins and synthetic nucleic acids. The red line indicates the peak fractions analyzed by SDS-PAGE. Middle: SDS-PAGE analysis of the peak fractions from SEC. Right: DLS of the peak fraction from SEC. See also Figures S1 and S3 and Table S1.

Article
The in vitro structure of EC-RNaseHII exhibits broad concordance with the in vivo XLMS results ( Figure S3A), namely, the distance restraints expressed as Cɑ-Cɑ distances for DSS crosslinks-listed in Figure 1B-fall well within the range expected for this cross-linker connecting residues embedded in the flexible (loop) regions of interacting proteins. The EC-RNaseHII structure is also compatible with the structures of the E. coli EC in complex with general elongation factors, such as NusA, NusG, GreA/B, and Rho ( Figure S3B).
To investigate the mechanism of substrate recognition by the TC-RER complex, we determined the structure of EC18(rGMP)-RNaseHII(E17A) complex (Figure 4; Video S3). The overall map of EC18(rGMP)-RNaseHII(E17A) is very similar to that of EC18-RNaseHII. It clearly shows the protein-protein interfaces between RNaseHII and RNAP, with RNaseHII again residing at the front face of RNAP ( Figure 4B). In vivo cross-linking sites between RNaseHII and RNAP also support the structural model of the EC18(rGMP)-RNaseHII(E17A) complex ( Figure S3A). The local density of RNaseHII is more well defined, suggesting that RNaseHII is less flexible on RNAP when engaged with its DNA substrate ( Figure S2C). In the presence of rGMP-containing DNA, the RNaseHII C-terminal domain swings toward the downstream DNA duplex, losing the contact with the RNAP b 0 a9 helix (residues 195-207), while keeping the points of contact with the loop region between the b3 and b4 strands of the RNAP b 0 subunit (Figures 4C, lower left panel and S2H; Video S2). Well-defined interactions are switching to another three pairs of ionic and hydrogen bonds: RNaseHII(K52)-b 0 (E155), RNaseHII(R53)b 0 (Q158), and RNaseHII(E60)-b 0 (N153). For the substrate DNA binding, RNaseHII utilizes its positively charged groove to bind rNMP-containing DNA backbone. The interactions occur with residues K47, R111, K124, K140, and R192 of RNaseHII ( Figure S2D).
Without imbedded rNMP, the binding of RNaseHII to DNA is weak but detectable (Figure S1A, lanes 2, 3). However, a single rNMP (rGMP) is sufficient to trigger a major shift under the same conditions (lanes 5, 6), which is indicative of tight DNA binding. Thus, in the absence of a substrate, RNaseHII can remain in the ''scanning'' mode in the TC-RER complex until it encounters rNMP in the template strand.
The catalytic center of RNaseHII is oriented toward substrate rGMP in the template DNA strand. Here, residue Y165 of RNaseHII recognizes the 2 0 OH group of rGMP(+16), while the conserved GRG motif (residues 19-21) is holding it in place for cleavage (Figure 4C, lower middle panel; Video S3). The active site of E. coli RNaseHII is composed of four conservative carboxylate residues: D16 and E17 of the b1 strand; D108 at the end of the b4 strand; and D126 of the a3 helix, which is in close proximity to the scissile phosphate group of rGMP(+16) ( Figure 4C, lower right panel). Thus, the structure of the engaged TC-RER complex from E. coli provides further support for the model of substrate recognition and cleavage suggested by the Thermotoga maritima RNaseHII-DNA complex structures, 15 and it implies that RNaseHII scans DNA in search for its substrates while riding the EC.
Instructed by the structure of TC-RER complexes, we constructed two clusters of point mutations in the b 0 subunit of RNAP (rpoC151-155 MTNLE/GTGLG; rpoC195-207 EQECEQ LREELNE/AAECAQAAAELAA), which are expected to perturb the interacting surface with RNaseHII ( Figures 2D, 2E, and 4C). Indeed, such a mutant RNAP (RNAP HII ) purified from exponentially growing cells binds RNaseHII less strongly, as determined by SEC ( Figures S4A-S4C), while remaining as active as the WT RNAP ( Figures S4D and S4E), thus validating our structural model.
According to the structures of TC-RER complexes, the distance between the catalytic center of RNAP and that of RNaseHII engaged with its substrate is 15 ± 1 nt. To determine if such an arrangement holds during transcription elongation, we monitored RNAP progression in the presence of WT RNaseHII and the catalytically deficient RNaseHII(E17A), which retains its ability to bind substrate DNA ( Figure 3D). The WT RNaseHII caused RNAP to pause at position +65 followed by a slow progression to position +68, which is the site of embedded rCMP ( Figure 3E), suggesting that pausing was the result of a nick in the template strand introduced by RNaseHII. In contrast, RNaseHII(E17A) forced RNAP to pause at position +54, demonstrating that the binding of RNaseHII(E17A) to its substrate is strong enough to halt the moving EC. The 14-nt distance between the site of substrate binding of RNaseHII and the position of the RNAP catalytic center is in good agreement with the structural models of the TC-RER complexes ( Figures 2C and 4C).

TC-RER in living cells
In vivo XLMS suggests that most RNaseHII molecules associate with RNAP at any given moment during exponential growth Figure 2. Overview of the cryo-EM density map and the structural model of the unengaged TC-RER complex (A) A schematic of the nucleic acid scaffold used to assemble the tertiary elongation complex (EC18). ntDNA, non-template DNA strand; tDNA, template DNA strand; nt, nucleotide. Upstream (À) and downstream (+) positions are indicated and numbered accordingly. (B) A surface view of the cryo-EM density map of the EC18-RNaseHII complex at a 2.96-Å nominal resolution. The color coding is maintained in all figures, unless otherwise indicated. RNAP core subunits: ɑ1, olive; ɑ2, gray; b, lime; b 0 , blue; u, purple; RNaseHII, cyan; tDNA, orange; ntDNA, red; RNA, yellow. uDNA, upstream DNA; dDNA, downstream DNA. (C) The structural model of EC18-RNaseHII shown in a cylinders/stubs representation with individual components denoted. (D) Major protein-protein interactions of RNaseHII with the EC18. Upper left: overview of the EC18-RNaseHII complex. The arrow indicates the moving direction of DNA during transciption elongation. Upper right and lower: magnified view of the boxed region. The RNAP b 0 subunit is colored blue, and the RNaseHII is colored in cyan with the a helix shown in cylinders. Interacting residues are labeled and shown as sticks. Black dash lines denote H-bonds and salt bridges. (E) Extensive interface between RNaseHII and RNAP b 0 subunit. RNAP b 0 subunit is shown in surface representation with electrostatic potential. Positively and negatively charged regions are shown in blue and red, respectively. RNaseHII is in semi-transparent ribbons/slabs representation. The interacting surface has an area of 671 Å 2 , measured by the buried solvent-accessible surface area (SASA), and the probe radius rad is set to 1.4 Å using ChimeraX ''measure buried area'' function. 40 RNaseHII-interacting surface of RNAP b 0 subunit (two separate domains) is outlined in green. RNaseHII residues R20, K47, K48, K52, and R53 are shown in sticks and labeled. See also Figures S1 and S2, Table S2, and Video S1.
( Figure 1A). The structures of TC-RER complexes further suggest that the binding of RNaseHII to the EC precedes its threedimensional (3D)-diffusion-limited interaction with the substrate, resulting in facilitated delivery of RNaseHII to rNMP substrates via 1D scanning during transcription. To test this hypothesis, we examined the effect of transcription elongation on the rate of RER in vivo.
To estimate the impact of transcription coupling on RER, we measured the chromosomal rNMP removal as a function of local transcription ( Figure 5). Genomic DNA was isolated, treated with RNaseHII, and rNMP density within a region of interest (ROI) was estimated using a semi-long-range quantitative PCR (SLR-qPCR) ( Figures 5A-5D, S5A, and S5B; see STAR Methods). In vitro RNaseHII probing ensures that the amplification-interfering single-strand breaks are specific to misincorporated genomic rNTPs. Indeed, in contrast to RNaseHII, the treatment   Figure S1.
with RNaseHI does not affect the reaction threshold cycle (C T ) values ( Figure S5C). To firmly control transcription within a ROI, we utilized chromosomal insulators. 36 To minimize a transcriptional readthrough from upstream or downstream regions, a cluster of intrinsic terminators was placed to flank a ROI ( Figure 5A). A cognate isopropyl ß-D-1-thiogalactopyranoside (IPTG)-inducible promoter (lac p ::lacZ) allows transcription to occur within the lacZ insulator. Alternatively, the insulator is severely deprived of transcription due to a promoter deletion (Dlac p ::lacZ) ( Figure 5A). IPTG induction resulted in strong transcription within the lac p ::lacZ insulator, but not in the case of the Dlac p ::lacZ insulator ( Figures 5B and S5D). Cells displayed robust RER within the IPTG-activated lac p ::lacZ insulator, which was greatly diminished within the promoter-less Dlac p ::lacZ insulator ( Figure 5C). Similar results were obtained using another insulator constructed at a different chromosomal location (nupG) (Figures 5B and 5C). These results demonstrate that RER is coupled to transcription within the two different ROIs in vivo.

OPEN ACCESS
Next, we engineered a chromosomal RNAP b 0 mutation (rpoC HII ), which carries the same mutations that weakened RNaseHII binding to RNAP in vitro ( Figure S4), in the WT and DrnhB cells. The transcriptional activity of mutant RNAP HII was virtually indistinguishable from the WT RNAP both in vitro and in vivo ( Figures S4D and S4E). The rpoC HII increased the amount of unrepaired rNMPs approximately 5-fold within the lacZ ROI ( Figure 5D); the deletion of RNaseHII itself (positive control) resulted in approximately 11-fold higher density of rNMPs within the same ROI and experimental time frame. Introducing the rpoC HII mutation into the RNaseHII-deficient cells does not cause any additional suppression of RER above the DrnhB background, demonstrating a fully epistatic relationship between rpoC HII and DrnhB ( Figure 5D). Considering that RNAP HII only weakens, but not abolishes, the interaction between RNaseHII and RNAP in vitro ( Figures S4A-S4C), these results demonstrate A B C Figure 4. Overview of the cryo-EM density map and the structural model of the engaged TC-RER complex (A) A schematic of the nucleic acid scaffold used to assemble the tertiary elongation complex-EC18(rGMP). Ribo-GMP is shown as rG at position +16 and colored red. Upstream (À) and downstream (+) positions are indicated and numbered accordingly. (B) A surface view of the cryo-EM density map of the EC18(rGMP)-RNaseHII(E17A) complex at a 3.16-Å nominal resolution. Template DNA strand (tDNA) and non-template DNA strand (ntDNA) are indicated. (C) A structural model of EC18(rGMP)-RNase-HII(E17A) shown in a cylinders/stubs representation with individual components denoted. Lower: left, major protein-protein interactions of RNaseHII with the EC18(rGMP); middle and right, the substrate recognition and cleavage by transcription-coupled RNaseHII. Individual components are denoted, and the rG at position +16 is colored magenta with heteroatom-coloring. See also Figures S1 and S2, Table S2, and Video S3. that even a partial disruption of RNaseHII-RNAP contacts has a major negative effect on the efficiency of repair and that the local transcription is critical for RER.
To further support this conclusion, we constructed RNaseHII mutations that, according to our structural model, should compromise the interacting surface with RNAP. Four residues of RNaseHII have been selected to weaken the RNaseHII-RNAP interface without affecting overall architecture and catalytic properties of the enzyme ( Figures 2D, 2E, and 4C). The resulting chromosomal mutant K48A/K52A/ R53A/E60A was designated as rnhB RNAP . We confirmed that purified RNaseHII RNAP enzyme is nearly as active as the WT RNaseHII enzyme in vitro ( Figures S6A  and S6B). We also combined rnhB RNAP with rpoC HII to make a ''double'' chromosomal mutant rpoC HII /rnhB RNAP and examined the mutant strains for their ability to perform RER ( Figure 5D). The rnhB RNAP mutant compromises RER to approximately the same extent as rpoC HII does ( Figures 5D and S6C), i.e., $50% that of DrnhB. The rpoC HII /rnhB RNAP double mutant exhibits only a minor additive effect of the two individual mutants, indicating that these mutants are largely epistatic. Together, they diminish RER to almost 60% that of DrnhB. Epistasis is expected, as the two mutants disturb the same interaction surface from opposite sides. We conclude that our structural model accurately describes the interacting surface between RNAP and RNaseHII. These results further demonstrate that specific interactions between RNAP and RNaseHII are critical for RER in vivo.
The involvement of transcribing RNAP in repair of rNMPs implies that in the absence of ongoing transcription there is limited recruitment of RNaseHII to chromosomal DNA. To verify this assumption, we performed chromatin immunoprecipitation coupled with quantitative PCR (ChIP-qPCR) to examine DNA association of RNaseHII as a function of local transcription. To this end, we used a strain carrying a copy of rnhB 10xHis-tagged at its native chromosomal location. We confirmed that His-tagged RNaseHII fully retains its RER capability ( Figure S6D). As expected, we detect a robust RNaseHII signal using anti-His antibodies within the insulator upon IPTG induction, but only if the lacZ promoter was present ( Figure 5E). Thus, RNaseHII recruitment to DNA does indeed depend on local transcription elongation.
To confirm that RER is coupled to transcription genome-wide, we studied the effect of rpoC HII and rnhB RNAP mutants, as well A B C D E Figure 5. RER is coupled to transcription in vivo (A-C) Depriving the genomic loci of active transcription compromises local RER. (A) Transcriptional insulator. Chromosomal lacZ and nupG loci, with or without native promoters, were shielded from upstream or downstream RNAP readthroughs with the clusters of strong intrinsic terminators. 36 Positions of the primers used for RT-qPCR (short amplicon) and SLR-qPCR (short and long amplicon) are indicated. (B) Transcription within lacZ and nupG insulators as detected by RT-qPCR. Statistical analysis was performed using a two-tailed unpaired non-parametric Mann-Whitney test. All values are mean ± SD from at least six independent experiments. **p < 0.01. (C) rNMPs repair within the insulators as detected by the SLR-qPCR assay (STAR Methods). RER depends on promoter-initiated transcription within the insulator. All values are mean ± SD from at least six independent experiments. **p < 0.01. (D) A rpoC HII and rnhB RNAP mutations in RNAP and in RNaseHII, respectively, which weaken RNaseHII-RNAP interactions, and the combination of these two mutations (rpoC HII /rnhB RNAP ) compromises local RER. The relative efficiency of RER was determined as in (C). rnhB deletion (DrnhB) is used as a positive control. The negative effect of the DrnhB/rpoC HII double mutant on local RER is no greater than that of DrnhB alone, indicating the fully epistatic relationship between the two mutants. All values are mean ± SD from at least six independent experiments. **p < 0.01. (E) Recruitment of RNaseHII to DNA depends on ongoing local transcription. The occupancy of RNaseHII within the lacZ insulator (with or without promoter) was determined by ChIP-qPCR (STAR Methods). **p < 0.01; ns, not significant. All values are mean ± SD from at least six independent experiments. See also Figures S4 and S5.
as specific RNAP inhibitor rifampicin (Rif) on the removal of rNMPs from genomic DNA as RNaseHII-sensitive sites ( Figure 6). The high molecular weight (HMW) genomic DNA was isolated from mutant cells ( Figures 6A and 6B), or WT cells at different time points of incubation with Rif ( Figure 6C), treated with RNaseHII, and then resolved by mild alkali agarose gel. For the positive control, we used RNaseHII-deficient cells. The loss of HMW DNA products isolated from DrnhB and DrnhADrnhB cells, lacking both RNaseHI and RNaseHII, but not DrnhA cells, is indicative of multiple genomic rNMPs (Figure 6A). This result confirms that we detected chromosomal misincorporations repaired predominantly by RNaseHII, whereas RNaseHI plays a secondary role in this process. 41 A significant loss of HMW DNA products (in response to RNaseHII treatment) isolated from rpoC HII , rnhB RNAP , and rpoC HII /rnhB RNAP cells demonstrate that even a partial uncoupling from transcription by the mutations compromises RER genome-wide ( Figure 6B), supporting the results of local RER ( Figure 5D).
The amount of HMW products gradually decreased over time in the presence of Rif (50 mg/mL), demonstrating that at 90 min of incubation with Rif, the level of misincorporated rNMPs has increased to approximately 40% of the level observed in DrnhB cells ( Figure 6C), arguing that the continuous transcription is important for global RER. Of note, this ''low'' amount of Rif, while sufficient to stop bacterial growth, is not enough to stop all transcription in E. coli 42 ; it requires a much higher dose to achieve near absolute transcription inhibition. 36 However, the ''high'' dose of Rif (750 mg/mL), while still bacteriostatic for E. coli, also halts chromosomal replication initiation 43 and as a result would prevent most of the de novo rNMP misincorporation. This explains the ability of low but not high Rif to promote the accumulation of RNaseHII-sensitive genomic products over time (Figures 6C, right panel and S6B) and provides an internal control for the specificity of Rif effects on TC-RER. We thus conclude that even a partial inhibition of productive transcription elongation has a major negative effect on global RER. Western blot analysis confirms that the intracellular level of RNaseHII does not decrease during the time of incubation with Rif ( Figure S6E), whereas Rif does not affect the activity of RNaseHII ( Figure S6F), thus supporting the direct role of transcription coupling in RER. A B C Figure 6. Partial uncoupling from transcription compromises genomic RER (A) Global RER assay. Genomic DNA was isolated, treated with RNaseHII, and then resolved on mildly alkaline agarose gels. Representative gel for the WT and DrnhA/B mutants is shown on the left. Relative amount of rNMP-free DNA (%) was plotted as a ratio of high molecule weight (HMW) products before and after RNaseHII treatment. Numbers are mean ± SEM from three independent experiments. (B) rpoC HII and rnhB RNAP mutations that weaken RNAP-RNaseHII interactions compromise genomic RER. Genomic DNA from WT and mutant cells was isolated, treated with RNaseHII, or left untreated, and then resolved on alkaline agarose gels. Representative gel is shown on the left. HMW area (indicated by the bracket) is used for quantitation. The rnhB deletion is used as a positive control. The percentage of repaired (rNMP-free) DNA in RNase-HII-treated samples is plotted relative to the untreated samples. Data are mean ± SEM from three independent experiments. *p < 0.05. (C) The inhibition of promoter escape by RNAP with rifampicin suppresses global RER. Cells were incubated with low (50 mg/mL) or high (750 mg/mL) rifampicin for the indicated times, followed by genomic DNA processing as in (A). Representative alkaline gel for low rifampicin samples is shown on the left (see also Figure S6B for high Rif comparison). Data are mean ± SEM from at least three independent experiments. *p < 0.05. See also Figures S4 and S6.

DISCUSSION
The only DNA repair pathway for which transcription coupling (TC) has been unequivocally established is nucleotide excision repair (NER). 36,44-47 In TC-NER, RNAP acts as a primary sensor of DNA damage, as a vehicle that delivers NER enzymes to the lesion sites, and as a scaffold for proper assembly of the functional NER complex. 36,46 Remarkably, the present results show that RNAP serves similar functions in RER, except that the primary damage sensor must be RNaseHII itself, whereas RNAP operates primarily as a ''delivery vehicle'' (Figure 7). Indeed, RNAP enables 1D unidirectional diffusion of RNaseHII along DNA, while precisely orienting its catalytic center relative to the DNA substrate for prompt lesion recognition and processing ( Figure 7B). Such an assisted diffusion resolves an apparent incongruity between highly efficient RER and multiple confounding factors that hamper this process in living cells, including the low abundance of RNaseHII, 26 high frequency of chromosomal rNMPs, 4,7,9 macromolecular crowding, DNA compaction, and numerous DNA-binding proteins capable of shielding the substrate from RNaseHII. 27 Bulky DNA lesions, such as those processed by NER, usually present an unsurpassable block to the elongating RNAP, which then requires UvrD-mediated RNAP backtracking to expose them to NER enzymes. 36,48 In contrast, misincorporated rNMPs only transiently pause RNAP ( Figure 3E) and are unlikely to trigger extensive backtracking in vivo. Indeed, we see no significant effect of uvrD deficiency on RER ( Figure S6D), arguing that TC-RER is unlikely to operate via a backtracking step. In contrast to TC-NER, where RNAP serves as the primary sensor for chromosomal lesions, TC-RER utilizes RNAP not as a scanner but as a ''motor vehicle'' to facilitate a 1D search for misincorporated rNMPs. Backtracking-driven TC-NER may also contribute to repair of rNMPs. 41 However, the role of TC-NER in this process must be secondary, as the deletion of uvrA has virtually no effect on RER in the presence of active RNaseHII ( Figure S6D).
The conformation of RNaseHII in the engaged TC-RER complex allows for the binding of rNMP in the template strand to nick 5 0 O-P bond ( Figure 4C). Such an apparent strand specificity implies that all parts of bacterial genome should be transcribed for TC-RER to function on both strands. Indeed, the phenomenon of pervasive transcription has been well established in bacteria. 49 Our RNA-seq analysis supports this view, demonstrating that virtually all antisense and intergenic regions are transcribed, albeit at a generally lower efficiency, in exponentially growing E. coli ( Figure S3C). 46 Mammalian RNaseH2 has been proposed to interact with the replicative machinery and/or PCNA, which may enable its 1D scanning for rNMPs. 30,31 Our in vivo XLMS analysis detects no stable or specific interaction between RNaseHII and bacterial replisome or its associated factors, including a PCNA paralog-DnaN beta clamp ( Figure 1A). It should be noted, that bacterial RNaseHII is active as a single polypeptide, while in eukaryotes it is active as a three-subunit enzyme. Therefore, the methods of recruitment and interaction partners may be different based on this subunit composition. It appears that bacterial RNaseHII relies exclusively on transcription coupling to achieve efficient RER. Recently, the genomic co-localization of RNaseH2 with RNAP II in human cells has been reported. 50 Although the role of RNAPII-associated RNaseH2 was proposed to be in diminishing co-transcriptional R-loops, it is tempting to speculate that it may also function in TC-RER. Such mammalian TC-RER could oppose transcriptionassociated mutagenesis involving ribonucleotides, 51 a process which has recently been implicated in cancer. 52 Limitations of the study Our findings establish TC-RER phenomenon in E. coli and explain its molecular mechanism. However, it remains to be determined how conserved this process is across the bacterial kingdom and beyond. Our study does not provide the high-resolution genome-wide analysis of RER, which may eventually establish the rates and distribution of RER hotspots as a function of transcription elongation across bacterial genomes.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:

ACKNOWLEDGMENTS
We thank William Rice, Bing Wang, and Alice Paquette for helping with sample screening and data collection at NYU Langone Health's Cryo-EM Laboratory. We thank the BigPurple HPC core at NYU Langone Health and the Greene HPC at NYU for computer access.

DECLARATION OF INTERESTS
The authors declare no competing interests.

RESOURCE AVAILABILITY
Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Dr. Evgeny Nudler (evgeny.nudler@nyulangone.org).

Materials availability
This study did not generate new unique reagents. All strains and plasmids are available upon request.

Data and code availability
Coordinates for the structural models of EC18-RNaseHII and EC18(rGMP)-RNaseHII(E17A) have been deposited to the Protein Data Bank under PDB accession numbers 7UWE and 7UWH, respectively. The cryo-EM density maps for EC18-RNaseHII and EC18(rGMP)-RNaseHII(E17A) have also been deposited to the Electron Microscopy Data Bank under accession number EMD-26830 and EMD-26832, respectively. All deposited data is publicly available as of the date of publication. This paper does not report the original code. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

EXPERIMENTAL MODEL AND SUBJECT DETAILS
The E. coli strains MG1655 and its derivatives were used for all genetic and functional studies. The E. coli strains BL21(DE3) Tuner were used for recombinant proteins expression for structural and biochemical studies.

METHOD DETAILS
In vivo XLMS (covalent crosslink mapping by mass spectrometry) and data analysis E. coli strains construction and in vivo XLMS analysis was carried out using a previously described protocol. 38 Strain construction rpoC:10XHis and rnhB:3XFLAG strains were constructed by the introduction of coding sequences for 10XHis and DYKD DDDKDYKDDDDKDYK DDDDK after codons 1407 (rpoC) and 198 (rnhB), respectively, into a parental E. coli strain MG1655 by means of lambda Red-mediated gene replacement. 64 Successful construction was confirmed in each case by genomic sequencing and whole-cell enumerative proteomics.
In vivo DSS crosslinking Cells were grown in 0.5X Terrific Broth (Thermo Fisher Scientific, AAJ75856A1) at 37 C with agitation (250 rpm). When OD 600 reached 0.5, the culture was supplemented with 350 mM DSS (Proteochem, c1105) in DMSO ZerO 2 (Millipore Sigma, 900645) to a final concentration of 2 mM. The reaction was quenched after 45 min incubation by the addition of Tris-HCl, pH 8.0 (Thermo Fisher Scientific, AAJ22638AP) to the final concentration of 5 mM. Cells were harvested by centrifugation at a 6000 g for 5 min at 4 C, and processed immediately or stored at -80 C. In vivo formaldehyde cross-linking Cells were grown in 0.5X Terrific Broth (Thermo Fisher Scientific, AAJ75856A1) at 37 C with agitation (250 rpm) until OD 600 $0.5, supplemented with a 10% aqueous solution of formaldehyde (Electron Microscopy Sciences, 15712) to the final concentration of 0.025%; cells were harvested by centrifugation at a 6000 g for 5 min at 4 C, and processed immediately or stored at -80 C.
Affinity purification of 10XHis-tagged complexes Cells were suspended in lysis buffer (50 mM HEPES, pH 7.5, 500 mM NaCl, 2 mM MgSO 4 , 5 mM ZnSO 4 , 1 mM TCEP, 1X ProBlock Gold Bacterial 2D, Gold Bio, GB-108-2) and lysed by the combined action of lysonase (Millipore Sigma, 71230) and ultrasonication. A cell-free extract was prepared by centrifugation at 29,500 g for 45 min at 4 C, and 10xHis-tagged proteins were purified using His Mag Sepharose excel (Cytiva, 17371220), according to the manufacturer's protocol. Affinity purification of 3XFLAG-tagged complexes Cells were suspended in a lysis buffer (50 mM HEPES, pH 7.5, 125 mM NaCl, 1 mM TCEP, 1X ProBlock Gold Bacterial 2D, Gold Bio) and lysed by the combined action of lysonase and ultrasonication. Cell-free extract was prepared by centrifugation at 29,500 g for 45 min at 4 C, and 3XFLAG-tagged proteins were purified using Pierce Anti-DYKDDDDK Magnetic Agarose (Thermo Fisher Scientific, PIA36797), according to the manufacturer's protocol. Scientific, LS121-500). Eluted peptides were dried and resuspended in 20 ml 0.1% (v/v) formic acid (FA, Thermo Fisher Scientific, LS118-4) for MS analysis. Peptides were analyzed in an Orbitrap Fusion Lumos mass spectrometer (Thermo Scientific) coupled to an EASY-nLC (Thermo Scientific) liquid chromatography system, with a 2 mm, 500 mm EASY-Spray column (Thermo Scientific, ES903). The peptides were eluted over a 120-min linear gradient from 96% Buffer A (0.1% FA in water) to 40% Buffer B (0.1 % FA in ACN), then continued to 98% Buffer B over 20 min with a flow rate of 250 nl/min. Each full MS scan (R = 60,000) was followed by 20 data-dependent MS2 (R = 15,000) with high-energy collisional dissociation and an isolation window of 2.0 m/z. The normalized collision energy was set to 35. Precursors of charge state 2 and higher were collected for MS2 scans in enumerative mode, precursors of charge state 4-6 were collected for MS2 scans in a crosslink discovery mode (both were performed for each sample); monoisotopic precursor selection was enabled and the dynamic exclusion window was set to 30.0 s. Raw files obtained in the enumerative mode were analyzed using the pFind3 software 54 in open search mode, using the entire MG1655 proteome as the search space (Uniprot UP000000625). Fasta sequences of identified proteins formed the search space for crosslink discovery by pLink2 37 ; protein modifications inferred by pFind3 and comprising >0.5% of the total were included as the variable modifications in pLink2 search parameters. pLink2 results were filtered for FDR (<5%), e-value (<1.0E-3), score (<1.0E-2), and abundance (PSMsR5). Label-free protein quantitation was performed using MetaMorpheus. 65 Protein expression and purification WT E. coli RNAP was purified as previously described. 38 RNAP mutants were purified using the same protocol.

Mass spectrometry and data analysis
The open reading frame of the full-length E. coli RNaseHII protein was cloned into a modified pSUMO vector, which has the 10XHis tag conjugated with an N-terminal SUMO tag. The plasmid was used to transform E. coli strain BL21 (DE3), and recombinant protein expression was auto-induced. 66 After 16 hours of cultivation at 30 C, cells were harvested by centrifugation (4,000g for 10 min at room temperature) and pellets were stored at -80 C. Cell pellets were resuspended in a lysis buffer (50 mM Tris-HCl, pH 7.5, 10% (v/v) glycerol, 50 mM KCl) supplemented with complete, EDTA-free protease inhibitor cocktail tablets (Roche Applied Science) and lysed using sonication (5-second pulses with 10-second intervals for 10 minutes on ice). The cell lysate was clarified by centrifugation (30,000g for 40 minutes at 4 C) to remove insoluble debris. The supernatant was applied to a HisTrap column (GE Healthcare) equilibrated in HisTrap Buffer A (50 mM Tris-HCl, pH 8.0, 5% (v/v) glycerol, 0.5 mM b-mercaptoethanol, 500 mM NaCl, 10 mM imidazole). The column was washed with 20 column volumes (CV) of HisTrap Buffer A. Protein was eluted with HisTrap Buffer B (50 mM Tris-HCl, pH 8.0, 5% (v/ v) glycerol, 0.5 mM b-mercaptoethanol, 250 mM NaCl, 250 mM imidazole). Fractions containing recombinant 10XHis-SUMO-RNase-HII eluted from the HisTrap column were subjected to 10XHis-SUMO tag cleavage using SUMO protease (Invitrogen) in dialysis buffer (50 mM Tris-HCl, pH 8.0, 5% (v/v) glycerol, 0.5 mM b-mercaptoethanol, 100 mM NaCl) at 4 C for 16 hours. After the 10XHis-SUMO tag cleavage, protein mixture was applied to a HisTrap column (GE Healthcare) equilibrated in HisTrap Buffer A (50 mM Tris-HCl, pH 8.0, 5% (v/v) glycerol, 0.5 mM b-mercaptoethanol, 500 mM NaCl, 10 mM imidazole). Flow-through containing non-tagged RNaseHII was pooled, concentrated and applied to a Superdex 200 size-exclusion chromatography (SEC) column (GE Healthcare) that was equilibrated in 20 mM Tris-HCl, pH 7.5, 2 mM MgCl 2 , 150 mM KCl, 1 mM dithiothreitol, and the RNaseHII-containing peak fractions were collected, flash-frozen in liquid nitrogen and stored at -80 C. The RNaseHII mutants were purified using the same protocol.
Nucleic-acid scaffold preparation Synthetic DNA and RNA oligonucleotides were obtained from Integrated DNA Technologies (IDT). The nucleic acids were dissolved in RNase-free deionized water at a concentration of 1 mM. To assemble the scaffold, template DNA and RNA were mixed at a 1:1 ratio, annealed by incubation at 95 C for 2 min, 75 C for 2 min, 45 C for 5 min, and then by decreasing the temperature by 5 C every 2 min until reaching 25 C. The annealed template DNA:RNA hybrid was stored at -20 C until use.

Preparation of EC-RNaseHII complexes for Cryo-EM
Elongation complex formation was performed as described previously. 38 Purified E. coli RNAP was mixed with a corresponding template DNA:RNA hybrid at a molar ratio of 1:1.3 and incubated for 30 min at 30 C. Nontemplate DNA was added at a molar ratio of 3:1 and incubated for 20 min. To remove excess nucleic acid, the complexes were applied to a Superose 6 Increase SEC column (GE Healthcare) that was equilibrated in 20 mM Tris-HCl, pH 8.0, 2 mM MgCl 2 , 50 mM KCl, and 1 mM dithiothreitol. The peak fractions containing the elongation complex (EC) were pooled and subsequently mixed with purified RNaseHII/RNaseHII(E17A) at a molar ratio of 1:5, followed by SEC over a Superdex 200 Increase column (GE Healthcare) that was equilibrated in 20 mM Tris-HCl, pH 8.0, 2 mM MgCl 2 , 50 mM KCl, and 1 mM dithiothreitol to remove excessive RNaseHII/ RNaseHII(E17A). EC-RNaseHII complexes were then concentrated to $3.5 mg/ml. 8 mM CHAPSO [3-([3-Cholamidopropyl] dimethylammonio)-2-hydroxy-1-propanesulfonate] (Sigma Millipore) was added to the samples right before grid preparation.

RNaseHII-binding assay
For the RNaseHII-binding assay of WT RNAP and RNAP HII , the same protocol for EC18-RNaseHII formation as described above was used. The final product was subjected to SDS-PAGE analysis, Coomassie blue staining, followed by gel quantitation using ImageJ (V1.53k). 61 Article Dynamic light scattering (DLS) analysis DLS of RNAP-RNaseHII complexes was carried out using DynaPro Nanostar instrument (Wyatt Technology Corporation). 67 0.1 ml aliquots in single use cuvettes (Uvette 220-1600, Eppendorf 952010077) of sample at 1-2 mg/ml concentration were stabilized at 25 C, and measurements were taken in a series of 20, 5 sec intervals each, with the auto-attenuation of laser power on. The average among 20 individual measurements polydispersity (%) and radius (nm) were determined using Dynamics V7 software (Wyatt Technology Corporation), and molecular weight (kDa) was estimated from the apparent radius (Raleigh sphere approximation) using the same software.
Cryo-EM grid preparation UltrAuFoil (Quantifoil) R-0.6/1 Au 300 mesh grids were plasma-cleaned for 20 seconds. After applying 3.5 ml of sample, grids were blotted for 0.5 or 1 second with a blotting force of 0 and vitrified in liquid ethane using a Vitrobot Mark IV (FEI) with 100% humidity at 22 C.

Cryo-EM data acquisition and processing
The EC18-RNaseHII dataset was collected on a Titan Krios electron microscope (FEI) operated at 300 kV and equipped with a Gatan K3 Summit direct electron detector and the EC18(rGMP)-RNaseHII(E17A) dataset was collected on a Talos Arctica electron microscope (FEI) operated at 200 kV and equipped with a Gatan K3 Summit direct electron detector. Images of EC18-RNaseHII were recorded in super-resolution mode with a pixel size of 0.426 Å and a defocus range of 0.8 -2.0 mm, using a total dose of 62 electrons/Å 2 fractionated over 50 frames. Images of EC18(rGMP)-RNaseHII(E17A) were recorded in super-resolution mode with a pixel size of 0.548 Å and a defocus range of 0.8 -2.0 mm, using a total dose of 56 electrons/Å 2 fractionated over 50 frames.
Collected micrographs were drift-corrected and dose-weighted in MotionCor2, 57 and the contrast transfer function (CTF) parameters were estimated using CTFFIND4. 56 Approximately 10,000 particles were manually picked and subjected to 2D classification in RELION-3, 58 which was used for all subsequent image processing. Projection averages of the most populated 10 classes were used as templates for automated particle picking in RELION-3. The original drift-corrected micrographs were manually inspected and used for particle extraction. All the extracted particles were subjected to 2D classification in RELION-3. Poorly populated classes were removed, resulting in datasets of 2,928,164 particles (EC18-RNaseHII dataset) and 1,134,629 particles [EC18(rGMP)-RNase-HII(E17A) dataset].
The EC18-RNaseHII particles were first 3D auto-refined, then subjected to focused 3D classification at RNaseHII density and classified into 2 classes using the refined angles without alignment. The class that showed better density in the RNaseHII sub-region contains 681,875 particles. The EC18-RNaseHII particles were then 3D auto-refined and subjected to 3D classification on the overall density and classified into 5 classes using the refined angles and local angular searches. The best 3 classes that showed the highest resolution both in the EC and RNaseHII sub-regions, contained 423,328 particles, were auto-refined, post-processed and subjected to two cycles of CTF refinement, particle polishing and 2D classification (cleanup) in RELION-3, yielding the final density map at an overall resolution of 2.96 Å . The final dataset contains 142,145 particles ($5% of the particles in the starting dataset). Local resolution calculations were performed using RELION-3.
The EC18(rGMP)-RNaseHII(E17A) particles were first 3D auto-refined, then subjected to 3D classification into 5 classes using the refined angles and local angular searches. The class that showed clearer density of RNaseHII, containing 17.2% of the particles in the starting dataset, was subjected to focused 3D classification into 2 classes on the RNaseHII sub-region. The particles that showed better density in the RNaseHII sub-region, containing 90,749 particles, were auto-refined, post-processed and subjected to two cycles of CTF refinement, particle polishing and 2D classification (cleanup) in RELION-3, yielding the final density map at an overall resolution of 3.16 Å . The final dataset contains 70,705 particles ($10% of the particles in the starting dataset). Local resolution calculations were performed using RELION-3.

Model building and refinement
To build an initial model for EC-RNaseHII, the atomic models of the EC part (PDB: 6XAS), 38 and the homologous modeled E. coli RNaseHII were fit into the cryo-EM density map using ChimeraX. 40 The same structures plus the atomic model of downstreamextended DNA were fit into the cryo-EM density map of EC18(rGMP)-RNaseHII(E17A). These initial models were real-space refined in PHENIX. 59 The subunits in RNAP, the RNaseHII and the nucleic acids were first refined as rigid bodies and were subsequently refined with secondary-structure restraints. Mutated residues were built de novo in Coot, 55 and real-space refined with the previously refined model.

E. coli RNAP and RNaseHII mutants and strains construction
To construct an E. coli RNAP core expression plasmid carrying rpoC151-155 MTNLE/GTGLG & rpoC195-207 EQECEQL REELNE/AAECAQAAAELAA mutations, a fragment of the rpoC gene with two clusters of point mutations was cloned and amplified from the pVS10 plasmid 53 and was inserted into the BsmI and SbfI sites of pVS10 to replace the corresponding region of the WT gene.
E. coli strains with rpoC multi-point mutations were constructed by using the lambda Red recombineering method together with CRISPR-Cas9 counterselection as previously described. 68 Double-stranded DNA (dsDNA) for recombineering was PCR amplified using oC16 and oC17 oligos, and the pVS10-RNAP HII plasmid carrying two clusters of mutations as a template. pKDsg derivative (pSg-155) used for counterselection was constructed by circular polymerase extension cloning using primers (o155sgF and o155sgR) with overlapping 20-bp protospacer sequences that corresponded to the fragment adjacent to appropriate PAM site (5'-NGG-3') in the mutated parental sequences.
E. coli MG1655 strains were first transformed with the pCas9cr4 plasmid and subsequently transformed with the sgRNA encoding plasmid (pSg-155). Cells that possessed both plasmids were grown in Super Optimal Broth with spectinomycin (Sp, 50 mg/l) and chloramphenicol (Cm, 30 mg/l) at 30 C. When OD 600 reached $0.5, lambda Red was induced with 1.2% (w/v) L-arabinose, and the cells were grown for another 20 min. Then, the dsDNA for recombineering (0.5 mg) was electroporated into the cells. After 2 hours of recovery, the cells were plated on Luria broth (LB) with Sp, Cm and anhydrotetracycline (aTc, 100 ng/l) and incubated overnight at 30 C to select for survivors of the CRISPR/Cas9 selection. Colonies were screened by PCR with the forward mutation-specific primers oN152A (to check the rpoC151-155 mutation cluster) or oN206A (specific to the rpoC195-207 mutation cluster) and the reverse primer oC11. The relevant chromosome region was verified by sequencing.
To eliminate the pSg-155 plasmid, cells were incubated in LB for 12 hours at 37 C and streaked on LB agar plates. Individual colonies were selected and assessed for Sp resistance loss. The pKDsg-15a plasmid was used to cure the pCas9cr plasmid that targeted the p15a origin of replication of pCas9cr. Upon the transformation of pKDsg-15a into cells that contained pCas9cr, the cells were recovered in SOC (Super Optimal broth with Catabolite repression) for 2 hours at 30 C, then aTc (100 ng/l) was added and incubated for an additional 2 hours before plating on LB with Sp and aTc. The pKDsg-15a plasmid was cured by growth at 37 C.
The rnhB mutations encoding RNaseHII K48A K52A R53A E60A were introduced into the Mg1655 strain or the strain containing rpoC mutations by ssDNA recombineering and CRISPR-Cas9 counterselection. The sgRNA encoding plasmid (pSg-H2-53) and mutagenic ssDNA oligo (oRnh48-60) were used. To obtain pSg-H2-53, the pKDsg plasmid was re-targeted by circular polymerase extension cloning using primers oH2R53sgF and oH2R53sgR. Colonies were screened with the primer oRnh48c (specific to the K48A change) and oRnh2 (reverse). The sequence of the rnhB region of the resulting strains was verified by DNA sequencing.
An E. coli strain with chromosomal 10XHis-tagged RNaseHII was constructed by introducing coding sequences for 10xHistiding at the C-terminus of rnhB in the E. coli MG1655 strain using the l-red expression plasmid system 64,69 and other mutants DrnhA, DrnhB, and DrnhA DrnhB were constructed using the same protocol. All the generated mutant strains generated were confirmed by PCR and DNA sequencing (Macrogen USA).

RNA isolation and RT-qPCR analysis
An overnight bacterial culture ($16 hours) was diluted 100-fold in Luria broth (LB) and grown at 37 C with agitation (250rpm). Transcription was induced at an optical density of 0.3 using 0.1mM IPTG for 30 minutes. Total RNA was isolated using RNA later (Ambion) stabilization solution and Direct-zol RNA Microprep kit (Zymo research). RNA was quantified and reverse transcribed using Quantitect reverse transcription kit (Qiagen). cDNA was used as template in qPCR using Quantstudio 7 (Applied Biosystems) in a 96-well plate. The qPCR mixture contained 1X SYBR Green master mix, 500 nM of each primer and 1 ml of template in a reaction volume of 20 ml. Relative expression levels were determined by normalizing reaction threshold cycle (C t ) values to that of the reference gene, cysG. C t values for un-induced cultures were used as calibrators for determination of relative expression by the relative quantification method (2 -DDC t ). 70 Statistical analysis was done using Graph Pad Prism (ver. 9.0). Unpaired, non-parametric, two-tailed Mann-Whitney test was used to determine statistical significance. Primer details are given in Table S3.
Semi long-range quantitative PCR (SLR-qPCR) and chromosomal rNMPs detection Validation of SLR-qPCR for rNMP detection SLR-qPCR can be used for any DNA lesion, including single strand breaks or double strand breaks, that create a barrier to amplification by DNA polymerase. To verify that RNaseHII-generated single strand nicks can inhibit DNA polymerase, a 107 bp double stranded DNA with a single rNMP imbedded 5 bp apart in the middle of each strand was synthesized ( Figure S5A). 1.0 mg of DNA rNMP was treated with 100 nM of RNaseHII for 30 minutes at 37 C in 1X thermopol buffer (NEB). The DNA was purified using PCR purification kit (Qiagen) and subjected to qPCR ( Figure S5A).
To validate the detection of rNMPs by SLR-qPCR a series of 1kb synthetic substrates with one, two, and three rNMPs imbedded in the center was synthesized ( Figure S5B). 1.0 mg of each substrate was treated with a sub-stoichiometric concentration of RNaseHII (100 nM) for 30 minutes at 37 C in 1X thermopol buffer (NEB). The RNaseHII treated substrates were purified using PCR purification kit (Qiagen) and used to quantify rNMPs with SLR-qPCR (see below for details). Data analysis was done as described. 71 Briefly, two sets of qPCRs were performed, one with a short amplicon ($100bp) to normalize for the concentration of template and a second one with long amplicon (1000 bp) to detect rNMPs. Lesion density was calculated using the following formula and expressed as lesions per 10kb. The lesion rates detected by SLR-qPCR were plotted against expected DNA lesions ( Figure S5B). SLR-qPCR assay, as designed, limits the concentration of RNaseHII to the sub-saturating conditions where the nicking of the template is directly proportional to the number of rNMPs in the template. Each embedded rNMP has probability M (0<M<<1) of being cut by RNaseHII, whereas the probability of 1 nick in the template is N (0<N<1), where N is M$n (where n is a number of rNMPs in the template).

Chromosomal rNMP detection
To quantify rNMPs incorporation in genomic DNA, indicated strains were inoculated in LB. The overnight cultures were diluted 100-fold in LB and grown at 37 C to an OD 600 of $ 0.3-0.4. Transcription was induced using 0.1mM IPTG for 30 minutes. 1 ml of culture was collected by centrifugation and the cell pellet was washed twice with phosphate-buffered saline (PBS, pH 7.4). The cells were used for the isolation of genomic DNA or were stored in a -20 C freezer for further use. Genomic DNA was isolated using the Monarch genomic DNA purification (NEB) according to the manufacturer's instruction. 1.0 mg of genomic DNA was treated with 100 nM of RNaseHII for 30 min at 37 C in 1X thermopol reaction buffer (NEB). DNA was purified using the PCR purification kit (Qiagen) and used to quantify rNMP incorporation using SLR-qPCR. 71 The reaction mixture contained 1X SYBR Green dye, 500 nM of each primer and 1 ng of template DNA in a total volume of 20 ml per well. Two sets of primers, yielding a short (130 bp; RR fwd and rev) and a long (2900 bp for lacZ and 700 bp for mCherry; RR fwd and SLR rev) amplicon provide data representing the total amount of template (an internal normalization control) and undamaged DNA respectively. The two primer pairs had comparable efficiency of amplification and yielded a single PCR product as judged by agarose gel analysis.
Data analysis was done as described 71 and above. Data analysis is based on the fact that the probability of encountering a lesion is very low in a small amplicon as opposed to a long amplicon. Furthermore, for the long amplicon DNA synthesis is more efficient in RNaseHII-untreated samples as compared to the treated samples. Lesion density in RNaseHII-treated samples is measured using untreated samples as reference. Analysis depends on the measurement of C T values of short and long amplicons. The difference in the crossing point (DC T ) of a long amplicon vs short amplicon is used as a measure of the lesion frequency in correlation to the amplification size of the long amplicon. A modified version of the 2 ÀDDCT method was used 70 (see above). The DNA damage was calculated as lesion per 10 kb DNA of each ROI. Statistical analysis was done using Graph Pad Prism (ver. 9.0). Unpaired, non-parametric, twotailed Mann-Whitney test was used to determine statistical significance.

Chromatin immunoprecipitation (ChIP-qPCR)
Overnight cultures ($16 hours) of indicated strains were diluted 100-fold in 50mL LB and grown at 37 C to an OD 600 of $ 0.6. Transcription was induced at an optical density of 0.3 using 0.1mM IPTG for 30 minutes. Cells were crosslinked using 1% (v/v) formaldehyde. Cells were incubated for 20 min with formaldehyde at room temperature. Excessive formaldehyde was quenched with glycine at the final concentration of 100 mM. ChIP was performed as described previously. 72 10XHis-tagged RNaseHII was immunoprecipitated using anti-His antibody (Abcam, ab9108). RNAP was used as a positive control and immunoprecipitated using antibody raised against the RNAP b-subunit (Biolegend, Clone-8RB13;663905). Mouse IgG (Santa Cruz Biotechnology, sc2025) was used as a negative control. Immunoprecipitated DNA was quantified by qPCR using Quantstudio 7 (Applied biosystems) in a 96 well plate. and expressed as percentage of input DNA. Primers lacZ1 and lacZ2 were used and are detailed in Table S3. Statistical analysis was done using Graph Pad Prism (ver. 9.0). Unpaired, non-parametric, two-tailed Mann-Whitney test was used to determine statistical significance.
RNA-sequencing (RNA-seq) RNA was isolated from exponentially grown MG1655 E. coli cells as previously described. 46 Ribosomal RNA was removed using RiboMinus Transcriptome isolation kit for bacteria (ThermoScientific). The libraries were prepared using NEBNext Ultra II Directional RNA Library Preparation kit (NEB) according to manufacturer's instructions. The libraries were sequenced using NextSeq 500 instrument (Illumina) in a 2x41 bp paired end mode to the sequencing depth of 20-30 millions reads per sample. The reads were aligned to the reference genome (E. coli MG1655, RefSeq NC_000913.3) using Bowtie2, 62 and strand specific per nucleotide coverage was calculated using BEDTools software suite. 63 In vitro transcription assays E. coli RNAP and NTPs were purified as described previously. 48 WT and mutant RNaseHII were purified as described above. A DNA template carrying a single rNMP at position +68 of the template strand was constructed by PCR amplification using biotinylated forward primer and reverse primers containing a single ribo-CMP (IDT). The resulting template (rC-DNA) was purified from agarose gel using gel-extraction kit (Qiagen) according to the manufacturer's protocol. A control template without rNMP was produced the same way using non-modified DNA oligo.
For measuring DNA cleavage and EMSA, DNA was labeled with 5 mCu g-[ 32 P]-ATP (3,000 Ci mmol À1 ; Perkin Elmer) for 30 min at 37 C using 20 units of T4 polynucleotide kinase (NEB). Radiolabeled DNA was ethanol precipitated and re-dissolved in the TE buffer. The sequence of the ribo-CMP template is shown below. The non-transcribed part is italicized. Primers used for PCR amplification are underlined. Bold T indicates the position of the initial EC. Bold G marks the position (in the complementary strand) of rCMP. The complete sequence of the DNA template used in the assay is: tccagatcccgaaaatttatcaaaaagagtttgacttaaagtctaacctataggatacta cagccATCGAGAGGGC CACGGCGAACAGCCAACCTAATCGACACCGGGGTCCGGGATCTGGATCTGGATCGCGAATTCCAGGCC TGCTGGTAATCTTTGGATCCC.
To measure RNaseHII-induced cleavage, a 100 nM P 32 -labeled rC-DNA template in TB100 buffer [40 mM Tris-HCl pH 8.0; 10 mM MgCl 2 ; 100 mM NaCl] was mixed with serially diluted RNaseHII (5 nM, 50 nM, 500 nM or 5 mM) for 1 minute at 22 C before quenching with Stop Buffer (SB, 13TBE buffer, 8 M urea, 20 mM EDTA, 0.025% xylene cyanol, and 0.025% bromophenol blue), and the products were resolved at 10% PAGE with 8 M urea in TBE buffer for 30 min at 50 watts (W). The gel was exposed to a phosphor-screen. Products were visualized by scanning the screen with Typhoon Imager (GE Healthcare) and analyzed using Image-Quant software (GE Healthcare).
For the EMSA experiment, a 10 nM P 32 -labeled rC-DNA template in 20 mL of TB50 (TB with 50 mM NaCl) were mixed with 10 nM of WT or mutant RNaseHII for 1 minute at 22 C. 4 mL (one fifth of the volume) of the Purple Loading Dye (NEB) was added and the mixtures were immediately loaded onto 8% PAGE native gel in TBE. The gel was run for 30 min at 50 watts, transferred on film, covered with saran-wrap and exposed to an X-ray film, followed by scanning with Typhoon Imager (GE Healthcare) and analysis using Image-Quant software (GE Healthcare).
For the RNaseHII roadblock experiment, the transcription reaction was initiated by mixing 10 pmol of RNAP with the equimolar amount of the biotinylated rC-DNA template in 20 mL of TB50 buffer for 5 minutes at 37C. 10 mM ApUpC RNA primer, and 25 mM ATP and GTP were added and incubation was continued for 5 min at 37 C. The resulting complex (EC10) was immobilized on NeutrAvidin UltraLink affinity resin ($10 ml, Fisher Scientific) in the presence of 1.5 mg/ml heparin for 5 min at room temperature. RNA was labeled with 2 mCu a-[ 32 P]-CTP (3,000 Ci mmol À1 ; Perkin Elmer) for 5 min at room temperature. Resulting EC29 was washed twice with 1 ml of TB1000 (TB with 1 M NaCl) and twice with 1 ml of TB100. From 200 mL suspension three 50 mL aliquots and five 10 mL were taken. One 10 mL aliquot was quenched with 10 mL SB (no chase). 50 mL aliquots were mixed with WT or E17A RNaseHII up to 0.5 mg/mL each or in an equal amount of TB100. Samples were incubated for 1 min at 22 C and chased at 22 C with 0.1 mM NTPs. 10 mL aliquots were withdrawn at 10, 20, 30, 40 or 80 second intervals and quenched in fresh tubes with 10 mL SB. For the sequencing reaction, four 10 mL aliquots were chased for 10 min at 22 C with 25 mM NTPs and one corresponding 3' deoxynucleotide triphosphate (dNTP) (3:1 ratio NTP/3'dNTP) before being quenched with 10 mL SB. The products were separated at 10% PAGE with 8 M urea in TBE. The gel was transferred to the film, covered with saran-wrap, and exposed to a phosphor-screen. The products were visualized by scanning the screen with Typhoon Imager (GE Healthcare) and analyzed using Image-Quant software (GE Healthcare).

RER quantitation in E. coli cells
The WT E. coli MG1655, DrnhA, DrnhB, and DrnhA-DrnhB strains were inoculated in LB. The overnight cultures were diluted 1:100 and grown in M9 medium (1X M9 salt, 0.4% glucose, 0.2% casamino acids, 2mM MgSO 4 , 0.1mM CaCl 2 , and 1mM thiamine hydrochloride) at 37 C to an OD 600 of $ 0.3-0.4. 1 ml of culture was collected and mixed with pre-chilled 2X -ET buffer (1X NET buffer -100 mM NaCl, 10 mM Tris [pH 8.0], 20 mM EDTA [pH 8.0]) and store on ice. For the rifampicin (Rif) experiments, Rif (50 mg/ml or 750 mg/ml) was added, and samples during different time courses were collected. The cells were centrifuged and used for the isolation of genomic DNA or stored in a -20 C freezer for further use. For the isolation of genomic DNA, the Lucigen Master Pure Complete DNA and RNA purification kit was used following the manufacturer's protocol. The isolated genomic DNA was dissolved in 35 ml of T 10 E 1 and 15 ml of each DNA sample was treated with reaction buffer (100 mM NaCl,1 mM DTT,1 mM EDTA, 25 mM Na 2 HPO 4 (pH 7.2 at 25 C), 1x BSA, and 10 mM MgCl 2 ) supplemented with 100 nM RNaseHII for 30 min at 37 C and the remaining 15 ml was treated similarly without the enzyme. Both the undigested and digested genomic DNA samples were then electrophoresed on 0.5% alkaline agarose gels in 30 mM NaOH, 1mM EDTA at 25V for 18-20 hours, then stained and visualized with SYBR Gold. The intensity of each high-molecular band was determined using the ImageJ (V1.52a), 61 and the fraction of lesion-free DNA (% RNaseHII resistant DNA) was quantified as a ratio of the RNaseHII undigested products to no enzyme treatments controls. Article Quantitation of endogenous RNaseHII in E. coli cells An E. coli rnhB-10X-His strain was grown at 37 C in LB with agitation (250 rpm) until OD 600 0.3 ± 0.05 and rifampicin (50 or 750 mg/ml) was added. Samples were taken at the indicated time intervals. Cells were washed twice with cold 1X PBS and stored at -80 C. For the Western Blot analysis, the cells were lysed in lysis buffer (500 mM NaCl, 50 mM HEPES pH 7.5, and 5 % glycerol) with the protease cocktail inhibitor using lysozyme (2 mg/ml) treatment and ultrasonication. The cell-free extract was prepared by centrifugation at 20,800 g for 10 min at 4 C and 25-50 mg of proteins were resolved on 3-12 % SDS-PAGE gradient gel, followed by the transfer to a PVDF membrane via electroblotting at 25 V for 1 hour. Western blotting was performed according to the general instructions provided with some modifications as per the antibodies used. PVDF membranes were blocked with blocking buffer (5 % skim milk in PBS pH 7.4 with 0.05% Tween-20) for 1 hour at room temperature and incubated with anti-6XHis antibodies (Abcam, Ab9108) and anti-b E. coli RNAP b monoclonal antibodies (BioLegend, 663905). All the antibodies were diluted (1:2000) in PBST buffer before use. The intensities of the corresponding protein bands for the RNAP b-subunit and RNaseHII were measured using ImageJ (V1.53k). 61 To normalize the intensities, the ratio of each time point was divided by the ratio of the control (mock treated) samples and change in the protein level was presented as % RNaseHII intensity as a function of Rif treatment (min).

QUANTIFICATION AND STATISTICAL ANALYSIS
Error bars, sample size, and data fitting are indicated in the corresponding figure legends and STAR Methods section. Statistical analysis was performed using GraphPad Prism version 9.1.2. for Windows (GraphPad Software). Where indicated, statistical analysis was performed using a two-tailed unpaired nonparametric Mann-Whitney test.  Figure S1. RNaseHII mutant enzymatic test and cryo-EM analysis of TC-RER complexes, related to Figures 1, 2, 3, and 4 (A) RNaseHII-DNA interaction. Substrate DNA with or without incorporated rGMP (red) is shown on top. The same DNA rGMP was used for the assembly of EC18(rGMP)-RNaseHII(E17A), except that À10 to +1 sequence of the non-template (nt) strand was identical to corresponding positions (CCTCTCCATG) on the template (t) strand to maintain the transcription bubble. EMSA with WT and E17A mutant RNaseHII used in structural studies is shown below. Radiolabeled DNA was incubated with RNaseHII enzymes for 1 min at 22 C before separating in a native PAGE. A fraction of unbound DNA (%) and DNA-RNaseHII complexes are indicated.  Duplex DNA is in a ladder representation. Template strand (tDNA) is shown in orange, non-template strand (ntDNA) in red, and the rGMP +16 in magenta. RNAP b 0 subunit is in blue cylinders/stubs. (E) FSC curves calculated between the refined structure and the half-map used for the refinement (work), the half-map not used for the refinement (free), and the combined map for EC18-RNaseHII (left panel) and EC18(rGMP)-RNaseHII(E17A) (right panel).
(legend continued on next page) ll OPEN ACCESS Article Figure S5. Validation of SLR-qPCR experiments, related to Figure 5 (A) Detection of rNMPs using qPCR. A 107-bp DNA with two rGMPs incorporated in the middle (shown in red) was synthesized (top). DNA rGMP (1.0 mg) was digested with 100 nM of RNaseHII at 37 C for 30 min. RNaseHII-treated and untreated DNA samples were mixed in varying proportions and subjected to qPCR. The sequence of the primers is indicated (top). C T values were plotted against corresponding ratio of untreated and treated DNA. Values are mean ± SD from three independent experiments. (B) Validation of SLR-qPCR for detecting rNMPs in genomic DNA. 1 kb synthetic substrates with 1, 2, or 3 ribonucleotides (shown in red) imbedded in the center were synthesized (top). dsDNA containing varying number of ribonucleotide triphosphares (rNTPs) were digested with the sub-stoichiometric amount of RNa-seHII, creating probabilistic single-strand breaks (see STAR Methods). The lesion rates detected by SLR-qPCR were plotted against expected DNA lesions in three independent experiments. The plot demonstrates a linear correlation between expected and detected lesion density: as the number of lesions in the substrate increases, the number of detected lesions also increases. The entire range of lesion density obtained in Figures 5C and 5D is within the validated SLR-qPCR sensitivity range (0.25-3.5 rNMPs/10 kb). (C) RNaseHI does not impact SLR-qPCR. Genomic DNA (1 mg) was treated with 100 nM of RNaseHI for 1 h. DNA was purified and used for qPCR with long amplicon (2,900 bp) primers. RNaseHI-treated and untreated samples showed comparable C T values indicating that in contrast to RNaseHII, RNaseHI does not create any amplification-interfering breaks. Values are mean ± SD from three independent experiments. (D) Transcription is strongly inhibited within the insulator in the absence of a dedicated promoter. A standard curve was created using PCR amplified lacZ DNA. Total RNA was isolated from control and insulator (with or without promoter) strains. RNA was reverse transcribed followed by qPCR. Number of copies of lacZ was determined by interpolation. Values are mean ± SD from three independent experiments.