Structure of a bacterial RNA polymerase holoenzyme open promoter complex

Initiation of transcription is a primary means for controlling gene expression. In bacteria, the RNA polymerase (RNAP) holoenzyme binds and unwinds promoter DNA, forming the transcription bubble of the open promoter complex (RPo). We have determined crystal structures, refined to 4.14 Å-resolution, of RPo containing Thermus aquaticus RNAP holoenzyme and promoter DNA that includes the full transcription bubble. The structures, combined with biochemical analyses, reveal key features supporting the formation and maintenance of the double-strand/single-strand DNA junction at the upstream edge of the −10 element where bubble formation initiates. The results also reveal RNAP interactions with duplex DNA just upstream of the −10 element and potential protein/DNA interactions that direct the DNA template strand into the RNAP active site. Addition of an RNA primer to yield a 4 base-pair post-translocated RNA:DNA hybrid mimics an initially transcribing complex at the point where steric clash initiates abortive initiation and σA dissociation. DOI: http://dx.doi.org/10.7554/eLife.08504.001


Introduction
Transcription initiation is a major control point of gene expression. The initiation process is best understood in the bacterial system (Saecker et al., 2011) where the conserved ∼400 kD catalytic core of the RNA polymerase (RNAP or E, subunit composition α 2 ββ′ω) combines with the promoter-specificity factor σ A to form the holoenzyme (Eσ A ), which locates promoter DNA and unwinds 12-14 base pairs (bps) of the DNA duplex to yield the transcription-competent open promoter complex (RPo). In the presence of nucleotide substrates, RNA synthesis begins with the formation of an initial transcription complex (RP ITC ). Before transitioning to a stable elongation complex, steric clash between the elongating RNA transcript and elements of σ set up abortive initiation, where the RNAP repeatedly generates and releases short transcripts without dissociating from the promoter (McClure et al., 1978;Murakami et al., 2002a;Goldman et al., 2009). Eventually, the transcript reaches a length of around 17 nt, where σ dissociation and the transition to the stable elongation complex begins (Nickels et al., 2005).
The architecture of Eσ A recognition of the key −35 and −10 promoter elements was delineated by the structure of Thermus aquaticus (Taq) Eσ A bound to an upstream fork (us-fork) promoter fragment, but the low resolution (6.5Å) prevented the visualization of molecular details (Murakami et al., 2002b). Although high resolution crystal structures defined key, sequence-specific interactions of σ with the −35 element (Campbell et al., 2002), the melted −10 element (Feklistov and Darst, 2011), as well as with downstream promoter DNA in the context of holoenzyme , these structures did not contain the full transcription bubble with the upstream double-strand/single-strand (ds/ss) DNA junction at the upstream edge of the −10 element where transcription bubble formation initiates. eLife digest Inside cells, molecules of double-stranded DNA encode the instructions needed to make proteins. To make a protein, the two strands of DNA that make up a gene are separated and one strand acts as a template to make molecules of messenger ribonucleic acid (or mRNA for short). This process is called transcription. The mRNA is then used as a template to assemble the protein. An enzyme called RNA polymerase carries out transcription and is found in all cells ranging from bacteria to humans and other animals.
Bacteria have the simplest form of RNA polymerase and provide an excellent system to study how it controls transcription. It is made up of several proteins that work together to make RNA using DNA as a template. However, it requires the help of another protein called sigma factor to direct it to regions of DNA called promoters, which are just before the start of the gene. When RNA polymerase and the sigma factor interact the resulting group of proteins is known as the RNA polymerase 'holoenzyme'.
Transcription takes place in several stages. To start with, the RNA polymerase holoenzyme locates and binds to promoter DNA. Next, it separates the two strands of DNA and exposes a portion of the template strand. At this point, the DNA and the holoenzyme are said to be in an 'open promoter complex' and the section of promoter DNA that is within it is known as a 'transcription bubble'. However, it is not clear how RNA polymerase holoenzyme interacts with DNA in the open promoter complex. Bae, Feklistov et al. have now used X-ray crystallography to reveal the three-dimensional structure of the open promoter complex with an entire transcription bubble from a bacterium called Thermus aquaticus. The experiments show that there are several important interactions between RNA polymerase holoenzyme and promoter DNA. In particular, the sigma factor inserts into a region of the DNA at the start of the transcription bubble. This rearranges the DNA in a manner that allows the DNA to be exposed and contact the main part of the RNA polymerase. If the holoenyzyme fails to contact the DNA in this way, the holoenzyme does not bind properly to the promoter and transcription does not start.
These findings build on previous work to provide a detailed structural framework for understanding how the RNA polymerase holoenzyme and DNA interact to form the open promoter complex. Another study by Bae et al.-which involved some of the same researchers as this study-reveals how another protein called CarD also binds to DNA at the start of the transcription bubble to stabilize the open promoter complex. Oligonucleotides used for RPo crystallization. The numbers above denote the DNA position with respect to the transcription start site (+1). The DNA sequence is derived from the full con promoter (Gaal et al., 2001). The −35 and −10 (Pribnow box) elements are shaded yellow, the extended −10 ( Keilty and Rosenberg, 1987) and discriminator (Feklistov et al., 2006;Haugen et al., 2006) elements purple. The nt-strand DNA (top strand) is colored dark grey; t-strand DNA (bottom strand), light grey; RNA transcript, red. (B) Overall structure of RPo. The nucleic acids are shown as CPK spheres and color-coded as above. The Taq EΔ1.1σ A is shown as a molecular surface (αI, αII, ω, grey; β, light cyan; β′, light pink; Δ1.1σ A , light orange), transparent to reveal the RNAP active site Mg 2+ (yellow sphere) and the nucleic acids held inside the RNAP active site channel. (C) Electron density and model for RPo nucleic acids. Blue mesh, 2F o − F c maps for nucleic acids (contoured at 0.7σ). DOI: 10.7554/eLife.08504.003 Figure 1. continued on next page data were collected and analyzed ( Table 1). The structure was determined by molecular replacement, which identified two complexes per asymmetric unit, and refined using data extending to 4Å-resolution (Table 1, Figure 1-figure supplement 2). The solvent content of the crystals was 82% and examination of the crystal packing revealed space for the expected position of additional promoter DNA. We therefore formed a complete RPo by combining Taq EΔ1.1σ A with a duplex promoter DNA scaffold (−36 to +12 with respect to the transcription start site at +1) but with a non-complementary transcription bubble generated by altering the sequence of the t-strand DNA from −11 to +2. RPo crystallized in the same habit and diffraction data were analyzed to 4.7Å-resolution (Table 1). In the resulting electron density maps, most of the ss t-strand DNA was poorly ordered and unable to be modeled. To stabilize the t-strand DNA, we added an RNA primer complementary to the ss t-strand DNA from +1 to −3, yielding a 4 bp RNA:DNA hybrid ( Figure 1A). We crystallized the resulting complex (437 kD, which we call RPo hereafter), collected and analyzed diffraction data, and refined the structure using reflections to a minimum Bragg spacing of 4.14Å (Table 1, Figure 1-figure supplement 2). In RPo, good electron density for all of the nucleic acids included in the scaffold was observed ( Figure 1C). The protein/DNA contacts seen in the us-fork complex are essentially identical to the relevant subset of contacts in RPo.
Despite the relatively low resolution of our analysis (Table 1), important protein side chain/nucleic acid interactions were resolved in electron density maps. Protein side chain/nucleic acid interactions specifically discussed in this paper are supported by unbiased simulated annealing omit maps shown for each case (see below). The protein side chain/nucleic acid interactions specifically discussed in this paper occur via conserved (often universally) residues of the RNAP β′ or σ A subunits. The level of conservation of relevant β′ residues, determined from an alignment of 834 bacterial RNAP β′ subunit sequences (Lane and Darst, 2010) is tabulated in Table 2. An alignment of 1002 diverse σ A sequences was constructed (Supplementary file 1; a sub-alignment of selected diverse sequences is shown in Figure 1-figure supplement 3) and the level of conservation of relevant σ A residues is tabulated in Table 3. RNAP interacts with ds DNA just upstream of the −10 element and specifically recognizes the extended −10 element Starting from the upstream end of the promoter DNA, the −35 element interacts exclusively with σ A 4 in a manner consistent with the high-resolution (2.4Å) structure of the isolated σ A 4 =−35 element complex (Campbell et al., 2002). The duplex DNA just upstream of the −10 element (−17 to −13) interacts with β′, σ A 3 , and σ A 2 ( Figure 1B). Previously, conserved residues of the β′-zipper (β′Y34 and, to a lesser extent, β′R35; Table 2) that contribute to RPo stability by interacting with duplex spacer DNA were identified (Yuzenkova et al., 2011). In the RPo structure, both β′Y34 and β′R35 are positioned to form polar interactions with the −17 nt-strand DNA (−17(nt)) phosphate (Figure 2A,C).
The primary role of σ 2 in −10 element recognition was first uncovered when substitutions of invariant Q260 (σ 70 Q437; Figure 1-figure supplement 3; Table 3) were shown to affect sequencespecific recognition of the −12 bp (Kenney et al., 1989;Waldburger et al., 1990). Modeling suggested that Q260 may H-bond with the major-groove edge of A −12 (t) (Feklistov and Darst, 2011). However, in our structures, the amide group of the Q260 side chain points away from the majorgroove edge of A −12 (t) and cannot form H-bonds ( Figure 2B,D). We suggest that Q260 may form base-specific H-bonds with the −12 bp in an intermediate during the pathway to RPo formation (Saecker et al., 2011), whereas our structures represent the final, transcription ready RPo, explaining the genetic data.  (Lane and Darst, 2010). †Blosum62 score calculated by PFAAT (Johnson et al., 2003). DOI: 10.7554/eLife.08504.008 Structural role of σ A aromatic residues in forming and stabilizing the upstream ds/ss junction of the transcription bubble Flipping of the A −11 (nt) base from the duplex DNA into its recognition pocket in σ A 2 is thought to be the key event in the initiation of promoter melting (Chen and Helmann, 1997;Lim et al., 2001;Heyduk et al., 2006;Feklistov and Darst, 2011). Strand opening propagates downstream to +1, but in the upstream direction, the base-paired T −12 (nt) interacts with an invariant W-dyad of σ A 2 (W256/W257, σ 70 W433/W434; Figure 1-figure supplement 3; Table 3) to maintain the ds/ss (−12/−11) junction at the upstream edge of the transcription bubble ( Figure 3A,C,D, Figure 3-figure supplement 1). The stabilization of the upstream ds/ss junction involves a previously unseen rearrangement of the W256 side chain. In all previous high resolution structures of σ A /σ 70 in many different contexts but never with an upstream ds/ss junction (Malhotra et al., 1996;Campbell et al., 2002;Vassylyev et al., 2002;Feklistov and Darst, 2011;Zhang et al., 2012), the W256 side chain makes an 'edge-on' interaction with W257 ( Figure 3B). In the presence of the upstream ds/ss junction, the W256 side chain rotates away from W257, filling the space vacated by the flipped-out A −11 (nt) and forming a π-stack with the face of T −12 (t) otherwise exposed by the absence of A −11 (nt)   Figure 1B). The boxed area is magnified on the right. (Right) Magnified view showing the upstream ds/ss junction of the transcription bubble in RPo (the RNAP β subunit, which obscures the view, has been removed). RNAP is shown as a molecular surface, except side chains of key σ A residues (R217, R220, W256, R288, R291) are shown (orange). The orthogonal directions of the ss nt-and t-strand DNA following the upstream ds/ss junction are denoted by black arrows. The dashed, curved line denotes the potential path of the t-strand −11 base from its position in the duplex DNA (base-paired to A −11 (nt)) to its position in the structure. (B) Structure of Taq σ A 2 bound to the ss, nt-strand −10 element . The W-dyad forms a 'chair'-like structure, with W256 serving as the back of the chair, and W257 as the seat, buttressing T −12 (nt) from the major groove side ( Figure 3A,C,D). The methyl group of the T −12 (nt) base approaches the face of the W257 side chain at a nearly orthogonal angle, possibly forming a favorable methyl π interaction (Umezawa and Nishio, 1998;Brandl et al., 2001) ( Figure 3C).
Examination of the structure near the upstream ds/ss junction revealed the solvent-exposed aromatic face of a conserved σ A 2 Tyr side chain, Y217 (σ 70 Y394; Figure 3A,E; Figure 1-figure supplement 3; Table 3), that does not appear to play an important role in the σ structure per se, but lies along the path the −11(t) base could follow from its position in duplex DNA (base-paired to A −11 (nt)) to its position in the structure when orphaned by the flipped out A −11 (nt) (dashed line, Figure 3A). The −11(t) nucleotide is almost always a T, being complementary to A −11 (nt), the most highly conserved position of the −10 element (Shultzaberger et al., 2007). In the us-fork, the −11(t) nucleotide is absent (Figure 1-figure supplement 1), whereas in RPo, the −11(t) nucleotide is an (atypical) A, being part of the engineered non-complementary transcription bubble ( Figure 1A). In RPo, the A −11 (t) base is not stacked on Y217 but instead is about 12Å away, flipped up alongside the σ A 3 − 3:0 α-helix, sitting between R288 and R291 ( Figure 3A; Figure 1-figure supplement 3; Table 3).
We reasoned that we may not observe the orphaned −11(t) base stacked on Y217 for two reasons that are not mutually exclusive. First, Y217 may play an important role in stabilizing the melted state of the −11 bp during an intermediate of the normal promoter melting pathway (Saecker et al., 2011). Second, structural modeling suggested that the A −11 (t) purine base present in the synthetic promoter construct ( Figure 1A) may be too bulky to stack on Y217, which sits at the bottom of a narrow trough in the σ A 2 structure ( Figure 3A). To investigate the role of Y217 further, we crystallized Taq EΔ1.1σ A with an us-fork template containing a complementary A:T bp at the −11 position (us-fork (−11 bp); Figure 4A). To avoid model bias, we determined the structure by molecular replacement using the Taq EΔ1.1σ A /us-fork (−12 bp) structure (lacking the −11(t) base; Figure 1-figure supplement 1). The structure was modeled and refined (4.6Å-resolution, Table 1, Figure 4-figure supplement 1), and the unbiased density maps revealed clear difference density for the T −11 (t) base stacked on Y217 ( Figure 4B).
Functional role of σ A aromatic residues in forming and stabilizing the upstream ds/ss junction of the transcription bubble A functional role for W256 in promoter melting was first proposed by Helmann and Chamberlin (1988). Ala substitution of the corresponding Trp in Bacillus subtilis σ A gave rise to severe promoter melting defects in vitro and corresponding cold phenotypes in vivo (Juang and Helmann, 1994;Panaghie et al., 2000). The functional role of Y217 has not, to our knowledge, been previously examined.
We investigated the effects of individual Ala substitutions in Eco σ 70 W433 and Y394 (Taq W256 and Y217) on the kinetics of RPo formation (Roe et al., 1984;Buc and McClure, 1985) using a recently reported fluorescence assay (Ko and Heyduk, 2014). The assay relies on a Cy3 fluorophore attached to the promoter nt-strand at position +2; fluorescence yield in this context is sensitive to the local environment and increases more than twofold upon RPo formation. Unlike previously used  (Feklistov and Darst, 2011) showing the disposition of the universally conserved σ A W-dyad (Taq σ A W256/W257). Shown is the ss DNA from −14 to −7 (−10 element colored yellow), the σ A 2:3 À helix (light orange) and the W-dyad (orange side chains with transparent CPK atoms). W256 makes an edge-on interaction with the face of W257, as observed in all other σ 70 /σ A structures in many different contexts (Malhotra et al., 1996;Campbell et al., 2002;Vassylyev et al., 2002;Murakami et al., 2002aMurakami et al., , 2002bFeklistov and Darst, 2011;Zhang et al., 2012). (C) Disposition of the W-dyad in RPo (containing upstream ds/ss junction, shown schematically above). Only the nt-strand DNA from −14 to −7, the σ A 2:3 À helix, and the W-dyad are shown (as in B). (D) Same view as (C). Superimposed is the simulated annealing omit map (grey mesh, 2F o − F c , contoured at 1σ), calculated from a model where the following segments of σ A were completely removed (216-221, 255-258, and 287-292) and shown only within 2Å of omitted atoms. (E) Similar view as (A) (right). Superimposed is the simulated annealing omit map (grey mesh, 2F o − F c , contoured at 1σ), calculated from a model where the following segments of σ A were removed (216-221, 255-258, and 287-292) and shown only within 2Å of omitted atoms. Clear Fourier density for σ A Y217 and R288 is shown. DOI: 10.7554/eLife.08504.011 The following figure supplement is available for figure 3:  . The σ A Y217 may stack on the T −11 (t) base orphaned by the flipped out A −11 (nt) base. (A) Synthetic oligonucleotides used for us-fork (−11 bp) crystallization. The numbers above the sequence denote the DNA position with respect to the transcription start site (+1). The DNA sequence is derived from the full con promoter (Gaal et al., 2001). The −35 and −10 (Pribnow box) elements are shaded yellow, the extended −10 element (Keilty and Rosenberg, 1987) purple. The nt-strand DNA (top strand) is colored dark grey; the t-strand DNA (bottom strand), light grey; the RNA transcript, red. (B) The T −11 (t) base orphaned by the flipped out A −11 (nt) stacks on σ A Y217 in the us-fork (−11 bp) structure. The 4.6 A-resolution electron density map (contoured at 0.7σ) is shown (grey mesh). Also superimposed is the simulated annealing omit map (green mesh, F o − F c , contoured at 3σ), calculated from a model where σ A Y217 was mutated to Ala and the T −11 (t) nucleotide was deleted. DOI: 10.7554/eLife.08504.013 The following figure supplement is available for figure 4: non-equilibrium methods (EMSA, filter binding), this assay allows detection of promoter melting at equilibrium and does not depend on the use of competitors, such as heparin. For these assays, we used one of the most thoroughly characterized promoters, λ P R (Saecker et al., 2002(Saecker et al., , 2011. Control assays showed that under saturating conditions, both σ 70 substitutions (W433A and Y394A) associated with core RNAP and supported abortive transcription as well as wild-type σ 70 (data not shown), confirming their structural integrity.
The multistep process of promoter opening can be described by a simplified kinetic scheme ( Figure 5A) (McClure, 1980) where an initial promoter complex (RP i ) existing in rapid equilibrium with free promoter and RNAP (binding step described by a dissociation constant K d ) is converted in a rate-limiting step to RPo (isomerization described by the rate constant k 2 ). Fluorescence traces of RPo formation under pseudo first-order conditions (Roe et al., 1984) recorded at increasing RNAP concentrations were fit to single-exponentials and yielded observed rate constants (k obs ) for RPo formation ( Figure 5B). Nonlinear fits to the resulting hyperbolic curves ( Figure 5C) allowed the determination of K d and k 2 (Saecker et al., 2002) ( Figure 5D).
Neither σ 70 W433A nor Y394A had a significant effect on K d for RP i formation, but the substitutions decreased the rate of isomerization by about twofold to threefold (at 37˚C, Figure 5D). At suboptimal temperature (25˚C) the effect of the W433A substitution was more pronounced, resulting in an ∼sevenfold reduction in isomerization rate. Neither σ 70 W433A nor Y394A significantly altered the affinity of holoenzyme binding to ss oligos comprising the nt-strand of the −10 element (Tomsic, 2001) ( Figure 5E).
W256 appears to make the primary contribution to maintaining the ds/ss junction at the upstream edge of the transcription bubble ( Figure 3A), suggesting that this residue may play an important role in preventing transcription bubble collapse and dissociation of RPo. To probe the roles of both σ 70 W433 and Y394 in maintaining RPo stability, we rapidly destabilized preformed RPo with 1.1 M NaCl (Gries et al., 2010) and followed the loss of RPo by monitoring the decay of fluorescence intensity with time ( Figure 5-figure supplement 1). The dissociation curves are complex, reflecting the detection of a short lived intermediate (expected under these conditions) (Gries et al., 2010) by this assay. Although a full analysis is beyond the scope of this study, the overall apparent rate of RPo decay ðk app off Þ was determined from single-exponential fits of the decay curves. The σ 70 W433A and the Y394A variants both gave a ∼fourfold higher rate of RPo dissociation under high salt conditions than did wild-type σ 70 ( Figure 5D, Figure 5-figure supplement 1). σ A directs the ss t-strand to the RNAP active site Downstream from the point of melting, the two DNA strands are directed on orthogonal paths (black arrows, Figure 3A). The nt-strand (−11 to −4) drapes across the surface of σ A 2 , directed by phosphate backbone interactions and notable base-specific recognition of A −11 (nt) and T −7 (nt) of the −10 element, and G −6 (nt) of the discriminator (Feklistov and Darst, 2011;Zhang et al., 2012). Further downstream, interactions of the nt-strand from −3 to +2 occur exclusively with the RNAP β subunit, including base-specific recognition of G +2 (nt) .
At the point of melting, a ∼90˚turn of the t-strand backbone (between −12 and −11) may be effected by electrostatic interactions between conserved basic residues of σ A 2 (R220; Figure 1-figure supplement 3; Table 3) and σ A 3 (R288, R291) and four t-strand backbone phosphates in a row (−13, −12, −11, −10) encompassing the turn ( Figure 3A). Strong simulated annealing omit 2F o − F c density is associated wth σ A 3 R288, confirming its role in interacting with the −13(t) phosphate ( Figure 3E). The σ A 2 R220 and σ A 3 R291 give weaker difference density so their role in interacting with the −12(t) and −11(t) phosphate groups is tentative. The turn directs the t-strand away from the nt-strand and towards the RNAP active site ( Figure 3A). The ss t-strand DNA from −9 to −5 is guided towards the RNAP active site through a tunnel formed between the RNAP β1-lobe (called the protrusion in eukaryotic RNAP II; Cramer et al., 2001) and the σ 3.2 -loop (also referred to as the σ-finger), an extended linker that loops into and out of the RNAP active-site channel (Murakami et al., 2002a;Zhang et al., 2012), connecting the σ 3 and σ 4 domains ( Figure 6).

The σ 3.2 -loop sterically blocks extension of the 4 nt RNA transcript
Previous structural analyses predicted that the σ 3.2 -loop would physically occupy the path of the elongating RNA and must be displaced for full RNA extension to occur (Vassylyev et al., 2002;Murakami et al., 2002a). Indeed, the upstream edge of the post-translocated 4-nt transcript fits snugly between the RNAP active site and the distal tip of the σ 3.2 -loop, which contacts the upstream RNA:DNA bp at −3, and the t-strand bases at −4 and −5 ( Figure 6). Extension of the RNA transcript and translocation to form a 5 bp post-translocated RNA:DNA hybrid cannot occur without displacement of the σ 3.2 -loop (Basu et al., 2014), marking the point in transcription initiation  (Ko and Heyduk, 2014) for Eco holoenzymes with σ 70 (wt) as well as σ 70 carrying substitutions W433A or Y394A. Error bars denote standard errors of the mean for ≥three independent measurements. (D) Summary of effects of σ 70 W433A and Y394A substitutions on thermodynamic and kinetic parameters of RPo formation. The data was normalized to the % observed with wild-type Eσ 70 . (E) Equilibrium binding of ss nt-strand oligos of λ P R promoter −10 element detected in the fluorescent RNAP beacon assay (Feklistov and Darst, 2011;Mekler et al., 2011) to Eco holoenzymes with σ 70 , as well as σ 70 carrying substitutions W433A or Y394A. DOI: 10.7554/eLife.08504.015 The following figure supplement is available for figure 5: (translocation of the 4-5 bp RNA:DNA hybrid from pre-to post-translocated) where steric clash between the elongating RNA transcript and the σ 3.2 -loop begins effecting abortive initiation and σ release (Murakami et al., 2002a;Nickels et al., 2005;Kulbachinskiy and Mustaev, 2006).

Discussion
Our structures reveal that the overall architecture of the Taq RPo (Figure 1) closely resembles that of the Eco RPo (Zuo and Steitz, 2015), but the improved resolution of our analysis allows a more detailed description of protein/DNA interactions (Figure 2), particularly interactions involved in forming and stabilizing the ds/ss junction at the upstream edge of the transcription bubble ( Figure 3). Previous models of RPo were pieced together from structures of σ domains or RNAP holoenzyme complexed with promoter fragments (Campbell et al., 2002;Murakami et al., 2002b;Feklistov and Darst, 2011;Zhang et al., 2012). The Taq RPo structure upstream of the −10 element matches the overall architecture of the low-resolution (6.5Å) Taq RNAP holoenzyme/upstream-fork promoter complex (Murakami et al., 2002b) except unlike the upstream-fork structure (where the RNAP holoenzyme/−35 element interactions were distorted by crystal packing interactions), the Taq RPo recapitulates the σ 4 /−35 element interactions seen in the high-resolution (2.4Å) crystal structure of the Taq σ A 4 =−35 element DNA complex (Campbell et al., 2002). The Taq RPo structure also recapitulates the σ 2 /−10 element interactions seen in high-resolution (2.1Å) structures of Taq σ A 2 complexes with ss −10 element DNA (Feklistov and Darst, 2011). The interactions of the RNAP holoenzyme with the ss discriminator element (ss nt-strand DNA from −6 to −3; Figure 1A), the ss nt-strand DNA from −2 to +2 (including base-specific interactions of G +2 (nt) with a pocket in the RNAP β subunit), and the downstream edge of the transcription bubble and downstream duplex DNA are very similar to those observed in a 2.9Å-resolution structure of Tth RNAP holoenzyme complexed with a downstream-fork promoter template .

Role of conserved σ A aromatic residues in promoter opening
Our results clarify the role of the universally conserved W-dyad of housekeeping (also called primary or group 1) σ's (Gruber and Bryant, 1997) in the promoter opening pathway, particularly for Taq σ A W256 (Eco σ 70 W433), which rotates into the DNA duplex and serves as a steric mimic of the flippedout A −11 (nt) base by a stacking mechanism ( Figure 3A,C,D). The bacterial RNAP σ subunit can be added to the list of proteins using a wedge residue (usually an aromatic side chain) to invade the DNA duplex to stabilize the extrahelical conformation of a flipped-out base (Lau et al., 1998;Davies et al., 2000;Yang et al., 2009;Yi et al., 2012). We also identified another conserved σ A aromatic residue (Taq σ A Y217) that plays an important role in the promoter opening pathway, possibly by stacking with T −11 (t) orphaned when the conserved A −11 (nt) base flips out ( Figure 4B).
The kinetic studies reveal that both aromatic residues (W256 and Y217) act in a context dependent manner-they are not important for the initial promoter binding step ( Figure 5D) nor for binding the ss −10 element DNA ( Figure 5E): instead W256 and Y217 act to increase the rate of the isomerization (promoter opening step) itself ( Figure 5E,D), possibly by making contacts unique to the transition state that lower the energy barrier between RPi and RPo in the two-step kinetic scheme ( Figure 5A). Since the initial promoter binding step (formation of RPi, Figure 5A) is not affected by the σ 70 W433A substitution ( Figure 5D), we surmise that RPi does not feature the stacking interaction formed by W433A on the T −12 (nt) base (exposed by the flipping-out of A −11 (nt)). Since the −11 bp is thought to be the first bp disrupted in the promoter opening pathway (Chen and Helmann, 1997;Lim et al., 2001;Heyduk et al., 2006;Feklistov and Darst, 2011), this implies that RPi is a closed complex (RPc) comprising duplex promoter DNA.
The effects of σ 70 W433A that we observed are consistent with previous observations using nonequilibrium methods (Fenton et al., 2000;Tomsic, 2001;Fenton and Gralla, 2003;Schroeder et al., 2009). These observations support the critical role of σ A W256 and Y217 (σ 70 W433 and Y394) in formation and stability of RPo.
In addition to the housekeeping σ (σ A in Taq or σ 70 in Eco) that controls transcription of the majority of cellular genes (with consensus −35 and −10 elements of TTGACA and TATAAT, respectively; Shultzaberger et al., 2007), bacteria rely on alternative σ′s to direct RNAP to highly specialized promoters (with alternative −35 and −10 elements) controlling operons in response to environmental and physiological cues Feklistov et al., 2014). Although the W-dyad is universally conserved in housekeeping σ's (Gruber and Bryant, 1997), it is not a conserved feature of alternative σ's (Lonetto et al., 1992;Helmann, 2002;Campbell et al., 2003); bulky hydrophobic residues are favored at the corresponding positions of alternative σ's (but rarely W). The W-dyad is likely to be the optimal configuration for supporting the upstream ds/ss junction of the transcription bubble, giving the housekeeping σ′s a powerful DNA-melting capacity, allowing them to function on thousands of highly divergent, nonoptimal promoter sequences. Alternative residues supporting the upstream ds/ss junction of the transcription bubble may weaken the ability of RNAP with alternative σ's to form RPo, fine-tuning their specificity (Feklistov et al., 2014). The residue corresponding to Taq σ A Y217 (σ 70 Y394) appears to be conserved as either Y or F among σ 70 -family alternative σ's suggesting that this residue plays a key role common to all σ′s.
Transcript elongation, scrunching, and σ-release Zuo and Steitz (2015) soaked crystals of Eco transcription initiation complexes (containing a full transcription bubble) with NTP substrates to generate short transcripts (with 5′-triphosphate) in crystallo. A pre-translocated 4-nt transcript did not reach the σ 3.2 -loop, whereas a pretranslocated 5-nt transcript appeared to just reach and interact with the σ 3.2 -loop. Attempts to generate longer transcripts resulted in severe degradation of the crystals, suggesting significant conformational changes of the RNAP that were incompatible with the crystal packing either due to transcript/σ 3.2 -loop interactions, 'scrunching' of the t-strand DNA (Kapanidis et al., 2006;Revyakin et al., 2006;Roberts, 2006), or both. The upstream edge of our post-translocated 4-nt transcript is equivalent to the pre-translocated 5-nt transcript observed by Zuo and Steitz (2015): in both cases the upstream edge of the RNA just contacts the σ 3.2 -loop and the conformation of the σ 3.2 -loop is very similar indicating that, at least in this case, the presence or absence of the 5′-triphosphate does not alter the gross interaction of the elongation transcript with the σ 3.2 -loop. In vitro, RNAP initiates efficiently with dinucleotide primers lacking a 5′-triphosphate without obvious defects in σ release or promoter escape. Basu et al. (2014) were able to generate a 6-nt pre-translocated transcript (containing a 5′-triphosphate) in crystals of Tth transcription initiation complexes with a downstream-fork promoter template that lacks duplex DNA upstream of the −10 element and is therefore unable to 'scrunch' the t-strand DNA. In this case, the 5′-nt of the transcript displaces the σ 3.2 -loop, which is not modeled and presumably disordered. Other conformational changes of the RNAP or changes in σ/RNAP interactions were not observed.

Relationship to RPo formation in eukaryotes
In vitro, the rate-limiting step of bacterial RNAP transcription is often the isomerization step to open the promoter and form RPo (McClure, 1980(McClure, , 1985Amouyal and Buc, 1987). The kinetics of the many steps of the transcription cycle in vivo have not been characterized, but many transcription units are clearly controlled at the initiation step (Paul et al., 2004). In bacteria, recognition of the promoter −10 element and DNA opening are directly coupled (Feklistov and Darst, 2011;Liu et al., 2011), with the Trp stacking interaction ( Figure 3A,C) playing a key role.
In contrast to tight coupling between promoter recognition and transcription bubble formation at most bacterial promoters, in eukaryotes promoter recognition, RNAP II recruitment, and promoter opening appear to be uncoupled. The preinitiation complex (PIC) is the molecular assembly through which eukaryotic RNAP II locates and utilizes a promoter, which may be pre-recognized by basal transcription factors. RPo formation requires ATP hydrolysis by the Ssl2 (XPB) subunit of TFIIH, which translocates downstream DNA into RNAP II against fixed upstream contacts to force DNA melting (Kim et al., 2000;Grünberg and Hahn, 2013). This contrasts with the spontaneous unwinding driven by RNAP/promoter DNA interactions alone during bacterial RPo formation (Liu et al., 2011).
Although there are clear similarities between σ and the eukaryotic basal transcription factor IIB in the contacts made to the 5′ RNA, hybrid junction, and ss-tDNA, there is no structural similarity between σ and TFIIB (Kostrewa et al., 2010;Liu et al., 2010;Sainsbury et al., 2013). These contacts may play similar roles in aiding promoter escape by helping eject σ or TFIIB from the RNAP active site cleft, but it is currently unclear whether any eukaryotic basal transcription factor stabilizes an upstream fork-junction by interactions similar to the σ-mediated Trp stacking ( Figure 3A,C). Further, although effects on RPo formation may help regulate some eukaryotic promoters (Kouzine et al., 2013), other steps, including removal of nucleosomes and promoter-proximal pausing (Boeger et al., 2003;Adelman and Lis, 2012) appear to be rate-limiting at many eukaryotic promoters. Even when promoters are nucleosome-free, assembly of the PIC, rather than promoter opening, may be rate-limiting. Further mechanistic and structural studies of RNAPII on promoters with diverse architectures, including both TATA-containing and TATA-less promoters, are needed for a better understanding of the steps in RNAPII initiation.

Conclusions
The structures of RPo determined here reveal how the RNAP holoenzyme recognizes the extended −10 element, stabilizes the transcription bubble, directs the t-strand DNA into the RNAP active site, and how the RNA:DNA hybrid initiates σ A release. Supported by the real-time kinetic data, the structures elucidate the roles of individual aromatic amino acid residues in nucleation of the transcription bubble and maintenance of RPo stability, in part through previously unobserved stacking mechanisms. The results also provide a basis for more incisive investigations of RPo formation and transcriptional regulation (Bae et al., 2015).

Materials and methods
Preparation and crystallization of Taq Δ1.1σ A -holoenzyme/promoter complexes Taq core RNAP and Δ1.1σ A were prepared as described previously (Murakami et al., 2003). Promoter DNA strands (Oligos Etc.) were annealed in 10 mM Tris-HCl, pH 8.0, 1 mM EDTA, 0.2 M NaCl and aliquots were stored at −20˚C.
For crystallization, aliquots of purified Taq core RNAP and Δ1.1σ A were thawed on ice and bufferexchanged into crystallization buffer (20 mM Tris-HCl, pH 8.0, 0.2 M NaCl). Taq Δ1.1σ A -holoenzyme was formed by adding 1.2-fold molar excess of Δ1.1σ A to the core RNAP and the mixture was incubated for 15 min at room temperature. A 1.5-fold molar excess of promoter DNA was then added to the holoenzyme along with MgCl 2 (10 mM final) and incubated for 15 min at room temperature. When present, a fivefold molar excess of RNA primer (GE Dharmacon, Lafayette, CO, United States) was also added. The final RNAP concentration was adjusted to 25 μM. Crystals were grown by vapor diffusion at 22˚C by mixing 1 μl of sample with 1 μl of reservoir solution (20 mM MgCl 2 , 20 mM Tris-HCl, pH 8.0, 1.6 M ammonium sulfate) in a 48-well hanging drop tray (Hampton Research, Aliso Viejo, CA, United States). Thin rod-shaped crystals (typically, 30 × 30 × 300 μm) appeared after about 5 days. The crystals were transferred into reservoir solution supplemented with 25% (vol/vol) glycerol in two steps for cryoprotection, then flash frozen by plunging into liquid nitrogen.
Structure determination X-ray diffraction data were collected at Brookhaven National Laboratory National Synchrotron Light Source (NSLS) beamline X29 and at Argonne National Laboratory Advanced Photon Source (APS) NE-CAT beamlines 24-ID-C and 24-ID-E. Data were integrated and scaled using HKL2000 (Otwinowski and Minor, 1997). The diffraction data were anisotropic. To compensate, isotropy was approximated by applying a positive b factor along a* and b* and a negative b factor along c* (Table 1), as implemented by the UCLA MBI Diffraction Anisotropy Server (http://services.mbi.ucla.edu/ anisoscale/) (Strong et al., 2006), resulting in enhanced map features ( Figure 1C Figure 4B).
In the RPo structure, the ss t-strand DNA from −11 to −4 was only modeled in one complex of the asymmetric unit. In the other complex, strong, connected Fourier difference density for this segment of DNA was observed but the density was relatively featureless and we were unable to model this segment of the DNA. In the us-fork (−11 bp) complex, the t-strand T −11 was modeled in only one complex of the asymmetric unit. In the other complex, density for this base was absent.

Resolution limit and structure validation
We follow the criteria of Karplus and Diederichs (2012), who showed that the R merge statistic commonly used to evaluate data quality is 'seriously flawed' and should not be used (Diederichs and Karplus, 1997), and that the commonly used criteria of <I>/σI > 2 also results in the loss of much useful crystallographic data (Karplus and Diederichs, 2012). Karplus and Diederichs (2012) showed, using objective and unbiased analyses, that inclusion of weak X-ray diffraction data (R merge values >> 1.0 and <I>/σI << 1) resulted in improved structural models. An improved statistic, CC* (essentially a Pearson correlation coefficient), was introduced that provides a single statistically valid guide for deciding whether diffraction data are useful.
Since most of the analyses described herein were performed from the RPo structure, we justify the inclusion of diffraction data to 4.14Å-resolution for this case. Data in the highest resolution shell (4.29-4.14Å) are very weak when examined by standard criteria (high R pim values and <I>/σI = 0.8, Table 1), but have good multiplicity (21.6) and completeness (99.8%), and yield a CC1/2 of 0.157, which is significantly different from zero for the large sample size (16,966 unique reflections) at exceedingly low p values (Rahman, 1968). That the highest resolution shells contain useful data and not noise is reflected in the observation that the R free and R work for the model refinement do not diverge (Figure 1-figure supplement 2, Figure 4-figure supplement 1). Inclusion of higher resolution data resulted in unacceptably low completeness in the highest shells due to the data anisotropy.
In the final 2F o − F c electron density maps, numerous protein side chains were resolved, including many that appeared to form important protein/nucleic acid interactions. To confirm these protein side chain positions, we produced unbiased difference Fourier maps using a simulated annealing omit procedure. Protein segments flanking the side chains in question were removed completely from the structural model, and the modified models were subjected to simulated annealing refinement using PHENIX (Adams et al., 2010). We used the following annealing temperatures (K), 1000; 2500; 5000; 10,000. All temperatures gave the same result (recovery of electron density for the omitted side chains), but the 5000 and 10,000 K refinements gave rise to obvious local structural distortions (expected for such high annealing temperatures with our low-resolution data) so the unbiased 2F o − F c maps were calculated from the 2500 K annealing refinements ( Figures 2C,D, 3D,E).

Preparation of DNA for kinetic measurements
A 135 bp λ P R promoter with Cy3 label at position +2 of the nontemplate strand was prepared using a 79 nt long synthetic oligonucleotide containing amino-dT at +2: ATCTATCACCGCAAGGGATAAATATCTAACACCGTGCGTGTTGACTATTTTACCTCTGGCGGTG ATAATGGTTGCA/iAmMC6T/GT The oligonucleotide was modified with Cy3-NHS and purified by reverse phase HPLC. The duplex was then prepared by Taq DNA polymerase extension of a partial duplex formed by mixing 0.25 μM Cy3-labeled non-template strand and 0.275 μM 79 nt template strand (TGCTGACTGCTTAATCGCTTC TAGGGATATAGGTAATTCCATACCACCTCCTTACTACATGCAACCATTATCACCGCCA) containing at the 3′-end a 23 bp sequence complementary to the 3′-end of the nontemplate strand. Extended duplex was purified on a 1 ml Resource Q column (GE Healthcare Bio-Sciences, Marlborough, MA, United States) using a gradient of 0-1 M NaCl in 25 mM Tris-HCl (pH 8), 10 μM EDTA. Fractions containing labeled promoter were precipitated with ethanol to remove salt.

Mechanistic model
Quantitative mechanistic studies have found at least two kinetically significant intermediates (designated I 1 and I 2 ) on the pathway to formation of RPo by Eco RNAP at the λ P R promoter (Davis et al., 2007;Gries et al., 2010;Saecker et al., 2011): (1) where the interconversion between I 1 and I 2 is rate-limiting in both directions (Buc and McClure, 1985;Saecker et al., 2002). The rate limiting step in the forward direction is the conversion of I 1 to I 2 , so under standard solution conditions, I 2 is never significantly populated (Gries et al., 2010). Because I 2 is not significantly populated under the conditions of association experiments, the three-step mechanism simplifies to the two-step mechanism ( Figure 5A), where I 1 = RPi. Since the kinetics observed in the forward direction are well fit by a single exponential ( Figure 5B), we deduce that RPi does not give rise to a significant fluorescence signal in our assay. In the reverse direction, however, rapid destabilization of RPo (such as with 1.1 M NaCl used here) generates a burst of I 2 (Kontur et al., 2006;Gries et al., 2010). The complexity and shapes of the dissociation curves observed by our fluorescence assay are consistent with the detection of a transient burst of I 2 after challenging pre-formed RPo with 1.1 M NaCl ( Figure 5-figure supplement 1) (Gries et al., 2010). Real-time observation of I 2 is an important finding that merits further, quantitative study but is beyond the scope of this study. Instead, we have characterized the overall dissociation rate ðk app off Þ by fitting the dissociation curves with a single exponential, which reveals the gross (>fourfold) differences in overall dissociation rates observed between wild type and mutant σ's ( Figure 5D,

Forward kinetics
To measure the kinetics of RPo formation, Eco RNAP holoenzyme was loaded in one syringe of a stopped-flow instrument (SF-300X, KinTek Corporation, Austin, TX, United States) and Cy3-labelled promoter DNA in the other. After rapid mixing at the indicated temperature (37˚C or 25˚C), the final concentrations were: promoter DNA, 0.3 nM; RNAP, 2 to 150 nM in binding buffer (20 mM HEPES, pH 8.0, 100 mM K-Glutamate, 10 mM MgCl 2 , 1 mM DTT). Cy3 fluorescence emission was measured in real time with a 586/20 single-band bandpass filter (Semrock) and excitation at 550 nm. The kinetics of Cy3 fluorescence were determined at various RNAP concentrations and fit to a single exponential equation ( Figure 5B): where F t is the fluorescence intensity of Cy3 as a function of time (t), F 0 is the initial fluorescence intensity, F ∞ is the fluorescence intensity at t = ∞, and k obs is the pseudo-first-order observed rate constant of the increase in Cy3 fluorescence. The data were interpreted assuming the following kinetic scheme ( Figure 5A; [McClure, 1980;Buc and McClure, 1985]): where the initial RNAP/promoter complex (RP i ) existing in rapid equilibrium with free promoter and RNAP (described by a dissociation equilibrium constant K d ) is converted in a rate-limiting step to RPo (described by the rate constant k 2 ). To obtain K d and k 2 the observed rate constants (k obs , average values determined from >3 replicates) were plotted against RNAP concentrations ( Figure 5C) and the data were fit to a hyperbolic equation (Saecker et al., 2002):

Reverse kinetics
Cy3-labeled DNA promoter fragments (0.3 nM) in binding buffer were mixed with RNAP-holoenzyme (100 nM) and incubated at 37˚C for 20 min to preform RPo. They were rapidly mixed in the stopped-flow instument with the same buffer but resulting in a final NaCl concentration of 1.1 M. The kinetics of high-salt induced RPo decay was recorded in the same manner as for the forward direction. Averaged time traces from ≥3 replicates were fit to a single exponential Equation 2 corresponding to a simplified kinetic scheme: RPo! k app off R + P: