A mismatch bubble in double-stranded DNA suffices to direct precise transcription initiation by Escherichia coli RNA polymerase.

Formation of a transcription-competent "open" complex between Escherichia coli RNA polymerase and a promoter, where base pairing is disrupted over a region of 12 base pairs including the start site of transcription, is a complex process involving at least three steps: recognition of specific DNA sequences, a conformational change in RNA polymerase, and DNA melting. By using synthetic constructs devoid of promoter-specific sequences, we show here that a mismatch bubble of 12 base pairs suffices to direct transcription initiation in divergent directions from its edges, reflecting the absence of polarity determinants for RNA polymerase binding. Bubble transcription is obtained with both core polymerase and holoenzyme, but efficient formation of heparin-resistant initiation complexes requires the sigma (specificity) factor. Based on these results it is likely that the sigma factor blocks access of the heparin to a site on the holoenzyme.

Formation of a transcription-competent "open" complex between Escherichia coli RNA polymerase and a promoter, where base pairing is disrupted over a region of 12 base pairs including the start site of transcription, is a complex process involving at least three steps: recognition of specific DNA sequences, a conformational change in RNA polymerase, and DNA melting. By using synthetic constructs devoid of promoter-specific sequences, we show here that a mismatch bubble of 12 base pairs suffices to direct transcription initiation in divergent directions from its edges, reflecting the absence of polarity determinants for RNA polymerase binding. Bubble transcription is obtained with both core polymerase and holoenzyme, but efficient formation of heparin-resistant initiation complexes requires the u (specificity) factor. Based on these results it is likely that the cr factor blocks access of the heparin to a site on the holoenzyme.
Escherichia coli RNA polymerase holoenzyme consists of a multi-subunit catalytic form (core polymerase) and a specificity factor designated u. Holoenzyme is able to recognize its start site for RNA synthesis (the promoter) by virtue of two regions of sequence conservation that are designated -10 and -35 to indicate their approximate positions upstream of the start site (1, 2). It is thought that the u factor plays a direct role in the recognition process by contacting the -10 and the -35 regions (3-5). The existence of a variety of u factors, specific for different promoter sequences, has been documented (6, 71, but we here only consider holoenzyme containing the main E. coli u factor, d o . As was first shown by Siebenlist et al. (€9,in a transcriptioncompetent or "open" complex between RNA polymerase holoenzyme and a promoter, base pairing in the region from -10 to +2 is disrupted. Several studies indicate that at least two steps precede open complex formation, namely promoter binding and a conformational change in RNA polymerase (9). In an attempt to simplify the study of this process, we had previously constructed variants of the phage h P, , promoter for which the region normally opened by RNA polymerase contained an engineered mismatch "bubble" (10). Unexpectedly an additional Grants GM 31808 (to P. L. H.) and GM 47446 (to J. D. H.). The core *This research was supported by National Institutes of Health facility at Case Western Reserve University (oligonucleotide synthesis) is supported by United States Public Health Service Grant P30CA43703. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
istry, School of Medicine, Case Western Reserve University, 10900 Eu-1 To whom all correspondence should be addressed: Dept. of Biochemclid Ave., Cleveland, transcription start site was observed in the bubble region, from which transcription proceeded in the divergent direction. This result raised the possibility that RNA polymerase recognized the mismatch bubble itself, but due to the sequence of the template the activation of an otherwise poor promoter remained a viable alternative explanation.
To distinguish between these two alternatives, we constructed synthetic templates containing a 12-base bubble, embedded in sequences designed to have little similarity to the consensus E. coli promoter. Here we report that such a bubble is indeed sufficient to direct the initiation of RNA synthesis. These results expand the utility of oligonucleotide bubble templates to the study of transcription initiation in addition to their demonstrated use as model elongation substrates (11).

EXPERIMENTAL PROCEDURES
Muterials-E. coli RNA polymerase holoenzyme was purified as described (12) with the addition of a phosphocellulose column eluted with buffer containing 50% glycerol as the final step (13) or purchased from Epicentre Technologies (Madison, WI); the latter was also the source of the core enzyme. Oligonucleotides were synthesized using Applied Biosystems DNA synthesizers or purchased from Midland Certified Reagent Co. (Midland, TX). The "L" primer was identical to the 5' most 18 bases of the top strand in Fig. 1  (Partially) Complementary DNAs were annealed by mixing 10 pmol of each strand in 20 p1 of 0.5 M NaC1,O.Ol M Tris, pH 8.0, heating to 85 "C, and slowly cooling to room temperature. The gapped template was put together from three DNA oligodeoxyribonucleotides; the same top strand as above was hybridized to two smaller deoxyribonucleotides to yield a template that lacks the 12-base pair stretch of DNA on the bottom strand of the bubble region. The hairpin template has the left stem and a 26-base pair loop corresponding in sequence exactly to the bubble region (including the first base pair of the right stem); it consists of one oligodeoxyribonucleotide, which was quick-cooled on ice after heating to 85 "C.
RNA Synthesis and Purificution-Runoff RNA synthesis was carried out as described in Ref. 14. The reactions (20 p1) contained DNA and RNA polymerase (at concentrations indicated in the legend to Fig. 2) in 30 m Tris-HC1 (pH 8.2), 100 m KC1,3 m MgCl,, 0.1 m M EDTA, 10 p~ dithiothreitol, and 45 pg/ml bovine serum albumin. After a 5-min preincubation, heparin was added to 50 pg/ml, followed after another 5-min incubation by 5 pl of an NTP mix, providing final concentrations of 0.2 m GTP, ATP, and CTP, 2 UTP (containing 15 pCi of [CX-~~PIUTP). RNA products were run on analytical or preparative 10% acrylamide, 8 M urea gels; the procedure for RNA extraction from excised bands was as described (15). Primer extension with the leftward and rightward RNAs was carried out using the procedure ofA. Aiyar and J. Leis' with primers L and R. Briefly, about 0.1 pmol of RNA was mixed with 1 pmol of phosphorylated primer (lo6 cpm) in 5 pl of water; the mixture was A. Aiyar and J. Leis, personal communication.

'CCC AATTCTCTACGC
3 'GGG TTAAGAGATGC T4 heated to 100 "C for 1 min, cooled to 40 "C, and then placed on ice. The volume was raised to 10 p1 by the addition of KC1 and Tris-HC1 (pH 8.0) to 0.1 M for each. Five pl of this mix were heated to 40 "C for 10 min, and then primer extension was initiated by the addition of dNTPs and rATP to 125 p~ in a reaction buffer containing 6 m~ MgCl,, 7 mM dithiothreitol, 25 mM Tris-HCI, 25 mM KCl, and 5 units of avian myeloblastosis virus reverse transcriptase (a gift from Dr. J. Leis). The reaction was incubated at 37 "C for 40 min, phenol extracted, and loaded onto a off of the two DNA strands for calibration. 10% sequencing gel, which also included dideoxy sequencing reactions Gel Mobility Shift-RNA polymerase (60 nM holoenzyme or 66 n~ core polymerase) and 40 nM DNA ( lo6 cpm; end-labeled) were incubated for 10 min in transcription buffer, followed by a 1-min challenge with heparin added to a concentration of 50 pg/ml. At this time the complexes were exposed to various mixtures of nucleoside triphosphates (final concentration, 300 PM) for 5 min. Complexes of DNA and RNA polymerase were separated from free DNA on 4% (59:l acrylamide: bisacrylamide) nondenaturing gels (16) buffered by 0.5 x Tris borate EDTA buffer. Subsequent to electrophoresis the bands on the gel were transferred to a polyvinylidene difluoride membrane as described (15). After blocking the membrane and exposing it to 0.5 pg/ml 2F8 anti-a antibody (17) followed by 1.7 pg/ml goat anti-mouse antibody coupled to horseradish peroxidase (Amersham Corp.), chemiluminescence was developed as described by the manufacturer. For its detection a 2-min exposure to Kodak X-Omat AR film was sufficient. For the detection of radioactivity, the chemiluminescence signal was blocked by a layer of aluminum foil, followed by autoradiography overnight.
,DNase I Footprinting-These experiments were performed essentially as described in Ref. 18. Complexes between RNA polymerase (60 nM holo or 66 nM core) and singly end-labeled DNA (approximately lo5 cpm) were preincubated for 10 min a t 37 "C in transcription buffer containing MgCI, to a final concentration of 20 m~. After the addition of heparin to 10 pg/ml for 1 min, DNaseI was added to 0.2 mg/ml for 1 min, after which the reaction was terminated by the addition of EDTA to 20 mM. The reactions were then loaded onto a 4% nondenaturing polyacrylamide gel as described above, the bands corresponding to the RNA polymerase DNA complexes were cut out, and the labeled DNA was extracted and rerun on a 10% denaturing polyacrylamide gel to visualize regions of protection from DNaseI cutting. The regions of protection were assigned by comparison to DNA sequencing reactions using as primers L or R. In view of the differences in DNA ends between the sequencing products and those produced by DNaseI cutting, the assignments may be inaccurate by as much a s two phosphodiester bonds.

Requirements for Initiation of RNA Synthesis at a DNA
Bubble-To assess whether a mismatch bubble was sufficient TG A T A~~ TCT to direct initiation of RNA synthesis a t a specific site, we constructed the template TB12 shown in Fig. 1. This construct designated TB12 has a 12-base pair mismatch bubble, asymmetrically disposed on the 90-base pair duplex. TBO has the same top strand, but a bottom strand that is complementary to the top strand. The constituent oligonucleotides were designed to minimize the extent of unintended promoter-like sequences. Using the method of McClure and colleagues (19) we searched the sequence for a similarity to an extended promoter sequence as defined by the authors. The results (not shown) indicated that no promoter sequence with a "similarity score" over 42 was present, less than that characteristic of even poorly active promoters. If the search was limited to matches to the consensus -10 and -35 hexamers separated by 15-19 base pairs, three potential promoter sequences (similarity scores 50-55) were found, only one of which has a predicted start site in the bubble region (see below).
The ability of E. coli RNA polymerase to use TB12 and TBO as templates for RNA synthesis is shown in Fig. 2. The combination of holoenzyme and TB12 leads to the heparin-resistant synthesis of two predominant RNA products; more slowly migrating faint bands have mobilities consistent with their being end-to-end transcripts. No activity is seen with TBO (lane 2 ) or with TB6 containing a 6-nucleotide bubble: establishing a mismatch spanning between 6 and 12 nucleotides as a requirement for initiation of RNA synthesis by holoenzyme. The RNA products made off of the bubble template are sensitive to RNase A but resistant to RNaseH; after heating and slow cooling of the transcription mixture, the pattern of sensitivities to the two RNases is reversed (data not shown). We interpret the latter results as an indication that the RNA is displaced from the template by the RNA polymerase rather than persisting as an RNA-DNA hybrid as has been observed with some artificial DNA templates (e.g. Ref. 20). With core polymerase TB12 was an efficient template only in the absence of a heparin challenge (compare lanes 10 and 7). The addition of anti-u antibody to the holoenzyme leads to a drastic reduction in the amounts of RNA S. E. Aiyar, J. D. Helmann, and P. L. deHaseth, unpublished experiments. synthesized (compare lanes 5 and 6). These results as well as those of Fig. 4 indicate that core complexes are more sensitive to displacement by added heparin than complexes with holoenzyme. A few complexes must escape the effect of heparin, as in the presence of a 10-fold increased amount of ATP, CTP, and GTP (2 mM uersus 0.2 mM), enhanced levels of RNA synthesis are seen with core polymerase. The levels of synthesis are still well below those with holoenzyme; however, at 0.2 mM the ratio of RNA synthesis for holo and core is about 9, a t 2 mM about 5 (data not shown). Start Sites for Initiation of Bubble Danscription-Comparison with markers indicated that the two RNAs (indicated by arrows in Fig. 2) were approximately 46 and 35 nucleotides long, consistent with RNA synthesis initiating in the bubble region and proceeding right-and leftward, respectively. That this was indeed the case was demonstrated in primer extension experiments to map the start sites of the individual gel-purified RNA species. The results are shown in Fig. 3, where the lengths of the primer-extended products are compared with sequencing reactions carried out using the same primers; lanes 1-5 show the results with the longer (rightward) RNA and lanes 6-10 those for the shorter (leftward) RNA. With each primer a unique product is obtained using the purified RNA as the template (similar results were obtained when unpurified RNA was used): indicating that only one start site is used for transcription in each direction. The primer-extended product made using either RNA as template does not line up perfectly with any band of the corresponding chain-terminated sequencing reactions, presumably due to differences in their sequence in the bubble region. The start sites shown in inserted would be an A, the vastly preferred base for initiation a t E. coli promoters (1, 2). Transcription originates in either half of the bubble and proceeds outward, resulting in RNAs of 47 and 36 nucleotides for the right-and leftward transcript, respectively.
RNA Polymerase Complexes with Bubble DNA--In agreement with the observation that in the absence of heparin both core and holo RNA polymerases are able to initiate RNA synthesis, we find that both forms of RNA polymerase bind in a bubble-dependent manner to DNA as shown in Fig. 4. Complexes formed between RNA polymerase and bubble DNA were separated on a nondenaturing gel and transferred to a membrane that was then subjected to both autoradiography to detect radiolabeled DNA (Fig. 4a) and probing with anti-a antibody (Fig. 4b). Fig. 4 a , lane 4, shows that the addition of holoenzyme resulted in the appearance of a new DNA band that migrated with greatly reduced mobility compared with the free DNA. With core two species are detected, bracketing in mobility that formed with holoenzyme. In the absence of added heparin the top band predominates, but after a 1-min challenge with 50 pg/ml heparin, the bottom band is most abundant. As our transcription runoff experiments (Fig. 2) demonstrate that challenge with this concentration of heparin prevents RNA synthesis, it is tempting to speculate that only the more slowly migrating species is initiation-competent. Both are obliterated by the addition of heparin to 250 pg/ml, a concentration that does not affect the complex with holoenzyme (data not shown).
The addition of all four nucleoside triphosphates to the complex with holoenzyme before loading it on the gel (lane 8) leads to the almost complete disappearance of the holo-specific band and the appearance of a species corresponding in mobility to the lower core-specific band. This is consistent with the release of the a factor a t some point during the elongation reaction and the rebinding of core polymerase in an initiation-incompetent fashion. Clearly the vast majority of the complexes with holoenzyme is transcription-competent. The omission of GTP leads to the synthesis of small RNAs: nominally tetramers and hexamers in the left-and rightward directions, respectively, and the conversion of only a small amount of the holo complex (lane 7). With other subsets of NTPs, no effect was seen on the holo complex (lanes 5 and 6). To directly verify the presence of the u factor in the various complexes detectable on the gel, the membrane was also probed with anti-a antibody, followed by the detection of the antibody by chemiluminescence (Fig. 4b). This approach confirmed the presence of the cr factor in the complex of TB12 with holoenzyme both in the absence of any NTPs and in the presence of various subsets. Apparently the u factor stays bound during the synthesis of.the short RNAs. Some of the bubble DNA-RNA polymerase complexes have been further characterized by DNaseI footprinting. Complexes were formed and treated with DNaseI and then isolated on a nondenaturing polyacrylamide gel a s described under "Experimental Procedures." After removal of bound RNA polymerase, the DNA was analyzed on a denaturing polyacrylamide gel. Results for the complex with holoenzyme are shown in Fig. 5. Holoenzyme protects close to 40 base pairs on either DNA strand of the bubble complex, with the bubble situated approximately in the middle of this region; on the top strand, 15 and 12 phosphodiester bonds are protected 5' and 3' of the bubble, respectively, while on the bottom strand 10 and 15 bonds are protected on the 5' and 3' sides of the bubble. Due to the nonuniform cutting of bonds by DNaseI, the slight apparent asymmetry in the positioning of the footprint with respect to the location of the bubble is probably not significant. It is likely that the footprints obtained are the sum of protections obtained with two populations of RNA polymerase-DNA complexes, poised to intiate in either direction (see also below). Each of these footprints might well be asymmetric, resulting in the incomplete protections seen for the regions furthest removed from the bubble. Even so, it is evident that the upstream portion of the holoenzyme footprint does not extend as far from the region of strand separation as that for open complexes a t promoters; the region of DNA that is in close contact with RNA polymerase is shorter by 20-30 bases for the complexes at the bubble than for those at the promoters. In comparison, we had previously determined a size of 70 base pairs for the region protected by holoenzyme when bound over a mismatch bubble constructed in the context of promoter DNA (10). With core polymerase the protected region is shorter but did not clearly stand out, and the borders were difficult to assign (data not shown).
Modulation of Bi-directional RNA Synthesis-Our results indicate that specific contacts between RNA polymerase and the -10 and -35 regions, crucial in promoter recognition and open complex formation, are not required for start site selection or initiation of RNA synthesis a t a mismatch bubble. In the absence of sequence recognition no directionality is imparted on the polymerase, and its interaction with the bubble region gives rise to complexes that can transcribe in either direction. (The concurrent binding of two polymerases to one bubble seems highly unlikely, but we have not formally demonstrated that the two complexes are mutually exclusive). Directionality is again observed in the presence of CI protein (Fig. 2, lane 3). Rightward transcription is little affected, while transcription in the leftward direction is greatly reduced, as the 0,l-bound CI protein presumably interferes with RNA polymerase binding and initiation in the latter direction.
Other DNA Templates-It is known that various other template constructs can be used by RNA polymerase, some preferentially by core polymerase (21,22). To define further initiation at the bubble templates, we also used the "gapped" and "hairpin" templates ( Fig. 1). The results are shown in Fig. 6. The use of a 10-fold higher concentration of nucleoside triphosphates (2 uersus 0.2 mM in Fig. 2) caused a greater amount of RNA to be synthesizd with both templates, but without affecting the overall trends. The gapped template, which contains a 12-nucleotide gap in the DNA of the bottom strand as compared with the TB12 template, is better used by holoenzyme than by core, both leading to the synthesis of the 36but not the 47-nucleotide RNA product. This result would indicate that the presence of a single-stranded DNA region on the template rather than the nontemplate strand is crucial, as expected from other work (23). The hairpin template is conceptually equivalent to TB12 lacking the right stem of double helical DNA, thus it is not surprising that the RNA made has the size (36-mer) of the leftward transcript from TB12. Interestingly, this template is used preferentially by core polymerase even after a heparin challenge. The results obtained for the gapped and hairpin templates in the absence of a heparin challenge are similar in every respect to the pattern shown in Fig. 6 (data not shown). We do not understand the reason why on the gapped template holoenzyme initiates better than core polymerase even in the absence of heparin.

DISCUSSION
Initiation a t a Bubble Versus Initiation a t a Promoter-Initiation by RNA polymerases at a mismatch bubble and a promoter is similar in several respects. In both cases, the u factor is required for the formation of a heparin-resistant complex, and it stays bound while short RNA molecules are being synthesized. For initiation at promoters u release occurs when the length of the nascent RNA is 10 nucleotides (1,2,24). With the bubble templates we did not investigate the specific release of the u factor, but our experiments did indicate that no release occurred if the RNA was 6 nucleotides or shorter. Each complex uses a unique start site and synthesizes abortive products if provided with an incomplete set of rNTPs. We have also ob-served3 that initiation at a bubble is sensitive to rifampicin, again analogous to initiation at promoter DNA. These results indicate that bubble templates (without a hybridized primer) have the potential to serve as model systems for the study of steps in the process of formation of a n RNA polymerasepromoter open complex. Application of this assay to the study of mutant u factors is in progress?
The The rightward start site is positioned more proximal to the edge of the bubble than is the leftward start site. We considered the possibility that RNA polymeraseinduced strand separation occurred at the A-T rich right edge of the bubble, extending the bubble in this orientation and positioning this start site also further from the true edge. Chemical probing with potassium permanganate (25) did not support this hypothesis, however.' Another possibility is that a promoterlike sequence detected in a computer search (see above), with a -10 region (CATAGT on the bottom strand; nonconsensus bases underlined) 9 base pairs upstream of the leftward start site might be influencing start site selection despite its poor match with the consensus promoter (a total of 5 nonconsensus base pairs) and the long distance between the -10 region and the start site. Finally it is possible that the known propensity of RNA polymerase toward end-binding (26) plays a role in the asymmetry of start site selection. Due to the 14 base pair difference in distance between the left and right ends and the bubble, it is possible that initiation in each direction could be affected differently by a n end-bound RNApolymerase. It seems improbable, however, that an end-bound RNA polymerase could itself initiate at the bubble.
Initiation on Other Nonpromoter Templates-The similarity in behavior between initiation at promoters and on the model template described here is more remarkable if our results for TB12 are compared with other systems where anomalous DNA molecules have been employed as templates for RNA synthesis. Our own experiments have shown that preferential initiation by holoenzyme is obtained with the gapped but not the hairpin template. Precise initiation of RNA synthesis by E. coli RNA polymerase core and holoenzymes at nonpromoter sites has also been observed on other template constructs. Kadesch and Chamberlin (23) showed that various polymerases can initiate at the single strand-double strand junction of a template with a poly(dC) tail added to the 3' end of duplex DNA, in some respects these constructs are reminiscent of our gapped template. Recently Daube and von Hippel (11) have developed a bubble template with an RNA primer hybridized to one of the single strands as a model system for the study of elongation. Specific extension of the primer RNA was accomplished by core polymerase in the absence of the u factor. Schaller and co-

Danscription Initiation at a Mismatch Bubble
workers (27) have mapped the start site of the primer RNA for second strand synthesis on the single-stranded DNA phage M13 to a single-stranded region 6 bases upstream of a long double-stranded stem. To what extent this system is similar to RNA synthesis directed by the hairpin template is not clear. Finally, early results on the interaction of RNA polymerase with DNA containing nicks bear mention. While the exact relationship between the position of the nicks and the site of initiation was not determined, it was found that the presence of nicks markedly stimulated initiation by core but not by holo polymerase (21, 22). The nicks were envisaged to facilitate strand separation during formation of open complexes by core polymerase. In contrast, holo polymerase was proposed to bind extremely tightly to the nicks, resulting in a complex that did not engage in RNA synthesis, due to an inability to transition into an elongation mode. Efect of Heparin-The experiments presented here suggest a model for the role of the u factor in the formation of heparinresistant RNA polymerase-DNA complexes. Heparin, a polyanionic polymer that mimics the highly charged sugar-phosphate backbone of nucleic acids, is thought to inactivate RNA polymerase by competitive binding to one or more nucleic acid binding sites to form a highly stable inhibited complex (28). For example, RNA polymerase in solution binds heparin and will no longer bind a DNA template, suggesting that heparin can occupy the template binding site (29). However, several RNA polymerase-DNA complexes are also heparin-sensitive. These include nonspecific core-DNA complexes, closed complexes, and, to varying degrees, open complexes (29, 30). Heparin appears to directly attack these complexes, because inactivation is much more rapid than the rate of dissociation to free RNA polymerase (29, 30). Several factors appear to determine the level of heparin resistance of open complexes, including the presence or absence of magnesium ion (24) and promoterspecific variability in RNA polymerase conformation (29).
With the bubble template, it is possible to directly compare the heparin sensitivity of core and holo polymerase bound in similar transcriptionally active complexes. The much greater resistance of holo complexes to heparin challenge demonstrates that the (T factor stabilizes the bubble DNA complex against heparin attack, but the mechanism of this stabilization is not obvious. The (T factor is thought to play a role in both promoter recognition (3-6) and in the strand separation process that accompanies open complex formation (31). Neither of these activities are apparently required for the stabilization of bubble DNA complexes because (i) initiation is independent of promoter sequences and (ii) initiation on bubble DNA at low temperature is not impaired by mutations in region 2.3 of the u factor, which impair the ability of holoenzyme to melt promoter DNA. 4 We propose that the primary pathway for heparin inactivation of bubble DNA complexes involves attack on a product RNA binding site (cf. 32-34); in complexes with holoenzyme access to this site would be partially occluded by the bound u factor. Consistent with this model, we note that if a partially complementary RNA oligomer is annealed to a DNA bubble, core polymerase can associate in a heparin-resistant fashion (11). In addition, elongation of RNA from single-stranded DNA templates, which leads to formation of an RNA-DNA hybrid and a presumably vacant product binding site, is immediately halted by the addition of heparin (28). The idea that the u factor and the product RNAoccupy partially overlapping sites on RNA polymerase is supported by binding studies. For example, the addition of RNA to holoenzyme displaces the u factor and leads to the formation of a n RNA-core complex (35). Growth of the nascent RNA chain into this putative product binding site may be one factor that helps trigger release of u during the maturation of RNA polymerase into the highly processive and heparin-resistant elongation complex.