The nucleotide sequence in the promoter region of the gene for an Escherichia coli tyrosine transfer ribonucleic acid.

The sequence of the first 29 nucleotides in the promoter region of a tyrosine tRNA gene has previously been determined (Sekiya, T., van Ormondt, H., and Khorana, H.G. (1975) J. Biol. Chem. 250, 1087-1098). This work has now been extended to give the sequence of a total of 59 nucleotides; the sequence is as follows: (see article). The general approach used in the determination of the sequence involved the DNA polymerase I-catalyzed elongation of synthetic deoxyribopolynucleotide primers hydridized to the l-strand of phi80psu+III DNA at the appropriate site. Sequencing of the newly added nucleotides was facilitated by the use of a number of techniques including (a) elongation of the primer with the use of all of the four nucleoside 5'-triphosphates but limiting the concentration of one of the triphosphates, (b) insertion of ribonucleotide units at appropriate sites so as to permit subsequent specific cleavages by pancreatic RNase, and (c) two-dimensional fingerprinting of the oligonucleotides in conjunction with partial exonucleolytic degradation, comprehensive nearest neighbor analyses, and the determination of pyrimidine tracts.

The general approach used for the sequence determination involved the DNA polymerase I-catalyzed elongation of suitable deoxyribopolynucleotide primers when hybridized to the l-strand of 480~s~ JI1+ DNA at the appropriate site. Sequences of the newly grown oligonucleotide chains were determined by a combination of two-dimensional fingerprinting following partial exonucleolytic degradation, nearest neighbor analyses, and determination of pyrimidine tracts. Primer elongations were carried out in a controlled and stepwise manner and the newly grown oligonucleotide chains were kept short by incorporating the following features into the method: (a) the insertion of a ribonucleotide unit at or near the 3' terminus of the primers ; (b) the use of a maximum of three nucleoside 5'-triphosphates in the first stage of the elongation reaction, isolation of the elongated primer, and its reuse in a second step together with different sets of deoxynucleoside triphosphates ; and (c) elongation of the primer using all of the four nucleoside triphosphates with one of the triphosphates being supplied in a limiting concentration. studies carried out with short single-and double-stranded DNAs as templates for the, Escherichia coli DNA-dependent. RNA polymerase showed (3, 4) that there was a lack of specificity in both the initiation and termination of transcription. Consequently, the RNA products were heterogeneous and, furthermore, extensive synthesis of the transcripts was not realized.
Therefore, with the aim of realizing properly initiated and terminated synthesis of an RNA, a system was chosen which would allow the investigation of the promoter and terminator regions as well. Thus, work was initiated on the synthesis of the DNA corresponding to the E. coli tyrosine tRNA.
The gene for this tRNA can bc inserted into the transducing bacteriophage 480 (5) and biochemical studies using the transducing bacteriophage have closely defined the initiation and termination sites for the transcription of this gene by the isolation and characterization of a precursor to the tRNA* (6). Further, the transducing bacteriophage containing the above tRNA gene provides a convenient starting material for structural work in the promoter and terminator regions by the approach described previously (1, 7-9) and below. Therefore, concurrently with the synthetic work on the DNA corresponding to the precursor for the above tRNA, investigation of the nucleotide sequences in the promoter and terminator regions of this gene was undertaken.
In previous papers, the sequence of 23 nucleotides beyond the C-C-A end of the above tRNA gene, the terminator region, has been reported (1). We now wish to report on the sequence of 29 nucleotides in the promoter region of this gene, t.he region preceding the starting point of t.lie initiation of transcription. Preliminary accounts of this work have already appeared (10,11).
The general approach used in the sequencing work involves the separation of the strands of the bacteriophage 48Opsu$ DNA carrying the gene for the tyrosine suppressor tRNA, the hybridization of appropriate deoxyribopolynucleotide primers at the tyrosine tRNA gene termini in the r-or l-strand of the above DNA, and the controlled DNA polymerase-catalyzed  Cp 8Opsu II&l Strand   re&l-I-(5') DNA I (5')  T-AC-T-G-GCC-T-G-C-T-C-C-C-T-T-A-T-C-G  (3')   DNA I[ (5')  T-A-C-T-G-G-C-C-T-G-C-T-C-C-C-T-T-A-T-c-G7  (3')   12341678  A:  (5,) T-~C-T-G-~C-C'r-~c=T-c-c-c;T-T-A-T-c-(3')   e 9 0 II 12 13 14 0 16 17 18 R   B: 1. Experimental plan for sequencing and the nucleotide elongated primers, indicates that this position was occupied partly sequences determined in the promoter region of the tyrosine tRNA by dC and partly by rC. This was because the DNA III used gene, The primer-template complexes were initially obtained contained two components; one contained eight nucleotides into by, hybridizing DNA I through DNA III to the l-strand of the the promoter region, whereas the second was shorter by two &3Opsu& DNA.
DNA polymerase-catalyzed elongations were nucleotides (rC-A) at the 3' end. Elongation of this mixture with carried out using the nucleoside triphosphates shown. The new the deoxynucleoside triphosphate mixtures shown, therefore, gave nucleotide sequence discovered after each elongation and subse-dC in addition to rC at the seventh nucleotide in the promoter quent alkailne cleavage and analysis is shown in the appropriate region. dashed box. The asterisk after C, the 33rd nucleotide in the > <

FIG.
2. The nucleotide sequence in the promoter region of the tyrosine tRNA gene. Elements of a-fold symmetry in the sequence are shown in the boxes with matching arrows pointing to the axis of symmetry. extension of the primers at their 3' ends and determination of the nucleotide sequence of the newly added nucleotides to the primers.
The plan of the present experiments for the determination of the promoter sequence and the results obtained are illustrated in Fig. 1. The primer, DNA I, was extended by a guanine ribonucleotide unit at the 3' end (DNA II). Controlled chain elongation of the latter was performed by using dATP, dGTP, and dCTP.
The new nucleotide chain thus formed was isolated by alkaline cleavage at the rG site. Its sequence was shown to be as in the dashed box in A (Fig. 1). Next, the primer, DNA I, was first elongated using the three triphosphates mentioned above except that rCTP replaced dCTP.
The product, designated DNA III (Fig. I), was elongated further in the presence of dATP, dTTP, and dCTP.
The addition of 11 new nucleotides was now observed in the major product and their sequence was determined after cleavage at the rC sites (dashed box in B, Fig. 1). In a third experiment, DNA III was extended using dATP, dCTP, dGTP, and a very low concentration of dTTP.
One of the products formed contained the new decanucelotide sequence shown in the dashed box in C (Fig. 1). The total sequence thus obtained is shown in the double-stranded form in Fig. 2.

EXPERIMENTAL PROCEDURE
Materials and Methods Except for the following, these were as described in a previous paper (1).
DNA Z-This was prepared by the Td polynucleotide ligasecatalyzed joining of the chemically synthesized oligonucleotides 5'J2P-d-G-C-T-C-C-C-T-T-A-T-C-G and the unphosphorylated d-T-A-C-T-G-G-C-C-T in the presence of the complementary oligonucleotides.
The details will be described elsewhere. DNA II-The chemically synthesized and 5'-phosphorylated oligonucleotide, Conditions for the enzymatic reaction and isolation of the product have been given in the text. Partial degradation was with venom phosphodiesterase.
Homochromatography in the second dimension was performed using Homomix II. gated with rG units at the 3' end as follows (12 Final pH of the solution was 6.9. The incubation was carried out at 37" for 18 hours. The product was isolated by gel filtration through a Sephadex G-50 column (0.9 X 24 cm). Analysis showed that 4 guanylate units had been added to the deoxyoligonucleotide. Thus, the two-dimensional fingerprint (Fig. 3) in the presence of the complementary polynucleotides and isolation of the joined product.
The reaction mixture (104 ~1) contained 720 pmol of Finally, degradation to 3'-mononucleotides gave the radioactive products dGp, rGp, and pdGp, the ratio of radioactivity in dGp + rGp to pdGp being 4.1: 1.0. The above product was joined enzymatically to d-T-A-C-T-G-G-C-C-T and the joined product was isolated as shown in Fig. 4. The material (56 pmol) from the first peak in Fig. 4   Analysis of the primer hybridized to the l-strand was carried out by filtration through an Agarose 1.5m column as described previously (8)  For isolation, the reaction mixture was dialyzed, then heated at 100" for 2 min, and loaded on an Agarose 1.5m column (0.9 X 24 cm). Peak I contained the l-strand which had incorporated some radioactivity and Peak ZZ was the required elongated DNA II, whereas Peak III contained the excess of nucleoside triphosphates. three bands, A1 to As, were observed.
After alkaline treatments to cleave at the rG site, the fragments containing the new sequences were separated by one-dimensional homochromatography.
As seen in Fig. 6, b to d, several products were obtained from each of the bands shown in Fig. 6a. A1 gave more of the shorter fragments whereas, as expected, the longer product, AS, gave longer oligonucleotides.
Although the oligonucleotides shown in Fig. 6 Dyes Z and II in Channel a indicate the positions of the dyes, xylene cyan01 and bromphenol blue, respectively, used as markers.
Homochromatography was carried out using Homomix II. The position of the decanucleotide, 32P-d-G-G-A-G-C-A-G-G-C, used as a marker in homochromatography, is indicated as Deca.
nucleotide, corresponded to the structure shown in the dashed box in Fig. 1 (Ezpetiment A). The sequences of all of the oligonucleotides investigated are shown in Fig. 9. The sequences of the above products were confirmed by extensive nearest neighbor analyses and by Burton degradation.
The results are shown in Tables II and 111. As is seen in Table II for H,, when [a-32P]dATP was used, radioactivity was found in dAp, dGp, and dCp in the ratio of 1: 1: 1. When [cr-32P]dGTP was used, radioactivity was found in dAp, dGp, and dCp in the ratio of 1:3 :2. Finally, when [cr-32P]dCTP was used, radioactivity was in dGp only.
The results are all consistent with the sequence derived above. Similar analyses of the shorter fragments were also completely consistent with the sequences listed in Fig. 9.
The products formed on Burton degradation of the oligonucleotides were separated by electrophoresis on DE81 paper and the results are summarized in Table III.
The bands Hz, HI, Hd, He and H, gave Pi and pdCp in the expected molar ratio.
The compounds, H1 and Hs gave pdC in addition to Pi or pdCp or both as would be expected from their 3'-terminal sequences. Preparation of Primer, DNA IZI, for Further Sequencing- The sequence deduced above, contains dC units at the initiation site and at positions 5 and 7 in the promoter region (Figs. 1 and9).
Because CTP can substitute for dCTP in the DNA polymerase reaction (18,19), the polymerase reaction described in the preceding section was repeated except that CTP replaced dCTP. The expected product (DNA III) would be an attractive primer for obtaining a manageable new sequence. In view of the incomplete elongation encountered above using 1 PM concentra- were treated with venom phosphodiesterase and the digests were subjected to the two-dimensional fingerprinting procedure as that described under "Materials and Methods." The homomix used in A, B, C, and D was III, whereas that in E and F was II.
tions of the triphosphates, the present reaction was performed using 300 PM CTP, 18 C(M dGTP, and 1.6 PM [(r-82P]dATP.4 DNA III was isolated from excess nucleoside triphosphates and the template DNA by gel filtration through an Agarose 1.5m column. The product was analyzed by electrophoresis on a 15% polyacrylamide gel (Fig. 10, Channel a). The DNA III preparation contained two compounds, designated Band 1 and 2. From their mobilities on the gel and the results described below, it was concluded that Band 1 corresponded to DNA III containing the full eight nucleotides in the promoter region, whereas Band 2 lacked the dinucleotide sequence rC-A at the 3' end.

Sequence of Next 11 Nucleotides is d-T-C-A-T-A-T-C-A-
A-A-T (Fig. 1)-DNA III prepared as above was used in two 4 Analysis of the composition of the products II1 to Hr shows that the incorporations of dGTP and dCTP were particularly rate-limiting, whereas dA incorporation evidently went to completion.
Therefore, in this experiment, higher concentrations of dGTP and CTP were used. Only a few nucleotides were added as shown by the mobility of the extended primer on a polyacrylamide gel (Fig. 10, Channel 6, Ad). After alkaline cleavage of AJ, separation by homochromatography (Fig. lla) gave an oligonucleotide (Hs) which was identified as d-A-T-C. Thus, degradation to 3'-and 5'-nucleotides (Table IV) gave the results expected for this sequence. Because the dA unit in this sequence is the 3'-terminal nucleotide (eighth nucleotide in the promoter) in DNA III, the sequence of the next two nucleotides is d-T-C.
This conclusion was also confirmed by the following experiment.
DNA III was next elongated using dATP, dTTP, and dCTP.
6 With this combination of the triphosphates, only the component carrying the complete octanucleotide sequence in DNA III (Band 1 in Fig. lOa) would undergo elongation.
The second component (Band 2 in Fig. 10a) lacking rC-A at the 3' terminus would be left, out because of the absence of dATP. After the usual separation, the elongated product was subjected to electrophoresis on a polyacrylamide gel. As seen in Channel c in Fig. 10, one main product, designated As, was obtained Nith the estimated size of 45 nucleotides.
The products formed after alkaline hydrolysis are shown in Fig. lib. Two products, called Hs and Hlo, were obtained which were sequenced by partial enzymatic degradations followed by fingerprinting (Fig. 12). The sequences derived are also shown in Fig. 9.
Degradation to 3'-and 5'-nucleotides gave results which confirmed the above sequences (Table IV).
Thus, when [cr-"PI-dTTP was used and degraded to 3'-nucleotides, the radioactivity was found only in dAp in both Hg and Hlo. When [cu-a2P]dATP was used, the radioactivity in Hs was found in dAp, dTp, and dCp in the ratio of 2: 1:2; however, the ratio of radioactivity in Hlo was 2:1:3.
The latter results are consistent with the fact that one of the components in DNA III lacked the terminal rC-A sequence.
Elongation of this primer would lead first to the repair of this sequence and, therefore, when [ol-a2P]dATI-' is used, an extra mole of radioactive dCp \\ould be found relative to Hg after degradation.
When [a-a2P]dCT1' was used, degradation of Hlo to 3'-nucleotides gave radioactive dGp and dTp in the ratio of 1:2, whereas degradation of Hg gave radioactivity only in dTp. These results are again consistent with the structures shown in Fig. 9 and the fact that Hlo must have arisen from the component in DNA III which lacked rC-A.
Depurination of I-I9 and Hlo by the Burton method gave the results described in Table V. As can be seen, both products contain 2 pdTpdCp residues followed by 1 or more dA residues (formation of radioactive l'i) and also a pdTp residue again followed by 1 or more dA residues.
When [cr-a21']dTTI' was used, both Hs and H10 gave radioactivity in pdT. This result showed that the 3'-termirl,al nucleotide in both cases was dT. Further, as seen in Table V, one difference was observed between Hs and HlO. When [a-a2P]dATl-' was used, radioactive pdCp was found only in the case of H 1o. All of the above results are in accord \\ith the sequences in Fig. 9 for Hg and Hlo.
Sequence of Next 10 Nucleotides is d-G-A-C-G-C-G-C-C-G-C   The products were separated into Pi and pdCp + pdC by electrophoresis on DIi%l paper. pdCp and pdC were separated hy cellulose thin layer chromatography using Solvent III described previously (12). The numbers in parentheses are the observed molar ratios.  . l)-During the above primer elongation experiments using a mixture of dA'I'P, dGTI', and dCTI', the formation of products longer than those expected was sometimes noticed.

HlO G-C-A-T-C-A-T-A-T-C-A-A-A-T Hll A-T-C-A-T-A-T-C-A-A-A-T-G-A-C-G-C-G-C-C-G-C H12 G-C-A-T-C-A-T-A-T-C-A-A-A-T-G-
Appearance of the new products varied with the preparations of the triphosphates used and seemed to depend on the presence of a small amount of dTT1' as an impurity.
Therefore, further elongation reactions wcrc performed by the deliberate addition of 0.015 pM dTTP to the mixture of the remaining three triphosphates. In this way, it was hoped to obtain an elongated primer of the desired size (55 to 60 nucleotides) as a major product. When DNA III was used as the primer and the extended product examined by electrophoresis, the pattern shown in Fig. 1Od was obtained.
The main band, As, was subjected to alkaline hydrolysis and the polynucleotide with the new sequence was further purified (Fig. 11~). The main product now obtained (Hn) was sequenced by fingerprinting of its partial enzymic digest (Fig. 13). Two sets of spots were seen, the major set corresponding to the sequence shown in Fig. 9 for Hii. The faint spots, which also corresponded to a set, belonged to the contaminant, Hi2 (Fig. 9), present in Hii. The contaminant, which was purified in a small amount by careful elution from the thin layer plate (Fig. il), was examined for nearest neighbor analysis and for pyrimidine t.racts. All of the evidence and its lower mobility pointed to its having the additional d-G-C sequence at the 5' end. These results were expected because the DNA 111 contained a component which lacked the rC-A sequence at the 3' end and terminated in the preceding rC-G sequence. Elongation of this sequence would, therefore, start with the d-C-A sequence and, after alkaline cleavage, the sequence at the 5' end in the new product would be d-G-C-A.
Results of the nearest neighbor analyses performed on different preparations of the products, Hii and Hi*, are compared in Preparation of the primer, DNA III, elongation of DNA III, and isolation of elongation products were carried out as described under "Materials and Methods." The primer, DNA III, and the elongation products were separated by electrophoresis on 15% polyacrylamide gel using the condition desciibed under "Materials and Methods." Channel a contains the mimer DNA III which contains two components, Band 1 and-l. in the nresence of a limited amount of unlabeled hTT@ (0.015 PM) which gave Band A8. Dye Z and ZZ indicate the positions of the dyes, xylene cyan01 and bromphenol blue, respectively, used as markers. Table VI. Thus, when Htl was prepared using [cr-32P]dATP as the labeled nucleotide, radioactivity was found in dAp, dGp, dTp, and dCp in the ratio of 2: 1: 1:2. A similar preparation of HE gave the same results except that dCp was present in the molar ratio of 3. When [cr-azP]dGTP was used in the elongation reaction, both Hlz and HI1 gave radioactivity in dTp and dCp in the identical ratio of 1:3. When [&zP]dTTP was used as the labeled nucleotide, again both HI1 and HI2 gave dAp as the sole labeled nucleotide. Finally, when [&2P]dCTP was used in the preparation of Hu, radioactivity was found in dAp, dGp, dTp, and dCp in the ratio of 1:3 :2 : 1. All of these results support the conclusions drawn from the fingerprints.
The results of Burton degradation on the products HI1 and H12 are shown in Table VII. When [&*P]dTTP was used, HU gave hdTps and $dTpdCp in the ratio of 2:2. HI1 prepared by using [a-32P]dCTP gave :dCp, i;dC, EdC$dCp, and pdT$dCp in the ratio of 2 : 1: 1:2. These results indicate that HI1 contains in its sequence 2 pdCp, 1 pdCpdCp, 2 pdTp, and 2 pdTpdCp as pyrimidine tracts and pdC as the 3' end nucleotide. When [cr-a2P]dATP was used, pdT$ and pdTpdC$ were produced in the ratio of 1:2 from EIl1. This suggests that both of the pdTpdCp sequences and 1 of the 2 pdTp residues are followed by B fi indicates the position of the radioactive phosphate which was determined by treatment with bacterial alkaline phosphatase followed by an electrophoresis on DE81 paper. and hydrolyzed with 0.5.~ KOH. The hydrolysates were subjected to homochromatography.
Channel a contains the newly formed oligonucleotide (Hs) obtained from Band Ad in Fig. 10 b.
Channel b contains the oligonucleotides (Hg and H1o) obtained from Band As in Fig. 10 c. Channel c contains the oligonucleotides (HII and H12) obtained from Band AC in Fig. 10d. Homochromatography as shown in Channel a was carried out using Homomix III.
Homochromatography shown in Channels b and c was carried out using Homomix II and I, respectively.
The markers, Penta, Deca, and Dodeca, are as described in Fig. 6. The marker, DNA I, indicates the position taken by [32P]DNA I. a dA residue. H1l prepared from [&*P]dGTP gave pdC$, pdCpdCji, and dpTh in the ratio of 2 : 1: 1, and the result indicated that pdCpdCp, both the pdCp and 1 of the 2 pdTp are followed by a dG residue. These results completely agree with the above described sequence for Hl1.
Results of the parallel experiments carried out with HU (Table VII) were also consistent with the sequence deduced for H12.
There was, however, one deviation from the expected results which was observed when [cr-azP]dCTP was used as the radioactive triphosphate. The ambiguity evidently arose from the lack of separation of HI1 and Hlz in this particular experiment and, because HI1 was the major product, analyses carried out on the product corresponded more closely to HI1 than to HE. Thus, only 2 mol of CdCp were recovered in place of the expected 3 mol (data not shown). DISCUSSION The approach used in the present work has involved the DNA polymerase I-catalyzed elongation of a primer hybridized at the appropriate site to the single-stranded DNA template containing the polynucleotide sequence of interest. The same approach has been used previously for t.he determination of the nucleotide sequence in the terminator region of the same gene (1,9), and it is also being used extensively in other laboratories (17,20). The potential and flexibility of the method have now been enhanced by the introduction of three types of procedures: the insertion of ribonucleotide units at strategic places so as to allow specific cleavages and isolation of shorter chains containing the new nucleotide sequences; carrying out controlled and stepwise elongation by using three (or less) nucleoside triphosphates, isolating the extended primer, and repeating the elongation reac- The digests were subjected to the two-dimensional fingerprinting phodiesterase degradations of the products Hs and Hlo. The procedure as that described under "Materials and Methods." oligonucleotides, Hg and Hle, purified as in Fig. 11 b and under The homomix used was II. The markers, Penta, Decu, and "Materials and Methods" were treated with snake venom phos-Dodeca, are as described in Fig. 6. The sequence derivation from phodiesterase (A and C) or spleen phosphodiesterase (B and D). the pattern of products is shown at the left in each case.   Fig. 11~. The oligonucleotide HII obtained as in Fig. llc and under "Materials and Methods" was treated with snake venom phosphodiesterase and the digests were subjected to the twodimensional fingerprinting procedure.
For homochromatography, Homomix II was used. The sequence is derived from the pattern as shown in the reproduction on the left of the fingerprint. tion with a different set of nucleoside triphosphates; and the use of all of the four nucleoside triphosphates in an elongation reaction but supplying one of the triphosphates in a limiting concentration . With these techniques, it seems feasible to determine, in a controlled fashion, sequences of relatively long stretches of DNA. Thus, no obstacle is expected in extending the present work further, if necessary, until the nature of the promoter region is elucidated completely.
Several methods were used for the determination of the sequences of the purified oligonucleotides. These included two- 1 dimensional fingerprinting of the digests obtained on partia degradations with exonucleolytic enzymes, the extensive nearest neighbor analysis by the use of different c@P-labeled deoxynucleoside triphosphates, and the determination of pyrimidine tracts by Rurton's method. Finally, in data not given, confirmation of the sequences thus derived was also obtained by the characterization of the oligonucleotide fragments obtained by pancreatic DNase degradation of the extended primers. The nucleotide sequence now discovered (Fig. 2) possesses remarkable features. It contains two outer regions, which bear a-fold symmetry relationship with one another and contain five exclusively G :C base-pairs. The central part of the sequence is A:T rich and also contains regions which have 2-fold symmetry. The total sequence possesses a rotational axis around the 15th base-pair which is a G : C base-pair and which is flanked by A:T base-pairs on both sides. The regular double helix can open up to adopt the secondary structure shown in Fig. 14. Thus, the sequence has not only regions eith asymmetrical base composition, but can exist in two alternative structures (Figs. 2 and 14). The structure in Fig. 14 also has the same regions of 2-fold symmetry, but their orientation is reversed relative to that in Fig. 2. These features must have biological significance because their occurrence on a random basis seems most improbable.
Regions with 2-fold symmetry are being found with increasing frequency in DNA. Thus, they have been found around the sites of action of several restriction enzymes and related methylases (21)(22)(23)(24)) of the enzyme cleaving the covalently closed DNA of bacteriophage X to generate cohesive ends, of the ter function in bacteriophage X (25), and in the lac operator (26). The sequences previously described for the terminator region in the tyrosine tRNA gene (1) and the sequences found for the promoter regions of the bacteriophage fd DNA7 and the leftward promoter  (31), the latter would appear to recognize all of the promoters on the E. coli chromosome as well as at least some of the promoters on the genomcs of the bacteriophages which infect E. coli. It is therefore of immediate interest to compare the sequences of some of the different promoters rccognized by the E. coli transcriptase.
Progress is just beginning to be reported in this area. The sequence of 36 nucleotides in the leftward promoter of bacteriophage X has been determined in this laboratory (27) and by I'tashne and colleagues (28). The sequence of one of the strong RNA polymerase binding sites present' in the Rp of the bacteriophage fd has been determined by Schaller and col1eagues.r Information has also been forthcoming on the sequence of the promoter in SV40 DNA which is recognized by the E. coli polymerase (32,33). Finally, the sequence of the promoter region in the lactose operon is also known8 The striking fact which emerges is that the above promoters all differ widely in primary sequence. Therefore, the important concept must forthwith be invoked that the polymcrasc recognizes a structure rather than a linear sequence in the double helix. That DNA may be recognized by proteins by virtue of specific looped-out structures has already been proposed by Gicrer (34, see also Ref. 35). However, it remains for future work to determine the nature of the presumed threedimensional structure recognized by the RNA polymerase because a comparison at this time of the known promoter sequences does not readily reveal a common pattern such as was the case for the t,RNA sequences. Thus, of the five promoters whose sequences are known, elements of symmetry can be detected only in the promoter in the tyrosinc tRNA gene, the leftward promoter for the N gene in bacteriophage X and the strong promoter for the RF of the bacteriophage fd, and only the first two promoters could possibly be regarded as being similar in regard to the symmetry elements.
The sequences for the lac promoter and for the SV40 DNA promoter for the E. coli polymerase evidently lack any recognizable symmetry patterns.
Despite the fact that there are large unknowns at present regarding the RNA polymerase-promoter interaction, it is tempting to see some significance in the similarities between the structures shown in Figs. 2 and 14 for the tRNA gene promoter. It is possible that RNA polymerase at first recognizes all or a part of the linear double helix containing regions of a 2-fold symmetry (structures of Fig. 2) and binding of the enzyme to this site takes place. Then, there ensues a conformational change in the enzyme concomitant with a transition in the DNA structure to that shown in Fig. 14. One of two things may then follow. Either the enzyme binds to one or t,he other looped-out arms such that the sequence recognition is still maintained. By this new mode of binding of the enzyme, selection of the strand as well as the initiation site could both be accomplished. Alternatively, the enzyme after the conformation change may be able to recognize the symmetrical regions in both arms of the structure in Fig. 14. In this case also, strand selection and initiation site would be accomplished simply by the asymmetrical configuration of the multisubunit enzyme.
Studies are now under way to gain insights into these aspects of the mechanism of transcription by using the approach described below.
Although it seems highly likely that the interesting structures shown in Figs scription, it is not clear at this time how much of the promoter region is actually represented in the sequence now known. Further sequence work would certainly be desirable.
However, definitive answers can come only by actual in vitro studies of the transcriptional process, including specificity of initiation. These require, in turn, DNA segments which contain varying lengths of sequences into the prcinitiation region and also an adequate length of the DNA into the post initiation region. DNAs of this kind can be obtained by the synthetic methodology which has been developed in this laboratory.
However, for initial studies, the desired DNAs can also be prepared from the primertemplate complexes used in the present work.
After controlled elongation, the primer-template complexes may be digested with an endonuclease specific for single-stranded DNA.
In this way, double-stranded DNAs corresponding in length exactly to the elongated primers may be isolated.
Experiments along these lines are in progress.
Studies arc also in progress on the mechanism of the termination of transcription in the tyrosine tRNA gene, by using DNA segments corresponding to the elongated primers described previously (1). It is hoped that with the elucidation of the exact chain lengths of the promoter and terminator regions, it should prove possible to reconstruct, by total synthesis, a gene containing its control elements.
Finally, the availability of such a totally synthetic gene should enable systematic studies of the structure-function relationships in the above tRKA by predetermined alterations in the structural gene.
Acknowledgmenfs-Initial experiments on primer-dependent nucleotide incorporations in the promoter region were carried out. by Drs. Peter Locwen and Marvin H. Caruthers.
Their assistance in this work is gratulefly acknowledged.