Nucleotide Sequence of the asd Gene of Streptococcus mutans IDENTIFICATION OF THE PROMOTER REGION AND EVIDENCE FOR ATTENUATOR-LIKE SEQUENCES PRECEDING THE STRUCTURAL GENE*

The complete nucleotide sequence of the asd gene of Streptococcus mutans encoding aspartate &semiaide-hyde dehydrogenase (EC 1.2.1.11), an enzyme com- prised of 357 amino acids, having an M, of 38,897 and active in the biosynthetic pathway of lysine, threonine, methionine, diaminopimelic acid, and isoleucine, has been determined. In addition we report the 276 nucleotides upstream of the structural gene which contain a highly efficient promoter identified by both RNA polymerase binding and in vitro transcription analysis. A leader transcript which terminates at a fixed point immediately preceding the asd promoter region was identified in the DNA sequence and confirmed by in vitro transcription analysis as well. The close prox- imity of this transcript and its p-independent transcriptional terminator to the asd coding sequence suggests involvement in a mechanism of regulation. Message stability experiments indicate the half-life of asd specific messages to be comparable to that of Escherichia coli messages. Conditions of varying concentrations of lysine, threonine, and methionine exert no apparent control over expression of the S. mutans asd gene in Escherichia coli suggesting the requirement of an ac- cessory regulatory element specific for the S. mutans asd gene.


to the GenBankTM/EMBL Data Bank with accession numberfs)
The nucleotide sequencefs) reported in this paper has been submitted 502667.
f Present address: Sungene Technologies Cop., Palo Alto, CA. ." Present address: Dept. of Biology, Washington University, St. Louis, MO. al., 1982) producing a protein of -40,000 Mr. Due to the limited concentrations of the components necessary for replication, transcription, and translation found in minicells (Frazer and Curtiss, 1975), this suggested that the S. mutans asd gene contained an efficient promoter which competed favorably with the promoters of pBR322 for available DNAdependent RNA polymerase. Such competition has been reported among endogenous pBR322 promoters (Stuber and Bujard, 1981). Alternatively, or in addition, the mRNA transcript of the s. mutans asd gene might be unusually stable in E. coli or translated more efficiently than most E. coli mRNA species.
pYA575, a chimeric plasmid capable of producing a functional asd gene product, contains -1330 bp' of S. mutans DNA inserted between the EcoRI and HindIII sites of pBR322 (Jagusztyn-Krynicka et al., 1982). Subclones pYA576 and pYA577 containing deletions of the first 208 bp of the insert proximal to the EcoRI site and the last -700 bp proximal to the HindIII site, respectively, were unable to complement Aasd E. coli mutants and, therefore, conferred an Asd-phenotype. However, pYA577 produced a novel and apparently fused protein while no such protein was seen for pYA576. This suggested that putative regulatory sequences were located within the first 208 bp of the insert, proximal to the EcoRI site of pBR322 (Jagusztyn-Krynicka et al., 1982). pYA574 containing -3000 nucleotides of additional contiguous S. mutans DNA upstream from these sequences showed approximately 2-fold higher levels of asd expression than did pYA575. The region required for elevated expression was narrowed to 700 bp in pYA631.
To facilitate study of the asd gene, we prepared a detailed restriction map of the S. mutans insert in pYA575. E. coli RNA polymerase filter binding assays were performed under conditions which enhance selective binding to promoter-containing DNA fragments (Strauss et al., 1981). The assays indicated preferential binding to an EcoRIIHinfI fragment comprised of the first 139 bp of the insert, consistent with the subcloning data. The nucleotide sequence of this fragment contains regions of homology with those sequences involved in the promotion of RNA transcription in E. coli systems including five overlapping and tandem Pribnow boxlike sequences (Hawley and McClure, 1983;Pribnow, 1975a;Pribnow, 1975b;Rosenberg and Court, 1979;Siebenlist et al., 1980). A second HinfIIEcoRI fragment of 636 bp downstream and internal to the asd structural gene also showed binding under conditions of reduced stringency. Haziza et al. (1982) have reported the nucleotide sequence of the asd gene of E. coli. They had hoped to find evidence of regulation by transcriptional termination at an attenuator (Yanofsky, 1981) but did not find the appropriate sequences. We report here the complete nucleotide sequence of the coding region for the S. mutans asd gene product contained within 1071 bp, capable of coding for a protein of 357 amino acids and M , = 38,897. The sequence extending 276 bp upstream of the asd ATG start codon which contains, in addition to the promoter sequence described above, signals for both the promotion and termination of transcription (Rosenberg and Court, 1979) as well as those for ribosome binding (Gold et al., 1981) that would permit the production of a 44-amino acid polypeptide. Upon initial inspection, this sequence bears resemblance to the reported arrangement of attenuation signals in amino acid biosynthetic operons in Gram-negative bacteria (Kolter and Yanofsky, 1982). However, several differences were apparent. The p-independent termination sequences begin in the middle of the presumptive leader peptide coding region and are preceded by only three of a total of seven lysine, threonine, or methionine codons in the peptide. Of these seven, only the two lysine codons at the very end of the peptide are contiguous. No alternative secondary structure to prevent formation of a transcription termination hairpin loop is apparent.
We altered the growth conditions of ~1 8 2 5 , an E. coli K-12 strain deleted for asd, harboring either pYA574 or the smaller pYA575 to determine if expression was affected by different concentrations of the appropriate amino acids and thereby indicate whether regulation occurred in the presence of the upstream sequences in the larger chimera. Very little difference was evident. In vitro run-off transcription analysis did demonstrate, however, that a leader transcript was produced and terminated, at -50% efficiency, at the proposed p-independent termination sequence. Four transcripts were produced from the asd promoter region unless rifampicin was added to the reaction in which case only the shortest of the transcripts, which mapped to the fourth of the five -10 sequences preceding the asd structural gene, was retained. An additional transcript was found to initiate approximately 652 bp into the asd structural gene sequence which coincided with RNA polymerase binding assays and subcloning experiments (Jagusztyn-Krynicka et al., 1982). Measurement of the rate of degradation of asd-specific messages by hybridization of pulse-labeled RNA to specific DNA probes bound to filters (Mostellar et al., 1970) demonstrated the half-life to be in the order of 2.5 min which is comparable to E. coli specific messages (Adesnik and Levinthal, 1970;Baker and Yanofsky, 1970;Geiduschek and Haselkorn, 1969) indicating that message stability is not involved in high levels of asd expression.
Enzymes"T4 DNA ligase and restriction enzymes were purchased from Bethesda Research Laboratories except for NarI, FnuDII, ClaI, NaeI, EcoRV, RsQI, and HhaI which were purchased from New England Biolabs. Digestions were normally carried out in 10 mM Tris, pH 7.5, 10 mM MgCl,, 6 mM P-mercaptoethanol, 100 pg of bovine serum albumin/ml, and the appropriate concentration of NaCl or KC1 as recommended by the supplier. Ligations and EcoRI and SmaI digestions were carried out in the buffers recommended by the supplier. E. coli RNA polymerase holoenzyme was the kind gift of Peter Chan, David Wood, and Jacob Lebowitz. E. coli alkaline phosphatase was from Worthington, calf intestinal alkaline phosphatase from Sigma, T4 polynucleotide kinase from Pharmacia P-L Biochemicals, and the large fragment (Klenow) of E. coli DNA polymerase I from New England Nuclear. The gene 6 exonuclease of T7 was prepared by the method of Sadowski (1972a, 1972b).
Plasmids and Transformation-The plasmids pYA574 and pYA575 were as described (Jagusztyn-Krynicka et al., 1982). pYA631 was derived from pYA574 digested with EcoRV and treated with calf intestinal alkaline phosphatase to dephosphorylate the 5' ends followed by ligation to EcoRV-digested pBR322. x1825 was transformed by the method of Dagert and Ehrlich (1979) and the transformation mix plated on L agar (1.5% agar (w/v) in L broth) containing 12.5 pg of tetracycline/ml and no DAP. Transformants were screened for plasmid DNA by the minilysate technique of Birnboim and.Doly (1979), and verification of the desired construction was achieved by restriction analysis. Plasmid Purification-To amplify plasmid DNA, 170 pg of chloramphenicol (Sigma) was added per ml of a culture grown in M9 minimal medium to an Am = 0.5-0.6, and growth was continued at 37 "C for 16 h. Cell lysis was by the method of Guerry et al. (1973) modified by a freeze-thaw step after lysozyme digestion and the use of 3% sodium dodecyl sulfate (SDS). CsC1-ethidium bromide bouyant density gradient centrifugation, collection, and concentration of DNA have been described (Holt et al., 1982) except that Beckman VTi 50 and VTi 65 rotors were used at 45,000 rpm for 20 and 14 h, respectively.
Restriction Mapping and Gel Electrophoresis-DNA restriction fragments were mapped by single and multiple digestions and subjected to electrophoresis on 2, 3, or 4% (w/v) agarose gels (Bio-Rad and Sigma) in a horizontal electrophoresis apparatus in 100 mM Tris, pH 8.3, 100 m M boric acid, 2 mM NazEDTA (1 X TEB). AluI and HinfI digests of pBR322 served as markers. Restriction fragments were purified from acrylamide gels as described (Maxam and Gilbert, 1980) or from agarose by electrophoresing the fragment into a DEAE membrane (Schleicher and Schuell) and eluting as recommended by the manufacturer. RNA transcripts and DNA sequence reactions were both run on 643% polyacrylamide, 7 M urea gels (380 X 280 X 0.375 mm). The running buffer in all cases was 1 x TEB, pH 8.3.
RNA Polymerase Binding Assay-The method employed was essentially that of Chan et al. (1979) except that the KC1 concentration was varied as indicated in the figure legend.
DNA Sequence Analysis-The DNA sequence was determined by the methods of Maxam and Gilbert (1980) and Sanger et al. (1977). In the case of the latter method, templates were prepared by digesting pYA574 with either ClaI or SdI, each of which have single sites in the vector, pBR322, on opposite sides of the insert. Linearized molecules were then made single stranded by digestion with the 5' specific gene 6 exonuclease of bacteriophage T7 (Kerr and Sadowski, 197213;Kolter and Yanofsky, 1982). Primers were gel-purified restriction fragments.
Anulysis of RepressionlDepression ojasd by Altering Concentrations of Lysine, Threonine, and Methionine-This experiment was carried out under conditions similar to those described by Haziza et al. (1982). A 2 5 -~1 inoculum of ~1 8 2 5 , alone or transformed with pBR322, was added to 5 ml of M9 minima1 salts, 0.5% glucose, 1 mM MgSO,, 0.1 mM CaCl,, and per ml, 5 pg of biotin, 50 pg of DAP (0.26 mM), 80 pg of DL-threonine (0.67 mM), and 20 pg of L-methionine (0.134 mM) and allowed to grow a t 37°C with shaking. x1825 transformed with either pYA574 or pYA575 was grown under the same conditions except for substitution of 88 pg of L-lysine HCl/ml (0.48 mM) in place of DAP. These were considered normal conditions and served as controls. x1825 containing pYA574 or pYA575 was also grown in this medium except the concentrations per ml of lysine, threonine, and methionine were varied as follows: (i) none added; (ii) repression: 1.827 mg of L-lysine (10 mM), 0.595 mg of DL-threonine (5 mM), 0.298 mg of L-methionine (2 mM); (iii) lysine limitation: 5.5 pg of L-lysine (0.03 mM), 80 pg of DL-threonine (0.67 mM), 20 pg of L-methionine (0.134 mM); (iv) lysine excess: 1.827 mg of L-lysine (10 mM), 80 pg of DL-threonine (0.67 mM), 20 pg of L-methionine (0.134 mM). Cells were grown to an A m = 0.6 which required 9 & 0.5 h, except under conditions of no lysine, threonine, or methionine which required 13 h, and were placed on ice. After 5 min a 300.~1 sample of each culture was removed and the cells pelleted in a 1.5-ml Eppendorf tube for 2 min at -15,000 X g. The pellets were suspended in 25 pl of sample buffer (2.3% SDS, 5% 8-mercaptoethanol, 10% glycerol, 0.1% bromphenol blue, 62.5 mM Tris, pH 6.8), heated at 100°C for 5 min, and subjected to SDS-polyacrylamide gel electrophoresis by the method of Laemmli and Favre (1973) followed by staining with Coomassie Blue (Weber and Osborn, 1969). Low molecular weight protein markers were from Bio-Rad.
In Vitro Transcription Analysis-Various DNA restriction fragments, covering an area in pYA575 and pYA631 contained within a 2170-bp EcoRV fragment, beginning 575 bp upstream from the start of the asd structural gene and continuing downstream 160 bp beyond the end of the insert into pBR322, were used as transcriptional templates. Transcription conditions were essentially as described by Turnbough et al. (1983). KC1 concentrations and presence or absence of rifampicin are as indicated in figure legends. RNA markers were kindly provided by John Donahue (University of Alabama).
Message Stability Determinution-Cultures of x1849 alone or transformed with either pBR322 or pYA575 were grown with shaking at 37°C in supplemented M9 media to an Asm = 0.6 at which time [3H]uridine (ICN, specific activity, 44 Ci/mmol) was added at 25 pCi/ ml. Shaking was continued for an additional 2 min. Rifampicin (Sigma) and cold uridine were added at respective concentrations of 300 and 200 pg/ml. Zero time, 30-s, and 1-min samples were taken followed by samples every minute thereafter for a period of 7 min total time. Samples were immediately made 20 mM in sodium azide and poured over crushed ice. Cells were harvested, lysed, and the RNA extracted by the method of Summers (1970).
pBR322 and pYA575, linearized with HindIII, and the 1100-base pair EcoRI fragment of the asd insert were loaded onto nitrocellulose filters (Schleicher and Schuell BA85) by the method of B~v r e and Szybalski (1971). A hole punch was used to make "Mini" filters to more accurately compare the amount of hybridization with the tritiated RNA. DNA-RNA filter hybridization was as described by Davis and Vapnek (1976) except that the hybridization mixture was incubated for 20 h at 42°C (Maniatis et al., 1982). Subsequent washing of filters and RNase digestion were as described by Yamamoto et al. (1982).

RESULTS
Restriction maps of pYA575, pYA631, and pYA574 are presented in Fig. 1. Proteins solubilized from whole cells of x1825 alone or containing pBR322, pYA574, pYA575, or pYA631 are shown in Fig. 2. T h e band corresponding to the asd gene product is indicated. The higher level of activity of the asd gene consistently seen in pYA574 is retained in pYA631 indicating that those sequences required for greater expression than seen in pYA575 are located within the additional 700 bp of S. mutans DNA found in pYA631 upstream from the asd coding region. experiments (Jagusztyn-Krynicka et al., 1982) had suggested that the asd transcription regulatory signals should be located within the first 208 bp of the insert. The likelihood of a second promoter, not involved in asd transcription but responsible for the retention of activity of the tetracycline resistance genes of pBR322 in pYA575, despite disruption of the tet promoter by insertion of DNA at the HindIII site, had also been suggested by the subcloning data (Jagusztyn-Krynicka et al., 1982). Alternatively, fortuitous placement of -35-like sequences within 14 bases of the HindIII end of the insert could account for this activity. Based on these considerations HinfI was deemed the enzyme of choice to generate fragments for the RNA polymerase binding assay. Thus, both pBR322 and pYA575 were cut with EcoRI and HindIII to excise the insert DNA in pYA575, followed by HinfI digestion and exposure to RNA polymerase at 50, 100, or 200 mM KC1 concentrations. DNA fragments, complexed with RNA polymerase and bound to nitrocellulose filters, were eluted, concentrated, and run on a 6% polyacrylamide gel as seen in Fig. 3.
The only pBR322 fragments selectively retained under the stringent conditions imposed by the high KC1 concentration (Belintsev et al., 1980;Strauss et al., 1980) were those of 1000 bp which contains the p-lactamase promoter and 517 bp which specifies the 104-nucleotide RNA associated with incompatibility and copy number control in plasmids which contain ColEl replication functions (Chan et al., 1979;Stuber and Bujard, 1981). The EcoRIIHinfI fragment corresponding to the first 139 bp of insert DNA, as seen in Fig. 1, was also retained, further indicating the presence of a n efficient promoter in this region. The 626-bp HinfIlEcoRI fragment appears, faintly, as well. All of the other fragments generated from the insert DNA, except that of 124-bp HinfI/HinfI fragment, demonstrate binding under less stringent conditions.
DNA Sequence-The sequence of 140 nucleotides encompassing the asd gene is presented in Fig. 4. T h e majority of the sequence was generated by the chain-terminating method of Sanger et al. (1977). The sequencing strategy, indicating primer extension and the 295 bases determined by the method of Maxam and Gilbert (1980), is shown in Fig. 1 including regions involved in the initiation and termination of both transcription and translation. The asd gene has an unusual promoter region located between nucleotides 205 and 259. It contains a -35-like RNA polymerase recognition sequence of TTGTAT, preceded by a run of Ts and a frequently conserved A a t a spacing of 8 nucleotides upstream from TTG (Hawley and McClure, 1983;Siebenlist et al., 1980). At a spacing of 6 nucleotides downstream from the -35 hexamer begins a series of five potential RNA polymerase binding sequences contained within a stretch of 24 nucleotides. All five contain a first and second position TA and a sixth position T. The second of these is a perfect Pribnow box consensus of TATAAT(G) (Pribnow, 1975a;Pribnow, 1975b). Each of these sequences is followed by one or more purines, in all cases an A, in some a G as well, a t a spacing from the sixth position T of the -10 hexamer of 5-8 nucleotides which is appropriate for the site of initiation of transcription (Hawley and McClure, 1983). Spacing between the -35 and -10 hexamers in E. coli is optimally 17 f 1 nucleotides (Hawley and McClure, 1983;Siebenlist et al., 1980) suggesting the fourth sequence of TATTAT, at a spacing of 18 nucleotides, as the most likely RNA polymerase binding site.
Slightly farther downstream is a 9-base Shine-Dalgarno ribosome binding sequence of TAAAGAGGT bearing strong complimentarity to the 3' terminus of E. coli 16 S ribosomal RNA (Gold et al., 1981;Shine and Dalgarno, 1974) (calculated AG = -13.5 kcal (Salser, 1977)). This is followed a t a spacing of 5 nucleotides by an ATG translation initiation codon. The entire coding sequence of the asd gene is contained within nucleotides 274-1344 which would permit the synthesis of a protein of 357 amino acids and a molecular weight of 38,897. This is in good agreement with the size estimated by SDSpolyacrylamide gel electrophoresis (Fig. 2).
Sequences upstream of the promoter region contain features which suggest involvement in regulation of asd expression. An area of dyad symmetry is found between nucleotides 160 and 177 which is followed by a run of Ts. Such a sequence has been implicated in the p-independent termination of transcription (Holmes et al., 1983;Rosenberg and Court, 1979), particularly with regard to the regulation of gene expression in the mechanism of attenuation in amino acid biosynthetic operons (Farnham and Platt, 1980;Kolter and Yanofsky, 1982). A AG = -10.7 kcal can be calculated for the most stable configuration of this putative hairpin (Salser, 1977). The sequences from 2 to 47 also bear strong homology to sequences found to be involved in the promotion of transcription (Hawley and McClure, 1983;Rosenberg and Court, 1979;Siebenlist et al., 1980) with a -35-like sequence a t 11-16 and a -10-like sequence a t 34-39. This would allow transcription initiation at the A a t position 46 and would permit the synthesis of a transcript, if terminated after the putative hairpin, of 140-145 nucleotides.
A Shine-Dalgarno sequence (Shine and Dalgarno, 1974) involved in ribosome binding (Gold et al., 1981) is found at 82-90, within the putative leader transcript region. This sequence is very similar to that preceding the asd coding sequence and gives the same calculated AG = -13.2 kcal (Salser, 1977) for interaction with the 3' end of E. coli 16 S ribosomal RNA (Gold et al., 1981;Shine and Dalgarno, 1974). An ATG start codon occurs a t a spacing of 4 nucleotides at positions 95-97 followed by an open reading frame through position 226 which would permit the synthesis of a leader peptide of 44 amino acids. Unlike leader peptides formed in other amino acid biosynthetic operons (Kolter and Yanofsky, 1982) few codons for the end product amino acids of the biosynthetic pathways in which aspartate P-semialdehyde dehydrogenase is involved (lysine, threonine, and methionine) are found. Three other features are significantly different as well. The p-independent transcription termination sequence occurs in the middle of the coding region for the peptide, no alternative secondary structure to block formation of the terminator hairpin is evident, and a completely separate set of sequences capable of promoting transcription of the asd structural gene is present.

Effect of Lysine, Threonine, or Methionine Concentration on Expression of the S. mutans asd Gene-Since the asd gene
product is easily detectable by SDS-polyacrylamide gel electrophoresis (Fig. 2), alterations in the activity of the gene should be readily discernible. Samples of proteins extracted from whole cells of ~1 8 2 5 , alone or transformed with pBR322, pYA574, or pYA575 and grown under conditions of various concentrations of lysine, threonine, and methionine, as outlined under "Materials and Methods," were subjected to electrophoresis by the method of Laemmli and Favre (1973) and are seen in Fig. 5. No striking difference is apparent in the amount of aspartate P-semialdehyde dehydrogenase produced relative to other cellular proteins, regardless of growth conditions. However, a slight increase may occur under conditions of limited lysine. In all cases, the amount of aspartate Psemialdehyde dehydrogenase is greater from cells which harbor the larger chimera, pYA574, containing the sequences for the putative leader transcript and peptide. Transcripts were subjected to electrophoresis and mapped according to their size.
The efficiency and specificity of in uitro transcription increases with increasing salt concentration, as can be seen in Fig. 6. A 1009-bp EcoRV/PuuII fragment containing the asd promoter region was used as a template. This specificity is further enhanced by the addition of rifampicin to a concentration of 4 PM, 30 s after initiating the reaction. The main transcripts run essentially as two doublet bands of approximately 2351237 and 2421246 nucleotides. The presence of rifampicin virtually eliminates all of these but the 235-nucleotide transcript which maps to a position 9 nucleotides after the fourth and most favorably spaced -10 sequence. The 237-nucleotide transcript also maps to this sequence while the 242-and 246-nucleotide transcripts would initiate at positions following the third and perhaps second -10-like sequences, respectfully. This suggests that in uitro, and under the least stringent conditions, spacing constraints between the -35 and -10 hexamers may be overcome. The 143-and 442nucleotide transcripts are initiated upstream of the aspartate B-semialdehyde dehydrogenase promoter as can be seen in Fig. 7.
The entire 2170-bp EcoRV fragment produced three major transcripts of approximately 1400, 725, and 143 nucleotides as seen in Lane I of Fig. 7. The 143-nucleotide transcript was localized to a 490-bp EcoRI fragment (Lane 4 ) immediately upstream of asd and was found in transcription of all fragments (Lanes 1-4) which contained this DNA segment. This suggests a fixed terminus for this transcript. Based on its synthesis from specified DNA fragments and its size, it was identified as the putative leader transcript suggested by the nucleotide sequence which also indicated the location of the p-independent termination site. A 299-bp EcoRI to PuuII fragment (nucleotides 189-487 of the sequence) produced four transcripts that migrate essentially as two doublet bands of approximately 235, 237, 242, and 246 nucleotides (Lane 5). These same transcripts were produced by the 1009-bp EcoRV to PuuII fragment (Lane 3 and Fig. 6) which also produced the transcripts of approximately 442 and 143 nucleotides. The 442-nucleotide transcript may be produced by inefficient termination of the 143-nucleotide transcript since it is absent in both Lanes 4 and 5 of Fig.  7 and is the appropriate size. Transcription of a 1335-bp EcoRV to BamHI fragment seen in Lane 2 of Fig. 5 supports this contention. Transcripts of approximately 143 and 563 nucleotides corresponding to the terminated leader transcript and that for asd are seen. Also evident is a transcript of approximately 768 nucleotides which, again, is appropriate for a read-through of the termination site for the 143-nucleotide transcript. ITP has been demonstrated to reduce efficient termination by interfering with the formation of stable stem-loop structures (Lee and Yanofsky, 1977;Miller, 1972). If ITP is substituted for GTP in the reaction, the 143-nucleotide transcript disappears; however, the 768-nucleotide transcript does not increase as anticipated (data not shown).
A third independent transcript seen in Lanes 7-9 of Fig. 7 is initiated within the 453-bp BamHI to EcoRI fragment producing transcripts of approximately 342 and 725 nucleotides when run out to the EcoRI and EcoRV sites, respectively.
The initiation site of this transcript may be located at a position 6 nucleotides downstream from a possible -10 RNA polymerase binding site at positions 913-918 of the sequence (Fig. 3) which is preceded by a -35 sequence identical to that for the leader transcript at positions 889-894. Alternatively, a second possible promoter is located slightly downstream with a -35 sequence at positions 899-904 and a -10 sequence at 922-927. The location of a transcriptional initiation site within this region had been indicated by both RNA polymerase binding and subcloning experiments and probably serves as a promoter site in expression of the pBR322 tet gene in these constructs (Jagusztyn-Krynicka et al., 1982).
Stability of asd-specific mRNA-To determine if efficient expression of the S. mutans asd gene in E. coli is a result of slow turnover of the asd-specific message, stability experiments were undertaken by hybridizing [3H]uridine pulsed labeled mRNA to specific DNA probes bound to filters. As seen in Fig. 8, the half-life for mRNA purified from strain x1849 containing pYA575 and hybridized to the 1100-bp EcoRI fragment (EcoA), containing the majority of the coding sequence of a d , was approximately equal to that of the same messages hybridized to the vector, pBR322. The half-life determinations for pYA575/pYA575 and pBR322/pBR322 RNA/DNA hybridizations were essentially the same. Of possible significance is that in both hybridizations involving pYA575 RNA and DNA containing usd-specific sequences, the initial rate of degradation was slightly more rapid. Also, asd-specific messages are produced at a level approximately 2-fold higher than total messages specific for pBR322 sequences when the template for transcription is pYA575.

DISCUSSION
Southern blot hybridization analysis (Southern, 1975) of the original S. mutans asd clones indicated a lack of homology with the E. coli genome (Jagusztyn-Krynicka et al., 1982). Comparison of the nucleotide sequences of the S. mutans and E. coli asd genes confirms this absence of homology both in the coding region and in that containing the signals involved in the initiation of transcription and translation. Despite the fact that the asd gene of S. mutam efficiently complements a deletion of the corresponding gene in E. coli, little homology exists between the deduced amino acid sequences of either aspartate P-semialdehyde dehydrogenase protein. Codon utilization is very different, as well, as can be seen in Table I. The S. mutans coding region favors codons with high A/T content and in fact is 60% A/T which seems to be characteristic of many genes from Gram-positive bacteria (Hollingshead et al., 1986). There does, however, appear to be some similarity in hydrophilic and hydrophobic regions as determined by computer analysis (data not shown). Biellmann et al. (1980) have identified the amino acid sequence Phe-Val-Gly-Gly-Asp-His-Thr-Val-Ser as the active site for the aspartate P-semialdehyde dehydrogenase molecule coded by E. coli, with the histidine indicated as the active site residue. Haziza et al. (1982) have found a corresponding sequence encoded in amino acid residues 130-138 of their deduced sequence with a substitution of asparagine for aspartic acid and cysteine for histidine a t residues 134 and 135. A cysteine residue at the active site is more in line with that seen for other dehydrogenases (Harris and Waters, 1976). Two of the three cysteine residues found in the S. mutans deduced amino acid sequence occur a t positions 125 and 128. If one assumes the cysteine residue a t position 128 to be the active site of the S. mutans molecule, the sequence can be overlapped with that of E. coli so that 16 of 42 surrounding amino acids coincide, the longest run of homology being 9 of 13 residues between positions 156 and 168 in the E. coli sequence and 149 and 161 of the S. mutans sequence. This indicates some degree of sequence conservation about what may be the active site of both molecules.
Attenuation mechanisms (Kolter and Yanofsky, 1982;Yanofsky, 1981) have been implicated in the regulation of several amino acid biosynthetic operons (Barnes, 1978;Bertrand et al., 1975;DiNocera et al., 1978;Gardner, 1979;Lawther and Hatfield, 1980;Lee and Yanofsky, 1977;Nargang et al., 1980;Zurawski et al., 1978) as well as in the regulation of resistance to erythromycin (Horinouchi and Weisblum, 1980) and pyrimidine biosynthesis (Turnbough et al., 1983). In general, in the amino acid operons the mechanism involves a transcribed leader region, prior to the coding sequence for the first structural gene, which encodes a leader peptide containing several residues corresponding to the amino acid end product of the operon. Two regions of dyad symmetry occur in the transcript, one in the coding region for the leader peptide followed by a second, closely resembling a p-independent transcription terminator (Rosenberg and Court, 1979), preceding a run of uridine nucleotides. These regions in turn can interact to form a single base-paired structure. Termination or read-through of transcription is dependent on the availability of the appropriate charged tRNAs for translation which mediates which secondary structure will form.
In determining the nucleotide sequence of the asd gene of S. mutans we were somewhat surprised to find a sequence resembling an attenuator preceding the coding region for aspartate P-semialdehyde dehydrogenase, since no evidence of an attenuator had been identified in the E. coli system (Haziza et al., 1982). The fact that the E. coli gene is regulated by levels of lysine principally but also threonine and methionine whereas no such regulation could be demonstrated by the S. mutans gene, a t least in E. coli, indicates that this attenuator-like sequence does not function in that capacity. However, in uitro transcription indicates that a leader transcript is synthesized and terminated with about 50% efficiency. A putative peptide is encoded by the leader transcript, but it spans the terminator and continues for 13 amino acid residues beyond the point of transcription termination. Seven codons specifying amino acids that are ultimate products of the action of the asd gene product occur in the peptide; except for two lysines these amino acids are noncontiguous, and four are found subsequent to the transcription termination sequence. It is possible that this transcription termination site serves as simply a pause site to permit better strand separation about the downstream promoter immediately preceding the asd structural gene. This would allow greater efficiency of transcription initiation and hence higher levels of aspartate P-semialdehyde dehydrogenase production. Alternatively the structure of the asd-specific promoter, with its multiple -10-like sequences, may serve as an "antenna" as suggested by Reznikoff et al. (1985) enhancing the association of RNA polymerase with the promoter by providing multiple sites of interaction. While spacing constraints would seem to favor the fourth -10 sequence a t least one and perhaps two other sequences appear to be involved in initiating transcription in in vitro runoff transcription experiments. Several other Streptococcus sp. gene sequences have recently been published (Hollingshead et al., 1986;Malke et al., 1985;Mannarelli et al., 1985;Stassi et al., 1982). While all of these are preceded by sequences bearing homology to the E. coli consensus promoter (Hawley and McClure, 1983) only the region preceding the type 6M protein gene of S. pyogenes contains several potential promoters and the activity of these sequences has yet to be determined (Hollingshead et al., 1986).
The efficiency of the asd promoter seems clear. The RNA polymerase binding data demonstrates its affinity for polymerase under the most stringent conditions. The in vitro transcription data support this as does the message stability data. While the half-life of the asd message is comparable to that found for E. coli specific messages the level of synthesis of the asd message is considerably higher than that of pBR322 sequences when pYA575 is the transcriptional template. This includes both the @-lactamase gene and the sequences coding for the 104-nucleotide RNA which have been reported to have strong promoters (Chan et al., 1979;Stuber and Bujard, 1981). Furthermore, a fusion of the aqd promoter and translational signals with the E. coli lac2 gene on the plasmid vector pMLB1034 (Berman, 1983) is capable of producing over 8000 units of @-galactosidase activity2 (Miller, 1972). We have also used this promoter region to construct an expression vector, pYA626; which has been successfully employed by members of our laboratory to express Mycobacterium leprae genes in E. coli (Clark-Curtiss et al., 1985;Jacobs et al., 1986). These characteristics are comparable to those described by Gentz and Bujord (1985) for "efficient unregulated promoters in the E. coli system." We had anticipated that the S. mutans asd gene, constitutively expressed in E. coli on the plasmid pYA575, would respond to excess or limiting concentrations of lysine when upstream sequences coding for the leader transcript were supplied in the plasmid pYA574. The fact that, regardless of lysine concentration, the pYA574-carried gene specifies higher levels of aspartate P-semialdehyde dehydrogenase may be alternatively explained by the availability of the readthrough leader transcripts for translation. However, this seems inadequate and, of course, is only based on in vitro and not in vivo studies of transcripts formed. We firmly believe the presence of the leader region is more than coincidence and plays some role in expression of the asd gene. Haziza et al. (1982) proposed the involvement of a regulatory protein mediating the effects of lysine, threonine, and methionine concentration. If such a molecule recognized sequences specific to the E. coli gene it is unlikely it would act upon that of S. mutans due to the lack of homology between the sequences.
T o determine a mechanism of regulation would then require G. A. Cardineau, unpublished data.  returning the asd gene to S. mutans. To this end, we have constructed a shuttle vector, pYA629, capable of being transformed into and replicated by both E. coli and S. mutans (Murchison et al., 1986). Transformation of S. mutans with pYA629 carrying the asd/lucZ fusion described above will provide us with a readily assayable gene product which will facilitate the study of regulation of the gene and its level of activity in its normal host.