Nucleotide sequence of the overlapping genes for the subunits of Bacillus subtilis aspartokinase II and their control regions.

The nucleotide sequence of a 2.9-kilobase Bacillus subtilis DNA fragment containing the entire coding region of aspartokinase II and adjacent chromosomal regions (Bondaryk, R. P., and Paulus, H. (1985a) J. Biol. Chem. 260, 585-591) has been determined. The results confirmed the earlier prediction that the two subunits of aspartokinase II, alpha and beta, are encoded by in-phase overlapping genes. The nucleotide sequence showed strong ribosome binding sites before the translation initiation codons of the alpha and beta subunits. Deletion of most of the coding region unique to the alpha subunit had no effect on the synthesis of the smaller beta subunit, demonstrating that the beta subunit is indeed the product of independent translation. The site of transcription initiation of the aspartokinase gene was found to be more than 300 nucleotides upstream from the translation start of the alpha subunit. The intervening region contained a short reading frame capable of encoding a 24-residue lysine-rich polypeptide, which overlaps a region of extensive dyad symmetry culminating in a rho-independent transcription terminator. This region may be an attenuator control element that regulates the expression of the aspartokinase gene in response to the availability of lysine, the end product of the pathway. The coding sequence of the aspartokinase II subunits was immediately followed by a rho-independent transcription terminator. This termination site has an unusual symmetry, which allows it also to serve as transcription terminator for a gene that converges on the aspartokinase II gene from the opposite direction, an interesting example of genetic economy. The deduced amino acid sequence of B. subtilis aspartokinase II was compared with the sequences of the three aspartokinases from Escherichia coli (Cassan, M., Parsot, C., Cohen, G. N., and Patte, J. C. (1986) J. Biol. Chem. 261, 1052-1057). Significant sequence similarities suggest a close evolutionary relationship between the four enzymes.

The nucleotide sequence of a 2.9-kilobase Bacillus subtilis DNA fragment containing the entire coding region of aspartokinase I1 and adjacent chromosomal regions (Bondaryk, R. P., and Paulus, H. (1985a) J.
Biol. Chem. 260,[585][586][587][588][589][590][591] has been determined. The results confirmed the earlier prediction that the two subunits of aspartokinase 11, a and 8, are encoded by in-phase overlapping genes. The nucleotide sequence showed strong ribosome binding sites before the translation initiation codons of the a and 8 subunits. Deletion of most of the coding region unique to the a subunit had no effect on the synthesis of the smaller 8 subunit, demonstrating that the subunit is indeed the product of independent translation. The site of transcription initiation of the aspartokinase genes was found to be more than 300 nucleotides upstream from the translation start of the a subunit. The intervening region contained a short reading frame capable of encoding a 24-residue lysine-rich polypeptide, which overlaps a region of extensive dyad symmetry culminating in a pindependent transcription terminator. This region may be an attenuator control element that regulates the expression of the aspartokinase gene in response to the availability of lysine, the end product of the pathway. The coding sequence of the aspartokinase I1 subunits was immediately followed by a p-independent transcription terminator. This termination site has an unusual symmetry, which allows it also to serve as transcription terminator for a gene that converges on the aspartokinase I1 gene from the opposite direction, an interesting example of genetic economy. The deduced amino acid sequence of B. subtilis aspartokinase I1 was compared with the sequences of the three aspartokinases from Escherichia coli (Cassan, M., Parsot, C., Cohen, G . N., and Patte, J. C. (1986) J. Biol. Chem. 261,[1052][1053][1054][1055][1056][1057]. Significant sequence similarities suggest a close evolutionary relationship between the four enzymes.
The coding region specifying the two subunits of aspartokinase I1 (ATP:L-aspartate 4-phosphotransferase, EC 2.7.2.4) from Bacillus subtilis has recently been cloned in a bacterial plasmid (Bondaryk and Paulus, 1985a). Characterization by specific cleavage with restriction endonucleases suggested a map for the aspartokinase I1 gene in which the coding sequence for the smaller (3 subunit overlaps the promoter-distal portion of the coding sequence for the a subunit in the same reading frame (Bondaryk and Paulus, 1985a). Studies of the expression of aspartokinase in Escherichia coli transformed with a recombinant plasmid carrying the complete coding region showed the product to be indistinguishable in its molecular and regulatory properties from the aspartokinase I1 isolated from B. subtilis, indicating that the cloned DNA fragment contained all the information necessary for the structure and synthesis of the enzyme Paulus, 1985a, 1985b).
In this paper, we present the nucleotide sequence of the entire aspartokinase I1 gene and the adjacent regions on the B. subtilis chromosome. Our data allow the tentative identification of potential control regions and support the earlier proposal (Bondaryk and Paulus, 1985b) that the two aspartokinase subunits are the products of independent translation of in-phase overlapping genes. Comparison of the deduced amino acid sequence of B. subtilis aspartokinase I1 with that of the three E. coli aspartokinases provides interesting insights into the evolutionary relationships between these enzymes.

EXPERIMENTAL PROCEDURES AND RESULTS'
Nucleotide Sequence of the Aspartokinase ZZ Gene-The nucleotide sequence of the entire 2.9-kb2 PstI fragment, known to contain the complete aspartokinase I1 coding region (Bondaryk and Paulus, 1985a), is shown in Fig. 2, together with the amino acid sequence defined by the major open reading frames. Our earlier studies had shown that the coding sequences of both the a and (3 subunits of aspartokinase straddle a unique BamHI site (Bondaryk and Paulus, 1985a Regions of dyad symmetry are overlined with arrows and their centers indicated by a dot, potential ribosome binding sites are marked by triangles, and the -10 and -35 regions of the putative aspartokinase I1 promoter are enclosed in boxes, with the probable transcription start point shown by a star. The deduced amino acid sequence and putative control elements of a potential operon with opposite polarity are shown in a separate figure (Fig. 9, miniprint).
G l y V a l P h e T h r T h r A s p P r o A r g T y r V a l L y s S e r A l a A r g L y s L e u G l u G l y 1 1 9 6 1 2 1 1 1 2

6 A T C T G A T A C G A T G A A A T G C T T G A G C T T G C C A A T T T A G G C G C C G G T G T T C T T C A T
I l e S e r T y r A s p G l u M E T L e u G l u L e u A l a A s n L e u G l y A l a G l y V a l L e u H i s

C C G A G A G C A G T T G A G T T C G C G A A A A A T T A C C A A G T G C C G T T A G A A G T G C G T T C A
1 3 0 1 P r o A r g A l a V a l G l u P h e A l a L y s A s n T y r G l n V a l P r o L e u G l u V a l A r g S e r 1 2 4 1 1 2 5 6 1 2 7 1 1 2 8 6 1316 133 1 1 3 4 6 Ip subunit start S e r T h r G l u T h r G l u A l a G l y T h r L e u I l e G t u G f u 8 l u S e r S e r IMET G l u G l n 1361  AAT TTA ATT GTC AGA GGC ATT GCA TTT GAA GAT CAA ATC ACA AGA GTA ACC ATT  Asn Leu Ile Val Arg Gly Ile Ala Phe Glu Asp Gln Ile Thr Arg Val Thr Ile   1376  1391  1406   TAC GGG CTG ACT AGC GGC CTG ACA ACT TTG TCT ACT ATT TTT ACA ACA CTT GCC  Tyr Gly Leu Thr Ser Gly Leu Thr Thr Leu Ser Thr Ile Phe Thr Thr Leu Ala   1421  1436  145 1   1466  1481  1496  AAA AGA AAC ATA AAC GTG GAT ATC ATT ATC CAA ACG CAG GCC GAG GAC AAG ACT  Lys Arg Asn Ile Asn Val Asp Ile Ile Ile Gln Thr Gln Ala Glu Asp Lys Thr   GGA ATT TCC TTC TCT GTC AAA ACA GAA GAT GCA GAC  CAA ACC GTT GCG GTG CTT   1541  1835 as that of the aspartokinase a subunit. The amino acid sequence predicted by residues 615-641 is in agreement with the amino-terminal nonapeptide sequence of the LY subunit (see the Miniprint). The amino-terminal hexadecapeptide of the aspartokinase p subunit corresponds to the amino acid sequence translated from nucleotide residues 1347-1394, indicating that residues 1347-1835 encode the smaller aspartokinase subunit. These assignments were confirmed by the observation that nucleotide residues 1830-1835 encode the amino acid sequence alanyl-valine, the known carboxyl terminus of both aspartokinase subunits (Moir and Paulus, 1977b).

FIG. 2-continued
Site of Transcription Initiation-The approximate position of the elements essential for transcription initiation was deduced by the use of promoter probe plasmids. DNA fragments were analyzed for promoter activity by inserting into a plasmid just upstream from a promoterless /%galactosidase or chloramphenicol acetyltransferase gene and by analyzing an appropriate host strain transformed with the recombinant plasmid for @-galactosidase activity or chloramphenicol resistance, respectively. The fragments inserted into the promoter probe plasmid extended various distances from its upstream end into the 2.9-kb PstI fragment carrying the aspartokinase coding sequence. DNA fragments carrying residues 1-243 were unable to support the synthesis of E. coli @-galactosidase or the expression of chloramphenicol resistance in either B. subtilis or E. coli, whereas fragments extending to residue 380 or beyond had significant promoter activity (Tables  1-111). The observation that residues 1-598 effectively promoted chloramphenicol resistance in E. coli and B. subtilis, whereas both fragments produced by cleavage with BglII at residue 261 were inactive (Tables I1 and 111) indicated that elements essential for promoter activity occurred both upstream and downstream of the unique BgLII site.
These results suggested that the aspartokinase I1 promoter is located in the vicinity of residue 261, about 350 base pairs upstream from the translation initiation site of the aspartokinase gene (residue 612). That the extensive intervening region has a regulatory function was suggested by the data in Tables 1-111, which show that DNA segments extending be-TABLE I Analysis of promoter activity of various B. subtilis DNA fragments inserted into promoter-probe vector pCED6 and expressed in E. coli

RV
The appropriate segments of the 2.9-kb B. subtilis DNA fragment were obtained as inserts in M13mp19 in the course of DNA sequencing and were excised by cleavage in the polylinker and supplied with HindIII linkers where appropriate. The fragments were inserted by ligation either into pCED6 cleaved with HindIII and PstI or into HindIII-cleaved pCED6, followed by identification of inserts with the desired polarity by restriction analysis. Cultures of E. coli RV transformed with the recombinant plasmids were harvested in midexponential phase and assayed for 0-galactosidase activity as described under "Experimental Procedures." Enzyme activity was expressed per ml of culture and normalized to a culture density at 600 nm of 1.0.   The DNA fragments described in Table I, two additional fragments generated from one of these by internal BglII cleavage at position 261, and an EcoRV fragment (residues 957-1484) derived from the entire 2.9-kb PstI segment were provided with suitable cohesive ends using synthetic DNA linkers and ligated into the BamHI and SalI sites of the polylinker region of pSL100, except for segment 1-260, which was ligated into the BamHI and HindIII sites. Cultures of E.
coli HBlOl were transformed with the recombinant plasmids, lo' cells were plated at various concentrations of chloramphenicol as indicated, and the number of colonies was scored.  Analysis of promoter activity of various B. subtilis DNA fragments inserted into promoter-probe vector pPL703 and expressed in B. subtilis BRl51 The same DNA fragments described in Table 11, provided with appropriate cohesive ends, were ligated into the BamHI and SalI sites of the polylinker region of pPL703 and introduced into B. subtilis BR151 by protoplast transformations. Cells (lo4) of transformed B. subtilis BR151 were plated at various concentrations of chloramphenicol, and the number of colonies was scored.  (Table 111), suggesting that the region between residue 489 and the translation start site represents a negative control element. Although the quantitative aspects of these results must be interpreted with caution, since it was not established that the copy numbers of the various plasmids were the same, the smaller effect of the intervening region seen in E. coli (Tables I and 11) suggested that the full expression of the negative control requires the physiological context of the B. subtilis cytoplasm. As discussed later, the intervening region probably functions in the attenuation of transcription.

Plating efficiency in presence of
The same promoter probe plasmids were also used both in B. subtilis and in E. coli to examine the region near the beginning of the p subunit. No promoter activity was found associated with residues 957-1484 (Tables I1 and 111). As this includes nearly 400 base pairs upstream from the beginning of the @ subunit, it seems likely that the latter is translated from the same primary transcript as the a subunit.
The site of transcription initiation was also determined by the direct examination or RNA transcripts by hybridization mapping of mRNA isolated from E. coli JMlOl transformed with a plasmid carrying the B. subtilis aspartokinase I1 gene or from B. subtilis VB217 derepressed for aspartokinase 11.
The procedure involved the extension of a radiolabeled primer, annealed to a single-stranded DNA template, by bacteriophage T4 DNA polymerase in the presence of mRNA hybridized to the same template (Hu and Davidson, 1986). Since T4 DNA polymerase cannot displace a hybridized RNA moiety from its DNA template, primer extension should stop at the 5' terminus of the hybridized mRNA, the 3'-end of the growing DNA chain thus marking its position. Experiments were carried out with three different templates that had been inserted into M13mp18 or M13mp19 at the BglII site (residue 261), the DdeI site (residue 212), or the SmaI site (residue 99) of the 5'-flanking region of the B. subtilis aspartokinase I1 gene and which extended into the coding region for the LY subunit. As shown in the Fig. 3, the site of the first significant termination of primer extension in the presence of mRNA was at residues 280,281, and 282, respectively, with the three DNA templates, whereas no significant termination was seen in that region in the absence of added RNA. The first DNA residue hybridized with mRNA was thus at position 281, suggesting that this might represent the transcription start site. Considerable primer extension proceeded beyond residue

A B
280, probably because of the presence of partially degraded mRNA molecules and perhaps also because of limited RNA strand displacement by T4 DNA polymerase. The latter was suggested by the progressive increase in length of the extended primer as the distance between the priming site and the transcription start site increased with the three templates employed (Fig. 3). On the other hand, the possibility cannot be excluded that the 3-residue range of primer extension observed in these experiments was due to limited exonucleolytic degradation of the 5'-ends of the transcripts or to heterogeneity in the transcription start site. Almost identical results were obtained with mRNA isolated from an aspartokinase I1 overproducing strain of B. subtilis and from E. coli transformed with a plasmid carrying the aspartokinase I1 gene.
Sites of Translation Initiation-The translation initiation site of the a subunit of aspartokinase I1 is unambiguously defined as the ATG sequence at position 612, the aminoterminal nonapeptide of the a subunit being encoded by residues 615-641 and the in-phase TAA at position 606 precluding the use of other potential formyl-methionine codons upstream. On the other hand, the fact that residues 1347-1394 code for the amino terminus of the @ subunit does not necessarily designate the ATG sequence at residue 1347 as the translation initiation site, since the isolated fi subunit might instead be the product of proteolytic processing of the  (lanes 4-7). Part of the nucleotide sequence deduced from the sequencing lanes is shown on the right of each panel, with the putative -10 region of the promoter indicated by a box and the shortest extended primer segment by an asterisk. The production of aspartokinase I1 subunits was measured by Western blotting as described under "Experimental Procedures" in E. coli HBlOl transformed with a pUC18 plasmid carrying the entire aspartokinase I1 coding region ( l a n e I), transformed with a similar plasmid from which a portion of the coding sequence of the a subunit (residues 566-1234) had been deleted (lanes 3 and 4 ) , or untransformed ( l o n e 5). Lane 2 is a sample of purified B. subti& aspartokinase 11, with the a and B subunits identified on the left. a subunit, as had been originally proposed (Moir and Paulus, 1977b). To distinguish between these possibilities, the production of @ subunit was examined in E. coli HBlOl transformed with a plasmid carrying a deletion of residues 566-1234, which included the translation start and a large portion of the coding sequence for the a subunit. Extracts from the transformed cells were subjected to polyacrylamide gel electrophoresis in the presence of sodium dodecyl sulfate and analyzed for the presence of aspartokinase I1 subunits by immunoblotting with antiserum against the @ subunit, which cross-reacts with both aspartokinase subunits (Moir and Paulus, 1977b). The results showed that the deletion of residues 566-1234 prevented the production of the aspartokinase I1 a subunit but not of the @ subunit (Fig. 4). The observation that, under these conditions, @ subunit could be produced in the absence of a subunit clearly demonstrated that @ subunit is not derived from the latter but is the product of independent translation. Since the deletion used in the experiment encompassed the entire translation initiation region of the a subunit, it is most likely that the translation of @ subunit starts at the ATG sequence a t residue 1347, which corresponds to the amino-terminal methionine residue of the @ subunit and is preceded by a strong ribosome binding site (residues 1333-1338). The nature of the additional band of molecular weight about 30,000 seen in these experiments (Fig. 4, lane 1) is not clear. The question of whether it is a degradation product of a subunit or the result of translation initiation at an additional site is under investigation.

Translation Initiation Sites
for the Aspartokinase Subunits-The cloned 2.9-kb PstI fragment of B. subtilis DNA contains three major open reading frames: residues 1-264; residues 612-1835; and (with opposite polarity) residues 2328-1885. The aspartokinase I1 a subunit is encoded by the central reading frame, its amino-terminal nonapeptide being specified by residues 615-641 and its carboxyl-terminal dipeptide by residues 1830-1835. The amino-terminal hexadecapeptide of the @ subunit corresponds to residues 1347-1394 within the a subunit coding sequence, in agreement with our earlier suggestion (Bondaryk and Paulus, 1985a) that the two aspartokinase subunits are specified by in-phase overlapping reading frames. The coding regions for the amino termini of the aspartokinase subunits either start with ATG or are directly preceded by ATG, consistent with the notions that they represent translation start sites. Just upstream of the putative formyl-methionine codons of the a and /3 subunits are the sequences AAAGG (residues 597-601) and AGGAGG (residues 1333-1338), respectively, both complementary to the 3'end of B. subtilis 16 S ribosomal RNA (5"CCTCCTTTCT-3') and, therefore, potential ribosome binding sites (Mc-Laughlin et al., 1981). It is very likely that the synthesis of the a subunit starts with the ATG sequence at residue 612 and that the terminal formyl-methionine codon is removed from the nascent a subunit post-translationally. The translation start of the @ subunit is more problematical, since the sequence data alone cannot rule out the possibility that the @ subunit is derived post-translationally from a subunit by proteolytic processing. However, that possibility could be eliminated by the use of a mutant plasmid from which the translation initiation site and the amino-terminal half of the a subunit had been deleted. Strains transformed with such a deletion plasmid were unable to produce a subunit but did produce normal @ subunit, a clear demonstration that under these conditions the @ subunit of aspartokinase I1 is not derived from a subunit but rather is translated independently.
The observation that the aspartokinase I1 subunits are translated independently raises the interesting question of how their synthetic rates are coordinated. In exponentially growing cells of B. subtilis, the a and /3 subunits are synthesized at equivalent rates, whereas a subunit is produced in 2fold excess in germinating B. subtilis spores and /3 subunit is overproduced nearly 4-fold in E. coli transformed with a recombinant plasmid carrying the B. subtilis aspartokinase I1 gene (Bondaryk and Paulus, 1985b). This suggests that in growing B. subtilis a balance has evolved between the various factors that determine the rate of peptide chain initiation to assure equimolar intracellular levels of the aspartokinase subunits and that this balance is perturbed under non-steady state conditions (germinating spores) or in a foreign cytoplasm. Among the factors that could influence the relative rates of initiation of a and @ subunits are the relative affinities of ribosomes for the respective ribosome binding sites, the competition between ribosomes actively translating the a subunit mRNA and those binding to the mRNA at the internal site for the initiation of @ subunit, and the rate of degradation of the 5"terminal portion of the aspartokinase mRNA. The sequence data allow us to evaluate only the first factor and indicate that the ribosome binding site at the start of the @ subunit is considerably stronger than that at the beginning of the a subunit. The calculated free energies of interaction (Freier et al., 1986) of 16 S ribosomal RNA with the ribosome binding sites for the a and the @ subunit are -4.4 and -9.1 kcal/mol, respectively, for B. subtilis and -3.5 and -9.1 kcall mol, respectively, for E. coli ribosomes. The relatively stronger interaction with the @ subunit start site may be to compensate for possible negative factors. This compensation may be excessive in E. coli where ribosomes would bind more weakly to the a start site than in B. subtilis and thus lead to 4-fold overproduction of @ subunit.
Transcription Control Sites-Various polynucleotide segments upstream from the translation start sites of the aspartokinase I1 subunits were analyzed for promoter activity by cloning in promoter probe plasmids. The results of this anal-ysis showed that an effective promoter region was located between residues 243 and 380 of the cloned B. subtilis PstI segment but that the promoter was inactivated by cleavage with BglII endonuclease at position 261. Examination of the nucleotide sequence near the unique BgnI site revealed the presence of an AT-rich region, followed by hexanucleotide sequences (TTGTCC and TAAAAT) homologous to those frequently found about 35 and 10 nucleotides upstream of bacterial transcription start sites (TTGACA and TATAAT, respectively; Hawley and McClure, 1983). The -35 and -10 consensus sequences are separated by 17 base pairs, the preferred spacing for bacterial promoters (Hawley and Mc-Clure, 1983). The presence of the BglII endonuclease recognition site in this spacer region explains the inactivation of the promoter by cleavage with BglII. Mapping experiments of mRNA isolated from E. coli JMlOl transformed with a recombinant plasmid carrying the B. subtilis aspartokinase gene or from B. subtilis VB217, an overproducer of aspartokinase 11, indicated that its 5' terminus corresponded to residues 281-283 in the DNA sequence, consistent with the proposed promoter location. Analysis of other polynucleotide sequences including more than 300 residues just upstream from the a subunit translation initiation site (residues 261-598) and 400 residues upstream from the B subunit start (residues 957-1484) failed to reveal any other promoter elements, suggesting that the two aspartokinase subunits are encoded by a single mRNA species initiated near residue 281. The aspartokinase I1 mRNA might thus be considered a polycistronic mRNA but with the unusual property that the two cistrons are overlapping.
Another unusual property of the aspartokinase I1 mRNA is that more than 300 nucleotides intervene between its 5'-end and the site of translation initiation of the a subunit. Examination of this leader sequence reveals two interesting features (Fig. 5). One of these is an open reading frame (residues 362-433) which encodes a 24-residue polypeptide and is preceded by a strong ribosome binding site (residues 347-352). The other is the presence of four regions with extensive selfcomplementarity and thus with the potential to form hairpin loops. The first of these palindromic regions overlaps the open reading frame, whereas the last has the characteristic of a pindependent terminator by ending with seven consecutive uridylate residues (Adhya and Gottesman, 1978). Furthermore, sequence complementarity exists between the first, second, and fourth palindromic regions such as to allow an alternate pattern of secondary structure (indicated by the

FIG. 5. Potential transcription attenuator of the B. 8ubtili8
aspartokinase I1 gene. The diagram shows a possible secondary structure of the leader transcript, with an alternative structure indicated by lines connecting residues that could interact by stable base pairing if the leftmost stem-loop structure were disrupted. Also shown are a strong ribosome binding site (triangles) and the disposition of lysine residues in the putative leader peptide. The RNA is shown terminated at the p-independent transcription termination site. lines in Fig. 5). This structural pattern is characteristic of the transcription attenuator elements found in many biosynthetic operons (Kolter and Yanofsky, 1982), the first, second, and fourth hairpin loops corresponding to the protector, prepreemptor, and terminator elements, respectively (Nargang et al., 1980). The putative 24-residue leader peptide contains 4 lysine residues in the region that precedes or overlaps the protector loop, so that lysine deficiency would cause the stalling of ribosomes in this region and thereby destabilize the protector structure and favor the alternate base pairing pattern involving two preemptor loops without a functional transcription terminator (Nargang et al., 1980). According to this model, limitation of lysine, a major end product of the aspartate pathway, would allow transcription to proceed beyond the attenuator site into the structural gene for aspartokinase, thus enhancing the rate of synthesis of the lysine biosynthetic enzyme. Experiments are in progress to test this model by mapping the transcripts produced in B. subtilis under conditions of lysine deficiency and excess. It should be noted that the structure of the putative aspartokinase I1 attenuator is much more complex than that of the only other B. subtilis attenuator described (Shimotsu et al., 1986), which does not seem to involve the synthesis of a leader peptide but rather the interaction with a regulatory protein. Undoubtedly, the control of transcription of the aspartokinase gene will be more complex than outlined above and may involve additional factors, such as a putative negative control element defined by one class of mutants resistant to S-(2-aminoethyl)-~cysteine (Yeh and Steinberg, 1978;Mattioli et al., 1979). It may be that the attenuation control of the aspartokinase I1 operon combines the elements involved in the regulation of E. coli and of B. subtilis transcription attenuators, i.e. the synthesis of a leader peptide as well as the interaction with a regulatory protein. Extensive studies on transcription both i n vitro and i n vivo will be necessary to test this possibility. The termination of the aspartokinase I1 transcript appears to occur right after the translation termination site at a region of dyad symmetry, which in its latter half contains a run of five thymidylate residues. The symmetrical region resembles a p-independent terminator site (Adhya and Gottesman, 1978) except that the consecutive thymidylates are within the palindromic sequence. This type of structure has also been observed in the bidirectional transcription terminator of the l a ) A l a Val . .      (Grundstrom and Jaurin, 1982). Transcription of either DNA strand would give rise to an RNA molecule which could assume a hairpin structure with five uridylates in its distal stem as required for a p-independent terminator site (Fig. 6). As mentioned earlier, the reading frame for the aspartokinase I1 subunits converges on a large open reading frame of opposite polarity (residues 2328-1885), and the transcription of both coding regions is presumably terminated at the same bidirectional p-independent terminator element.

E~T F G A K Y L H P A T L L P A V R S I I I P V F V G S S K D~R A G G T L V C~K~E N P P L F~L
Relation to Adjacent Open Reading Frames-Extensive open reading frames both precede and follow that of the aspartokinase I1 subunits. As discussed above, the reading frame just downstream from the aspartokinase gene is of opposite polarity and appears to share with it a bidirectional transcription terminator. It potentially encodes a 148-residue polypeptide of unknown function and is preceded by a strong ribosome binding site and a potential transcription promoter (see the Miniprint), consistent with the idea that it represents a converging monocistronic operon.
The cloned 2.9-kb B. subtilis DNA fragment begins with an open reading frame that encodes 88 amino acid residues and is followed by tandem termination codons. If this indeed represents a fragment of a potential coding region, one can ask the interesting question of whether it may be functionally related to the aspartokinase gene. Such a relationship is suggested by the fact that the putative promoter of the aspartokinase gene overlaps the carboxyl-terminal portion of the open reading frame, a situation that has also been observed elsewhere, e.g. the ampC and frd operons of E. coli (Postle and Good, 1985). A consequence of such an arrangement is that the terminator of the upstream operon is interposed between the transcription and translation start sites of the second operon and thus serves as an attenuator of its transcription. In the case at hand, it is the attenuator element discussed earlier, postulated to control the transcription of the aspartokinase I1 gene in response to the availability of lysine, which would serve as transcription terminator of the earlier operon. Consequently, under conditions of lysine limitation, termination of transcription of the upstream operon would be incomplete and transcriptional read-through into the aspartokinase operon would yield a polycistronic mRNA in addition to the transcript initiated at the aspartokinase I1 promoter. Such a situation would of course not occur with the recombinant plasmid under study, which carries only a terminal promoterless fragment of the upstream operon, but would obtain with the genes on the B. subtilis chromosome.
The identity of the gene adjacent to the aspartokinase I1 locus would thus be of considerable interest. Inspection of the amino acid sequence of the sequenced fragment between nucleotide residues 112 and 189 reveals striking clusters of lysine residues, perhaps a clue to a possible relationship to lysine biosynthesis.
Amino Acid Sequence Comparison with the E. coli Aspartokinases-The nucleotide sequences of the genes encoding the three E. coli aspartokinases have been elucidated by Cohen and co-workers (Katinka et al., 1980;Zakin et al., 1983;Cassan et al., 1986). A comparison of the corresponding amino acid sequences has revealed considerable similarity between aspartokinase 111, the product of the lysC gene, and the aspartokinase domains of aspartokinase I-homoserine dehydrogenase I (thrA) and aspartokinase 11-homoserine dehydrogenase I1 (metL) (Cassan et al., 1986). In Fig. 7 these sequences, aligned as proposed by Cassan et al. (1986), are compared with the deduced amino acid sequence of B. subtilis aspartokinase 11. Considerable similarity is seen between B. subtilis aspartokinase I1 and the three E. coli aspartokinases, 31, 26, and 22% of its amino acid residues being identical with those of E. coli aspartokinase 111, I, and 11, respectively. The regions of major interspecies similarity correspond to the regions of major similarity between the three E. coli aspartokinases (Cassan et al., 1986), which lie between residues 7-55 and residues 136-238 of B. subtilis aspartokinase 11. Especially extensive similarity is seen in the 145-191 region, suggesting that it may perhaps represent the catalytic center of aspartokinase. An indication of similarity on the level of tertiary structure is provided by the observation that the start of the p subunit (residue 247) is in a position homologous to the site of partial trypsin cleavage of E. coli aspartokinase I-homoserine dehydrogenase I (Sibilli et al., 1981). Partial proteolysis experiments with B. subtilis aspartokinase I1 have indicated that the fi subunit constitutes a discrete globular domain linked by a protease-sensitive hinge to the catalytic domain (Paulus, 1984), analogous to the hinge linking the aspartokinase domain to one of the homoserine dehydrogenase domains in the bifunctional E. coli enzyme (Fazel et al., 1983). The close structural relationship between B. subtilis aspartokinase I1 and the E. coli aspartokinases suggests a common evolutionary origin. A closer relationship exists between B. subtilis aspartokinase I1 and E. coli aspartokinase 111 (31% of residues identical) than between the three E. coli aspartokinases (24.5-29.7% of residues identical; Cassan et al., 1986). Cassan et al. (1986) argue, on the basis of similarity of the carboxyl-terminal portion of E. coli aspartokinase I11 to the homoserine dehydrogenase domain of the two bifunctional aspartokinases, that fusion of aspartokinase with homoserine dehydrogenase to yield aspartokinase-homoserine dehydrogenases I and I1 must have occurred before the separation of aspartokinase 111 from the latter. According to this argument, B. subtilis aspartokinase I1 diverged from E. coli aspartokinase I11 after the latter diverged from the other E. coli enzymes. If this were indeed the evolutionary pathway of B. subtilis aspartokinase 11, one could conclude that the p subunit of that enzyme originated from the homoserine dehydrogenase domain of the bifunctional E. coli aspartokinases or, more specifically, from the globular 25-kDa domain (ID) postulated to link the aspartokinase and homoserine dehydrogenase catalytic domains in E. coli aspartokinase I-homoserine dehydrogenase I (Fazel et al., 1983). In view of the fact that the function of the p subunit of B. subtilis aspartokinase I1 is not yet understood (Paulus, 1984), such a possibility would be of considerable interest.
Conclusions-Our results show that the B. subtilis aspartokinase operon has several unusual features. It is composed of two overlapping cistrons, a situation originally described only in viral and plasmid genomes (Normark et al., 1983) but now being increasingly recognized also on the bacterial chromosome (Smith and Parkinson, 1980;Plumbridge et al., 1985;Mackman et al., 1985;Flower and McHenry, 1986) and even in eukaryotes (Kozak, 1986). The structural genes are preceded by an exceptionally long leader sequence that appears to function as a transcription attenuator but differs from the only such regulatory element described in B. subtilis (Shimotsu et al., 1986) by encoding a leader peptide analogous to those found in E. coli (Kolter and Yanofsky, 1982). The transcription attenuator seems to function also as a transcription terminator of an adjacent operon, whereas the transcription termination site of the aspartokinase I1 operon is shared with a converging operon, interesting examples of genetic economy or regulatory subtlety. In order to understand the functioning of these unusual elements, it will be necessary to modify them by deletion or site-directed mutagenesis and study their expression when reintegrated (Haldenwang et al., 1980) into the B. subtilis chromosome. It is hoped that experiments of this type will not only advance our understanding of the control of lysine biosynthesis in B. subtilis and of structure-function relationships in aspartokinase I1 but also provide interesting insights into the evolution of regulatory strategies.