Structure and Autoregulation of the metJ Regulatory Gene in Escherichia coli*

The nucleotide sequence of the Escherichia coli metJ regulatory gene (312 nucleotides) has been determined as well as that of two mutations located within the gene. Analysis of the sequence downstream from the netJ gene has revealed inverted repeats homologous to several intercistronic regions, also reported to occur between operons. A hybrid protein that contains the 55 first amino acid residues of the met3 protein sub-stituting for the 8 amino acid residues at the NHz terminus of &galactosidase was produced by gene fusion. The hybrid protein retaining &galactosidase activity was purified. Its amino-terminal sequence was determined and this allowed us to locate the translational start codon of the metJ gene. Evidence was provided for autoregulation by repression of the metJ gene. By sequencing upstream from metJ, the region situated between the metJ and metB genes was found to contain putative operator structures that we propose to call “Met boxes.” The levels of the proteins involved in methionine biosyn-thesis are elevated in strains of Escherichia coli bearing mutations in the metK or in the metJ gene (1). It appears that metJ codes for a regulatory protein which, when combined with methionine (or one of its derivatives), causes repression of the expression of the methionine regulon. The seven methionine structural genes are scattered on the E,

The nucleotide sequence of the Escherichia coli metJ regulatory gene (312 nucleotides) has been determined as well as that of two mutations located within the gene. Analysis of the sequence downstream from the netJ gene has revealed inverted repeats homologous to several intercistronic regions, also reported to occur between operons. A hybrid protein that contains the 55 first amino acid residues of the met3 protein substituting for the 8 amino acid residues at the NHz terminus of &galactosidase was produced by gene fusion. The hybrid protein retaining &galactosidase activity was purified. Its amino-terminal sequence was determined and this allowed us to locate the translational start codon of the metJ gene. Evidence was provided for autoregulation by repression of the metJ gene. By sequencing upstream from metJ, the region situated between the metJ and metB genes was found to contain putative operator structures that we propose to call "Met boxes." The levels of the proteins involved in methionine biosynthesis are elevated in strains of Escherichia coli bearing mutations in the metK or in the metJ gene (1). It appears that metJ codes for a regulatory protein which, when combined with methionine (or one of its derivatives), causes repression of the expression of the methionine regulon. The seven methionine structural genes are scattered on the E, coli chromosome (2) and are independent units of transcription, except in the case of two genes (metB and metL) which are arranged in an operon (3, 36). The metJ aporepressor could thus interact with several operator loci.
Here, we present the complete nucleotide sequence of the metJ regulatory gene as well as that of two mutations located within the gene. We address also the question of autoregulation of repressor synthesis. To determine whether the product of m e t J regulates its own synthesis, we have constructed in vitro a metJ-lacZ hybrid gene which leads to a hybrid protein endowed with (3-galactosidase activity. In this system, the synthesis of P-galactosidase is used as an index of transcription from the netJpromoter. We have been able to investigate the quantitative effect of metJ+ alleles on expression of @-* This work was supported by the Centre National de la Recherche Scientifique (U. A. 517), the Ministere de l'hdustrie et de la Recherche, the Commissariat a 1'Energie Atomique and the Institut National de la Sant6 et de la Recherche Midicale (841006). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
$ To whom all correspondence should be addressed.
galactosidase activity in such metJ+ derivatives of the strains bearing a single copy of the metJ-lacZ hybrid gene.

MATERIALS AND METHODS
Construction of a m e t J -h Z Hybrid Gene-We made use of the pMC1403 plasmid suitable to detect fragments with transcription and translation start signals (4 The hybrid gene was transferred to a X bacteriophage. An EcoRI-Sac11 fragment of the recombinant plasmid carrying the hybrid gene was inserted between the left arm of XSEW (5) and the right arm of XGT4 (6). The recombinant bacteriophages were selected by detection of 8-galactosidase activity in the plaques. These plaques were purified three times and used to lysogenize a Alac-argE strain (XA100C su-). The lysogen was made metJ by cotransduction of the metJ185am marker (carried by 55119) with argE. The metJ genotype was recognized by its norleucine resistance and methionine excretion (7). The lysogenic strain was then transformed with the pMAD4 multicopy plasmid to provide a strain with several copies of the wild-type metJ gene.
Purification of the Hybrid Protein, Product of the metJ-lacZ Hybrid Gene-The MClOOO strain transformed by the pIP24 recombinant plasmid was used as a source of the p m e t J -h Z hybrid protein. The purification is a slight modification of the procedure of Fowler and Zabin (8). Cultures were grown in a 15-liter fermenter in LB broth supplemented with ampicillin (50 pg/ml) and the bacterial pellet was frozen at -30 "C. All subsequent steps were performed in the presence of 1 mM phenylmethylsulfonyl fluoride. 30 g of bacterial cells were broken by ultrasonic treatment. A fourth step involving a Sephacryl S300 chromatography in the presence of 6 M urea was added to the purification scheme described (8). This procedure gave a protein pure enough as judged by polyacrylamide gel electrophoresis in the presence of sodium dodecyl sulfate to be subjected to automated sequencing.
NH, Terminal Sequence Determination of the pmetJ-lac2 Hybrid Protein-Approximately 8 nmol of pure pmetJ-lacZ were dialyzed against several liters of 50 mM ammonium bicarbonate. The sample was lyophilized and applied to the spinning cup of a modified SOCOSI-PS100 sequenator. A high-performance liquid chromatography system (Waters) was used to identify the phenylthiohydantoin amino acid derivatives which were eluted with a methanol gradient from a Merck Lichrospher 60 CH8 super column.
Enzymes and Chemicals-&Galactosidase was assayed by hydrolysis of 0-nitrophenyl-8-D-galactoside according to Miller (9). DNA polymerase large-fragment and restriction endonucleases were obtained from New England Biolabs. T4 polynucleotide kinase was from The abbreviations used are: bp, base pairs; kb, kilobase pairs; pmetJ-lacZ, translational product of the metJ-lacZ hybrid gene.
Boehringer. Acrylamide was from British Drug Houses, Ltd., urea and boric acid were from Serva, hydrazine was from Eastman, and dimethyl sulfate was from Aldrich. AI1 other chemicals, mostly from Merck, were analytical grade or purer. Phenol was distilled.
Nucleotide Sequence Determination-The nucleotide sequences were determined by the chemical method of Maxam and Gilbert (10).

RESULTS AND DISCUSSION
DNA Sequence of the metJ Gene-The E. coli metJ gene was cloned together with the three other met genes of the metJBLF cluster in plasmid vectors (3,13). In the pMAD4 plasmid, metJ was localized in a DNA segment starting at 5.1 kb and ending at 5.7 kb upstream from a single EcoRI site, which has been shown to be near the right (clockwise) end of the bacterial DNA insert containing the met genes (3,13). Moreover, our previous structural evidence (3) indicates that metJmust lie within a 0.7-kb DNA segment situated upstream from the origin of the metB structural gene.
The complete nucleotide sequence of the metJ gene is presented (Fig. 1) as well as the deduced corresponding protein sequence. The DNA fragments and the restriction sites used in the sequence determination are indicated in Fig. 2. Analysis of the sequence (Fig. 1) shows that an open reading frame begins at position -141 and ends at position 312. Position -141 is localized precisely at 138 bp (counterclockwise) from the origin of the metB gene. As previously published (3), the metJ gene is transcribed in the opposite direction to that of the metBL and metF transcriptional units.
Two ATG codons in phase (positions -141 and -3) could theoretically be the translational start codon of the metJ structural gene. In order to determine which ATG is used as initiator codon. we determined the Actually, we constructed a metJ-lacZ hybrid gene (see "Materials and Methods"), purified the corresponding hybrid protein and determined its N-terminal sequence. The results given by automated sequencing are Ala-Glu-Trp-Ser-Gly and allow the unambiguous choice of the ATG in position -3 as initiator codon of metJ. The gene is thus 312 nucleotides long and encodes a single polypeptide chain of 104 residues with a molecular weight of 11,996. This is in agreement with the analysis of the peptides synthesized under the direction of plasmid genes by the maxicell procedure; the metJ product was thus identified in sodium dodecyl sulfatepolyacrylamide gel as a 12,000-dalton radioactive polypeptide (38). The same low-molecular-weight radioactive peptide was obtained when the maxicell extracts of a strain bearing the pMAD4 plasmid (which carries the metJBLF cluster) were analyzed (14). At the moment, no information is available about the possible oligomeric structure of the metJ protein.
The codon usage in the metJ gene is not strictly characteristic of the codon usage found in weakly expressed genes by Grosjean and Fiers (15). Neither an AT-rich sequence followed by a sequence related to CAATCAA, reported (16) to be characteristic of rho-dependent terminators, nor a structure resembling a rho-independent terminator (16) have been found downstream of metJ. However, at positions 346 to 370 and 411 to 440 (Fig. l ) , two potential stem and loop structures can be formed. The nucleotide sequence of these structures is very similar to the consensus structure found in several intercistronic regions (17). Moreover, this DNA palindromic segment has been recently reported to occur also between operons (18), and it has been postulated that it could be involved in the regulation of transcription termination events. Further experiments are needed to confirm if these DNA segments are indeed responsible for the transcription termination of the metJ gene.
Analysis of the Deduced Protein Sequence and Comparison with Other Regulatory Proteins-Analysis of the protein sequence indicates that, although the total number of basic and acid residues are of the same order, their distribution is very different, 67% of the basic residues are present in the Nterminal half of the protein and 68% of the acidic ones are in the C-terminal half. Also, 8 out of the 15 basic residues are

GGC GAR
T A T ATC AGC CCA TAC GCT GAG CAC GGC AAG AAG AGT GAA CAR GTC AAA RAG AT1 ACG GTT TCC A T 1 C C I   "The specific activity for GT745 given in minimal medium was 632 units of P-galactosidase (defined as nanomoles of o-nitrophenol produced per min and per mg of protein). This activity was arbitrarily chosen as 1. The same activities were found in minimal medium supplemented with 5 mM methionine. present in tandem mainly in the N-terminal region of the polypeptide sequence.

CGC AGC GAC GAR
Comparison of the amino acid sequences of several prokaryotic regulatory proteins has revealed two discrete regions of homology, region 1 and region 2 (19). The results obtained when wild-type and mutant proteins were compared suggest that the homologous sequences of these two regions could play an important role in the interaction with DNA. Moreover, both regions of homology are located in the known DNA-binding domains of lac1 (for a review, see Refs. 20 and 21), cZ (23, and crp (23,24) products. In the so-called region 1 of homology (19), the most homologous segment is the T-V(or I)-S(or G)-R-tetrapeptide totally conserved in crp, lad, galR, araC, and lysR products and to a lesser extent in trpR, l e d , and tnpR proteins. When the amino acid sequence of the metJ product is compared with that of the regulatory protein sequences mentioned, a tripeptide T-V-S characteristic of region 1 is found at positions 25, 26, and 27, corresponding to the nucleotides 73 to 81 of Fig. 1. The position of the tripeptide in the N-terminal part of the metJ protein, together with the predominance of basic residues in the same part may indicate that the N-terminal region of the metJ protein interacts with the DNA target sequence for the met repressor.
DNA Sequence of Two metJ Mutations-The metJ184 and metJ185 alleles responsible for a derepressed synthesis of the met proteins were isolated, in strain JJ100, by selection for ethionine resistance (12). The pRCG135 and pRCG137 recombinant plasmids were constructed (38) by cloning in pBR322, using the Admet102 metJ184 and Xdmetl02 metJ185 transducing bacteriophages as sources of mutant alleles of meter, respectively (12,25). In addition, it was previously shown that the metJ185 defect must be due to an amber mutation (12).
Nucleotide sequence of the structurally altered netJ genes indicates that, in the pRCG137 DNA, the G base at position 8 ( Fig. 1) is changed into an A, transforming the Trp codon into an amber stop codon; in the pRCG135 DNA insert, the only modification is situated at position 178 (G -+ A; Fig. I), transforming the Ala codon into a Thr codon. These results are consistent with the pattern obtained by analysis of the peptides synthesized under the direction of plasmid genes by the maxicell procedure (38). Indeed, the radioactive polypeptide identified as the metJ product is present in a reduced amount in extracts of maxicells containing the pRCG135 plasmid and absent in those with the pRCG137 plasmid (metJaml85). Identification of the structural change and absence of gene product confirm the genetic evidence that the metJ185 allele is an amber mutation.
The reduced amount of the metJ product obtained in the pRCG135 maxicell preparation may indicate that the mutation (Ala 3 Thr) has modified the stability of the metJ polypeptide, perhaps rendering it more accessible to proteolytic degradation.
Regulation of /3-Galactosidase in metJ-lacZ Fusion Strains-A metJ-lac2 hybrid gene was constructed (see "Materials and Methods") in order to assess expression of the metJ gene by that of P-galactosidase. If transcription from the metJ promoter can be repressed by the metJ gene product, then introduction of metJ+ alleles should cause repression of P-galactosidase synthesis in the XmetJ-lac2 lysogens. The presence of a single wild-type copy of the metJ gene on the chromosome of the XmetJ-hZ lysogen was associated with decreased levels of @-galactosidase (factor 2.3) and the introduction of a multiple-copy plasmid bearing the metJ gene was associated with a further decrease (factor 6.5) as determined by enzyme assays. Table I gives data for a XmetJ-lac2 lysogen and metJ+ derivatives for cultures grown in minimal medium supplemented or not supplemented with 5 mM L-methionine.
It is apparent that the presence of the metJ+ allele brings about a decreased synthesis (factor 2.3 to 6.5) of P-galactosidase from the h Z gene fused to the metJ promoter, providing strong evidence for autoregulation. The data also suggest that the repression of the metJ protein synthesis is insensitive to supplementation of the medium with concentrations of methionine that normally repress the synthesis of the methionine structural genes (26).
It seems that autoregulation of regulatory genes is found in a great number of cases. Synthesis of the araC product has been shown to be self-regulated (27), as have the l e d gene (28), the XcZ gene (29), and the hutC gene of Salmonella typhimurium (30). However, the malT product is not autoregulated (31). Self-repression of the synthesis of repressors of biosynthetic pathways was shown for the first time in the case of the trpR gene (32, 37) and an operator-like sequence in the 5' part of the trpR gene has in fact been identified (32, 33).
In our case, analysis of the 5' part of the metJ gene shows a regulatory region for two divergent transcriptional units (the metJ gene and the metBL operon). The promoter of the metBL operon has been identified and the transcription start is located at position -147 of Fig. 1 (3). The results presented by Duchange et at. (3) show the presence of another promoter activity in an orientation opposite to the metBL operon, but do not allow to locate precisely the metJ promoter. This must await the determination of the transcription start of metJ. A plausible operator region (3) was proposed for the metBL operon (position -174 to -195 of Fig. 1) which was homologous to a DNA segment located in the 5' flanking regions of the metF gene (14) and of the metA gene.2 Since metJ is transcribed on the other DNA strand and in the opposite direction to metB, the putative operator region of the metBL operon could also be that of the metJ gene. An example of a gene cluster constituted of two divergent operons with an internal common operator region has been already described (34). By analogy with the arg regulon where the homologous operator regions were called "Arg boxes," we propose "Met boxes" although no operator-constitutive mutations have been reported yet where these sequences would be altered. It is possible that regulation at the level of transcription is not the only regulation affecting metJ expression. CONCLUSION With the information presented in this paper and previously, we are able to give the complete organization of the metJBLF cluster of E. coli. The nucleotide sequences of the metB, metF, and metL genes are known by previous reports of this laboratory (3, 14, 35) and that of metJ is presented here. As indicated in Fig. 3, the three independent transcriptional units are organized as follows: the metJ structural gene, transcribed in the E. coli chromosome (2) counterclockwise occupies 312 bp and the metBL and metF structural genes (clockwise) occupy 3585 and 888 bp, respectively. A complex regulatory region is found between the metJ and metB structural genes. The metJBLF cluster represents 5735 nucleotides from the last base of the second palindromic unit of metJ (position 440 of Fig. 1) to the last base of the putative rhodependent terminator of metF (14). The transcription start sites for the metBL operons and for the metF gene were determined. Finally, the results presented here indicate that the metJ regulatory gene is autoregulated.