Regulation of Expression and Nucleotide Sequence of the Escherichia coli dupD Gene*

Regulation of the Escherichia coli dupD gene in- volved in diaminopimelate and lysine biosynthesis was unknown as no convenient enzymatic assay was avail- able until recently. This gene was cloned into pBR322 from a X transducing phage; its complete nucleotide sequence was established. This sequence shows that the dupD gene is composed of a single cistron encoding a 274-amino acid polypeptide, M, 30,040. Enzymatic activity measurements show that this gene encodes the tetrahydrodipicolinate N-succinyltransferase which catalyzes the third step of the specific lysine-diamino-pimelate pathway. The transcriptional start of the dupD gene was localized; the identified promoter signals are weak compared to those from the E. coli promoter consensus sequence. The dupD gene-coding se- quence is followed by a typical p-independent transcriptional termination sequence. A study using an operon fusion constructed in vitro between the dupD promoter and the gulK structural gene indicated that dupD gene expression is repressed by lysine; no atten- uation-like sequence can be found to account for this regulation. At the present time, out of the 9 genes involved in diaminopimelate and lysine biosynthesis, 6 are known to be lysine regulated. 9

Regulation of the Escherichia coli dupD gene involved in diaminopimelate and lysine biosynthesis was unknown as no convenient enzymatic assay was available until recently. This gene was cloned into pBR322 from a X transducing phage; its complete nucleotide sequence was established. This sequence shows that the dupD gene is composed of a single cistron encoding a 274-amino acid polypeptide, M, 30,040. Enzymatic activity measurements show that this gene encodes the tetrahydrodipicolinate N-succinyltransferase which catalyzes the third step of the specific lysine-diaminopimelate pathway. The transcriptional start of the dupD gene was localized; the identified promoter signals are weak compared to those from the E. coli promoter consensus sequence. The dupD gene-coding sequence is followed by a typical p-independent transcriptional termination sequence. A study using an operon fusion constructed in vitro between the dupD promoter and the gulK structural gene indicated that dupD gene expression is repressed by lysine; no attenuation-like sequence can be found to account for this regulation. At the present time, out of the 9 genes involved in diaminopimelate and lysine biosynthesis, 6 are known to be lysine regulated.
The 9 genes involved in diaminopimelate and lysine biosynthesis in Escherichia coli are of particular interest because they are scattered on the chromosome. The expression of 5 genes (lysC (l), asd (2), dupB (3), dupE (4), and 1ysA ( 5 ) ) is known to be regulated by the intracellular lysine pool (4). There is no evidence that regulation by lysine should be dependent on a common regulatory element. These 5 genes are currently studied in our laboratory to compare their regulatory sequences.
For two other genes of the regulon, two genetic loci were identified by Bukhari and Taylor (6) in the 4-min region of the chromosome and were arbitrarily called dupC and dupD. These genes should encode, respectively, tetrahydrodipicolinate N-succinyltransferase and succinyldiaminopimelate transferase (7). These enzymes of the lysine-diaminopimelate pathway were previously characterized by Gilvarg and co-* This work was supported by the Centre National de la Recherche Scientifique (LA 136) and by a grant from Roussel-UCLAF (for C. H.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
$ To whom correspondence should be addressed. § Present address, Laboratoire de Chimie Bacterienne, Centre National de la Recherche Scientifique, B P 71, 13277 Marseille Cedex 9, France.
workers (8,9); however, substrates for these enzymes were not readily available, and the exact functional roles of the dupC and dupD loci were not identified.
In the case of the so-called dupD locus several mutations are known (6, lo), and this region seema to be quite complex; first of all, one mutant of this locus (6) has a different phenotype (slow growth on L broth-rich medium even in the absence of diaminopimelate); secondly, from a previous report of the cloned dupD gene (11), more than one cistron could be involved (12).
We report here the identification of the dupD gene product and the precise localization of this gene on the chromosomal fragment that we isolated (10) from a XgtXB bacteriophage library constructed by Thomas et ul. (13). We also present the nucleotide sequence of the dapD gene and its transcriptional start. The regulation of expression of the dupD gene was studied using i n vitro operon fusion.

EXPERIMENTAL PROCEDURES
Media, Strains, and Plasmids-Bacterial strains and plasmids are listed in Table I. Bacteria were grown either in L broth-rich medium or in 63 minimal medium supplemented with the required metabolites and 0.4% glucose as the carbon source (18). For plasmid-harboring bacteria, appropriate antibiotics were added (ampicillin, 50 pg/ml; tetracycline, 10 pglml). For enzymatic studies, repression conditions were obtained by addition of 4 mM lysine during growth. Conditions leading to lysine limitation were obtained in dapB leaky mutant RDB16 (10) grown in minimal medium. In these conditions, doubling time was 4 h (while in diaminopimelate and lysine excess, doubling time was 1 h). Bacteria were harvested after 16 h of growth.
DNA Manipulations-Transformation of E. coli was done according to Davis et aE. (19) after treatment of cells with calcium chloride. Large-scale purification of plasmid DNA was made according to Humphreys et al. (20) after chloramphenicol amplification (21). For DNA sequencing, plasmids were further purified on sucrose gradient 5-20% (w/v) in 10 mM Tris-HCI, pH 8, 1 mM EDTA, and 100 mM NaC1. For rapid analysis of recombinant plasmids, the alkaline lysis procedure of Birnboim and Doly (22) was used.
Restriction enzymes (New England Biolabs and Boehringer Mannheim) were used according to Ref. 19. Ligations were performed with T4 DNA ligase (Boehringer Mannheim) in 20 mM Tris-HC1, pH 7.5, 10 mM MgCI2, 10 mM dithiothreitol, and 1 mM ATP; incubation was done overnight a t 4 "C in 10 pl.
Analytical or preparative electrophoresis was done in 10 mM Trisborate, pH 8.3, 2.5 mM EDTA on agarose or polyacrylamide gels.
All DNA sequencing was done using the Maxam and Gilbert method (23). Purified plasmid was cut with appropriate restriction enzymes and 5' end-labeled using the exchange reaction of Berkner and Folk (24). After secondary restriction cleavage, the labeled fragments were separated by polyacrylamide gel electrophoresis electroeluted by the method of McDonel et al. (25). Products of chemical degradation reactions were analyzed with the denaturing gels of Sanger and Coulson (26).
Mapping of the 5' end of transcripts was done by two methods: protection against S1 nuclease (Boehringer Mannheim) degradation (27) and primer extension with reverse transcriptase (gift from M. Yaniv, Institut Pasteur, Paris) (28). Preparation of total in vivo RNA } Thiswork was done from appropriate strains: either from chromosomal dapD gene of strain RM4102 (14) or from the same strain harboring multicopy plasmid pDD2 carrying the dapD gene. Hybridization was done during 3.5 h at 35 "C in the presence of 80% formarnide. Enzymatic treatments were then carried out as described by Sollner-Webb and Reeder (28). Enzyme Assays-Enzymatic determinations were made from cellfree crude extracts. Galactokinase assays were performed according to McKenney et al. (16); 8-lactamase specific activities were measured as described in Chenais et ul. (29).

Precise Localization and Identification of the dapD Gene-
Precise localization of the dapD gene was performed on a 5.5kilobase pair EcoRI bacterial fragment carrying this gene. This fragment was subcloned in both orientations from a X bacteriophage into plasmid pBR322 to give plasmids pDDl and pDD3 (10). Its restriction map is in agreement with a previously reported map of this region (12) except for the rightmost EcoRI site of our chromosomal fragment which is absent in XpolC9-transducing phage (12); other results from our laboratory indicate that EcoRI sites on the borders of the insert could be either created (dapD, this work; dapB (30)) or lost (dapA') probably due to EcoRI* conditions used during the AgtXB library construction.
Construction of plasmids pDD2, pDD4, and pDD5 is shown in Fig. 1. These plasmids complement all dapD mutant strains: dapD4 and dapDl2 from Bukhari and Taylor (6), and RDD3, RDD7, RDD15, RDD22, and RDD32 from our laboratory (10). These results show that the dapD gene defined by these 7 mutations is localized in the 1300-bp2AluI fragment of pDD5. Enzymatic studies (31) were performed on one of these strains, RDD32, which was found to lack any tetrahydrodipicolinate N-succinylase a~t i v i t y .~ Conversely, crude extracts from this strain harboring plasmid pDD5 were highly enriched in tetrahydrodipicolinate N-succinylase activity as compared to a wild-type train.^ Nucleotide Sequence of the dapD Gene-The nucleotide sequence of the 1.3-kilobase pair AluI fragment was determined using the method of Maxam and Gilbert (23); the strategy is shown in Fig. 2. The complete sequence on both C. Richaud, manuscript in preparation. The abbreviation used is: bp, base pair(s). C. Gilvarg, personal communication.  The arrows indicate the sites for 5' labeling as well as the direction and extent of the sequences (only sites used for labeling are shown). This strategy allowed complete determination on both strands; all restriction sites of the sequence given in Fig. 3 used in labeling experiments were checked by overlapping with other fragments. Heauy line on the restriction map indicates the dupD gene coding part. * indicates the probes used for 5' mRNA mapping by S1 nuclease ( S I ) or reverse transcriptase (R7').
strands from Hinff, site on the left part of the fragment (bp 1) to the last BstNI site on the right end (bp 1177) is given in Fig. 3. A large open reading frame of 274 triplets is found from ATG (bp 217) to ochre codon TAA (bp 1039); this open reading frame is preceded by a GAG sequence 7 nucleotides upstream from the initiating ATG, which could be part of a ribosome-binding site by complementarity of 16 S RNA (32).
Codon usage has been compared to that of total tryptophan operon coding sequences (33). peptide in E. coli maxicells harboring plasmid pDD2 (data not shown). immediately downstream of the TAA nonsense codon a typical p-independent terminator sequence (34) is found in the 3' flanking region, a GC-rich inverted repeat followed by a run of 7 thymidylate residues in the nontemplate DNA strand.
33 bp upstream from the dapD gene initiating ATG (bp 217) a TGA codon (bp 184) closes a reading frame opened for at least 60 triplets (limit of the sequence so far determined) that could be the end of an adjacent gene.
Determination of the Transcriptional Start-The 5' end of the dapD messenger RNA has been localized by nuclease S1 mapping assay (24) and by the reverse transcriptase extension method (25). The strategy followed is shown in Fig. 2. As may be seen in Fig. 4, both methods localize the transcriptional start at nucleotides G and A (bp 187 and 185). Thus, a leader sequence of only 29-31 nucleotides is found before the translational initiation ATG (bp 217).
Upstream of the transcriptional start the perfect "-35" RNA polymerase recognition site TTGACA (34) is found at the correct distance. As for the other important recognition sequence, the "Pribnow box" (35) (ideally TATAAT), two weak signals are found either AACGAT (with 3 matches out of 6), 1' A T

5'
T G :l T T T A

3'
FIG. 4. Identification of the start point for dapD gene transcription. The protected fragments after treatment with S1 nuclease or reverse transcriptase (RT) are shown along with the sequencing reaction products of Hinff2-HincII fragment (see Fig. 2). 15 bp from the "-35" sequence and 11 nucleotides from the mRNA start, or GATAAA (with 4 matches out of 6), 18 bp from the "-35" sequence and 8 nucleotides from the mRNA start.
Regulation of Expression of the dapD Gene-In order to study eventual regulation of the expression of dapD gene, in vitro fusion with the galK gene was performed. Plasmid pK04, carrying a promoterless galK gene, has been designed by McKenney et al. (16) to analyze DNA fragments promoting procaryotic transcription. Such fragments can be introduced into unique sites, which places the galK-coding region under the control of the promoter present in the inserted fragment. It is assumed that the amount of galactokinase produced is linearly related to the transcriptional efficiency of the inserted promoter (16).
Restriction sites upstream from and inside the dapD gene did not allow direct construction of an operon fusion with the galK gene. So a two-step procedure was used (see Fig. 5). The constructed fusion, pCJ5, contains a 706-bp RsaI fragment beginning 370 bp before translational initiation ATG and 305 bp before the "-35" transcriptional recognition sequence.
Promoter expression and regulation were further characterized by quantitative measurements of galactokinase synthesis in uiuo. Enzyme levels were determined in two strains and in different growth conditions ranging from lysine excess to partial limitation. Results presented in Table I1 show that expression of the dapD gene as measured with the dapD-galK plasmidic fusion pCJ5 is subject to lysine regulation. A &fold difference could reproducibly be obtained between lysine excess and lysine limitation.

DISCUSSION
We report here the nucleotide sequence of the E. coli dapD gene located in the 4-min region of the chromosome (6, 7).  "Specific activities are expressed as nanomoles of galactose 1phosphate produced/min/mg of protein in the crude extract, normalized to the plasmid copy number; plasmid copy number was calculated from /3-lactamase specific activities as described in Chenais et al. (29).
Growth conditions. In this dap leaky strain, minimal medium leads to lysine limitation with a doubling time of 4 h. Lysine limitation is controlled by derepression of lysine-sensitive aspartokinase; these conditions are only 30% derepressive as compared to the maximal level of aspartokinase found in chemostat.
The nucleotide sequence shows one open reading frame of 822 bp beginning by ATG (preceded by GAG) and terminated by TAA. Transcription start has been identified 30 f 1 bp before the translational start, and a very strong putative signal of transcriptional termination was revealed immediately downstream from the translational end triplet. Comparison of our results with Bendiak and Friesen's (12) reveals some discrepancies; our sequencing and transcription mapping data identify unambiguously one cistron between HincII site and AluID (see Figs. 2 and 3), while Bendiak and Friesen argue for the existence of two complementation groups of Dap-mutants located on both sides of the PstI site (12). Our recombination experiments (data not shown) with different deleted plasmids show that indeed the dapD4 mutation is localized on the right side of the PstI site (see Fig. 2) but cannot be complemented by a fragment containing only the right part of the bacterial fragment.
As noticed already by Bukhari and Taylor (6), the dapDl2 mutant is different from all other dupD mutants. Our experiments (data not shown) localized this mutation in the 5' region of the gene. Slow growth on LB-rich medium in the absence of diaminopimelate may account for the leakiness of this mutant. Though phenotypically different from all other dupD mutants, it is located in the dupD gene which actually consists of only one cistron. This gene encodes the tetrahydrodipicolinate N-~uccinyltransferase~ which catalyzes the third step of the specific lysine-diaminopimelate pathway and not the succinyldiaminopimelate aminotransferase as was previously arbitrarily admitted (6, 7).
Results obtained with in vitro fusion between the dupD promoter and the gulK structural gene allow the conclusion that expression of the dupD gene is regulated by lysine. The 5-fold range regulation observed between lysine excess and lysine limitation (only partial in our experiments) may reflect a larger phenomenon; we cannot exclude the possibility that a regulatory element present in limiting amount is involved. Experiments are in progress to obtain single-copy chromosomal fusions for further study of dupD expression.
Thus, out of the 7 genes of the lysine pathway studied so far in our laboratory (4), only the expression of dupA is not subject to repression by lysine. As for the other genes of the same regulon, examination of the sequence of the dupD gene does not reveal any of the characteristics of an attenuationlike structure (36). The sequence immediately upstream from the transcriptional start may be compared to the camensus promoter sequence determined by Rosenberg and Court (34) and recently re-examined (37). The perfect consensus "-35" region TTGACA can be found at the correct distance from the transcriptional start. Downstream from this recognition sequence for RNA polymerase, we can identify two possible "Pribnow boxes" (35), AACGAT and GATAAA; both "-10" regions are far from the consensus sequence and not at the optimal distance of 17 bp from the "-35" sequence (37, 38). The strength of these promoter sequences was determined by the computing method of Mulligan et ul. (39); scores of 43. 8 and 44.8%, respectively, were found, which is under the limit of effective promoters. Thus, the dupD gene is expected to be poorly expressed. Some of the data presented here do not fulfill this prediction: (i) galactokinase expression from plasmid pCJ5 carrying a hybrid dupD-gulK operon is relatively high compared with data from "ideal" promoters constructed by De Boer et ul. (40) or with the asd gene promoter which is known to be well expressed4; (ii) when mapping the dupD gene transcriptional start, protected hybrids could be readily detected even with mRNA extracted from bacteria harboring only one copy of the dupD gene per genome (see Fig. 4). To correlate this high in vivo expression of the dupD gene with its weak promoter sequence, we propose an activation mechanism like the one demonstrated for the lysA gene (41) and suggested for the dupB gene (30). Activation of the dapD gene promoter could be modulated by the internal lysine pool. Experiments are currently in progress in our laboratory to answer this question.