The Primary Structure of Escherichia coli L-Threonine Dehydrogenase*

The complete primary structure of Escherichia coli L-threonine dehydrogenase has been deduced by sequencing the cloned tdh gene. The primary structure so determined agrees with results obtained independ-ently for the amino acid composition, the N-terminal amino acid sequence (20 residues), and a short sequence at the end of an internal peptide of the purified enzyme. The presence of a predicted Asp-Pro bond at residues 148 and 149 was confirmed by treatment of purified threonine dehydrogenase with dilute acid and subsequent analysis of the resulting cleavage products. enzyme a zinc-containing long-chain dehydrogenase

The catabolic pathway initiated by threonine dehydrogenase is a major pathway for threonine degradation in Escherichia coli under many growth conditions (Fig. 1). Low levels of this enzymatic activity can be detected under all growth conditions so far examined (1)(2)(3). Leucine is an inducer of threonine dehydrogenase (2,3), and mutations that lead to elevated activity of this enzyme enable cells to grow on threonine as a carbon source (1,4). In contrast, the other threonine-degrading enzymes show little or no activity under most growth conditions; threonine aldolase has not been The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession numberfs) X06690.
$To whom correspondence should be addressed. detected in cell extracts (3), biosynthetic threonine dehydratase activity is inhibited by small amounts of isoleucine (5, 6), and in order to detect biodegradative threonine dehydratase activity cells must be grown anaerobically or in the absence of glucose (2,7).
Threonine dehydrogenase from E. coli has recently been purified to homogeneity and extensively characterized (23, 24). In this paper, the nucleotide sequence of the structural gene (tdh) for E. coli threonine dehydrogenase is reported.
The presence of an acid-labile Asp-Pro bond, predicted from the deduced amino acid sequence, is demonstrated, and the amino acid sequence of threonine dehydrogenase is compared with that of other dehydrogenases.

Materials
Formic acid (98%) and hexafluoroacetone. 3Hz0 were obtained from Aldrich. Formic acid (88%) was from J. T. Baker Chemical Co. Threonine dehydrogenase-initiated pathway for threonine utilization. Threonine is converted to glycine and acetyl-CoA in sequential reactions catalyzed by the gene products of the tdh operon, threonine dehydrogenase (tdh) and 2-amino-3-ketobutyrate CoA ligase (kbl). Glycine can then be converted to serine by serine transhydroxymethylase (glyA) to complete an alternate pathway for serine biosynthesis.
isolated by the alkaline lysis procedure of Ish-Horowicz and Burke (25). Plasmid and replicative form DNA were purified by equilibrium banding in CsC1 gradients in the presence of ethidium bromide. Small preparations of plasmid and replicative form DNA were made by a scaled-down version of the alkaline lysis procedure. Cells were transformed by the method of Cohen et al. (26). DNA fragments used for cloning were purified electrophoretically on agarose or acrylamide gels. The appropriate fragment was cut from the gel after visualization with ethidium bromide and electroeluted as described by Maniatis et al. (27). Single-stranded M13 templates were prepared by the procedure of Sanger et al. (28).

Methods
DNA Sequence Analysis-DNA sequences were determined by the method of Sanger et al. (28) modified for the use of [a-%]dATP as the labeling nucleotide (29). Recombinant M13 phage were used as templates.
Alkylation of Threonine Dehydrogenase with "C-Labeled Iodoacetic Acid-Guanidine HC1 and EDTA (final concentrations of 6 and 5.6 mM, respectively) were added to a sample of the enzyme, and the pH of this mixture was adjusted to 8.8 with triethylamine. The solution was then saturated with argon, and a 10-fold M excess (with respect to protein thiols) of dithiothreitol was added. The sample was again briefly sparged with argon. Reduction of disulfides was allowed to proceed with stirring for 3 h at 25 "C. A 10-fold M excess (with respect to dithiothreitol and protein thiols) of solid [1-"C]iodoacetic acid was finally added. The sample was again sparged with argon, and alkylation was allowed to proceed for 12 h in the dark at 37 "C with constant stirring. The reaction was terminated by adding a 10-fold M excess (with respect to iodoacetic acid) of 2-mercaptoethanol. Thereafter, the mixture was dialyzed exhaustively against distilled water and then lyophilized to dryness; the alkylated enzyme preparation contained 300,000 cpm/mg of protein.
Acid Cleavage Conditions-For the results presented here, cleavage of Asp-Pro linkages in threonine dehydrogenase was achieved by two procedures. In one, hexafluoroacetone and 0.11 N HCl were added sequentially to a lyophilized sample (ranging from 300 pg to 2 mg) of the carboxymethylated protein so that the final solution contained 10% (v/v) hexafluoroacetone, 0.1 N HCl, and 1 mg/ml of protein.
After the vials containing this mixture were flushed with Nz, they were sealed and incubated at 110 "C for 0, 5, 10, 20, or 40 min. Hydrolysis was terminated by rapidly cooling the solutions in ice (pH adjusted to 2.5 with pyridine) containing 7 M guanidine. HCl, and these solutions were incubated for 0, 24, 48, and 96 h at 40 "C. Thereafter, the samples were first neutralized by adding 10 M NaOH, then dialyzed extensively against distilled water, and finally lyophilized to dryness.
Cleavage of threonine dehydrogenase was monitored by subjecting samples of the acid digests to SDS-polyacrylamide gel electrophoresis according to the method of Weber and Osborn (30) except that a solution of 10% trichloroacetic acid containing 3% sulfosalicylic acid replaced the usual fixative. Myoglobin (Mr = 17,200), lysozyme (Mr = 14,300), soybean trypsin inhibitor (M, = 20,100), and trypsinogen (Mr = 24,000) served as standard proteins. Radioactive cleavage products of threonine dehydrogenase were detected on polyacrylamide gels by autoradiography on Kodak XAR film; protein bands were visualized by staining with Coomassie Brilliant Blue and destaining with a mixture of 7.5% acetic acid, 10% methanol.
Peptide Purification and Sequencing Procedures-Peptide purification and the chromatography of phenylthiohydantoin-amino acid derivatives were accomplished with a Waters Associates (Milford, MA) HPLC' system consisting of two M510 pumps, a U6K injector, a model 441 detector, and a model 680 gradient controller; data were plotted and analyzed with the use of a Hewlett-Packard Co. integrator (model 3390A). Manual Edman sequencing of selected peptides was performed by the thin film method described by Tarr (31); phenylthiohydantoin-amino acid derivatives were separated and analyzed as reported in earlier studies (32). Sequencing of Peptide 2, which was anticipated to have an N-terminal proline residue, was carried out after all free a-amino groups in an acid digest were blocked by reaction with o-phthalaldehyde (this derivatized all nonprolyl N termini). For this purpose, 1 mg of carboxymethylated threonine dehydrogenase was cleaved by incubating in 1 ml of 0.1 N HCl containing 10% hexafluoroacetone (as described earlier). The sample was then lyophilized to dryness, and the residue was subsequently dissolved in 1 ml of 88% formic acid; this solution was transferred to a sequencing tube and dried under vacuum. The reaction with ophthalaldehyde was then carried out as described (32) except that 20 pl of the o-phthalaldehyde reagent were added to the sample.
Lyophilized samples of threonine dehydrogenase that had been subjected to mild acid cleavage were dissolved in a minimal volume of 88% formic acid. These solutions were then diluted to 20% formic acid with Solvent A (0.1% trifluoroacetic acid in distilled water) and filtered through a Millex HV4 0.45-p filter (Millipore). Acid cleavage products were separated at room temperature by injecting 3-30 nmol of a given sample onto a Beckman C3 Ultrapore HPLC column (4.6 mm X 7.5 cm) equilibrated with Solvent A. Peptide fragments were eluted with a nonlinear gradient of 100% Solvent A to 30% Solvent A, 70% Solvent B, where B consisted of 0.1% trifluororacetic acid in a mixture of 25% 2-propanol, 75% acetonitrile. For determination of its N-terminal amino acid sequence, a sample of pure threonine dehydrogenase (2.3 mg of protein in 1.5 ml) was dialyzed exhaustively against water and then lyophilized to dryness. The first 20 amino acid residues of threonine dehydrogenase were determined by the University of Michigan Protein Sequencing Facility using an AB1 model 470A gas phase Sequencer with the standard AB1 software program OPNRUN, version 20.

RESULTS
Nucleotide Sequence of the tdh Gene-The tdh+ gene cloned within a multicopy plasmid reverses the nutritional phenotype of glyA tester strains (9). The tdh gene lies within a 3.6kilobase EcoRI fragment situated at coordinate 81.2 of the standard E. coli map (33). The kbl gene is immediately upstream of the tdh gene within the same EcoRI fragment (9, 11). The tdh and kbl genes are transcribed from a common promoter3 (see Fig. 2).
The tdh gene was localized within the EcoRI fragment by insertional mutagenesis (9). Segments of the EcoRI fragment predicted to contain the tdh gene were subcloned into the M13 family of vectors. Subsequent dideoxy sequencing (   the protein encoded within the ORF agree with the known properties of purified threonine dehydrogenase (Table I).
Furthermore, the predicted N-terminal amino acid sequence agrees completely with the sequence of the N-terminal 20 residues determined for the purified enzyme using automated Edman degradation (Fig. 3). The ORF must, therefore, encode threonine dehydrogenase.
The tdh gene shows a codon bias that is similar to the upstream kbl gene. Such a codon bias is typical of many highly expressed E. coli genes (34) notwithstanding the fact that the tdh operon is weakly transcribed under most growth conditions (data not shown). Downstream of the tdh gene is a nucleotide sequence that strongly resembles the consensus pindependent transcription termination signal (35) (see Fig.  3). Thus, the EcoRI fragment contains an operon of at least two genes whose expression is associated with the operation of one pathway for threonine catabolism. Downstream of the proposed p-independent transcription termination site lies another unidentified ORF which continues through the end of the EcoRI fragment. The orientation of this ORF is the same as that of the kbl and tdh genes. It is not known whether the corresponding gene product plays a role in threonine catabolism.
Selective Cleavage of an Asp-Pro Bod-Peptide bonds in proteins between the a-carboxyl group of aspartic acid and the imino group of proline have been shown to be uniquely susceptible to cleavage at low pH. For some proteins, cleavage of this bond is highly specific and nearly quantitative under mild conditions of temperature and pH; in other instances, however, more drastic conditions are required and result in only partial hydrolysis with much less specificity. Nucleotidesequencing data indicate the presence of an Asp-Pro bond between residues 148 and 149 of E. coli threonine dehydrogenase. Attempts were made to confirm the presence of this linkage in the enzyme by selective acid cleavage. Seven different protocols were tested to accomplish optimal cleavage of E. coli threonine dehydrogenase by mild acid; the same general pattern of protein bands was seen in every case. Typical results of such experiments are shown in Fig. 4, A and B. Three major bands attributable to threonine dehydrogenase, Peptide 1 (residues 1-148), and Peptide 2 (residues 149-341) were always evident as were also fainter bands corresponding to fragments attributable to nonspecific bond cleavage. The extent of random cleavage varied with the hydrolysis conditions used, as illustrated in Fig. 4, A and B, but in every instance the R , values (and thus the molecular weights) of the major bands were the same. As can be seen, the intensity of the threonine dehydrogenase band diminished progressively with increasing time of acid hydrolysis, while those bands designated Peptides 1 and 2 became correspondingly darker. Random bond breakage with the appearance of numerous smaller peptides (Mr < 10,000) increased with time of incubation. We were surprised to observe that simple exposure of this enzyme to mild acid conditions at room temperature (for -10 min, sample-handling time) caused some partial cleavage (see lane 1 , Fig. 4A); protein fragments detected under such conditions were not seen when recently purified samples of the dehydrogenase (prepared and stored in 50 mM Tris.HC1 buffer, pH 8.4) were immediately subjected to SDS-polyacrylamide gel electrophoresis.
Peptide bonds in proteins involving the a-carboxyl group of aspartic acid are the most acid-labile. Of these, the Asp-Pro linkage is generally the most sensitive (36). give Peptide 1 and a small fragment of M , = 8,000; this would explain our observation that after a given period of treatment at low pH, the band corresponding to Peptide 1 is often more intense than that for Peptide 2. Attempts to purify Band B peptide have failed as it consistently coeluted with native dehydrogenase in HPLC.
Exposure of the gel shown in Fig. 4B to x-ray film clearly showed that the level of radioactivity present in threonine dehydrogenase alkylated with [14C]iodoacetate decreased with time of acid digestion, while the level in Peptide 1 correspondingly increased (see Fig. 4C). Similar results were obtained when the carboxymethylated enzyme was treated with 0.1% trifluoroacetic acid at 110 "C for 0-40 min (data not shown).   The faint level of radioactivity associated with Band B peptide can be explained by a ready cleavage of the Asp-Val bond between residues 222 and 223 as suggested above. When the apparent molecular weights of Peptides 1 and 2 were estimated from their mobilities on SDS-polyacrylamide gels, values near those calculated from their amino acid compositions were obtained (18,300 and 22,200, respectively, for Peptides 1 and 2 versus 16,300 and 21,000 from their amino acid composition). Since the migration of low molecular weight peptides is often somewhat variable (30) and these two peptide fragments did have the N-terminal amino acid sequence expected (see the following results), the small discrepancy in their estimated molecular weights is not considered significant.

I l i s A r g P h e Ser I l e A s p A s p P h e C l n L y s C l y P h e A s p A l a net
N-terminal Sequence of Peptides 1 and 2-When a crude acid digest of carboxymethylated threonine dehydrogenase was fractionated by HPLC on a CS Ultrapore column, three major protein peaks that coincided with three peaks of radioactivity were detected. No significant radioactivity was observed anywhere else throughout the chromatographic run. By SDS-polyacrylamide gel electrophoresis, one of the three protein/radioactive peaks was shown to be identical with the native carboxymethylated dehydrogenase; a second was found to be a mixture of the carboxymethylated enzyme, Peptide 2, and Band B peptide; and the third comigrated with Peptide 1. Autoradiography of these gels confirmed the identities assigned.
Manual Edman sequencing (five cycles) of purified Peptide 1 gave the following sequence: Met-Lys-Ala-Leu-Ser; this is the N-terminal amino acid sequence of E. coli threonine dehydrogenase. Peptide 2 could not be obtained in pure form; it coeluted with carboxymethylated threonine dehydrogenase in each of eight different HPLC gradient systems tested. As a consequence, a crude acid digest was first treated with ophthalaldehyde to block all primary amino groups, and this mixture was then subjected to manual sequencing. The following sequence corresponding to residues 149-153 of native threonine dehydrogenase was obtained in good yield in each of the five cycles: Pro-Phe-Gly-Asn-Ala.

DISCUSSION
Threonine Dehydrogenase-Our results with purified threonine dehydrogenase support the DNA-sequencing data indicating the presence of an Asp-Pro bond that is subject to hydrolytic cleavage at low pH. The enzyme also appears to have a second acid-labile bond (perhaps that between Asp-222 and Val-223). Under the hydrolysis conditions examined, trace quantities of as many as 30 peptides were detected; these are assumed to be products of random cleavage at the carboxyl side of other aspartyl bonds in the molecule. The level of such breakage was to some extent dependent on the conditions used, but no general trend was evident.
In addition to confirming the presence of the Asp-Pro linkage in E. coli threonine dehydrogenase, it had been our hope that selective quite specific cleavage of this bond would be a convenient method for breaking the protein roughly in half, thereby significantly enhancing the capability of isolating, sequencing, and assigning the location of "active site" peptides in the molecule. The results presented make such an application quite unlikely. Only a small fraction of the carboxymethylated enzyme is cleaved at the Asp-Pro bond during a typical 40-min incubation in mild acid; most of the protein remains intact and, at best, the yield of Peptide 1 or 2 is only about 5-10% of the starting material. Furthermore, numerous other peptides are formed by random bond cleavage; although they are present in lesser amounts, their yield also increases with longer reaction times.
Homology of L-Threonine Dehydrogenase from E. coli with Other NAD'-dependent Dehydrogenases-The NAD+-dependent dehydrogenases constitute a large group of enzymes of which several have been extensively characterized catalytically and structurally. For example, both amino acid sequence and x-ray crystallographic data are available for lactate dehydrogenase, malate dehydrogenase (cytosolic and mitochondrial), glyceraldehyde-3-phosphate dehydrogenase, glutamate dehydrogenase, and horse liver alcohol dehydrogenase (37). In addition, amino acid sequence information has been published for many other enzymes in this group. These dehydrogenases as a class are structurally related as they share a common NAD*-binding domain (38). With the primary structure of E. coli threonine dehydrogenase now determined, we find this enzyme has considerable sequence homology with the zinc-containing long-chain alcohol/polyol family of dehydrogenases.
Protein amino acid sequence databases were searched through BIONET by the FASTP program of Lipman and Pearson (39) for proteins homologous with E. coli threonine dehydrogenase. The same program was used to obtain an initial alignment of the amino acid sequences of threonine dehydrogenase and sheep liver sorbitol dehydrogenase. With this done, the sequences of threonine dehydrogenase, sheep liver sorbitol dehydrogenase, horse liver alcohol dehydrogenase, maize alcohol dehydrogenase, and yeast alcohol dehydrogenase were simultaneously aligned so as to optimize conservation of residues invariant throughout the alcohol/polyol dehydrogenase family (40) while also maximizing matches between threonine dehydrogenase and the other enzymes. In general, the same relative alignment of the alcohol dehydrogenases and sorbitol dehydrogenase, as previously published (40), was retained. Occasionally, relative alignments were changed to accommodate the sequence data for threonine dehydrogenase. The horse liver alcohol dehydrogenase numbering system is used in this paper; residue numbers given are relative to those assigned the horse liver alcohol dehydrogenase sequence. Fig. 5 shows the results of such amino acid sequence alignments. As is evident, many residues are conserved in all five of these enzymes, while several others are shared among four of the five. The amino acid sequence of threonine dehydrogenase shows between 25 and 28% identity with those of the other four proteins. Threonine dehydrogenase is most similar to sorbitol dehydrogenase and yeast alcohol dehydrogenase (28 and 27% identity, respectively), while it has 25% identity with the sequences of both horse liver alcohol dehydrogenase and maize alcohol dehydrogenase. Furthermore, many of the nonidentical residues are conservative replacements making the overall sequence similarity higher than that suggested by the percent identity between molecules.
Jornvall et al. (40) recently published an alignment of 16 different alcohol dehydrogenases together with sheep liver sorbitol dehydrogenase (40). They called this group of enzymes the zinc-containing long-chain alcohol/polyol dehydrogenases; all are polypeptides of approximately 350 amino acids (41) and require zinc for catalytic activity (40). Among these 17 enzymes, 22 amino acid residues are strictly conserved. Another 13 residues are highly conserved but not invariant. Examination of the amino acid sequence alignment presented here shows that of these 22 invariant residues, 20 are also found in threonine dehydrogenase (Glu-35 of horse liver alcohol dehydrogenase is replaced conservatively with Asp in threonine dehydrogenase, while threonine dehydrogenase has    type signifies sequence identities between threonine dehydrogenase and at least one of the other proteins. Residues known to be zinc ligands in liver alcohol dehydrogenase have been underlined. Residues conserved through the 17 alcohol/sorbitol dehydrogenases previously aligned (40) have been indicated with an arrow; those that are highly conserved but not invariant among these enzymes are identified by a triangle. Dashes signify gaps introduced to optimize homology. The sequence has been numbered as done previously for liver alcohol dehydrogenase. Residue numbers refer to positions in liver alcohol dehydrogenase or to a residue in the same position in one of the other proteins and correspond to the amino acid below the last digit of each number (e.g. residue 230 of sorbitol dehydrogenase is Ser).

S D H : A A V W V T D L ; A S R L S K A K E V G A D F I l t l i N E S P E E I A K K V E G L L G S K P E V T I E C T
a Leu in place of Pro-31 of horse liver alcohol dehydrogenase). Of the 13 highly conserved residues in these 17 alcohol/polyol dehydrogenases, 6 are shared by threonine dehydrogenase, and 4 others are replaced conservatively. These homologous residues are spread throughout the length of the threonine dehydrogenase molecule and include many amino acids known to have important structural and/or catalytic roles in horse liver alcohol dehydrogenase.
Probably the most interesting similarity between the amino acid sequences of threonine dehydrogenase and the alcohol dehydrogenase/sorbitol dehydrogenase group is that the two zinc-binding sites of horse liver alcohol dehydrogenase (presumed to fill the same role in the other enzymes) are also present in threonine dehydrogenase. Crystallographic studies have shown the ligands to the active site zinc of horse liver alcohol dehydrogenase are Cys-46, His-67, and Cys-174. Like horse liver alcohol dehydrogenase, threonine dehydrogenase has a cysteine residue at position 46 and a histidine residue at position 67. On the basis of computer modeling, it has been suggested that sorbitol dehydrogenase differs from the alcohol dehydrogenases in having a glutamic acid residue, Glu-174, as the third active site zinc ligand (42). The alignment presented here supports this idea as threonine dehydrogenase is seen to have an aspartate residue at position 174. In this context, threonine and sorbitol dehydrogenases show a clear relationship, and sorbitol dehydrogenase is not unique in having a carboxyl group as a third active site zinc ligand. Sorbitol dehydrogenase also differs from the related alcohol dehydrogenases in that it lacks a second site where the structural zinc of horse liver alcohol dehydrogenase is bound. In this regard, threonine dehydrogenase is more like the alcohol dehydrogenase family as it has cysteine residues at the four positions known to act as ligands for the structural zinc of horse liver alcohol dehydrogenase (i.e. Cys-97, Cys-100, Cys-103, and cys-111).
The significance of the presence of two zinc-binding sites (Le. active site zinc and structural zinc) in threonine dehydrogenase comparable to the sites of horse liver alcohol dehydrogenase is uncertain. We have recently found that threonine dehydrogenase, as isolated, does contain tightly bound zinc, but the stoichiometry has not yet been e~tablished.~ The native enzyme is known to be stimulated approximately 10fold by added Mn2+; manganese binding is reversible with a K d 10 PM and a stoichiometry of 1 mol of Mn2+/mo1 of subunit (24). In contrast, added zinc has no effect on threonine dehydrogenase activity, but EDTA rapidly causes inactivation. The significance of such observations will be the focus of future studies.
L-Threonine dehydrogenase of E. coli, therefore, shares some significant sequence characteristics with other well characterized dehydrogenases. In addition, its molecular weight (37,20O/subunit) is in the same range as found for these enzymes. Although the complete amino acid sequence is only distantly related, as seen in the low percent identity between aligned sequences, many of the residues conserved have important structural and/or catalytic roles in horse liver alcohol dehydrogenase. Hence, threonine dehydrogenase of E. coli appears to be a new and unique member of this class of zinccontaining long-chain NAD+-linked dehydrogenases.