Purification, Characterization, Cloning, and Amino Acid Sequence of the Bifunctional Enzyme 5,lO-Methylenetetrahydrofolate Dehydrogenase/& 10-Methenyltetrahydrofolate Cyclohydrolase from Escherichia coli*

We have purified the enzyme 5,lO-methylenetetra- hydrofolate dehydrogenase (EC 1.5.1.5) from Escherichia coli to homogeneity by a newly devised proce- dure. The enzyme has been purified at least 2,000-fold in a 3190 yield. The specific activity of the enzyme obtained is 7.4 times greater than any previous preparation from this source. The purified enzyme is specific for NADP. The protein also contains 5,lO- methenyltetrahydrofolate cyclohydrolase (EC 3.5.4.9) activity. Sodium dodecyl sulfate-polyacrylamide gel electrophoresis and behavior on a molecular sieving column suggest that the enzyme is a dimer of identical subunits. We have cloned the E. coli gene coding for the enzyme through the use of polymerase chain reaction based on primers designed from the NH2 terminal analysis of the isolated enzyme. We sequenced the gene. The derived amino acid sequence of the enzyme contains 287 amino acids of M, 31,000. The sequence shows 60% identity to two bifunctional mitochondrial enzymes specific for NAD, and 40-4570 identity to the presumed dehydrogenase/cyclohydrolase domains of the trifunctional C1-tetrahydrofolate synthase of yeast mitochondria and cytoplasm and human and rat cyto- plasm.

activity. Sodium dodecyl sulfate-polyacrylamide gel electrophoresis and behavior on a molecular sieving column suggest that the enzyme is a dimer of identical subunits. We have cloned the E. coli gene coding for the enzyme through the use of polymerase chain reaction based on primers designed from the NH2 terminal analysis of the isolated enzyme. We sequenced the gene. The derived amino acid sequence of the enzyme contains 287 amino acids of M, 31,000. The sequence shows 60% identity to two bifunctional mitochondrial enzymes specific for NAD, and 40-4570 identity to the presumed dehydrogenase/cyclohydrolase domains of the trifunctional C1-tetrahydrofolate synthase of yeast mitochondria and cytoplasm and human and rat cytoplasm. An identical sequence of 14 amino acids with no gaps is present in all 7 sequences.
Glycine and serine are the major sources of one-carbon metabolites. They yield 5,lO-methylenetetrahydrofolate ( The activity was first found in vertebrate liver preparations (I, 2). These preparations from eukaryotes were initially not recognized to be composed of multifunctional peptides possessing additional enzymatic activities leading to further metabolism of the product of the dehydrogenase reaction, and the details of the dehydrogenase reaction were not established until the dehydrogenase was purified from Clostridium cylindrosporum, a bacterial source that possessed the dehydrogenase alone (3). The dehydrogenase also occurs as a bifunctional protein together with 5,10-methenyl-THF cyclohydrolase (EC 3.5.4.9) that catalyzes the hydrolysis reaction shown in Equation 2.
The trifunctional enzyme is referred to as C1-THF synthase. The mono-, di-, and trifunctional forms of the enzyme described above show a distinct preference for NADP. However, several methylene-THF dehydrogenases have been described that show optimum activity with NAD. One such enzyme originally described in Ehrlich ascites tumor cells (7) has more recently been shown to occur in mouse (8) and human (9) mitochondria and is associated with many transformed cells (lo), but not with adult tissue. These dehydrogenase activities are also associated with cyclohydrolase activities, but differ from other 5,lO-methylene-THF dehydrogenases in being dependent on Mg2+ for activity. The genes for the mouse and human enzymes have been cloned and sequenced. The physiological role of this enzyme is not known.
Still another form of 5,lO-methylene-THF dehydrogenase has been described that is most active with NAD. It occurs in the bacteria Clostridium formicoaceticum (ll), Acetobacterium woodii (12), Clostridium acidi-urici,* and in yeast cytoplasm (13). These enzymes are not dependent on Mg2+ and are monofunctional. Their amino acid sequences remain to be determined.
The enzymes described possessing dehydrogenase activity, whether mono-, di-, or trifunctional proteins, occur as dimeric proteins composed of two identical subunits. The only exception to this dimeric structure is the dehydrogenase of E. coli reported to have been isolated in homogeneous form and to be composed of 5 dissimilar subunits (6).
In view of the unusual structure of the E. coli enzyme and its importance in one-carbon metabolism, we undertook to reisolate 5,lO-methylene-THF dehydrogenase from E. coli, reexamine its structure, and to determine the amino acid sequence of the enzyme. The enzyme was isolated in homogeneous form through the use of a new purification procedure described here.
We also cloned the E. coli gene coding for 5,lO-methylene-THF dehydrogenase. For this purpose, we designed a highly specific probe using PCR. From the known 35-amino acid sequence at the NH, terminus, two 20-mer degenerate primers were designed (14). These primers were 34 bases apart. Using these primers with a genomic DNA template, we generated a 74-base pair product by the PCR. Determination of the sequence of this 74-base pair product gave the exact sequence of the internal 34 bases. From this information, we designed an oligonucleotide probe that is an exact match to the gene of interest. This enabled us to clone and sequence the E. coli gene that codes for this bifunctional enzyme and to determine that it is encoded by a locus we have named folD (15).

The Enzyme
Materials-Common reagents were commercial products of the highest grade available. NADP was from Behring Diagnostics. Protamine sulfate was from Elanco (Indianapolis, IN). Bovine serum albumin was from Amour Pharmaceutical Co. (Phoenix, AZ). Protein Assay Dye Reagent Concentrate was from Bio-Rad. Folic acid, 5'-AMP, Trizma (Tris base), and Bicine were from Sigma. Sepharose CL-2B was from Pharmacia LKB Biotechnology Inc. Platinum oxide was from Alfa Products (Danvers, MA). 1,6-Diaminohexane, p-toluenesulfonyl chloride, liquid bromine, anhydrous dioxane, and pyridine were purchased from Aldrich Chemical Co. Tryptone and yeast extract were from Difco.
Folate Substrates-(6R,6S)-Tetrahydrofolate was prepared by hydrogenation of folic acid over platinum oxide in neutral aqueous solution (16) and was purified by chromatography on DEAE-cellulose (17). The stock solution at pH 7.5 was diluted to 10 mM (6R,6S)-THF as determined by enzymatic assay, and contained a final concentration of 0.5 M  Heparin-Agarose-Heparin was coupled via an amino group to CNBr-activated agarose (20). The concentration of the ligand was 4 mg of heparin/g wet weight of gel as determined by sulfate analysis.
TO avoid using benzidine (a Class C carcinogen), sulfate analysis was carried out using the BaClz/gelatin method (21).

AMP-Sepharose-AMP-Sepharose was prepared by linking the C-
8 of AMP to a 1,6-diaminohexane arm (22) which was then coupled to Sepharose CL-2B previously activated by tosyl chloride (23). The final product had 3.5 pmol of ligand/g wet weight of gel as determined spectrophotometrically in 87% glycerol (23). It should be noted that in most commercially available AMP-Sepharose the AMP is coupled at the N-6 amino group. We did not determine the affinity of the E.
coli dehydrogenase for such resins.
Protein Determination-Protein concentration was determined by the method of Bradford with bovine serum albumin as a standard (24). When interfering substances (e.g. protamine sulfate) were present the protein was precipitated from 10% trichloroacetic acid and a modification of the Lowry method was used (25).
SDS-PAGE-Sodium dodecyl sulfate-polyacrylamide gel electrophoresis was carried out as described by Laemmli (26).
NH2-terminal Analysis-NH2-terminal amino acid analysis was done by Frank R. Masiarz at Chiron Corp., Emeryville, CA. The sequence analysis was performed by automatic Edman degradation with an Applied Biosystems 470A gas-phase protein sequenator (Foster City, CA). The phenylthiohydantoin-derivatives were identified by reverse-phase chromatography with an on-line Applied Biosystems PTH analyzer.
Enzyme Assays 510-Methylene-THF Dehydrogenase-A variety of methods have been used for determining 510-methylene-THF dehydrogenase activity (27). A modification of the single point determination of Scrimgeour and Huennekens (28) was used in the studies described here. The modified assay contained 1.5 mM (GR,GS)-CHz-THF (pH 8.5), 0.8 mM NADP, and 100 mM Bicine-(K') (pH 8.6) in a final volume of 1 ml. The reaction mixture was incubated for 10 min at 37 "C after the addition of enzyme. The reaction was stopped and the reduced pyridine nucleotide formed in the reaction was destroyed by the addition of 2 ml of 0.36 N HCl for purified enzyme or 2% perchloric acid followed by centrifugation to remove the protein precipitate for crude extracts. After 10 min at room temperature the absorbance was read at 350 nm. The extinction coefficient of the remaining reaction product, 10-methenyl-THF, was css0 = 24,900 M" in acid. 1 Unit is equivalent to the formation of 1 pmol of product/min. Cyclohydrolase-5,10-Methenyl-THF cyclohydrolase activity was measured at 37 "C as previously described (29). 1 Unit is equivalent to the formation of 1 pmol of product/min.

Enzyme Purification
Cell Growth-E. coli BE was grown in LB medium. Cells were harvested in late log phase (hW = 1.1) and stored as a frozen cell pellet at -70 'C. Cells stored under these conditions for up to 6 months showed no loss of dehydrogenase activity.
Extract-25 g of frozen E. coEi cells was suspended in 75 ml of Buffer A (50 mM of Tris-HC1 (pH 7.5), 50 mM 2-mercaptoethanol) and disrupted by one pass through a French Press at 10,000 psi. The temperature was kept at 515 "C during this step, and at 0-4 "C in all subsequent steps. The passthrough was centrifuged for 30 min at 30,000 X g and the supernatant solution was collected.
Protamine Sulfate-0.3 volumes of 2% protamine sulfate in Buffer A was added slowly to the extract with continuous stirring. After 20 min at 0 "C, the suspension was centrifuged for 15 min at 30,000 X g and the precipitate was discarded.
Ammonium Sulfate Fractionation-Finely ground ammonium sulfate was added slowly to bring the protamine sulfate supernatant solution to 35% saturation (19.4 g/100 ml). After 30 min, the suspen-  sion was centrifuged for 15 min a t 30,000 X g. The supernatant solution was brought to 55% saturation by the further addition of ammonium sulfate (11.8 g/lOO ml) and the precipitate was collected as described above, dissolved in 10 ml of Buffer A, and dialyzed overnight against Buffer A.
Heparin-Agarose Chromatography-The dialyzed ammonium sulfate fraction, diluted to a protein concentration of 20 mg/ml, was applied to a 1.5 X 15-cm heparin-agarose column equilibrated with Buffer A. The column was washed with Buffer A until protein stopped eluting (approximately 3 column volumes). The column was then developed with a linear gradient established by mixing 150 ml of Buffer A with 150 ml of Buffer A containing 150 mM KCl. The flow rate was maintained a t 0.5 ml/min. The 5,10-methylene-THF dehydrogenase eluted as a sharp peak a t about 35 mM KC1. AMP-Sepharose Chromatography-The pooled peak from the heparin-agarose column was applied directly to a 1.5 X 6-cm AMP-Sepharose column equilibrated with Buffer A. The column was washed with Buffer A until protein ceased to elute (approximately established by mixing 75 ml of Buffer A with 75 ml of Buffer A 2.5 column volumes) and was then developed with a linear gradient containing 1.5 M KCI. The flow rate was kept at about 1 ml/min. The 5,lO-methylene-THF dehydrogenase elutes as a sharp peak at about 350 mM KCI. The protein concentration of the AMP-Sepharose peak was very low (about 10 pg/ml) and dialysis led to extensive loss of activity. To minimize activity loss, the pooled peak was concentrated in a Diaflo apparatus with a YM-10 membrane and then diluted several times until the KC1 concentration was 50.01 mM.
FPLC-The desalted AMP-Sepharose peak (volume = 10 ml; KC1 5 10 mM) was applied to a Mono Q (HR 5/5) column equilibrated with Buffer A. The column was washed with 2 ml of Buffer A and developed with a linear gradient mg of lysozyme and 1 mg of RNase were added and the mixture was kept at 0 "C for 15 min. SDS was added to a final concentration of 0.5% and the mixture was shaken gently at 37 "C for 3 h. 0.6 mg of Proteinase K and 1.5 ml of chloroform/isoamyl alcohol = 24/1 were added and the mixture was shaken gently at 37 "C for 4 h. The mixture was extracted gently with 3 ml of phenol and centrifuged 30

YHILSGRKLAQSIREKANDEIQAIKLKHPNFKPTtKIIQVGARPDSSTWRNI(L M G Q V L W ( A C A Q Q F R S N I A N E I K S 1 Q G H V P G F A P N L A I I Q V G N R P D S A~ M P A E I L N O I ( E I S A Q I R A R U V J Q V T Q L K E Q V P G~~I~~N R D D S N L Y 1~L
M P A G I I l I C K W S A Q I I V I L I l ( T q F T Q F T Q K Q E Q V P G L L NEAWIS~~~~LAQQIKQEVQQEVEEWVAS.GNKWHLSVIL~DNPASHS~VVLNKT KACEEVGFVSRSYDLPETTSBABLLELIDT41ADNTIWILVQ~LP..AGIDNVKVLER    Preparation of Probe-Two 20-mer degenerate oligonucleotide primers designed for the regions designated in Fig. 3  Laboratories) so that the final PCR product could be ligated into a cloning vector. The final PCR mixture was extracted with chloroform to remove the mineral oil. It was then treated with Klenow fragment of DNA polymerase I (Boehringer Mannheim) to blunt the ends. The 74-base pair product was isolated from a 4% low melting temperature agarose gel (NuSieve LMT) followed by phenol extraction and ethanol precipitation. The purified 74-base pair product was ligated into M13mplO which had been cut with SmaI and treated with calf intestine alkaline phosphatase (Amersham). Single-stranded DNA was isolated from 8 positive plaques and the nucleotide sequence was determined. From this sequence, we designed an oligonucleotide that was an exact match to the NHz-terminal region of the enzyme. A 25mer was chemically synthesized (see Fig. 3) and end-labeled with ["PI using T4 polynucleotide kinase (BRL).

KPRDGTSSDW. I W G T W Y V A D P S K K S C~C~E~W \ I K W H L I T P V
Screen X Library-A XSE6 library of E. coli W3110 DNA was purchased from the American Type Culture Collection (ATCC 37386) (30). Plaque hybridization was carried out with the 25-mer oligonucleotide probe (see Fig. 3). Two positive plaques were obtained from a screen of 1000 plaques. X Phage DNA was isolated from purified positive plaques. Restriction digests and Southern blot analysis were done to confirm the correct clone by comparison with the genomic Southerns (Fig. 1). It should be noted that the EcoRI site was not present in this clone. We assume this is a difference between the strains E. coli BE and E. coli W3110.
Sequence-The nucleotide sequence was determined for both strands with Sequenase and progressive synthetic 18-mer oligonucleotide primers. The first primer used was the 25-mer oligonucleotide probe described above.

RESULTS
Purification-The 5,lO-methylene-THF dehydrogenase activity was purified from frozen E. coli cells by the procedure described under "Experimental Procedures." The results are summarized in Table I. The final product represented a 2,000- fold purification of the enzyme with a yield of 31%. Products of each purification step have levels of 5,lO-methylene-THF cyclohydrolase activity proportional to the dehydrogenase throughout the purification. However, 5 pg of the final product shows a single band corresponding to a molecular weight of 35,000 on SDS-PAGE (Fig. 2). We could easily detect 0.1 pg of protein by this procedure. These results suggest that the product is at least 98% pure, and that both activities reside on a single polypeptide.
Molecular Weight and Subunit Structure-SDS-PAGE (Fig. 2) shows a single band corresponding to a M, 35,000.
Chromatography on Superose-12 showed a native molecular weight of approximately 70,000 relative to the marker proteins used (not shown). These data suggest that the enzyme is a dimer of subunits of equal size.
Bifunctional Composition-The 5,lO-methylene-THF cyclohydrolase activity co-purified with the dehydrogenase at a constant ratio, as shown in Table I. This data and the fact that the purified enzyme runs as a single band on SDS-PAGE (Fig. 2) suggested that the enzyme is bifunctional.
Pyridine Nucleotide-In the presence of saturating 5,lOmethylene-THF, the K,,, of  The inhibition by 2-mercaptoethanol and to a lesser extent the KC1 and Tris-HC1 present problems because the stock solution of THF contains these materials. To minimize this problem, the 10 mM THF was always prepared in 0.2 M Tris-HC1 (pH 7.5), 0.5 M 2-mercaptoethanol.
pH Optimum-The optimal activity of 5,lO-methylene-THF dehydrogenase was found at pH 8.5 in Bicine-K'. This pH value agrees with the value reported previously obtained in glycine or Tris buffer (31). Since Tris buffer inhibits the activity, we used Bicine-K+ which is less inhibitory than Tris and does not inhibit the reaction at 150 mM.
Stability-It has previously been reported that E. coli 5,lOmethylene-THF activity is unstable (6,31). We did not find this to be true. Some activity loss was observed during dialysis and storage at 4 "C during the purification procedure. However, the purified enzyme stored at -20 "C in 50% glycerol was 95% active after 6 months. Since all the buffers contained 2-mercaptoethanol, the enzyme was never subjected to freezing and thawing throughout the purification procedure.
NH2-terminal Analysis-The sequence of the 35 NH2-terminal amino acids of the isolated purified enzyme was: Ala-

Ala-Lys-Ile-Ile-Asp-Gly-Lys-Thr-Ile-Ala-Gln-Gln-Val-Arg-Ser-Glu-Val-Ala-Gln-Lys-Val-Gln-Ala-Arg-Ile-Ala-Ala-Gly-Leu-Arg-Ala-Pro-Gly-Leu-.
Sequence-The nucleotide sequence of the folD gene and the derived amino acid sequence is shown in Fig. 3. It was recognized by the occurrence of a sequence corresponding exactly to the amino acid sequence of the NHz-terminal end of the isolated enzyme. This sequence occurred at the head of the only nonterminated polypeptide chain among the 3 reading frames of the sequence obtained equivalent to a M, greater than 30,000. The isoelectric point of E. coli dehydrogenase/ cyclohydrolase calculated from the amino acid composition is pH 5.6.

DISCUSSION
The enzyme 5,10-methylene-THF dehydrogenase was first reported in pigeon liver (1) and beef liver (2) and was subsequently partially purified from a variety of sources (27). This activity has been found in all sources examined (27,32)) although the molecular organization of the protein bearing the enzymatic activity has been found to vary. Among the enzymes that have been purified to homogeneity, the dehydrogenase occurs as a monofunctional enzyme with a native M, 60,000-70,000 in C. cylindrosporum (3). It occurs as a component of a bifunctional enzyme together with 5,lO-methylene-THF cyclohydrolase in C. thermoaceticum (33). It OCcurs as a component of a trifunctional enzyme together with 5,lO-methylene-THF cyclohydrolase and 10-formyl-THF synthetase in the following eukaryotes from which it has been purified to homogeneity: the livers of rabbit (341, sheep (19), pig (35), chicken (36), and rat (37)) as well as yeast cytoplasm (38), and mitochondria (39). The enzymes noted above are all specific for NADP. Dehydrogenases specific for NAD that have been purified to homogeneity include the monofunctional enzymes from C. formicoaceticum (ll), A. woodii (12), C. acidi-urici' and bifunctional dehydrogenase/cyclohydrolases from mitochondria of human transformed cells (9) and mouse Ehrlich ascites cells (8). The latter two enzymes also require Mg2' for activity.
The genes for several of the enzymes described above derived from eukaryotes and from yeast have been cloned and their nucleic acid sequences have been determined. These cloned examples include both bifunctional enzymes and trifunctional enzymes, as well as enzymes specific for NAD and NADP. The amino acid sequences of the dehydrogenase/ cyclohydrolase enzymes and the dehydrogenase domains of the trifunctional enzymes that have been sequenced show high degrees of homology that is independent of the source or the physical organization of the enzyme. But, up to now, no gene sequence has been determined for a mono-or bifunctional dehydrogenase from eubacteria.
The proteins possessing methylene-THF dehydrogenase activity as described above whether derived from prokaryotes or eukaryotes generally show extensive amino acid homology and general consistency in structure. However, the amino acid sequence of the E. coli dehydrogenase was of particular interest since the structure of the enzyme from this source was reported to be significantly different from that described in any other source. The enzyme in E. coli  In the results reported here on our investigation of the E. coli enzyme, we purified the 5,10-methylene-THF dehydrogenase to homogeneity, but found that it is a protein of approximately M, 70,000 composed of two identical subunits. As reported previously (6), the enzyme is bifunctional possessing dehydrogenase and cyclohydrolase activities. Our isolation procedure results in a purification of at least 2,000-fold to yield a protein with a specific activity of 200 pmol of product/min/mg of protein. The preparation of the bifunctional enzyme from E. coli previously reported by Dev and Harvey (6) was purified 450-fold from the crude extract to a specific activity of 31 pmol of product/min/mg of protein. It would therefore appear that the enzyme preparation previously obtained (6) was not pure and that the conclusions regarding its protein structure, if not the kinetic properties, are questionable.
The molecular properties of the enzyme purified here are further substantiated by the information concerning the amino acid composition of the protein derived from the nucleotide sequence of the gene coding for this protein (Fig. 3). An amino acid sequence corresponding exactly to the 35 amino acids determined by chemical analysis of the aminoterminal amino acid sequence of the isolated enzyme was found. This sequence, preceded by a methionine residue, is part of an open reading frame that codes for 287 amino acids of M, 31,060, which is in fair agreement with the subunit size 35,000 determined by SDS-PAGE.
A search of the two major protein sequence data bases (EMBL-24 and GenBank 65) recognized six other proteins that showed high homology to the E. coli dehydrogenase/ cyclohydrolase enzyme sequence. These were the human (9) and mouse (8) bifunctional mitochondrial enzymes, the yeast mitochondrial (40) and yeast cytoplasmic (41) trifunctional enzymes, and the human (42) and rat (37) trifunctional enzymes. The percent identity calculated for the E. coli enzyme to the two other dehydrogenasejcyclohydrolase bifunctional enzymes and domains of the sequences listed above were 48, 50, 42, 45, 42, and 40, respectively. These sequences were aligned as shown in Fig. 4. The alignment shown is a composite of the Intelligenetics, Inc. GenAlign program and with further refinement done by eye. Amino acids common to all seven sequences are indicated by the shadowing. It is particularly noteworthy that a 14-amino acid sequence (residues 256-269 in the E. coli sequence) is found in all the other six proteins with no substitutions or gaps. This sequence was not found in any other protein contained in the data basesexamined.

-1TPVPGGVGPMTVA-
The identity/similarity matrix was determined for each of the sequences relative to each of the other sequences by the Profile Comparison of Multiple Alignment program (Table  11) (43). The analysis suggests that the sequences fall into two groups: one composed of the bifunctional dehydrogenase from E. coli and the human and mouse mitochondrial enzymes and the second group composed of the dehydrogenase and cyclohydrolase domains of the trifunctional enzymes of human and rat cytoplasm and the yeast cytoplasm and yeast mitochondria. This is surprising since the dehydrogenases of E. coli and the four trifunctional enzymes are specific for NADP, while the dehydrogenases of the two bifunctional mitochondrial enzymes are specific for NAD.
We mapped the gene coding for the bifunctional protein described here (15) by the Gene Mapping Membrane Technique (44). It was located at 570 kilobases on the physical map or 12 min on the genetic map (45) of E. coli. No other enzymes associated with folate have been identified in this region. We named the gene coding for this bifunctional dehydrogenase/cyclohydrolase folD.