Cloning of Human cDNAs Encoding Mitochondrial and Cytosolic Serine Hydroxymethyltransferases and Chromosomal Localization”

Human cDNAs for cytosolic and mitochondrial serine hydroxymethyltransferase (SHMT) were cloned by functional complementation of an Escherichia coli glyA mutant with a human cDNA library. The cDNA for the cytosolic enzyme encodes a 483-residue protein of M, 53,020. The cDNA for the mitochondrial enzyme encodes a mature protein of 474 residues of M, 52,400. The deduced protein sequences share a high degree of sequence identity to each other (63%), and the individ-ual isozymes are highly homologous to the analogous rabbit liver cytosolic (92% identity) and mitochondrial (97% identity) SHMT

one-carbon metabolism is the @-carbon of serine. Serine hydroxymethyltransferase (SHMT),' a pyridoxal phosphatecontaining enzyme, catalyzes the reversible conversion of serine and tetrahydrofolate to glycine and 5,lO-methylene tetrahydrofolate (2,3). Incorporation of the @-carbon of serine into DNA and SHMT activity are increased when cells are stimulated to proliferate and during the S phase of the cell cycle (4, 5), and SHMT activity is elevated in a variety of tumor tissues (6, 7).
Some eukaryotic cells contain both cytosolic and mitochondrial forms of SHMT (3,8), and mammalian cells that lack mitochondrial SHMT activity are auxotrophic for glycine. It has been suggested that glycine synthesis from serine occurs in the mitochondria, whereas cytosolic SHMT may catalyze the conversion of glycine to serine (9), although direct evidence for this proposal is lacking. The cytosolic and mitochondrial SHMT isoenzymes from rabbit liver have been purified to homogeneity and compared with respect to reaction and substrate specificity (10)(11)(12)(13)(14)(15)(16). Each isoenzyme is a tetramer of identical subunits, and both have isoelectric points near 7.2 (10). A study of cysteine-containing peptides from tryptic digests demonstrated that the two isoenzymes had different primary structures (13-15). Recently, the complete primary structures of the rabbit cytosolic (12) and mitochondrial (17) isoenzymes were determined by sequencing the proteins, confirming these differences. The primary structures of several bacterial SHMT proteins have been deduced from the sequence of their genes (18,20,21). ' We are interested in the role of subcellular compartmentation in the regulation of mammalian one-carbon metabolism, the role of the different SHMT isozymes in this process, and the potential of SHMT as a target for anti-proliferation agents. These studies require a mammalian SHMT cDNA for studying SHMT expression and the effects of modulation of enzyme activity in specific subcellular compartments. As no mammalian or eukaryotic SHMT cDNA or gene had been isolated, we attempted to isolate a human SHMT cDNA by its ability to complement an Escherichia coli glyA (SHMT-) mutant (22). This report presents the cDNA and deduced protein sequences of human cytosolic and mitochondrial SHMT and the localization of their genes to separate chromosomes. Elledge (Baylor College of Medicine). The construction of AKC and AYES-R and the phenotype of BNN132 have been described (23). The cre gene on AKC allows automatic subcloning of plasmid pSE936, contained between lox sites on AYES-R, when E. coli is infected with AYES-R (23). The human cDNA library, containing EcoRI-XhoI-SfiI linkers, was made from mRNA derived from Epstein-Barr virustransformed B-lymphocytes and was cloned into a unique EcoRI site located downstream from the h c promoter in the pSE936 region of AYES-R (23). AKC was rescued from BNN132 by mitomycin C induction and was used to infect GS245 (23). Kanamycin-resistant colonies were tested for the glyA phenotype on VB agar plates containing kanamycin (50 pg/ml), phenylalanine (50 wg/ml), and thiamine (10 pglml), with or without glycine (50 pg/ml) supplementation. Single cells of GS245(AKC) were isolated. Cloning of SHMZ"GS245(AKC) was grown overnight in LB medium (2.5 ml) containing 0.2% maltose, 1.0 mM isopropyl 0-D-thiogalactoside (IPTG), and 50 pg/ml kanamycin, resuspended in 10 mM MgS04, and mixed with the human cDNA library in AYES-R (2.5 X IO' phage) and incubated at 30 'C for 30 min without agitation as described by Elledge et al. (23). VB medium (4 ml) containing phenylalanine (50 wg/ml), glycine (50 pglml), thiamine (10 pglml), mannitol (0.2%), and IPTG (1.0 mM) was added and the culture incubated at 30 "C with shaking for 2 h. Twenty-five million cells were resuspended in medium lacking glycine, spread on 10 VB/2% agar plates containing 0.2% mannitol, 0.001% thiamine, 50 pg/ml ampicillin, and 50 pg/ml phenylalanine, and incubated at 30 "C. Control plates also contained glycine (50 pg/ml). In a second experiment, IPTG (1 mM) was added to the culture plates.
DNA Sequencing-EcoRI inserts of pSE936 plasmids which complemented GS245(AKC) and GS245 were subcloned into pTZ19U and transformed into E. coli MV1190 (Bio-Rad). Single-stranded DNA, produced using helper phage M13K07, was sequenced by the method of Sanger et al. (24) using Sequenase 2 (United States Biochemical Corp.). Primers were synthesized by the Micro-Chemical Facility (University of California, Berkeley).
Chromosomal Localization-SHMT cDNAs probes were labeled with biotin-11-dUTP by nick translation (26) and hybridized to metaphase chromosomes prepared from normal male peripheral blood by the bromodeoxyuridine synchronization method (27). In situ hybridization was performed by the method of Lichter et al. (28) with the following modifications. The hybridization solution contained probe DNA (400 ng), Cot 1 DNA (3 rg), and sonicated salmon sperm DNA (7 rg) per 10 pl of hybridization mix. After pre-annealing the probes for 10 min at 37 "C, the mixtures were applied to slides. Posthybridization washes were carried out with 2 X SSC, 50% formamide  Human Serine Hydroxymethyltrumferases A l a A l a G l n Thr G l n T h r G l y G l u A l a A s n Arg G l y T r p T h r G l y G l n Glu Ser Leu Ser A s p Ser A s p P r o G l u M e t T q G l u Leu Leu Gln Arg G l u LYS A s p Arg G l n C y S Arg G l y Leu G l u Leu Ile A l a Ser G l u A m P h e GAG ATG TGG GAG TIC CTG CAG AGG GAG AAG GAC AGG CAG TGT CGT GGC CTG GAG CIC ATT GCC TCA GAG AAC TIC l W AGC CGA GCT K G CTG GAG GCC CTG GGG TCC TGT Cn: AAC AAC AAG TAC TCG GAG GGT TAT CCT GGC AAG AGA Cys Ser Arg A l a A l a Leu Glu A l a Leu G l y Ser Cys Leu A s n A s n L y s lyr Ser Glu G l y Tyr P r o G l y Lys Arg Tyr Tyr G l y G l y A l a G1u V a l V a l A s p G l u Ile G l u Leu Leu Cys G l n Arg Arg A l a Leu G l u A l a Phe A s p Leu Leu G l n Pro His A s p Arg Ile Met G l y Leu A s p Leu P r o A s p Gly G l y His Leu Thr His G l y Tyr Met Ser Asp V a l L y s Arg Ile Ser A l a Thr Ser Ile me P h e G l u Ser M e t P r o lyr L y s Leu A s n P r o L y s T h r G l y Leu rle Ile P r o Ser P r o P h e L y s His A l a A s p Ile V a l T h r Thr Thr Thr His L y s Thr Leu Arg G l y A l a Arg Ser G l y Leu Ile P h e Tyr Arg LYS G l y V a l L y s A l a V a l Asp P r o Lys Thr G l y rn Glu Ile Leu Tyr Thr Phe Glu nsp rn Ile A s n P h e A l a V a l P h e P r o Ser &u G l n G l y G l y P r o His asn Leu L y s Asn A l a Arg A l a Met A l a A s p A l a Leu Leu G l u Arg Gly lyr Ser Leu V a l Ser G l y G l y Thr Asp Asn The amino acid sequence is numbered from the first codon in the cDNA. The polyadenylation signal is underlined.
at 44 "C (four times) and 1 X SSC at 55 "C (three times). Hybridized DNAs were detected with avidin-conjugated fluorescein isothiocyanate (Vector Laboratories). Two amplifications were carried out using biotinylated anti-avidin. Metaphase chromosomes were counterstained with chromomycin A 3 followed by distamycin A, by a modification of the procedure of Magenis et al. (29), to generate clear reverse bands. Images were photographed with Kodak technical pan (100 ASA) film using a Zeiss Axiophot microscope equipped with filter set 5.

RESULTS
Cloning of Human SHMT by Complementation of GS245(hKC)-GS245(XKC) cells were infected with the human cDNA library in AYES-R as described by Elledge et al. (23) and cultured for 2 h at 30 "C in nonselective medium (plus glycine) containing 1 mM IPTG. Washed cells were then plated on selective agar plates. After 3 days at 30 "C, about 100 colonies were obtained and 30 were re-streaked onto selective plates. Only a few colonies were obtained in the absence of IPTG. Twenty-four of the colonies continued to grow without glycine supplementation. Plasmids were isolated from the transformants and used to transform GS245. Several plasmids retained the ability to complement the glyA phenotype.
Restriction enzyme digestion of EcoRI cDNA inserts in pSE936 plasmids that complemented GS245 indicated two classes of inserts of approximately 1.7 and 1.9 kb, respectively, with distinctly different restriction maps. One EcoRI insert from each class was cloned into pTZ19U in both orientations for generating single-stranded DNA, and inserts were sequenced completely in both orientations. Preliminary sequencing of the inserts indicated that the deduced amino acid sequence of the 1.9-kb cDNA insert shared a high degree of sequence identity with rabbit liver mitochondrial SHMT (17), whereas the 1.7-kb cDNA coded for a protein homologous to the rabbit liver cytosolic isozyme (12).
Human Cytosolic SHMT-The cDNA sequence and the deduced protein sequence of human cytosolic SHMT are shown in Fig. 1. The cDNA contains an open reading frame of 1449 base pairs, which would code for a protein of 483 amino acid residues with an M, of 53,020. The 3"noncoding region lacks a poly(A) tail but does contain a region with some homology to the consensus polyadenylation signal but not an exact match. The cDNA insert in pSE936 and in pTZ19U complemented the glyA phenotype when cloned in an orientation that allowed expression from the lac promoter.

Human Serine
Hydroxymethyltransferases 11913 However, the open reading frame is out of frame with the ATG translation start sites in the pSE936 and pTZ19U vectors, and translation must have initiated at an internal ATG in the cDNA, even though the 5"untranslated region in the cDNA lacks sequences resembling a bacterial Shine-Dalgarno sequence.
Human Mitochondrial SHMT-The cDNA sequence and the deduced protein sequence of human mitochondrial SHMT are shown in Fig. 2. The mitochondrial cDNA lacks 5' residues that would code for the start methionine and a mitochondrial import sequence but does contains an open reading frame of 1422 base pairs, which would code for a protein of 474 amino acid residues with an M, of 52,400. The 3' end of the cDNA contains a polyadenylation signal.
A comparison of the deduced human mitochondrial se-quence with the sequence of mature rabbit liver mitochondrial SHMT (17) indicates that the human cDNA contains sufficient 5' residues to code for a mature mitochondrial protein (Fig. 3). The cDNA insert is in frame with an ATG translation start site in the pSE936 vector, and SHMT must have been expressed as a fusion protein from this ATG, which would have added the amino acid sequence MNSSRPRRP at the N terminus. Amino Acid Homology among SHMTs-The deduced amino acid sequences of human cytosolic and mitochondrial SHMT are compared with the corresponding rabbit liver proteins in Fig. 3. The primary structures of the rabbit enzymes were obtained by sequencing the purified rabbit liver proteins (12,17). The cytosolic and mitochondrial isozymes show a very high degree of sequence identity. Ninety-seven percent of the  residues of the two mitochondrial proteins are identicai, and about one-third of the remaining residues show conservative changes. About 92% of the residues of the two cytosolic proteins are identical, and about half of the remaining nonidentical residues represent conservative substitutions. The two human SHMT proteins share about 63% sequence identity, and about one-third of the remaining residues show conservative changes. The mitochondrial and cytosolic isozymes diverge most at their N termini and, to a lesser extent, at their C termini.
After completion of these studies, the sequences of two eukaryotic SHMT genes were reported: the Neurospora crassa cytosolic isozyme (30) and the pea mitochondrial enzyme (31). The deduced human sequences are also compared with these sequences and with four bacterial SHMT sequences in Fig. 3. The calculated evolutionary relationship between these proteins is shown in Fig. 4. The 10 proteins share a high degree of sequence identity. Twenty percent of the amino acid residues are common to all 10 proteins, and 39% are common to the six eukaryotic sequences (Fig, 3). The human cytosolic and mitochondrial isozymes are 57 and 60% identical, respectively, to the pea mitochondrial enzyme, 56% identical to the Neurospora cytosolic enzyme, and 42 and 43% identical to the E. coli protein. The major regions of sequence divergence are at the N and C termini. The eukaryotic sequences also contain three regions of amino acid insertions (4-14 residues) that are lacking from the prokaryotic sequences. The region around the pyridoxal phosphate binding site, Lys-257 in the consensus sequence (Fig. 3), is highly conserved in all the proteins. Lys-257 is preceded by a His residue and the active site His-Lys is preceded by 4 Thr/Ser residues and is followed by a Thr/Ser residue in all SHMTs (32, 33). The deduced evolutionary relationship between the SHMT proteins ( Fig. 4) suggests that the cytosolic and mitochondrial isozymes diverged from each other probably by a single gene duplication event and that this occurred after the divergence of the bacterial and eukaryotic proteins.
Nucleotide Homology between SHMT Isozyme cDNAs-The cDNA sequences of human cytosolic and mitochondrial SHMT show a high degree of sequence identity (57%) and are 65% identical over the their protein coding regions, whereas there is no significant homology between the human and bacterial sequences.
Expression of SHMT Activity in Transformants-Complementation of GS245 under selective conditions was accompanied by restoration of SHMT activity. SHMT activity in crude extracts of GS245 transformants averaged 104 and 64% of wild type levels in transformants expressing the mitochon- drial and cytosolic cDNAs, respectively, whereas no activity was detected in GS245 extracts. Although pTZ19U is a high copy number plasmid, none of the transformants displayed elevated SHMT activity. This probably reflected that the mitochondrial enzyme would have been synthesized as a fusion protein, whereas the cytosolic cDNA lacks a region with homology to the Shine-Dalgarno ribosome binding site and the translation efficiency of its mRNA in bacteria would be expected to be very poor.
Localization of Human SHMT Genes-The probe for the cytosolic SHMT gene was mapped to chromosome band 17~11.2 (Fig. 5, A and B ) . The mitochondrial SHMT cDNA probe was mapped to chromosome 12q13, most likely on subband 12q13.2 (Fig. 5, Cand D). Two independent experiments were carried out, and over 400 metaphase cells were evaluated. For cytosolic SHMT, signals were noted on two chromatids of at least one chromosome 17p11.2-pl2 in 45% of cells (n = 200) and the gene was finally localized to 17~11.2. For the mitochondrial SHMT probe, clear signals were noted on two chromatids of at least one chromosome 12q13 in 50% of cells (n = 200). No secondary signals on any chromosome band were noted in greater than 0.5% of cells.

DISCUSSION
This report describes the cDNA sequences of human cytosolic and mitochondrial SHMT and their deduced protein sequences. The relative ease with which these two cDNAs, which code for high abundance proteins, were cloned highlights the value of the X-YES expression vector system (23). We have previously used this system to clone a cDNA for human folylpolyglutamate synthetase, an extremely low abundance protein (34). However, a limitation of this technique, at least when used for expression in bacteria, is the need for the expressed protein to be functionally active, and consequently full-length cDNAs for inactive pre-proteins, possessing N-terminal leader sequences, would not be isolated. The mitochondrial cDNA obtained in the current study lacked 5' sequences coding for the mitochondrial import signal, and the start ATG and the mitochondrial isozyme was expressed as a fusion protein from an ATG present in the X-YES vector. Preliminary attempts at isolating the 5' region of the cDNA from the X-YES library using polymerase chain reaction techniques were not successful. Although the possibility that the mitochondrial isozyme lacks a conventional leader sequence cannot be excluded, the presence of a leader sequence in the recently described pea mitochondrial isozyme (31) suggests this is unlikely. We have recently isolated genomic clones for both human SHMT isozyme^,^ and sequence analysis of the mitochondrial clone should indicate the nature of the leader sequence.
Because of the localization of the cytosolic SHMT gene to chromosome region 17~11.2, it may be of interest to determine the relationship of SHMT to the duplication responsible for Charcot-Marie-Tooth neuropathy type 1A and the Smith-Magenis syndrome, as both are located in the same region (35). This region is located significantly centromeric to the region deleted in the Miller-Dieker syndrome and is relatively gene-rich, containing genes for ubiquitin B (UBB), muscle nicotinic cholinergic receptor p polypeptide (CHRNBl), keratin 18 (KRT18), and numerous other keratin-related sequences.
A gene for SHMT was previously mapped to chromosome 12q12-ql4 using a panel of somatic cell hybrids of the Chinese hamster ovary cell gZyA, which lacks mitochondrial SHMT P. Stover, L. Chen, and B. Shane, unpublished data.

Human Serine
Hydroxymethyltramferases 11915 FIG. 5. I n situ localization of cytosolic and mitochondrial SHMT genes to human chromosomal regions 17~11.2 and 12q13, respectively. A human chromosomal preparation was hybridized with plasmids containing the SHMT cDNAs labeled with biotin-11-dUTP. A and C, standard photomicroscopy. Human chromosomes shown with R (reverse) banding generated using chromomycin As/distamycin A. B, arrows indicate the fluorescein isothiocyanate signals corresponding to chromosome band 17~11.2 in A. D, arrows indicate the fluorescein isothiocyanate signals corresponding to chromosome band 12q13 in C.
activity (36). The mapping of the mitochondrial SHMT gene to chromosome region 12q13 suggests this is the same gene, as no additional signals were seen in this region suggestive of a third SHMT-related gene. The techniques used in the current study would have detected members of a multigene family sharing somewhat less than 90% homology over at least 1.2 kb.4 Chromosome region 12q13 is also relatively gene-rich, although the precise band assignments of many of these genes (12q12-14) are not known (37). Of interest here, however, may be the fragile site, folic acid type, rare fra( 12q13.1) located within the same band, likely centromeric to SHMT (38,39). Several neoplasia are also associated with translocations involving 12q13, although the relationship, if any, to SHMT is unknown. Two additional keratin genes (6A and 7) map to the region 12q12-21, which has suggested homology between this region and region 17p (37). The localization of the SHMT genes to these chromosomal regions and the high degree of nucleotide sequence identity between mitochondrial and cytosolic SHMT cDNAs are also consistent with this suggestion and suggest that the genes arose by a relatively recent duplication event. The deduced evolutionary relationship between SHMT proteins also suggests that the two isozymes arose from a gene duplication after the divergence of bacterial and eukaryotic proteins, and it is interesting to note that while some eukaryotic cells possess both cytosolic and mitochondrial SHMT isozymes, other eukaryotic cells express only the cytosolic or the mitochondrial activity.
The primary structure of SHMT has been very highly conserved through evolution, which may reflect a rigid requirement of residues for substrate binding and catalysis. It has been proposed that SHMT may be part of a multi-protein complex involved in purine or thymidylate synthesis in eukaryotic cells (19,40), and the high degree of sequence conservation may also reflect conservation of residues or tertiary ' J. Korenberg, unpublished data. structure involved in protein-protein interactions. The availability of mammalian cDNAs for cytosolic and mitochondrial SHMT will allow an assessment of essential noncatalytic residues that may be involved in complex formation and will also allow studies on the regulation of SHMT in mammalian cells and on the physiological role of the cytosolic and mitochondrial isozymes.