The primary structure of spinach glycolate oxidase deduced from the DNA sequence of a cDNA clone.

A cDNA clone encoding the peroxisomal enzyme glycolate oxidase (EC 1.1.3.15) was identified by probing a cDNA library of spinach with synthetic oligonucleotides based on the partial amino acid sequence of the enzyme. Determination of the DNA sequence of the 1526-nucleotide cDNA indicated a 1107-nucleotide open reading frame which encodes a polypeptide of 40,282 daltons. The polypeptide produced by in vitro transcription and translation of the cDNA insert had the same apparent subunit molecular mass as the enzyme purified from leaves, indicating that the cDNA encodes a full-length polypeptide and that no cleavage of the polypeptide is required for uptake of the polypeptide by peroxisomes. Comparison of the deduced amino acid sequence with those of two other plant peroxisomal proteins revealed a region of homology which may be involved in directing proteins to the peroxisome.

A cDNA clone encoding the peroxisomal enzyme glycolate oxidase (EC 1.1.3.15) was identified by probing a cDNA library of spinach with synthetic oligonucleotides based on the partial amino acid sequence of the enzyme. Determination of the DNA sequence of the 1526-nucleotide cDNA indicated a 1107-nucleotide open reading frame which encodes a polypeptide of 40,282 daltons. The polypeptide produced by in vitro transcription and translation of the cDNA insert had the same apparent subunit molecular mass as the enzyme purified from leaves, indicating that the cDNA encodes a full-length polypeptide and that no cleavage of the polypeptide is required for uptake of the polypeptide by peroxisomes. Comparison of the deduced amino acid sequence with those of two other plant peroxisomal proteins revealed a region of homology which may be involved in directing proteins to the peroxisome.
The flavoprotein glycolate oxidase (EC 1.1.3.15) is a peroxisomal enzyme which catalyzes the oxidation of a-hydroxy acids. In vertebrates the enzyme is believed to be involved in the metabolic production of oxalate by the oxidation of glycolate through glyoxylate (1). The enzyme is also present in the leaves of higher plants where it catalyzes the second reaction of the photorespiratory pathway, the oxidation of glycolate to glyoxylate and Hz02 (2). Both the mammalian (3) and plant (4-6) enzymes are generally isolated as tetramers or octamers composed of identical subunits of approximately 43 kDa. Glycolate oxidase is the only FMN-dependent oxidase for which the tertiary structure has been determined (5). The subunit structure is an eight-stranded a/@-barrel, similar to that of triose-phosphate isomerase, i n which the coenzyme is bound to the barrel. The structure of the flavin binding domain distinguishes glycolate oxidase from the other flavincontaining enzymes for which structures have been obtained ( 5 ) . A detailed description of the relationship between structure and function of the enzyme has been delayed by the lack of primary sequence information.
* This research was supported in part by a grant from the Mc-Knight Foundation and by United States Department of Energy Grant AC02-76ERO-1338. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
to the GenBankTM/EMBL Data Bank with accession number(s) The nucleotide sequence(s) reported in thispaper has been submitted

503492.
j To whom reprint requests should be addressed.
The availability of high resolution structural information makes glycolate oxidase an attractive subject for mechanistic studies of protein transport into peroxisomes. It is now apparent that peroxisomal enzymes are transported post-translationally into the preformed organelle (7). Unlike other organelles, peroxisomal proteins are not cleaved during transport into the peroxisome, and the structural features of the polypeptides required for peroxisomal localization are not known. The only peroxisomal enzymes from plants for which amino acid sequence information is available are noddin 35, a subunit of soybean uricase I1 (8), and a partial deduced amino acid sequence of the glyoxysomal malate synthase from cucumber (9). Thus, the availability of additional sequences of peroxisomal enzymes may be useful in determining the molecular basis for peroxisomal localization.
In this report we describe the cloning of a full-length cDNA for glycolate oxidase from spinach, the determination of the deduced amino acid sequence from the DNA sequence, and the identification of a short region of amino acid homology shared by the three plant peroxisomal proteins for which sequence is available.

EXPERIMENTAL PROCEDURES
Materials-Glycolate oxidase was purified to homogeneity from spinach leaves as described (5) or was a gift from Dr. C. I. Branden (Uppsala). Rabbit antiserum against glycolate oxidase was prepared by standard methods. A Xgtll library derived from mRNA of spinach leaf (Spinacea oleracea L., cv. American Hybrid 424) was generously provided by W. L. Ogren (United States Department of Agriculture, Agricultural Research Service, Urbana, IL). Oligonucleotides were synthesized by the phosphoramidite method (10). Plasmid pKSM13+ "Bluescript" was purchased from Stratagene (San Diego, CA).
Nucleic Acid Hybridization-Nitrocellulose plaque lifts (13) were probed with a mixture of synthetic oligonucleotides ( Fig. 1) which were 5'-end-labeled to an average specific activity of lo9 dpm pg-I with [T-~'P]ATP (3000 Ci mmol") and T4 polynucleotide kinase. Filters were prehybridized 3 to 5 h a t 52 "C in 6 X SSC, 50 mM sodium phosphate (pH 6.8),5 X Denhardt's solution, 100 pg ml" of sonicated herring DNA (13). The hybridizations were carried out a t 52 "C for 24 to 30 h in the same solutions with the addition of 10% (w/v) dextran sulfate and 0.5 pmol ml" of labeled oligonucleotide. The temperature for hybridization was based on an empirical formula (14) by assuming that all ambiguous positions contained A or T bases.
Hybrid-selected Translation-RNA was hybrid-selected as described (16)

Met V a l T y r A s p T y r T y r A l a
FIG. 1. The oligonucleotide mixture used as a probe for the glycolate oxidase gene and the corresponding amino acid sequence. The methionine residue was not observed on the peptide which was obtained from cyanogen bromide cleavage but was inferred from the mode of action of cyanogen bromide. dry, and the RNA was eluted as described (16), translated in a reticulocyte lysate system (Promega), and resolved on a 12% SDSpolyacrylamide electrophoresis gel which was dried and autoradiographed. DNA Sequence Analysis-The fragment to be sequenced was digested with Sau3A, TaqI, A M , HaeIII, or EcoRI and cloned into the HamHI, AccI, SmaI, or EcoRI sites, respectively, of M13mp18 or M13mp19. Cloning and preparation of template DNA was performed using standard protocols (17). The inserts were sequenced by the chain-termination method and resolved on buffer gradient gels (18). In Vitro RNA Transcription-Conditions for in oitro transcription of pMV2 were essentially as described (19) except that T7 RNA polymerase was used and the DNsse treatment was omitted.

RESULTS
From a partial amino acid sequence of the spinach glycolate oxidase2 we designed a mixture of 64 synthetic oligonucleotides (20-mers) which represented all possible DNA sequences which could encode the amino acid sequence (Fig. 1). The utility of this mixture as a gene-specific probe was suggested by the observation that when used to probe a genomic Southern blot of spinach DNA only one major band of homology was observed under stringent washing conditions (results not presented). The temperature for hybridization was chosen, on the basis of empirical rules (14), to be 2 "C above the predicted melting temperature which would result if all ambiguous nucleotides in the probe were A or T. The washing conditions were based on the observation that replacement of sodium chloride with tetramethylammonium chloride permits relatively accurate prediction of melting temperatures solely on the basis of oligonucleotide length (15).
A X g t l l library derived from spinach leaf poly(A+) RNA was probed with the end-labeled oligonucleotide mixture using the same hybridization conditions established for the genomic Southern blot. Of 500,000 plaques screened, three gave a reproducible signal and were retained. One of these, designated XGLO-1, had a 1.5-kilobase insert which was subcloned in both orientations into the EcoRI site of pKSM13' to produce plasmids pMV2 and pMV21.
In order to ensure that the cloned fragment encoded glycolate oxidase, the insert from pMV2 was used to select an mRNA from leaf poly(A+) RNA. This hybrid-selected RNA was t.hen translated in uitro and the products were electrophoresed in SDS-polyacrylamide gels before and after immunoprecipitation with a rabbit antibody specific for glycolate oxidase (Fig. 2). Even without immunoprecipitation a labeled polypeptide of about 43 kDa was apparent, and after immunoprecipitation only that polypeptide was seen. The apparent molecular weight of these polypeptides corresponded to a 43-kDa protein which was one of two polypeptides immunoprecipitated from the in uitro transition products of leaf poly(A+) RNA. Thus, we concluded that the insert in pMV2 encodes glycolate oxidase. The identity of the 39-kDa polypeptide which is also immunoprecipitated from the translation produc*s of poly(A+) RNA by the antibody is not known. Since the protein used to immunize the rabbit was highly purified ' C. Rranden, personal communication.  (5), we tentatively conclude that the 39-kDa polypeptide shares an epitope with glycolate oxidase.

sn S T S A A A A H T S A A H A A ~n
The complete DNA sequence of the insert in pMV'2 was determined on both strands according to the strategy outlined in Fig. 3. The composite nucleotide and deduced amino acid sequences are shown in Fig. 4. The entire insert was 1526 nucleotides and encoded an open reading frame of 1107 nucleotides which gave a deduced amino acid sequence of 369 residues encoding a protein of 40,282 daltons. The deduced amino acid sequence has an excess of positively charged amino acids (+5) which is consistent with the observation that other peroxisomal matrix proteins have relatively high isoelectric points (7). One potential site for asparagine-linked glycosylation is present in the deduced amino acid sequence (nucleotides 583-591).
The first ATG in the sequence of the cDNA occurs at nucleotide 135. Thus, if translation were to begin a t another site in a missing region of the RNA further upstream, the polypeptide would have a molecular mass at least 4.5 kDa greater than if it begins at nucleotide 135. In order to test

g S e r H i s I l e A l a A l a A s p T r p A s p C l y P r o S e r S e r A r g A l a V a l A l a A r g L e u TER
CAClVUITAATGlVUICCTATGTTTGAGCAACGGAAATGTAACAGCATATCCTTTTAAGTTTTTCTTTTTTTTTTTCCTACTTlVUIAAGTGCCTTCTTTTT 1228 CCTTCMCMCTTCATTTTTGATGACTATCAATGGATGCCTTATGTATTTTC~TGGCTACTCTGCCTTTCATTGTATATTCTTCTTATTTTCCCTTT 1327 whether the ATG a t nucleotide 135 is the normal start site for translation, we determined the apparent molecular mass of the polypeptide produced by in vitro transcription and translation of the cDNA clone. This was facilitated by the fact that plasmid pMV2 was constructed in such a way that the 5' end of the cDNA was proximal to a site on the parent vector pKSM13' for initiation of transcription by T7 RNA polymerase. The vector was also designed so that the ATG at nucleotide 135 is the first ATG in the transcript originating from the T7 polymerase initiation site. Comparison of the translation products from the in vitro transcript and poly(A+) RNA from leaves indicated that the polypeptide product of the cloned gene had the same apparent molecular mass as that from poly(A+) RNA (Fig. 5). Since a discrepancy of 4.5 kDa would have been readily apparent, we are confident that the open reading frame indicated in Fig. 4 is correct.

T T T M T A A T A C T T G A T A T A T C C A T A T M C T A T A T A T T T A C T C A C C
In vitro translation of spinach leaf poly(A+) RNA and the RNA produced by in vitro transcription of pMV2 with T7 polymerase resulted in the accumulation of several polypeptides, in addition to the 43-kDa polypeptide, which were immunoprecipitated by the anti-glycolate oxidase antiserum (Fig. 5). A similar observation by others (20) has been attributed to spurious initiation of translation at AUG codons other than the normally used initiator-AUG codon. The apparent molecular masses of the minor translation products in Fig. 5  (lane b ) are 39,32,29,27,19, and 9 kDa. These values coincide very well with the predicted molecular masses of 38.1, 33.3, 30.8,29,19, and 11 kDa which would be obtained if translation of the in uitro synthesized transcript initiated at all AUG codons at a low frequency. The relatively high level of accumulation of the 39-kDa polypeptide in the translation products of poly(A+) RNA (Fig. 5 ) may reflect the fact that the poly(A+) RNA is capped whereas the transcript from pMV2 is not.
In order to examine the possibility that peroxisomal proteins share a common structural motif, we compared the deduced amino acid sequence of glycolate oxidase to the deduced amino acid sequences of the soybean nodulin 35 polypeptide (8) and cucumber malate synthase (9). In both comparisons, the FASTP algorithm designed by Lipman and Pearson (21) identified the same region of glycolate oxidase as having significant amino acid sequence homology with the other two polypeptides (Fig. 6). In the comparison of glycolate oxidase and malate synthase, a 19-residue region had 42% homology. In the comparison of glycolate oxidase and nodulin 35 there was 28% homology over a 28-residue region.

DISCUSSION
Several lines of evidence indicate that pMV2 contains a cDNA sequence encoding the full-length glycolate oxidase. First, the cDNA uniquely selected mRNA which, when translated in uitro, yielded a polypeptide of the correct apparent molecular mass which was immunoprecipitated by antiserum against glycolate oxidase. Similarly, in vitro transcription and translation of the cDNA insert in pMV2 yields a polypeptide with an apparent molecular weight of about 43 kDa which comigrated with in uitro translated glycolate oxidase and was immunoprecipitated by the anti-glycolate oxidase antibody. In addition, the deduced amino acid sequence of the cDNA insert in pMV2 contained the partial amino acid sequence of glycolate oxidase which was used to design the mixed oligonucleotide probe.
It is generally accepted that peroxisomal proteins are translated on free cytoplasmic ribosomes and post-translationally imported without obligate cleavage of a signal or transit peptide (7). Thus, the information which directs a polypeptide to the peroxisome must be an intrinsic property of the structural protein. Although glycolate oxidase has one potential glycosylation site, the lack of an apparent difference in the molecular mass of the in vitro translated polypeptide and the subunit molecular mass of the enzyme purified from leaves (22) suggests that it is not glycosylated. Thus, an inherent structural feature of the polypeptide is implicated by default. It has recently been suggested (23) that proteins imported into glycosomes of Trypanosoma brucei have common elements on their surface which may serve as a topogenic signal for import into the glycosomes. In this respect it is interesting to note the homology found between a specific region of glycolate oxidase, uricase, and malate synthase (Fig. 6). Since these three plant peroxisomal proteins are functionally dissimilar, the existence of structural homology raises the possibility that this region may be involved in import of plant peroxisomal proteins. Alignment of the primary and tertiary structure of glycolate oxidase' indicates that the putative "sorting sequence" contained in the region of the polypeptide from amino acid 160 to 190 (Fig. 6) is located in a 45-residue loop which forms an exposed surface of the molecule (5). Thus, the site is in a suitable location to interact with components of the peroxisome involved in protein transport. The availability of the cloned glycolate oxidase gene and the system for in uitro expression of the gene described here should facilitate the direct testing of this hypothesis by permitting the creation and analysis of the effects of mutations in the structural gene on the in uitro transport of glycolate oxidase into isolated peroxisomes (24).