The Influence of Flanking Sequence on the 0-Glycosylation of Threonine in Vitro*

To investigate the influence of flanking amino acid sequence on the 0-glycosylation of a single threonine residue in vitro, we have examined a series of 52 related peptides. The substrates were based upon a sequence from human von Willebrand factor which is known to be glycosylated in vivo (-‘PHMAQVT-VGPGL+‘). Each residue of the parent peptide was substituted, in turn, with isoleucine, alanine, proline, glutamic acid, or arginine. Peptides were glycosylated using a UDP-Ga1NAc:polypeptide N-acetylgalactosa- minyltransferase purified 15,000-fold from bovine colostrum by chromatography on DEAE-Sephacel, SP-Sephadex, Sephacryl S-300, Affi-Gel Blue, and 5-mer- curi-UDP-GalNAc thiopropyl-Sepharose. Single amino acid changes in the sequences flanking the threonine could profoundly alter the glycosylation of the substrate peptides. Substitution of any amino acid tested at positions +3, -3, and -2 markedly decreased 0-glycosylation, as did the presence of a charged residue at position -1. The substitution of amino acids at the other positions of the peptide sub- strate had little effect on the incorporation of GalNAc.

The Influence of Flanking Sequence on the 0-Glycosylation of Threonine in Vitro* (Received for publication, August 5, 1992) Brian C. O'ConnellS, Fred K. Hagens, and Lawrence A. Tabakn To investigate the influence of flanking amino acid sequence on the 0-glycosylation of a single threonine residue in vitro, we have examined a series of 52 related peptides. The substrates were based upon a sequence from human von Willebrand factor which is known to be glycosylated in vivo (-'PHMAQVT-VGPGL+'). Each residue of the parent peptide was substituted, in turn, with isoleucine, alanine, proline, glutamic acid, or arginine. Peptides were glycosylated using a UDP-Ga1NAc:polypeptide N-acetylgalactosaminyltransferase purified 15,000-fold from bovine colostrum by chromatography on DEAE-Sephacel, SP-Sephadex, Sephacryl S-300, Affi-Gel Blue, and 5-mercuri-UDP-GalNAc thiopropyl-Sepharose.
Single amino acid changes in the sequences flanking the threonine could profoundly alter the glycosylation of the substrate peptides. Substitution of any amino acid tested at positions +3, -3, and -2 markedly decreased 0-glycosylation, as did the presence of a charged residue at position -1. The substitution of amino acids at the other positions of the peptide substrate had little effect on the incorporation of GalNAc.
Statistical analysis of sequences flanking known glycosylated threonine and serine residues suggests that they should be glycosylated with equal efficiency in the same sequence context (O'Connell et al., 1991). However, the bovine colostrum transferase failed to glycosylate a peptide derived from human erythropoietin which contains a serine that is glycosylated in vivo (-'PPDAASAAPLR+'). When a threonine was substituted for the serine in this peptide (-'PPDAATA-APLR+'), the substrate proved to be an excellent acceptor of GalNAc. These observations indicate that although flanking amino acid sequence is important for the 0-glycosylation of specific hydroxyamino acids, discrete threonine-and serine-specific transferases may exist.
The initial step of mucin-type 0-glycosylation is the attachment of N-acetylgalactosamine to a hydroxyamino acid. The addition of this monosaccharide extends the conformation of the protein backbone and is the point at which other * This work was supported in part by National Institutes of Health Grants DE-08108 and DE-089511 (to L. A. T.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
$ Supported by Dentist-Scientist Award K16 DE-00159; these studies constitute work toward fulfillment of the Ph.D. degree. § Supported by National Institutes of Health Grant T32  To whom correspondence should be addressed. Tel.: 716-275-0770;Fax: 716-473-2679. sugars may be added to build an oligosaccharide side chain . In contrast to Nglycosylation, there is no consensus sequence for glycosylation by UDP-Ga1NAc:polypeptide N-acetylgalactosaminyltransferase. A persistent problem with the detailed analysis of 0glycosylation is that the number of 0-linked glycans which have been assigned to specific hydroxyamino acids has remained rather small, presumably due to the difficulty of directly sequencing 0-linked glycoproteins (O'Connell et al., 1991).
Previous attempts at defining an 0-glycosylation consensus signal have emphasized the importance of proline near the glycosylation site (Aubert et al., 1976;Wilson et al., 1991;Gooley et al., 1991). In particular, the observation that the Thr-Pro-Pro-Pro sequence in the bovine myelin protein A1 will glycosylate in vitro has dominated later work in this area (Hagopian et al., 1971;Young et al., 1979;Briand et al., 1981;Hughes et al., 1988;Cottrell et al., 1992) and has led to the proposal that 0-glycosylation requires a turn in the substrate (Aubert et al., 1976;Eckhardt et al., 1987). However, the bovine myelin protein A1 is not glycosylated in vivo, and the triprolyl motif has been found in only a few glycoproteins (Smyth and Utsumi, 1967;Oppenheim et al., 1985). Moreover, there are several examples of 0-glycosylated sites which lack proline within the surrounding flanking region (Honma et al., 1980;Rall et al., 1982).
Previously, we compiled a database of unambiguously-defined 0-glycosylation sites and compared their flanking amino acid sequences with those of nonglycosylated serine and threonine residues (O'Connell et al., 1991). Our analysis suggested that both the type and position of amino acids near a potential glycosylation site could be important for glycosylation by the UDP-Ga1NAc:polypeptide N-acetylgalactosaminyltransferase. In particular, the presence of a proline, alanine, serine, or threonine at positions +3, -1, -6, and -3 was often associated with glycosylation of the hydroxyamino acid, whereas a charged residue at these positions was often associated with a serine or threonine which was not 0-glycosylated (+ orrefers to residues that are N-terminal or C-terminal to a hydroxyamino acid). Other positions flanking the potential glycosylation site appeared to have no influence on glycosylation.
In the present study we have investigated the influence of each residue from -6 to +5, relative to threonine, on the glycosylation of a peptide substrate. The sequence of the substrates is based on a region of human von Willebrand factor that contains a single threonine which is glycosylated in vivo (Titani et al., 1986). Five amino acids were chosen to substitute each position of the parent peptide, representing residues with different properties, i.e. hydrophobic residues, charged residues, small side chains, and association with turns. The substrates for the in vitro glycosylation assays consisted of 52 synthetic peptides, most of which were synthesized as mixtures and separated by HPLC. ' The substrates varied greatly in their ability to be glycosylated, confirming our hypothesis that single amino acid substitutions in the flanking sequence can alter the acceptor status of a hydroxyamino acid. Substitution by any amino acid tested at positions +3, -3, and -2 markedly decreased 0-glycosylation. The presence of a charged residue (glutamic acid or arginine) at position -1 resulted in an inhibition of 0-glycosylation. However, the substitution of amino acids at the other positions of the peptide substrate had little effect on the incorporation of GalNAc.
Although no difference was found between the sequence of amino acids surrounding glycosylated serines and threonines, when serine was substituted for threonine in the peptide derived from von Willebrand factor (-'PHMAQVSVG-PGL"), the substrate failed to 0-glycosylate (O'Connell et al., 1991). In this report, we demonstrate that the bovine colostrum transferase fails to glycosylate a substrate having a sequence from human erythropoietin (-5PPDAASAA-PLR+') which contains a serine that is glycosylated in vivo (Lai et al., 1986). Furthermore, we were unable to detect appreciable serine-specific activity in any column fraction throughout purification of the transferase or in crude colostrum. Although it has been assumed that the addition of GalNAc to serine and threonine is catalyzed by the same enzyme (Hagopian and Eylar, 1969;Hill et al., 1977;Sugiura et al., 1982;Elhammer and Kornfeld, 1986), collectively, our observations support the proposal that discrete threonineand serine-specific transferases must exist (O'Connell et al., 1991;Wang et al., 1992).

Substrate Purification
A series of 12-residue peptides was synthesized based on the sequence surrounding a glycosylated threonine residue in human von Willebrand factor (HVF), -6PHMAQVTVGPGL+5. Single residue substitutions were made at each position from 6 residues N-terminal to the threonine (-6) to 5 residues C-terminal (+5) to the threonine. Each position of the sequence was substituted in turn by isoleucine, alanine, proline, glutamic acid, and arginine. The complete set of substrates consisted of 53 peptides, which were named for the substituted amino acids they contained for example, peptide -2A had an alanine residue at position -2 relative to the threonine (Table I).
Based on the sequence flanking a glycosylated serine residue in human erythropoietin (EPO), a set of six peptides was synthesized (-'PPDAASAAPLR+'). In this case, the hydroxyamino acid was changed from a single serine residue to threonine or a pair or combination of serine and threonine. These peptides were named EPO-S, EPO-T, EPO-SS, EPO-TT, EPO-ST, and EPO-TS, depending on the hydroxyamino acids they contained (see Table 11). The EPO series of substrates were made individually and purified by reversephase HPLC. The peptides were not acetylated so that the acceptor residue could be identified by direct sequencing.
Step I : Separation of Cellular Debris and Lipid Globules-Whole bovine colostrum (288 ml) was centrifuged for 10 h at 10,000 X g. The lipid layer and cellular pellet was separated from the supernatant by pouring through a few layers of cheese cloth. A minitangental flow apparatus (Millipore) was used to perform a buffer exchange twice into 10 volumes of Buffer A. The liquid phase of the colostrum was concentrated to 250 ml and then centrifuged at 80,000 X g for 90 min to remove any remaining lipid and insoluble material.
Steps 2 and 3: DEAE-Sephacel and SP-Sephadex Chromatography-The 80,000 X g supernatant was applied to a column (10 X 30 cm) of DEAE-Sephacel resin equilibrated with Buffer A. The flowthrough fractions were detected by absorbance measurements at 280 nm and were pooled and applied directly to an SP-Sephadex column (10 X 30 cm) also equilibrated with Buffer A. The SP-Sephadex flowthrough fractions were pooled and concentrated 3.5-fold in a minitangental flow apparatus.
Step 4: Sephacryl S-300 Chromatography-The SP-Sephadex flowthrough fraction was brought to a protein concentration of 25 mg/ml and 0.10 volume of 10 X Buffer B was added. This sample (150 ml final volume) was applied to a column (5 X 120 cm) of Sephacryl S-300 which was equilibrated in Buffer B. Elution of protein was monitored by absorbance at 280 nm, and fractions were assayed for transferase activity. The enzyme activity of the most pure fractions was pooled, so that at least 80% of the total activity was recovered.
Step 5: Affi-Gel Blue Resin Chromatography I-The 210 ml of pooled material from Step 4 was diluted to 400 ml with Buffer B and applied to a column (2.5 X 25 cm) of Affi-Gel Blue, equilibrated with Buffer B. The column was eluted stepwise with 500 ml of Buffer B, 500 ml of Buffer C, 200 ml of Buffer D, and 200 ml of Buffer C. Bound protein (including transferase) was eluted from the column with 200 ml of Buffer E and collected in 6-ml fractions. The eluted material was pooled (150 ml) and immediately diluted to 750 ml with Buffer B. The resin was regenerated with 200 ml of Buffer F followed by 200 ml of Buffer G before proceeding to Step 6.
Step 6: Affi-Gel Blue Resin Chromatography ZZ-Material from step 5 was loaded onto a column (2.5 X 25 cm) of Affi-Gel Blue, equilibrated with Buffer B. The column was eluted stepwise with 250 ml of Buffer B, 250 ml of Buffer C, and 200 ml of Buffer D containing 0.2 mM cytidine 5'-monophosphate. The remaining chromatography in this and subsequent steps was done with glycerol-containing buffers. 200 ml of Buffer H was applied to the column and the enzyme was eluted with a linear gradient of 0.1 M NaCl to 1 M NaCl (125 ml of Buffer H and 125 ml of Buffer I), followed by 50 ml of Buffer I.
Fractions of 3.8 ml were collected, monitored for absorbance at 280 nm and tested for transferase activity using a threonine and a serinecontaining substrate. The eluant was divided into pool 1 (33 ml) and pool 2 (80 ml), whereupon each pool was added to an equal volume Buffer J and then dialyzed against 2 liters of Buffer J in Spectra/Por 1 dialysis tubing (6000-8000 M, cut-off, 25.5 mm diameter). The Affi-Gel Blue resin was regenerated with Buffers F and G as in Step 5.
Step 7: Affi-Gel Blue Resin Chromatography III-Pool 1 from Step 6 was diluted to 120 ml with Buffer J and then applied to a column (1 X 19 cm) of Affi-Gel Blue resin, equilibrated with Buffer J. The column was rinsed with 60 ml of Buffer J, and the transferase activity was eluted with a linear gradient of NaCl using 50 ml of Buffer J and 50 ml of Buffer K. The gradient was followed by 50 ml of Buffer K and finally 50 ml of Buffer J. The 2-ml fractions were assayed in the same way as the previous step. The transferase activity appeared to elute as two peaks, which were pooled separately and designated pool 1A (8 ml) and pool 1B (10 ml).
Pool 2 from step 6 (200 ml) was applied to the column (1 X 19 cm) of Affi-Gel Blue and then treated exactly the same as the first pool. Two peaks of transferase activity were again seen and designated pools 2A (5.5 ml) and 2B (8.5 ml). The four pools collected from Step 7 were dialyzed against 2 liters of Buffer L.
Step 8: 5-Mercuri-UDP-N-acetylgalactosamine Thiopropyl-Sepharose Chromatography-The affinity resin used for this step was prepared essentially as described by Bendiak and Schachter (1987) for 5-Hg-UDP-GlcNAc thiopropyl-Sepharose, except the UDP-GalNAc nucleotide sugar was substituted. The extent of mercuration of UDP-GalNAc was calculated by passage of an aliquot of the mercurated nucleotide mixture (1.6 pmol) through 0.5 ml of reduced thiopropyl-Sepharose and then measuring the absorbance of the eluted material at 267 nm. The yield of 5-Hg-UDP-GalNAc was 69%. The thiopropyl-Sepharose was reduced and coupled to the 5-Hg-UDP-GalNAc as described previously.
The 5-Hg-UDP-GalNAc thiopropyl-Sepharose resin (25 pmol of ligand/ml of gel) was divided into two columns (0.7 X 3.9 cm), which were equilibrated with 15 ml of Buffer L. Pools 1A and 2B from Step 7 were applied to the columns and then to each was added 7.5 ml of Buffer L, 15 ml of Buffer M, and 7.5 ml of Buffer N. The transferase was recovered with 4 ml of Buffer 0, collected as 600-p1 fractions into microcentrifuge tubes. The fractions were assayed and the activity from each column was found to be in one major peak. The peak tubes of each column were combined as pools 1AI and 2BI (2.3 ml). An aliquot of each pool was dialyzed against Buffer L and used subsequently to assay the substrates. The enzyme was stored in Buffer L at -70 "C and was stable for several months.

UDP-Ga1NAc:Polypeptide N-Acetylgalactosaminyltransferase Assays
The enzyme activity of the UDP-Ga1NAc:polypeptide N-acetylgalactosaminyltransferase in bovine colostrum and purified fractions was determined by monitoring the transfer of "C-labeled UPD-GalNAc to a peptide substrate.
In the purification of GalNAc transferase Step 1 to 4, enzyme activity was assayed using the tetrapeptide acceptor Ac-TPPP (50 nmol/assay) in a final volume of 50 p1 of 125 mM Tris, pH 7.1, 0.5% Triton X-100, 10 mM MnC12, and 7.3 p M UDP-[14C]GalNAc (40,000 cpm/assay). The amount of enzyme used was such that <5% of the substrate was consumed in the reaction. After a 1-h incubation at 37 "C, the reaction was terminated by addition of 10 pl of 150 mM EDTA, followed by immediate separation of unincorporated UDP-[14C]GalNAc on a 2 0 0 4 Dowex 1-X8 anion exchange resin (in formate form). Pools from each step of the purification procedure were also assayed in this way to determine the -fold purification of the enzyme and the assays were repeated on three separate occasions.
During the transferase purification, column fractions and pooled material were assayed for activity toward other serine-and threoninecontaining substrates. These substrates were based on a naturally occurring glycosylated serine residue in human erythropoietin, EPO-S (-5PPDAASAAPLR+5) and EPO-T (-5PPDAATAAPLR'5). The assays consisted of 20 nmol of peptide, 40 mM sodium cacodylate, 0.1% Triton X-100, 4 mM 2-mercaptoethanol, 10 mM MnC12, and 14.6 p~ UDP-['4C]GalNAc, at pH 6.5 in a volume of 50 pl, to which 2 p1 of the column fraction to be assayed was added. The reactions were incubated in a 37 "C heat block for 45 min to 2 h and then terminated by the addition of EDTA.
Glycosylated peptide was separated from unincorporated UDP-["CIGalNAc on Bond Elut C18 columns (Varian) using a Vac Elut (Analytichem) apparatus? Briefly, each column was wetted with 1 ml of methanol, followed by 1 ml of water containing 0.1% trifluoroacetic acid. Another 1 ml of water/trifluoroacetic acid was placed in the reservoir of the column, and the entire assay mixture was added and mixed by pipetting. The sample was then drawn through the column, which was rinsed with 1 ml of water/trifluoroacetic acid. The glycosylated substrate was eluted directly into 7-ml glass scintillation vials using 1 ml of 30% acetonitrile, 0.1% trifluoroacetic acid in water. Universol ES ( E N ) scintillation fluid was mixed with the eluted material and counted on a Beckman LS1801 liquid scintillation counter. The efficiency of counting was 97%.
For K,,, determinations of the peptides HVF, -6E, and +3A, the UDP-["CIGalNAc concentration was 36.5 pM. The acceptor concentration range was 0.4-8 mM. All the reactions were incubated for 15 min at 37 'C, after which they were immediately terminated and applied to the C18 columns as before. The assays were performed in triplicate and values for the apparent K,,, of UDP-GalNAc were calculated with the k*cat'" program (BioMetallics, Inc.) using a constant relative error.
The substrate specificity of the two most purified transferase preparations, 1AI and 2B1, was tested with each of the 52 peptide substrates. The reactions contained 20 nmol of peptide, 40 mM sodium cacodylate, 0.1% Triton X-100, 4 mM 2-mercaptoethanol, 10 mM MnC12, and 18.25 p~ UDP-['4C]GalNAc, at pH 6.5, in a volume of 50 pl. Either 1 pl(7 ng) of enzyme 1AI or 3 pl(24 ng) of 2BI was used in each reaction. The incubation time was 45 min at 37 "C, whereupon the reactions were terminated and separated on the C18 columns as described. Each set of assays was performed with a negative control (no peptide added) to determine the background number of counts/ min found in the material that was eluted from the C18 columns. The background count (less than 5% of the maximum substrate values) was subtracted from the total counts obtained for each substrate.

Amino Acid Sequencing of Peptides
In order to determine which hydroxyamino acid(s) of the multisite substrates was glycosylated, the peptides were subjected to amino acid sequencing. The peptides EPO-T, EPO-TT, and EPO-ST were glycosylated essentially as outlined above, except the UDP-[14C] GalNAc concentration was increased to 73 PM, and the reaction time was 20 h. The unincorporated nucleotide was removed, and an aliquot of the glycosylated peptide (approximately 10 nmol) was applied to a glass fiber filter that was pretreated with BioBrene" (Applied Biosystems, Inc.). The filter was subjected to 12 normal (Edman) cycles * B. C. O'Connell and L. A. Tabak, manuscript in preparation.
in an Applied Biosystems 473A sequencer. The product of each cycle was collected, dried, and the counts/min determined.

Protein Determination and Amino Acid Analysis
To quantitate the test peptides and to verify their amino acid content, the peptides were hydrolyzed under standard conditions (vapor phase HC1, 106 "C, 20 h). The hydrolyzed material was then analyzed on a Hewlitt-Packard amino acid acid analyzer with Amino Quant II@ software. Protein determination of the various purification pools was done in the same way as for the peptides, except that the buffers were first exchanged for water before hydrolysis, using Micron 10 ultrafiltration devices (Amicon).
For rapid estimation of protein concentration during purification Step 4, the Lowry assay was used.

RESULTS
Substrate Purification-Synthetic peptides were designed to study the effects of changes in the amino acid sequence surrounding a threonine residue on its ability to be glycosylated. The sequence of the substrate (-'PHMAQVTVG-PGL") was changed by the substitution at each residue in turn with isoleucine, alanine, proline, glutamic acid, and arginine (Table I). In order to facilitate the synthesis of many peptides, those which were altered at the same position were generally made as a mixture. The mixtures of crude peptides were separated into individual peptides by reverse phase HPLC (Fig. 1). The composition and quantity of each peptide was verified by amino acid analysis. The peptides that contained substitutions of arginine and isoleucine were most easily purified from the mixture, whereas those containing alanine, proline, and glutamic acid could only be separated by recycling the mixture through analytical columns under nearisocratic conditions. The substrates used to compare the glycosylation of serine to threonine in this study were based on part of the sequence of human erythropoietin (EPO) containing a glycosylated serine residue -'PPDAASAAPLR+' (EPO-S) and its threonine-containing analogue -'PPDAATAAPLR+' (EPO-T). To investigate the interaction, if any, between two potential acceptor sites, we also synthesized EPO-SS (-'PPDAASS-DAASTAAPLR+'), and EPO-TS (-'PPDAATSAAPLR+'). Each substrate in this series was synthesized and purified separately.
UDP-Ga1NAc:Polypeptide N-Acetylgalactosaminyltransferm e Purification from Bovine Colostrum-A summary of the enzyme purification is given in Table 111. Fractions were assayed for protein by absorbance at 280 nm and for transferase activity using both threonine-and serine-containing substrates. The amount of UDP-GalNAc incorporated into EPO-S was consistently about 5% of the amount incorporated into its analogue, EPO-T, for a given enzyme fraction. Steps 1-4 were employed to remove solid material, lipid, and some contaminating proteins (Fig. 2). The adsorption of the transferase to Affi-Gel Blue was used to concentrate the activity. The enzyme was passed through columns of Affi-Gel Blue three times and eluted with successively shallower gradients of NaCl (Figs. 3-5). The repeated passage of the transferase through Affi-Gel Blue resin resulted in separation of the enzyme into two peaks of activity, A and B. Following the Affi-Gel Blue chromatography, the transferase activity was purified 30-60-fold, compared with crude colostrum. An affinity column of 5-mercuri-UDP-GalNAc thiopropyl-Sepharose was used as the final purification step. Two of the enzyme pools, 1A and 2B, were applied to affinity columns and eluted with a buffer containing 20 mM UDP-GalNAc, resulting in step purification of 479-and 69-fold, respectively (Fig. 6). Following dialysis to eliminate UDP-GalNAc from the elution buffer, the final two purified pools of transferase, 1 A l and 2B1, were used to assay the peptide substrates.
The apparent K,,, of UDP-GalNAc was determined for each of the purified enzyme pools. Using the tetrapeptide Ac-TPPP at saturating conditions, the Km(spp) of UDP-GalNAc for

AAPLR+'),EPO-TT("jPPDAATTAAPLR+'),EPO-ST(-'PP-
transferase pool 1 A l was 4.4 p~ and for pool 2B1 was 7.6 p M (Fig. 7). Apparent K,,, values were obtained for the peptides HVF (1.8 mM) and -6E (3.1 mM) in the presence of 36.5 pM UDP-GalNAc. Under the same conditions, peptide +3A was not glycosylated (Fig. 8). I n Vitro Glycosylation of Peptide Substrates-The HVF peptide and its derivatives ( Table I) were used as substrates for in vitro glycosylation reactions with the purified transferase. Peptide -2E could not be recovered in sufficient amounts to be assayed, giving a total of 52 substrates. We found that reactions performed in the presence of sodium cacodylate proceeded more rapidly than in buffers containing either Tris, MES, or imidazole (Table IV). The peptides were assayed three times, with essentially the same results. There was no apparent difference between the two purified enzyme fractions, 1 A l and 2B1, in their ability to glycosylate any of the peptides. The result of substrate assays using the transferase fraction 1Al is shown in Table I and Fig. 9. The HVF peptide was glycosylated at the rate of 0.1 pmol/min/mg under standard assay conditions. Although the other substrates differed from the HVF peptide by only one amino acid residue, they varied considerably in their ability to be glycosylated. Total GalNAc incorporation for 19 substrates was at least 50% that of the parent peptide. Thirteen peptides were poor substrates for transferase, having incorporated 10% or less [14C]GalNAc than the HVF peptide.
Amino acid substitutions at certain positions of the substrate appeared to have more influence on glycosylation than other positions (Fig. 9). For example, the transferase seemed to be much less tolerant to changes at positions -2 and -3 than at -4, -5, or -6 of the peptide. The importance of the position of a residue (relative to the threonine) is demonstrated by proline, valine, and glycine, since they occur at two positions in the substrate. Any substitution of the proline residue at position +3 resulted in almost no glycosylation of the substrate, whereas changing the proline at position -6 had little effect. The 2 valine residues in the parent peptide are at positions -1 and +l. Glutamic acid and arginine almost eliminate glycosylation of the substrate when they are substituted for valine at position -1, although they have no such effect at position +1. In contrast, glycosylation of the peptide is influenced equally by -1 and +1 substitutions when valine is replaced by isoleucine, alanine, or proline.
No amino acid had the same effect on the substrate at every position (Fig. 9). Glutamic acid substitution, for example, increased glycosylation when it was present at five positions but almost abolished glycosylation at four positions. The   were assayed using EPO-T for pool 1A (0) and 2B (W). The major peak of transferase activity was collected from each column and designated 1AI and 2BI. Neither of the purified pools was able to glycosylate the serine-containing peptide EPO-S. effects of arginine substitutions showed a similar distribution to those of glutamic acid, although the enhancement of glycosylation was less marked. The alanine-substituted peptides were most like those containing arginine, except that alanine at position -1 still yielded a functional substrate but at -5 did not. Proline appeared to have the most inductive effect   that is, none of the isoleucine substitutions led to increased glycosylation of the peptide, and it prevented glycosylation in only two cases. Together, the results confirm our previous finding that the properties of a transferase substrate can be markedly changed by single residue substitutions. The position of the substitution relative to threonine seems to be of primary importance; 0-glycosylation was diminished by quite different types of residue (for example, -3, -2, and +3). At other positions the transferase may be sensitive to particular amino acids; for example, position -1 was influenced by charged residues more than other kinds. There may also be positions that tolerate a wide range of amino acids (for example, +1 and +2).
The set of erythropoietin-based substrates was assayed with both of the purified transferase pools (Table 11). Neither the parent peptide (EPO-S) nor the version containing two serine residues (EPO-SS) could be glycosylated to any detectable extent. The threonine variant of the parent peptide was readily glycosylated (EPO-T) and the peptide containing 2 threonine residues (EPO-TT) was the most rapidly glycosylated substrate of the set (0.37 pmollminlmg for enzyme pool 1Al). The substrate that had serine followed by threonine in the sequence, EPO-ST, reacted similarly to the substrate that had threonine alone. Edman degradation of the glycosylated peptide EPO-ST confirmed that only threonine was an acceptor of GalNAc (Fig. 10). Interestingly, the peptide having the threonine-serine sequence was a poor substrate (EPO-TS), perhaps because the threonine residue was positioned four residues away from proline instead of three. Similarly, in the peptide that had two threonine residues, almost all of the GalNAc was found to be associated with the C-terminal threonine (Fig. 10).

DISCUSSION
Ideally, studies of flanking sequence requirements for 0glycosylation would include all possible combinations of residues surrounding a hydroxyamino acid. Recently, novel techniques have been developed to synthesize and screen completely degenerate mixtures of peptides to identify sequences that bind the heat shock protein BiP (Flynn et d., 1991). Pilot experiments with this approach indicated that the presence of nonglycosylating peptides could effectively inhibit the glycosylation of normally good substrates when they were incubated together (data not shown). A similar approach, but using the peptides still attached to resin, was also found to be unsuitable; that is, substrates that were glycosylated well in solution could not be glycosylated on a solid support (Lam et al., 1991). Recombinant DNA techniques present an attractive way to synthesize a single degenerate mixture of substrates, but the effort involved in cloning, expressing, and then purifying a large number of individual substrates would be considerable. Synthetic peptides have been successfully used as substrates for GalNAc transferase for many years. Peptides can be made predictably in sufficient amounts and can be readily purified. The main drawback of producing large numbers of synthetic peptides is the cost involved previous in uitro glycosylation studies have typically featured fewer than a dozen peptides (Young et al., 1979;Briand et al., 1981;Hughes et al., 1988;Cottrell et al., 1992). Our experimental approach was to study a single 12-amino acid substrate that is based on the sequence surrounding a naturally occurring glycosylated site and is typical of many known glycosylation sites. The strategy for the substrate synthesis was to make mixtures of the peptides having substitutions at the same positions. For example, the five peptides that had isoleucine, alanine, proline, glutamic acid, and arginine at position -5 were synthesized as a mixture. Each peptide was individually separated from the mixtures by reverse phase HPLC. In this fashion, we were able to obtain substitutions at every position of a peptide, except threonine, while minimizing the actual number of syntheses.
The UDP-Ga1NAc:polypeptide N-acetylgalactosaminyltransferase from bovine colostrum was purified 15,000-fold to assay the peptide substrates. We purified the two peaks of enzyme activity eluted from the Affi-Gel Blue columns separately in the later steps, since we could not exclude the possibility that they represented two different transferases. However, the two enzyme pools described here, 1 A l and 2B1, exhibited similar substrate specificities with respect to the peptides in this study. The disparity in the Km(app) for UDP-GalNAc between the two pools (4.4 and 7.6 FM) may be due to the difference in the purity of the pools (15,508-fold and 4151-fold). It is also possible that one pool represents a modification, such as a cleavage product, of the other. The purification of this enzyme has been reported before, although the extent of its substrate specificity was not described (Sugiura et al., 1982;Elhammer and Kornfeld, 1986). Previous purification of this enzyme relied on affinity chromatography with apomucin-Sepharose. Given the difficulties in isolating and deglycosylahg homogeneous salivary gland mucin for an affinity resin, we chose to use the nucleotide sugar as an affinity ligand. The transferase was first purified 30-60-fold by standard chromatographic methods, including three passes through Affi-Gel Blue columns. The major purification was achieved by affinity chromatography on 5-mercuri-UDP-GalNAc thiopropyl-Sepharose; this step alone enhanced the purification almost 500-fold. While this manuscript was in preparation, another report was published that detailed the use of 5-mercuri-UDP-GalNAc thiopropyl-Sepharose to purify a transferase from porcine submandibular gland (Wang e t al., 1992).
Glycosylation assays of the HVF-based peptide and its derivatives confirm that single amino acid changes in the substrate can dramatically alter its capacity to be O-glycosylated ( Fig. 9). At position +3, the substitution of isoleucine, alanine, glutamic acid, or arginine almost precludes glycosylation of the peptide. These results suggest that for this substrate, the proline at +3 is essential for glycosylation. When a second proline residue is placed at -1, +1, +2, or +4, the level of glycosylation of the peptide is markedly increased. Given their frequency near glycosylated residues, it seems that proline residues are particularly effective at facilitating glycosylation, but since the enhancement is not seen at every position, there may be limitations on the range of proline's influence (Gooley et al., 1991). Since most glycosylated sites do not have proline at position +3, other sequences are able to provide the necessary context for glycosylation to occur. Position -1 was also sensitive to amino acid substitutions, with the charged residues causing a substantial reduction in glycosylation. It is noteworthy that the charged residues do not inhibit glycosylation when present at many other positions of this peptide. The peptides that were substituted at position -3 were all poor substrates for the transferase, which is consistent with the prediction that this position is important for 0-glycosylation, although there was no apparent selectivity among the different types of amino acids tested. Experimentally, amino acid substitutions at the -2 position of this peptide all substantially reduced glycosylation, an effect that was not predicted from the databases of known glycosylated sites. Nonetheless, most changes at other points in the flanking sequence (-6, -5, -4, +1, +2, +4, and +5) were generally less able to influence glycosylation of the threonine residue, indicating that transferase does not have strict requirements for all positions of the flanking sequence. From sequence analysis, it was expected that substitutions at the -6 position would not be tolerated, but perhaps its location at the N terminus of the peptide prevented it from exerting its usual effect. In general, the same types of amino acids were found to have the most influence on substrate glycosylation by statistical and experimental methods (proline and charged residues) (O'Connell et al., 1991). Alanine substitutions caused somewhat less pronounced effects, whereas isoleucine produced the least change overall.
Surveys of the sequences surrounding glycosylated sites have revealed no difference in patterns related to serine and threonine (Wilson et al., 1991;O'Connell et al., 1991). However, some reports have stated that peptides with serine residues were not glycosylated in vitro (Hughes et al., 1988;O'Connell et al., 1991). Recently, it was shown that three serine-containing peptides, whose sequences were similar to porcine submandibular gland apomucin, could not be glycosylated by a transferase purified from porcine submandibular gland or by homogenates of porcine, bovine, and ovine submandibular glands (Wang et al., 1992). In our experiments, the transferase derived from bovine colostrum did not glycosylate a peptide based on the sequence about a glycosylated serine residue in human erythropoietin (Table 11). Moreover, the level of serine-specific glycosylation was very low in the crude colostrum and throughout the enzyme purification (Figs. 3-6). The EPO-based peptide was not glycosylated when 2 serine residues were placed in tandem nor was an adjacent threonine residue able to induce glycosylation of a serine. Hence, the evidence suggests that the activity which glycosylates serine residues is either unstable, inhibited, or separate from the threonine-specific activity. The observation that a threonine analogue of the erythropoietin-based peptide was readily glycosylated indicates that the flanking sequence requirements for the hydroxyamino acids may be the same.
It is unclear whether all of the findings presented here are due specifically to amino acid sequence or to changes in the substrate conformation. It has been proposed that the main determinants for 0-glycosylation are the accessibility and local conformation of the acceptor site (Hagopian et al., 1971;Aubert et al., 1976;Hill et al., 1977;Eckhardt et al., 1987). Circular dichroism spectroscopy of peptides indicates that a random-type secondary structure is necessary for the substrate to be 0-glycosylated (O'Connell et al., 1991). It is possible that the transferase has both structural and sequence requirements superimposed on the need to be physically accessible to the transferase. The precise topography of potential glycosylation sites within proteins has yet to be determined.
It is important to note that the great majority of known 0glycosylated residues occur in clusters. The order, if any, in which clustered residues are glycosylated is unknown, so it is difficult to determine if modification of particular sites must precede the modification of other sites. The EPO-based peptide containing 2 threonine residues was glycosylated much more than the same substrate with only 1 threonine ( Table  11). The increased incorporation of GalNAc took place almost exclusively at the C-terminal threonine (Fig. lo), which is consistent with recently published data (Wang et al., 1992). However, in naturally occurring glycoproteins there is at least one example where only the N-terminal threonine of a pair is 0-glycosylated and many cases where both threonines of a pair are acceptors (Murayama et al., 1982;Tomita and Marchesi, 1975;Putnam et al., 1981). Glycosylated residues are customarily identified by a blank cycle in the amino acid sequence of a protein, which does not accurately determine the extent to which a particular residue is glycosylated. It is possible that glycosylation is not initiated to the same extent at all the nominally modified sites of a protein (Gooley et al., 1991). Incomplete initiation of glycosylation could account for some of the heterogeneity that glycoproteins typically display. It is conceivable that more than one threoninespecific transferase exists having separate or overlapping substrate specificities (Hagopian et al., 1971). The presence of several transferases could explain why deglycosylated proteins are poorly glycosylated by a purified enzyme in vitro and why no consensus sequence has emerged (Sugiura et al., 1982;Hill e t al., 1977). Examination of more complex multisite substrates will help to fully define the range of hydroxyamino acids that can be glycosylated by GalNAc transferases. These and other issues must be addressed before 0-linked glycoproteins can be synthesized with fidelity either in vitro or in heterologous systems.
It would be difficult to make broad conclusions about flanking sequence requirements for 0-glycosylation from m y study of peptides in vitro. A logical progression of this work is to examine the effects on glycosylation of systematic mutagenesis of substrates in vivo. Future studies should compare the tolerance for amino acid changes in the sequence surrounding glycosylated sites and should include multiple substitutions. The data presented here suggest that particular combinations of amino acids at sensitive positions are more likely to promote 0-glycosylation of a threonine residue.