Localization of 0-GlcNAc Modification on the Serum Response Transcription Factor*

unique form of nucleoplasmic and cytoplasmic protein glycosylation, 0-linked GlcNAc, Gal transferase R. glycosylation at individual or, the actual sites of attachment of 0-GlcNAc on transcription factors. rigorous for the occurrence and locations of 0-GlcNAc on the c-fos transcription factor, serum response factor (SRF), expressed in an insect cell line. Fast atom bom- bardment mass spectrometry (FAB-MS) of proteolytic digests of SRF provides evidence for the presence of a single substoichiometric 0-GlcNAc residue on each of four peptides isolated after

ing many RNA polymerase I1 transcription factors (Jackson, S. P., and Tjian, R. (1988) Cell 55, 125-133). However, virtually nothing is known about the degree of glycosylation at individual sites, or, indeed, the actual sites of attachment of 0-GlcNAc on transcription factors. In this paper we provide rigorous evidence for the occurrence and locations of 0-GlcNAc on the c-fos transcription factor, serum response factor (SRF), expressed in an insect cell line. Fast atom bombardment mass spectrometry (FAB-MS) of proteolytic digests of SRF provides evidence for the presence of a single substoichiometric 0-GlcNAc residue on each of four peptides isolated after sequential cyanogen bromide, tryptic, and proline specific enzyme digestion: these peptides are 306VSASVSP312, 274GTTSTIQ-TAPzs3, 313SAVSSADGTVLK324, and 374DSSTDLTQT-SSSGTVTLP"' . Using an array of techniques, including manual Edman degradation, aminopeptidase, and elastase digestion, together with FAB-MS, the major sites of 0-GlcNAc attachment were shown to be serine residues within short tandem repeat regions. The highest level of glycosylation was found on the SSS tandem repeat of peptide (374-391) which is situated within the transcriptional activation domain of SRF. The other glycosylation sites observed in SRF are located in the region of the protein between the DNA binding domain and the transcriptional activation domain. Glycosylation of peptides (274-283) and (313-324) was found to occur on the serine in the TTST tandem repeat and on serine 316 in the SS repeat, respectively. The lowest level of glycosylation was recovered in peptide (306-312) which lacks tandem repeats. All the glyco-sylation sites identified in SRF are situated in a relatively short region of the primary sequence close to or within the transcriptional activation domain which is distant from the major sites of phosphorylation catalyzed by casein kinase 11.
Transcriptional initiation in eukaryotes is often brought about by regulated activity of transcription factors, which bind specific sequences within enhancers and promoters. In many cases, modification of transcription factor activity is achieved by post-translational modifications. For example, phosphorylation has been shown to regulate either interaction with DNA (Boyle et al., 1991;Janknecht, 1992;Klausing and Knippers, 1989;Luscher et al., 1990;Manak et al., 1990;Marais et al., 1992) or the ability of DNA-bound factors to activate transcription (Gonzalez and Montminy, 1989;Hsieh et al., 1991;Pulverer et al., 1991). The role of other modifications is less clear. Many transcription factors for genes transcribed by RNA polymerase I1 are modified by glycosylation, probably via addition of 0-GlcNAc moieties Tjian, 1988, 1989;Lichtsteiner and Schibler, 1989). 0-GlcNAc, first discovered by Hart and colleagues (Torres and Hart, 1984;Holt and Hart, 1986;Hart et al., 1988;Hart et al., 1989a and1989b;Haltiwanger et al., 1992), is a simple, unmodified monosaccharide moiety glycosidically linked through the side chain hydroxyls of serine or threonine, often occurring a t multiple sites on the same protein. Because of its intracellular distribution (Holt and Hart, 1986), its presence on a number of nuclear and cytoplasmic proteins, and the existence in the cell of appropriate 0-GlcNAc transferases and glycosidases, it has been postulated that 0-GlcNAc is a regulatory modification analogous to phosphorylation (Hart et al., 1989a and1989b;Holt et al., 1987a;Kearse and Hart, 1991;Haltiwanger et al., 1992). In addition, the presence of 0-GlcNAc on various proteins known to form multimers, including the nuclear pore proteins, erythrocyte band 4.1, and a-crystallin, has prompted speculation that 0-GlcNAc may be involved in the organization of multiprotein complexes (Holt et al., 1987a and1987b;Hart et al., 1989a;Haltiwanger et al., 1992).
Serum response factor (SRF)l is a ubiquitous transcription factor that binds the serum response element (SRE), a regulatory element located in the promoters of many genes that are rapidly and transiently transcribed following stimulation The abbreviations used are: SRF, serum response factor; FAB-MS, fast atom bombardment mass spectrometry; HPLC, high performance liquid chromatography; Prop, propionyl; SRE, serum response element; TFA, trifluoroacetic acid; TPCK, tosylphenylalanyl chloromethyl ketone.

0-GlcNAc Modification of the Transcription
Factor SRF of cells by growth factors (for review, see Treisman, 1990). Three lines of evidence suggest that SRF is directly involved in regulation of transcription by the SRE. First, mutations that reduce or prevent SRF binding have corresponding effects on SRE activity (Treisman, 1990); second, depletion of SRF from cell nuclei following antibody microinjection blocks inducibility of the c-fos SRE (Gauthier-Rouviere et al., 1991); third, introduction into cells of SRF derivatives with altered DNA binding specificities can confer growth factor inducibility on appropriate DNA binding SRF is a 508-amino acid polypeptide. It binds to DNA as a dimer via a DNA binding/dimerization domain located between residues 133 and 222 (Norman et d., 1988), which is also required for the recruitment of accessory DNA binding proteins to the SRE (Mueller and Nordheim, 1991;Schroter et al., 1990). The C-terminal region of the protein contains a transcriptional activation domain.' Several studies have investigated post-translational modifications of SRF, with a view to correlation of such modifications with SRF function. SRF purified from HeLa cells is both phosphorylated (Prywes et al., 1988) and glycosylated (Schroter et al., 1990). The protein is phosphorylated in vivo at a conserved casein kinase I1 site N-terminal to the DNA binding domain (residues 77-85) (Janknecht et al., 1992;Manak et al., 1990;Marais et al., 1992) and a further site at residue 103 (Janknecht et al., 1992). These modifications have only modest effects on DNA binding affinity; however, casein kinase I1 phosphorylation dramatically increases rates of SRF-DNA exchange (Janknecht et al., 1992;Marais et al., 1992).
The glycosylation of SRF has not been characterized in detail. SRF from HeLa cells is retained on lectin columns, probably due to modification by 0-GlcNAc residues (Schroter et al., 1990). We previously produced recombinant SRF in insect cells using a baculovirus vector to facilitate biochemical characterization of the protein. The recombinant protein is efficiently phosphorylated at the N-terminal casein kinase I1 site and is as active as HeLa SRF in a HeLa derived cell-free transcription system (Marais et al., 1992). In this report, we characterize glycosylation of recombinant SRF. We present rigorous structural evidence for the presence of a single substoichiometric 0-GlcNAc residue at each of four specific sites within the C-terminal region of SRF.

MATERIALS AND METHODS
All reagents were of the highest quality available from Sigma, British Drug House, or Aldrich, unless otherwise stated. HPLC grade trifluoroacetic acid and acetonitrile were from Rathburn Chemicals. TPCK-trypsin was from Sigma and proline specific enzyme was from Seikagaku Kogyo Co.
Extraction and Purification of SRF-This was carried out as described in Marais et al., 1992. Precipitation of SRF-SRF was precipitated away from the buffer salts present following purification, using an acetone precipitation method double the volume of acetone was added to the SRF in buffer, the mixture being incubated at -20 "C for 1 h, after which precipitation of the protein is normally complete. If the solution was still clear after the initial addition of acetone and incubation, a further smaller volume of acetone was added followed by further incubation at -20 "C. Following precipitation, the protein was obtained from the solution via centrifugation at 4000 rpm (using a Denley BS 400 centrifuge) and removal of the supernatant. The protein was then dried under a stream of nitrogen. The dried SRF sample was then subjected to the digestion protocol described directly below.
Chemical and Enzymic Digestion of SRF-The precipitated SRF (approximately 5-10 nmol) was transferred to a 50-ml pear-shaped flask, after dissolving in the minimum amount of 70% formic acid. A few crystals of cyanogen bromide were then added to this solution, S. John and R. Treisman, manuscript in preparation. :' C. Hill and R. Treisman, unpublished data. and the reaction mixture was incubated a t room temperature in the dark for 5 h. The reaction was terminated by adding 5 volumes of water and freeze-drying. The products were digested with TPCKtreated trypsin (10 pg) for 6 h at 37 "C in 50 mM ammonium bicarbonate (pH 8.4), and the reaction was stopped by freeze-drying. The products were subsequently digested with proline-specific enzyme (10 pg) for 4 h in 100 mM ammonium bicarbonate (pH 7.8), the reaction again being stopped by freeze-drying.
HPLC Fractionation of the Products of Digestion-The SRF digest was fractionated by reverse phase HPLC using an Ultrasphere ODS column (25 cm X 4.6 mm) attached to a Kontron HPLC system, equipped with an injector (Rheodyne) data system 450, a 430 dualwavelength detector and a P-800 plotter-printer. The digestion products were loaded in 0.1% trifluoroacetic acid in milliQ water (solvent A) and eluted with a gradient from 0% solvent B to 60% solvent B over 50 min (solvent B was a mixture of acetonitrile and 0.1% aqueous trifluoroacetic acid, 9:1, v/v). The flow rate was 1 ml/min, and 1-min fractions were collected. The eluent was monitored at 214 and 280 nm. Each fraction was dried down and redissolved in 10 p1 of 6% aqueous acetic acid prior to analysis of 1 pl by FAB-MS.
Gal Transferase Labeling-HPLC fractions were solubilized in 45 pl of the following reaction mixture: 5 pl of 0.5 M sodium cacodylate (pH 6.5), 50 nm manganese chloride, 2 p1 of pregalactosylated galactosyltransferase (50 milliunits, -20 pg of protein (Sigma)), and 38 pl of milliQ water. Labeling was initiated on addition of 1 pCi of UDP-[3H]galactose (40 Ci/mmol) in 5 pl of 25 mM 5"adenosine monophosphate and was incubated at 0 "C for 1 h and at 37 "C for 1 h. After this period, 5 p1 of a 20 mM UDP-galactose solution in water was added and the reaction mixture was incubated for a further 2 h at 37 "C. At the end of this incubation, the reaction was terminated by freeze-drying. Unreacted UDP-Gal and buffer salts were removed from the galactose-labeled product using the same HPLC procedure employed for the initial fractionation of the SRF digest.
Liquid Scintillation Counting-In order to locate the radioactively labeled glycopeptide, the amount of radioactivity present in a portion (typically 1/40) of each of the 1-ml fractions eluting from the HPLC purlfication was monitored using liquid scintillation counting. The aliquot removed from each fraction was taken up in 3 ml of Aquassure liquid scintillant (Du Pont-New England Nuclear), and the level of radioactivity present was counted using a Kontron-400 scintillation counter.
Carboxyesterification-Methanolic HCl was prepared by bubbling HGl gas into methanol until the solution was hot to the touch (approximately 1 M). The peptide or glycopeptide was esterified by incubation in this reagent for 30 min at room temperature. Reagents were removed under nitrogen.
Propionylation-The propionylation reagent was prepared by mixing trifluoroacetic anhydride with propionic acid (2:1, v/v) and allowed to cool to room temperature. Propionyl derivatives were prepared by adding 10 pl of this reagent to samples previously dried down in glass screw-capped tubes. Aliquots (1 pl) were removed at various times and loaded directly into the FAB matrix. Reaction times of about 1 min gave the best data.
Elastase Digestion of Peptides and Glycopeptides-Peptide/glycopeptide samples were dissolved in 50 mM ammonium bicarbonate (pH 8.4) and digested with elastase (0.5 pg) for 5 h at 37 "C. The reaction was stopped by freeze-drying.
Leucine Aminopeptidase Digest-Samples were incubated with leucine aminopeptidase (0.5 pg) in 100 mM ammonium bicarbonate (pH 7.8) a t 37 "C. Aliquots were removed at various time points over a 5h period and loaded directly into the FAB matrix.
Edman Degradation-Samples for manual Edman degradation were dried down in glass screw-capped tubes and dissolved in 100 pl of milliQ water to which 100 pl of a 5% solution of phenylisothiocyanate in pyridine was added. The tube was flushed with nitrogen and incubated a t 45 "C for 1 h after which the reagents were removed under a stream of nitrogen. Cleavage of the phenylisothiocyanatelabeled N-terminal amino acid was carried out by the addition of 100 pl of trifluoroacetic acid and incubation for 10 min a t 45 "C. The products were dried in vacuum over potassium hydroxide. The truncated peptide/glycopeptide was extracted away from the majority of the Edman reagents using 3 X 100-p1 aliquots of 5% aqueous acetic acid which was removed by freeze-drying. In some experiments, the truncated sample was analyzed at this stage by FAB-MS. In other experiments, it was HPLC-purified prior to FAB-MS and/or scintillation counting. ment on the [3H]GalGlcNAc-313SAVSSAD/NGTVLK324 glycopeptide Immobilized Manual Sequencing-The site of 0-GlcNAc attach-0-GlcNAc Modification of the Transcription Factor SRF 16913 was determined by manual Edman degradation after covalent attachment to arylamide-derivatized polyvinyldifluoridine membrane (Sequelon-AA, Milligan Corp.) according to the method of Sullivan and Wong (1991). An aliquot of the trifluoroacetic acid-cleaved product from each cycle was transferred to a scintillation vial, evaporated a t 60 "C, neutralized with 0.5 M Tris (pH 6.8), and counted after the addition of scintillation mixture (Formula 963, Du Pont-New England Nuclear Research Products). FAB-MS-FAB-MS was carried out using a VG Analytical ZAB-H F mass spectrometer fitted with an "Scan FAB gun operated at 10 kV. The matrix used was thioglycerol, and samples were dissolved in either 5% aqueous acetic acid (native peptides and glycopeptides) o r methanol (esterified samples), except for propionyl derivatives which were directly aliquoted from the reaction medium. Spectra were recorded on oscillographic chart paper and manually counted.

FAB
Mapping of SRF-SRF was first digested with cyanogen bromide in order to inactivate protease inhibitors introduced during the extraction procedure and to open up the protein to facilitate subsequent proteolytic digestion. The mixture of polypeptides produced by cyanogen bromide digestion was then sequentially digested with trypsin and prolinespecific enzyme. Trypsin was selected because of the relatively high abundance of arginine and lysine in the predicted SRF sequence (Fig. l), while proline-specific enzyme was expected t o cleave near putative glycosylation sites, since 0-GlcNAc addition appears to require nearby proline residues (Haltiwanger et al., 1990). The overall digestion strategy was thus designed to optimize the probability that glycopeptides would be sufficiently small (less than 3000 Da) to permit high sensitivity FAB-MS analyses (Morris et al., 1977). The product mixture was fractionated by reverse phase HPLC to produce simpler peptide/glycopeptide mixtures for subsequent analysis (data not shown). A portion (typically 10%) of each fraction was analyzed by FAB-MS, and the molecular ions observed were mapped onto the predicted protein sequence ( Fig. 1) thereby verifying 94% of the sequence (Morris et al., 1983). Several fractions yielded spectra containing  (Fig. 2e). A terminal GlcNAcspecific Gal transferase was used in order to establish whether the 203 increments were indicative of glycosylation, or whether the signals were derived from peptides which fortuitously differed in mass by 203 Da. Fractions containing putative glycopeptides were incubated with Gal transferase and UDP-["]Gal followed by an excess of unlabeled UDP-Gal. After HPLC purification, the products were screened for incorporation of radioactivity and for shifts in molecular weight (Fig. 3). The signals at m/z 646, 931, 976, 1134, and 1796 were not affected by the Gal-labeling experiment. With the exception of m/z 1134, these mapped to expected digestion products, namely, 306VSASVSP312, I5GSALGGSLNRz4, 274GTTSTIQTAP283, and 374DSSTDLTQTSSSGTVTLP391, respectively. The signal at m/z 1134 is 1 mass unit higher than the calculated value for 313SAVSSANGTVLK324. Subdigestion and esterification experiments, which are reported below in the section describing glycosylation of this peptide, demonstrated that the asparagine residue in this peptide is present as aspartic acid at the end of the digestion/purification protocol. In some digestions of SRF, a small amount of the Asn-containing peptide was recovered, and this gave the expected molecular ion at m/z 1133 accompanied by a signal 203 mass units higher at m/z 1336. The signals a t m/z 849, 1179, 1337, and 1999, which are 203 mass units higher than m/z 646, 976, 1134, and 1796, respectively, each shifted by 162 mass units to m/z 1011,1341,1499, and 2161, respectively, after Gal labeling. These data were consistent with the incorporation of a single galactose residue into each putative glycopeptide and indicated that peptides 306VSASVSP312, 274GTTSTIQTAP283, 313SAVSSA(D/N)GTVLK324, and 374DS-STDLTQTSSSGTVTLP"' are glycosylated in SRF. Confirmation that these peptides are, indeed, glycosylated with 0-GlcNAc, and information on sites of glycosylation, was provided by a variety of experiments, the results of which are presented separately below for each glycosylated peptide. The peptide "GGSALGGSLNRz4 (m/z 931) is not glycosylated, but fractions containing this peptide also contain 313SAVSSADGTVLK324 (m/z 1134) which by chance has a mass difference of 203 mass units.
Characterization of Glycosylated 313SA VSSA(D/N)GT-VLKjZ4-The following data indicate the presence of 0-GlcNAc at Ser316 in 313SAVSSADGTVLK324. The peptide (m/z 1134) and its putative glycosylated counterpart (m/z 1337) eluted mainly in fraction 19 (Fig. 2a) together with several additional peptides whose molecular ions are assigned in the figure legend. After Gal labeling, the majority of the peptides were recovered in fraction 19 while the peak of radioactivity eluted in fraction 18 (Fig. 4a). The latter fraction gave a major molecular ion at m/z 1499 (Fig. 3 9 ii) consistent with Gal addition to the m/z 1337 component in Fig. 2a. Fraction 18 was subjected to a single step of manual Edman degradation, and the products were HPLC-purified. A very minor amount of radioactivity was recovered in the void, but the majority of radioactivity eluted in fraction 26 (data not shown). The latter was shown by FAB-MS to be the truncated glycopeptide 314AVSSADGTVLK324 substituted with an expected phenylthiocarbamyl group on the lysine side chain (m/z 1547; Fig. 4, inset). These data ruled out significant glycosylation of Ser"'. Further steps of manual Edman degradation were carried out, but these gave equivocal data due to the partial hydrolysis of the sugar moiety that occurred at each step. Therefore, information on the glycosylation site(s) A second preparation of SRF was digested with cyanogen bromide, trypsin, and proline-specific enzyme, and the products were purified by HPLC. In this.HPLC run, all components eluted slightly later than corresponding components in the first HPLC experiment (data not shown), and the peptide/ glycopeptide under study eluted in fraction 20. The FAB spectrum of fraction 20 contained a major signal at m/z 1134, accompanied by a minor signal 1 mass unit lower at m/z 1133 (data not shown) consistent with the presence of both the acid and amide forms of the peptide (313-324). An analogous pair of peaks was observed for the glycopeptide (mlz 13361   1337). An additional peptide (l5GSALGGSLNRZ4) was present in this fraction giving a major signal at m/z 931. The mixture was subjected to a short elastase digestion, and the products were analyzed by FAB-MS before (Fig. 5a) and after (Fig. 5b) esterification. The mass spectrum contained signals consistent with cleavage of SAVSSA(D/N)GTVLK between the adjacent serine residues to produce SANGTVLK (mlz 789) and SADGTVLK (mlz 790). No signals for undigested SAVSSA(D/N)GTVLK were observed. In contrast, the glycopeptide appeared to be largely unaffected by the enzyme since the signals at m/z 1336 and 1337 remained prominent ( Fig. 5a). The assignment of signals in Fig. 5a was confirmed bJt FAB-MS of the methyl-esterified derivative (Hunt and Morris, 1973) (Fig. 5b). The mlz 7891790 pair shifted to m/z 803 (monomethyl ester) and 818 (dimethyl ester), respectively, confirming the presence of a side chain carboxyl group in the m/z 790 component. Similarly, m/z 133611337 shifted to m/z 1350 and 1365. We postulate that resistance to elastase digestion of the glycopeptide under conditions that completely cleave the corresponding peptide between the 2 serines was most probably caused by glycosylation at 1 of these residues.
Additional evidence for glycosylation at this site was provided by data from aminopeptidase degradation of a portion of fraction 20. Aliquots were removed at several time points during the incubation and analyzed by FAB-MS. The peptide rapidly lost the first 6 residues to give DGTVLK (mlz 632; data not shown). The glycopeptide degraded more slowly, finally yielding a quasimolecular ion at m/z 1080, which shifted to mlz 1108 after formation of the dimethyl ester (data not shown), consistent with GlcNAc-substituted SSADGT-VLK.
In one of our laboratories: we have been exploring the utility for O-GlcNAc studies of the immobilized sequencing methodology recently introduced by Sullivan and Wong (1991) for the analysis of phosphorylation sites in phosphopeptides. Their strategy involved immobilizing the 32P-labeled phosphopeptide on arylamide discs using water-soluble carbodiimide, subjecting the immobilized peptides to manual Edman degradation and determining the amount of radioactivity released on each cycle. Having obtained very promising HPLC fraction number 16 (9 and HPLC fraction number 15 (ii) from the purification of the products of Gal labeling of the fraction containing 306VSASVSP312 and its putative 0-GlcNAc-modified glycopeptide (Fig. 2 4 ; and e: HPLC fraction number 20 (i) and HPLC fraction number 19 (ii) from the purification of the products of Gal labeling of the fraction containing the peptide "GSALGGSLNR" and its putative glycopeptide counterpart (Fig. 2e).
results when we used similar methodology to sequence several glycopeptide was immobilized and subjected to 10 cycles of [3H]GI~NA~-substituted synthetic peptides: we decided to Edman degradation. The majority of the counts were released subject the small amount of sample remaining in fraction 20 on the fourth cycle (Fig. 6) which corresponds to the 1st serine to immobilized sequencing. Radioactivity was first incorpo-of the SS repeat. No significant loss of counts ocurred at the rated into the sample by [3H]Gal labeling, and the purified 9th step, indicating that threonine 321 is not glycosylated. fraction containing the peptide :":iSAVSSADGTVLK324 and its putative glycopeptide counterpart; t.he UV absorbance was measured at 214 nm, and the level of radioactivity present in a 1/40 portion of the eluting fractions is expressed in counts/min, and b, partial FAB mass spectrum of the products of 1 cycle of manual Edman carried out on the fraction containing the Gal-labeled glycopeptide 313SAVSSADGTVLK324 (GalGlcNAc) and t.he equivalent peptide (Fig. 3c, ii). The  During the initial HPLC fractionation of the SRF digest, the constituents that gave molecular ions at m/z 1796 and 1999, respectively (Fig. 2b), consistent with the peptide "74DSSTDLTQTSSSGTVTLP391 and its 0-GlcNAc glycoform, eluted in fraction 24. After Gal labeling, the new glycopeptide signal (m/z 2161) was observed in the FAB spectrum of fraction 22 (Fig. 3b, ii). This fraction contained the bulk of the radioactivity (Fig. 7). The peptide was recovered mainly in fraction 24, while fraction 23 contained a mixture of peptide and glycopeptide. Elastase digestion of fraction 22 gave a mixture of products whose FAB mass spectrum contained [M + HI' quasimolecular ions at m/z 530, 587, 738, and 1213, together with sodium adducts in some cases, corresponding to 3*7TVTLP391 , '% GTVTLP3'l, 374DSSTDLT380, and GalGlcNAc-substituted 383SSSGTVTLP391, respectively (Fig. 8). This pattern indicates, among other peptide bonds, that the Thr380-Ser381 bond is sensitive to elastase, but cleavage within the serine triplet does not occur to an observable extent. Under the same conditions of elastase digestion, the peptide sample (fraction 24) afforded a FAB spectrum containing molecular ions at m/z 738, 761, and 848 (data not shown) corresponding to 374DSSTDLT380, 384SSGTVTLP391, and 383SSSGTVTLP391, respectively, showing that the peptide, unlike the glycopeptide, undergoes facile cleavage between Ser'*' and Ser384. Taken together, these data suggest that glycosylation occurs on the SSS tripeptide. This conclusion was supported by experiments carried out on the second batch of digested SRF which had not been subjected to Gal labeling (see experiments on SAVSSA(D/ N)GTVLK above). Elastase digestion of HPLC fraction 24 which contained 374DSSTDLTQTSSSGTVTLP391 and its putative 0-GlcNAc counterpart, together with some additional peptides, yielded products whose FAB spectra (data not shown) contained quasimolecular ions at m/z 738, 761, and 848 (see above). The last signal was accompanied by a peak tandem repeat would most reasonably account for both the resistance to elastase digestion of the Ser"3/Ser3a4 bond in glycopeptide 374-391 and also the resistance of glycopeptide 383-391 to leucine aminopeptidase digestion.
Characterization of Glycosylated 274GTTSTIQTAP283-The following data indicate the presence of 0-GlcNAc at Se? in SRF.
The 274GTTSTIQTAP2R3 peptide and its putative monoglycosylated counterpart eluted in HPLC fraction 18 (Fig. 2c) and gave molecular ions at m/z 976 and 1179, respectively. A number of other peptides were also present in this fraction, and their molecular ions are assigned in the legend to Fig. 2c. After Gal labeling, the majority of the peptide was recovered in fraction 18, while the 3H-labeled glycopeptide eluted mainly in fraction 17 (Fig. 3c, i and ii). The HPLC trace is shown in Fig. 9a. The peptide-rich fraction gave a major molecular ion labeling of the HPLC fraction containing the peptide 274GTTSTIQTAP283 and its putative glycopeptide counterpart, the UV absorbance was measured at 214 nm, and the level of radioactivity present in a 1/40 portion of the eluting fractions is expressed in counts/min, and 6, partial FAB mass spectrum of the products of propionylation of HPLC fraction number 18 (Fig. 3b, i)

Fraclian Inumber
at m/z 976 together with a weak signal at m/z 1341 (Fig. 3c, i) consistent with GalGlcNAc-substituted GTTSTIQTAP. The latter assignment was confirmed by FAB-MS after propionylation of the remainder of this fraction (Fig. 9b). This derivatization step was included because previous work from our laboratory (Reason et al., 1991) had shown that serine/ threonine-rich glycopeptides give better quality FAB spectra after protection of their hydroxyl groups with propionyl groups. Further, the derivatized glycopeptides preferentially desorb from a peptide/glycopeptide mixture. The FAB spectrum (Fig. 96) contained a cluster of molecular ions exhibiting the pattern of 56 and 40 mass unit intervals which characterizes derivatives formed by reaction with a mixed trifluoroacetyl/propionyl anhydride (Reason et al., 1991). The derivatized glycopeptide gave a cluster of signals at m/z 1677 (6 Prop), 1717 (5 Prop, 1 TFA), 1733 (7 Prop), 1773 (6 Prop, 1 TFA), 1789 (8 Prop), and 1829 (7 Prop, 1 TFA). Fraction 17, which contained the bulk of the radioactivity, was subdigested with elastase, and the products were analyzed by FAB-MS before (Fig. loa) and after (Fig. 106) propionylation. All major signals observed in Fig. loa, with the exception of m/z 944, can be rationalized as molecular ions of proteolytic fragments of peptides known to be present in the sample. They are assigned in the figure legend. The major signal at m/z 944 corresponds to GTTSTI with an attached GalGlcNAc moiety. Corroborative evidence for the glycopeptide was afforded by the propionylated derivative which gave a cluster of ions near m/z 1400 whose masses were consistent with those expected for the glycopeptide ( Fig. 106; signals are assigned in the legend). These data provided firm evidence for the site(s) of glycosylation being located in the TTST tandem repeat.
The precise location(s) of the glycosylation were explored further using sequential manual Edman degradation. The products of each Edman cleavage were purified using the same HPLC conditions that were used to separate the peptides 261DTLKp265 274GTTST1279 32GGGGTRGA39, 143RVKIK147, produced by proteolytic digestion of SRF. A portion of each HPLC fraction was counted, and the purified truncated glycopeptide was subjected to the next Edman cleavage. These steps were repeated until the counts were no longer eluting at the position expected for the truncated glycopeptide. Preliminary experiments on the GTTSTIQTAP peptide, in which the peptide/glycopeptide products of sequential Edman degradation were identified after HPLC purification (data not shown), showed that removal of the first 3 residues from GTTSTIQTAP produced only a small shift in the HPLC elution position (from fraction 18 to fraction 17). A similar small shift was expected for the glycopeptide since data from a variety of 0-GlcNAc glycopeptides (Reason et al., 1991;Roquemore et al., 1992), including those from SRF, have indicated that GlcNAc and GalGlcNAc-substituted peptides elute very close to their nonglycosylated counterparts.

0-GlcNAc Modification
The results of the Edman experiment are shown in Fig. 11. After one step of Edman degradation (removal of N-terminal glycine), the majority of the counts were recovered in fractions 17 and 18, with a small portion being located in the void FIG. 11. The total counts/min present in the HPLC fractions obtained from purification of the products of 1, 2, 3, and 4 cycles of sequential manual Edman degradation of Gal-GlcNAc-substituted 274GTTSTIQTAP283.
volume. It is probable that counts eluting in the void volume are due to a minor amount of hydrolyzed carbohydrate. We have previously reported the partial loss of 0-GlcNAc during the Edman reaction (Reason et al., 1991). The counts were also partitioned between fractions 17 and 18 after the next two steps of Edman (removal of Thr275 and T h P ) ; the retention of the major radioactive peak at 17 after these two steps provided strong evidence for the location of the sugar moiety on S e P 7 or T h P . After the fourth step of the Edman cleavage, most of the counts were lost from the predicted peptide/glycopeptide elution region (Fig. l l ) , and a major peak of radioactivity eluted in fraction 32, the late retention time being consistent with a phenylthiocarbamoyl derivative. These data suggest that Se? is the major glycosylation site in the GTTSTIQTAP peptide.
Characterization of Glycosyluted 306VSASVSP12-The following data indicate the presence of a single 0-GlcNAc at one or other of the first 2 serines in 306VSASVSP312.
This peptide ( m / z 646) and its putative monoglycosylated counterpart (m/z 849) co-eluted in fraction 16 upon HPLC purification of digested SRF (Fig. 2d). After repurification of the products of Gal transferase labeling, under the same HPLC conditions, the bulk of the peptide again eluted in fraction 16 (Fig. 3d, i), co-incident with the major UV absorption ( Fig. E a ) , while the putative glycopeptide, whose molecular ion had shifted from m/z 849 to m/z 1011 (Fig. 3d, ii), eluted in fraction 15, on the edge of the UV absorption but coincident with the peak of radioactivity (Fig. 12a). Fraction 15 also contained a significant amount of the unglycosylated peptide ( m / z 646, Fig. 3d, ii). The amount of radioactivity incorporated was substantially less than that observed for the other three glycopeptides (see above). This was consistently observed in several preparations of SRF indicating either that the VSASVSP sequence is poorly glycosylated in SRF, or that this region of the protein is particularly susceptible to endogenous hexosaminidases during isolation and purification of SRF. An attempt was made t o locate the site of attachment of the carbohydrate by elastase subdigestion of fraction 15, followed by FAB-MS analysis of the propionylated products. The FAB spectrum (Fig. 12b) , 1 TFA)). Inspection of the sequence indicates that three possibilities exist for this composition, namely VSAS, SASV, or ASVS produced by elastase cleavage of the S-V, V-S, and S-P bonds, respectively, in VSASVSP. The ASVS sequence was ruled out because elastase is not expected to cleave a S-P bond. Therefore, these data indicate that the 0-GlcNAc is attached to serine 307 or 309 in SRF. Because of the low level of glycosylation of this peptide, we were not able to carry out further experiments to differentiate between these two sites.

DISCUSSION
In this paper, we have examined recombinant SRF produced using a baculovirus vector to characterize glycosylation of an RNA polymerase I1 transcription factor. We used fast atom bombardment mass spectrometry (FAB-MS) in combination with chemical and proteolytic digestion to demonstrate unambiguously that the protein is modified by addition of 0-GlcNAc moieties on 4 serine residues within its C-terminal region. Substoichiometric levels of glycosylation were observed at all four sites. This may be a consequence of overproduction of the protein in the baculovirus system. However, since the SRF gene is conserved in insects (Norman et al., 1988, and Footnote 5), it is likely that the nature and locations of the modifications that we observe reflect the situation in mammalian cells. Of course, rigorous confirmation of this awaits the analysis of SRF purified from HeLa cells. Alternatively, unknown amounts of 0-GlcNAc are likely to have been removed by endogenous N-acetylglucosaminidase activities which are abundant in all eukaryotic cells (Haltiwanger et al., 1992).
The Gal-labeling experiment, in which partially purified digestion products were treated first with UDP-[3H]Ga1 followed by an excess of unlabeled UDP-Gal, enabled us to "tag" the 0-GlcNAc-containing components so that they could be unambiguously identified and further characterized. Addition of galactose resulted in a complete conversion of terminal GlcNAc residues to GalGlcNAc disaccharides. This was achieved with a high level of incorporation of radioactivity M. Levine and W. Gehring, personal communication.
(see, for example, Figs. 4a, 7, 9a, and 12a) which facilitated detection of the Gal-labeled products in a variety of subsequent reactions including subdigestion and Edman degradation. In mass terms, however, the bulk of the Gal labeling was effected with nonradioactive, rather than radiolabeled, reagent. Consequently, a shift in molecular weight of 162 OCcurred for each residue incorporated significant levels of tritium isotopes were not observed in the FAB spectra.
Four quasimolecular ions were shown to shift by a single increment of 162 mass units after Gal labeling. These were m/z 646, 976, 1134, and 1796 which were assigned to LK324, and 374DSSTDLTQTSSSGTVTLP391, each carrying a single 0-GlcNAc residue. A variety of data suggest that the major attachment sites of 0-GlcNAc are Ser307 or Ser309 in peptide (306-312), Ser277 in peptide (274-283), Ser316 of the Ser-Ser pair in peptide (313-324), and one of the Ser-Ser-Ser triplet in peptide (374-391). We found no evidence for the presence of more than one GlcNAc in any of these peptides. It is possible, however, that additional minor glycosylation could be present at levels undetectable by FAB-MS.
All four 0-GlcNAc glycopeptides co-eluted on reverse phase HPLC with their nonglycosylated counterparts. The relative abundance of the quasimolecular ions afforded by each peptide/glycopeptide pair suggested that the glycopeptides were only minor constituents of the mixture. Further information on the degree of substoichiometry of 0-GlcNAc came from the Gal-labeled glycopeptides which eluted a little earlier than the corresponding peptides. Comparison of the UV and radioactivity profiles suggested that the level of glycosylation in the isolated protein was considerably less than 50% even in the most heavily glycosylated glycopeptide and was probably less than 10% in the VSASVSP sequence. It is possible that the low level of 0-GlcNAc observed in each of the isolated peptides was a consequence of hexosaminidase action during purification of the SRF. In one experiment, we attempted to minimize hexosaminidase action by includingp-aminophenyl-N-acetyl-P-D-thioglucosaminide (a hexosaminidase inhibitor) in the mixture of inhibitors added at the start of the SRF purification. However, this had no observable effect on the amount of 0-GlcNAc recovered after digestion. The relative amounts of 0-GlcNAc detected at each of the four sites did not vary significantly between different batches of SRF. In all experiments, the lowest level of glycosylation was found on 306VSASVSP312, and the highest on 374DSSTDLTQTSSS-GTVTLP3'l, the latter being located in the transcription activation domain of SRF.' The four glycopeptides isolated from SRF between them contain 22 possible sites of 0-GlcNAc attachment. The majority of these serine and threonine residues were excluded as sites of significant glycosylation by employing a variety of subdigestion and Edman cleavage reactions, which were monitored, when sensitivity permitted, by FAB-MS. In principle, possible sites of 0-GlcNAc attachment in each peptide could be identified by observing which regions of the initial glycopeptide carried or lacked carbohydrate after subdigestion and/ or Edman cleavage. However, several factors served to complicate the structural studies of the 0-GlcNAc glycopeptides. Because the glycopeptide was usually "contaminated with high levels of its nonglycosylated counterpart, it was not possible to distinguish a carbohydrate-free degradation product whose precursor was the glycopeptide, from the same fragment derived from the nonglycosylated peptide. Further, the absence of quasimolecular ions for glycosylated fragments in spectra which clearly showed data for the corresponding 306VSASVSP312, 27"GTTSTIQTAP283, 313SAVSSADGTV-S. John and R. Treisman, unpublished data.
peptides, was not necessarily attributable to the lack of glycosylation in those regions of the peptide. For example, the glycopeptide products might not be observed for sensitivity reasons, especially after a series of reactions on low picomole levels of material of which only a small percentage was glycosylated. Also, the O-GlcNAc moieties affected both the rates and sites of proteolytic cleavage. Therefore, nonglycosylated regions of a glycopeptide might not be released during subdigestion even though the same sequence was readily released from the free peptide. A @-elimination strategy followed by FAB-MS sequencing to identify the dehydroamino acids was expected to solve the substitution position question (Morris et al., 1978), but again the small quantities of glycopeptide available plus possible problems on reaction rates of elimination prevented the acquisition of data on the sample tested.
The disposition of the glycopeptides within SRF and their adjacent amino acid sequences are shown in Fig. 13. For comparison, the sequences of the previously identified attachment sites in the 62-kDa nuclear pore protein (D'Onofrio et al., 1988), the 65-kDa erythrocyte protein (Hart et al., 1989b), the Band 4.1 erythrocyte protein (Hart et al., 1989b;Inaba and Maede, 1989), and bovine aA crystallin (Roquemore et al., 1992) are also included in Fig. 13. Three of the SRF glycosylation sites occur in a 50-amino acid stretch between the DNA binding and transcriptional activation regions of SRF. The fourth, and most heavily glycosylated, site is located within the transcriptional activation domain. Other than the prevalence of hydroxylated amino acids and the presence of at least 1 nearby proline residue, there are no obvious similarities between the sequences surrounding the attachment sites in SRF and the other peptides shown in Fig. 13. In in vitro assays, the O-GlcNAc transferase that we have recently purified (Haltiwanger et al., 1990) only recognizes peptides containing proline residues, which is consistent with the majority of the sites found thus far.
At present, the significance of glycosylation of SRF and other RNA pol I1 transcription factors is unclear. The sites that we have mapped in SRF are distinct from the parts of the protein thought to be involved in dimerization, DNA binding, or recruitment of accessory proteins. Moreover, the highly related Xenopus laeuis homolog of SRF (Mohun et al., 1991), suggesting that it does not fulfill an essential function. Interestingly, this peptide showed the lowest level of 0-GlcNAc when compared with the other three identified sites of glycosylation.
:NM?VSASV,CjP312 sequence is not conserved in the otherwise Because of the intracellular distribution of O-GlcNAc (Holt and Hart, 1986), its presence on a number of nuclear and cytoplasmic proteins, and the existence in the cell of appropriate O-GlcNAc transferases and glycosidases, it has been postulated that O-GlcNAc may be a regulatory modification analogous to phosphorylation (Hart et al., 1989a and1989b;Holt et al., 1987a;Kearse and Hart, 1991;Haltiwanger et al., 1992). Further, it has been suggested that phosphorylation and glycosylation may compete for the same site within a protein backbone (Kearse and Hart, 1991). In SRF, the 0-GlcNAc attachment sites that we have identified between residues 274 and 391 are well removed from the major casein kinase I1 phosphorylation site (residues 77-85). It is, however, conceivable that glycosylation and phosphorylation of SRF are interdependent via allosteric mechanisms.

Q A~~S R~D S S T D L T Q T I S S S~G T V T L P ) A T I M T S~V~T~~ ~~~[~a~
The presence of O-GlcNAc on a number of proteins that are known to form multimers, including the nuclear pore proteins, erythrocyte Band 4.1, and crystallins, has prompted speculation that O-GlcNAc may be involved in the organization of multiprotein complexes (Holt et al., 1987a and1987b;Hart et al., 1989a;Haltiwanger et al., 1992). Indirect evidence for the involvement of O-GlcNAc in transcriptional activation has been provided by earlier studies on the transcription factor Spl, which demonstrate that the GlcNAc binding lectin wheat germ agglutinin inhibited Spl-mediated transcriptional activation (Jackson and Tjian, 1988). Also, S p l synthesized in Escherichia coli expression systems, which was therefore not glycosylated, only exhibited one-third of the activity of the native protein (Jackson and Tjian, 1988). Sequences in the C-terminal region of SRF are involved in transcriptional activation observed using a HeLa cell-derived in vitro transcription system.' Our observation that one of the O-GlcNAc attachment sites in SRF is located within this transcriptional activation domain and three others in the linking domain between the DNA binding and transcriptional activation domains provides some further support for the proposition that O-GlcNAc may play a role in the control of transcription.
In in vitro assays, recombinant SRF produced in bacteria functions with an efficiency equivalent to that of the native HeLa cell protein (Manak et al., 1990;Marais et al., 1992), a finding that suggests that glycosylation is not necessary for the transcriptional activation function of the protein, at least in in uitro systems. However, based upon findings in reticulocyte lysates (Starr and Hanover, 1990), it is possible that in vitro addition of O-GlcNAc to bacterial SRF is catalyzed by enzymes in extracts used to study transcription, and thus the data obtained using recombinant SRF may have been afforded by glycosylated material. It is notable, however, that in a deletion analysis of the C-terminal transcriptional activation region of SRF, we found that deletion of sequences including the first three glycosylation sites has no effect on transcriptional activation, as measured in vitro? Possible in vivo functions of O-GlcNAc on transcription factors such as SRF include: regulation of their turnover or nuclear localization, control of their phosphorylation state, or mediation of their reversible association with other transcription factors and/or RNA polymerase 11. Clearly, careful consideration of experimental design and in vitro assay systems will be required to rigorously evaluate these possibilities.
The work reported in this paper constitutes the first physical molecular characterization of O-GlcNAc in a transcription factor. The post-translational modification information determined here opens up a wide range of experiments, including site-directed mutagenesis to prevent glycosylation of the protein in vivo, that should lead to a clarification of the biological function of this unusual glycosylation.