Isolation of the GSYl Gene Encoding Yeast Glycogen Synthase and Evidence for the Existence of a Second Gene*

by the gene. The data are explained if S. cerevisiue has two glycogen synthase genes encoding proteins with significant sequence sim- ilarity. The protein sequence predicted by the GSYl gene lacks the extreme NH&erminal phosphorylation sites of the mammalian enzymes. The COOH-terminal phosphorylated region of the mammalian enzyme over- all displayed low identity to the yeast COOH terminus, but there was homology in the region of the mammalian phosphorylation sites 3 and 4. Three potential cyclic AMP-dependent protein kinase sites are located in this region of the yeast enzyme. The regions of glycogen synthase likely to be involved in covalent regulation are thus more variable than the catalytic center of the molecule.

The COOH-terminal phosphorylated region of the mammalian enzyme overall displayed low identity to the yeast COOH terminus, but there was homology in the region of the mammalian phosphorylation sites 3 and 4. Three potential cyclic AMP-dependent protein kinase sites are located in this region of the yeast enzyme. The regions of glycogen synthase likely to be involved in covalent regulation are thus more variable than the catalytic center of the molecule.
Though protein phosphorylation is a common and widely distributed mechanism in nature for the regulation of protein function (1, Z), relatively little is known about the evolution of phosphorylation controls. The enzyme glycogen synthase (EC 2.4.1.11) is believed to regulate synthesis of the storage polysaccharide glycogen, and the mammalian glycogen synthase is an enzyme regulated by phosphorylation at multiple sites. The rabbit muscle enzyme contains at least nine sites/ -85,000-dalton subunit that are phosphorylated in vivo and which, in vitro, make the enzyme substrate for 10 or so protein kinases (3,4). The phosphorylation occurs in a hierarchal manner, in which phosphorylation at some sites is the prerequisite for modification of others (5,6). The enzyme is therefore an interesting candidate for comparative studies of protein phosphorylation.
In Saccharomyces cerevisiae, glycogen is synthesized in response to growth limitation (7-IO), and the activity of glycogen synthase increases when, for example, entry into stationary phase is imminent. Work by Cabib and colleagues (11-16) had established that the S. cerevisiae glycogen synthase shared several properties with its mammalian counterpart, including allosteric activation by glucose-6-P (11, 13) and reversible covalent phosphorylation (12, 15). The full details of the phosphorylation control were not worked out, and there was little structural information available for the yeast enzyme. There is biochemical evidence implicating CAMP-dependent protein kinase in the phosphorylation (17), and recent work on cyclic AMP-mediated signaling pathways in yeast also indicated that many mutants defective in the cyclic AMP pathway also have aberrant glycogen accumulation, supporting a role for cyclic AMP in the regulation of glycogen metabolism (U-21). There is also evidence for cyclic AMP-independent control of glycogen accumulation, however (19,22). We report purification of glycogen synthase from S. cerevisiae, the cloning of a glycogen synthase gene, GSYl, and evidence for the existence of another gene encoding glycogen synthase.

Purification
of Yeast Glycogen Synthase-Initial experiments were designed to establish the relationship of glycogen synthase activity with the growth curve of S. cereuisiae. Before the onset of the stationary phase, glycogen accumulation increased considerably with a concomitant increase in the total glycogen synthase activity (not shown), results basically in agreement with published reports (8, 37). Based on the above trials, larger scale cultures were grown to early stationary phase for purification of glycogen synthase as described under "Experimental Procedures." Analysis of the product by SDS-PAGE (Fig. 2) indicated two major protein species, M, 85,000 and 77,000, whose presence correlated with enzyme activity eluting from the final column. The 85-kDa species accounted for 15-20% of the protein. A polypeptide contaminant of M, 100,000 (Fig. 2), more prominent in the fractions eluting earlier from the final column, was shown to be glycogen phosphorylase.' Nondenaturing gel electrophoresis of the most active glycogen synthase fractions followed by an in situ activity stain (39) showed a broad band of activity which corresponded to the Coomassie Blue-stained material in parallel gels (data not shown).  Initial Protein Sequence Analysis and Cloning of a S. cerevisiae Glycogen Synthase Gene, GS Yl -The purified enzyme, containing both 77-and 85-kDa species, was submitted to protein sequence analysis, and a single NH*-terminal sequence was obtained, X-X-D-L-Q-N-?H-L-L-F-E-V-A-X-E-V-T-N-X-V-: that had identifiable homology to mammalian glycogen synthase. In other experiments, the 77-and 85-kDa species were analyzed separately after transfer to polyvinylidene difluoride paper. The 85-kDa polypeptide gave results consistent with the NHz-terminal sequence noted above over six cycles, whereas the 77-kDa peptide, though present in a greater amount, gave no signal. We suspect, therefore, that the 77-kDa species was NH*-terminally blocked. Analysis of CNBr fragments of the mixture by SDS-PAGE revealed two well defined and prominent species of 26 and 24.5 kDa, as well as several more poorly resolved species. The NH?-terminal sequences of these two prominent CNBr fragments were found to be the same: X-X-K-G-V-N-F-V-Y-G-N-X-L-I-E-G-A-P-X-V-.' The protein sequence information was used to design oligonucleotide probes to screen a yeast genomic library (see "Experimental Procedures"). One clone, containing a -17-kb insert (Fig. 3), contained a sequence that matched one of the probes. A 4.2-kb HindIII-Hind111 fragment hybridizing to the probe was subcloned, and a segment of 2533 bp was sequenced from both strands (Fig. 4). A 2124-bp open reading frame would encode a protein of 707 amino acids with overall 50% identity (Fig. 5) to rabbit or human muscle glycogen synthase, 62% allowing conservative replacements (40, 41). The initiator ATG has G at the +4 position and A at the -3 position, in keeping with Kozak's consensus sequence (42). There are no in-frame upstream ATGs not followed by stop codons. The next in-frame ATG downstream is at bp 241. The predicted NH;? terminus, based on initiation at the first ATG, corresponds exactly to the NHp-terminal sequence derived from the 85-kDa polypeptide. We therefore conclude that the start codon is at position 1 as numbered in Fig. 4. Codon usage in the gene was only slightly biased to that expected of highly expressed genes, with a codon bias index of 0.17 calculated according to Bennetzen and Hall (43). We propose to name the gene GS Yl.
Disruption of the GSYl Gene-The pRS306-GSYlAl plasmid (see "Experimental Procedures") formed the basis for disruption of the GSYl gene by homologous recombination a X indicates a cycle at which no assignment could be made, ? is an uncertain assignment, and / separates alternative assignments. (44). The plasmid was linearized by digestion with HindIII kb HindIII-Hind111 fragment predicted by analysis of the (Fig. 1). The resulting DNA contains 5' and 3' segments of GSYl gene. In the haploid mutant, this fragment was absent the GSYl gene but with almost all of the intervening coding and a 6.5-kb fragment (Fig. 6, truck 3), as expected from the region replaced by pRS306 sequences, including the URA3 contruction of the pRS306-GSYlAl plasmid (Fig. l), was gene. Using the LiAc method (45), we transformed haploid observed, confirming that the GSYl gene had been disrupted.
To confirm that the desired ing fragments of expected sizes for the wild-type and mutant replacement had occurred in the recombinants, DNA from strains (Fig. 6, tracks 2 and 4). Since viable gsylA1 haploids wild-type and transformed cells was analyzed by Southern could be isolated, the GSYl gene must not be essential. In hybridization using a SpeI-NdeI fragment, corresponding to fact, the Southern analysis revealed a second weakly hybridthe coding region of the GSYl gene, as a probe (Fig. 6). The izing HindIII-Hind111 fragment of -12 kb present in DNA DNA from wild-type cells (Fig. 6, track 1) contained the 4.2 from both wild-type (Fig. 6, track 1) and mutant (Fig. 6,  Identities with the S. cereuisiae sequence are denoted by solid dots; gaps are shown as dashes. Numbering of the rat sequence includes the initiator methionine, whereas the numbering of the muscle and yeast sequences does not. Y, yeast; L, rat liver; M, rabbit muscle. DNA from either wild-type (wt) or mutant (gsyZA1) yeast strains was digested with the indicated restriction enzymes and analyzed as described under "Experimental Procedures." The filter was hyhridized with a 2.3.kh SpeI-MeI fragment containing the GSYl coding region. The DNA was digested as follows: track I, HindIII; track 2, Hind111 and BarnHI; track 3, HindIII; track 4, Hind111 and BarnHI; track 5, Hind111 and PstI; track 6, Hind111 and SalI; truck 7, Hind111 and EcoRI.
3) cells. The presence of hybridizing fragments not due to GSYl was also apparent when DNA from the mutant was digested with other restriction enzyme combinations (Fig. 6). When a haploid yeast strain bearing the gsylA1 mutation was grown to stationary phase, glycogen synthase activity could be detected in cell extracts. Thus, a large-scale culture was grown and subjected to the glycogen synthase purification scheme developed for the wild type (see "Experimental Procedures"). Glycogen synthase activity followed through the purification essentially as had been observed using wild-type yeast (Table I). Recoveries at most steps were comparable to the wild type, except that the yield on the Mono-Q column was somewhat lower for unknown reasons. The overall purification, 1200-fold, was similar to that for wild type. Analysis of the final product by SDS-PAGE revealed a predominant 77-kDa polypeptide species, but the 85kDa species was absent (Fig. 7). The presence of the 85-kDa species thus correlated with the presence of an intact GSYl gene.
Protein Sequence Analysis of 77-kDa Polypeptide-We also attempted to explore the relationship between the 77-and 85 kDa polypeptides by analysis of the proteins. In one set of experiments, the 77-and 85-kDa polypeptides were transferred to nitrocellulose paper and subjected separately to trypsin digestion. Analysis of the digests by reverse-phase microbore HPLC indicated similarity but not total identity in the profiles (not shown). Peptides that appeared to be good candidates for protein sequence analysis were collected directly onto sample filters. Three tryptic peptides from the 77-kDa species were found to have sequences consistent with the sequence predicted by the GSYl gene, L-X-D-L-L-D-X-K, M-Y/G-L-E-Y/Q-V/F-K, and S-L-E-N-T-V-X-E-V-T-X-S-I-G-K," although there were some ambiguities in the sequence assignments. One peptide derived from the 77-kDa polypeptides gave a sequence, A-T-Y-Q-N-E-V-D-I-L-D-, which matched the GSYl protein sequence with only 7111 identities. We were unsuccessful in obtaining a sequence from peptides generated from the 85-kDa polypeptides which is present in much lower amounts. In another series of experiments, glycogen synthase purified from a wild-type strain of yeast was found to be phosphorylated by either bovine or yeast (see "Experimental Procedures") cyclic AMP-dependent protein kinase (data not shown). Both the 77-and 85-kDa polypeptide were modified to similar extents. Analysis of phosphoamino acids by partial acid hydrolysis of enzyme phosphorylated by yeast cyclic AMP-dependent protein kinase indicated that only serine residues were phosphorylated (not shown). Enzyme phosphorylated to 0.25 and 0.35 mol of phosphate/ mol of polypeptides (in the 85-and 77-kDa species, respectively) was subjected to SDS-PAGE, transferred to polyvinylidene difluoride paper, digested with trypsin, and analyzed by microbore HPLC as described above. Two phosphopeptides from the 77-kDa species, accounting for some 50% of the applied radioactivity, were successfully analyzed. Both peptides had the same NHz-terminal sequences, S-N-X-T-V-Y-M-X-P-G-and S-N?-S?-T-V-Y-M-X-P-," which could be aligned, starting at residue 659, with the GSYl protein sequence.

DISCUSSION
Yeast Glycogen Synthase Genes and Proteins-Glycogen synthase purified from wild-type S. cereuisiae contained two polypeptides of 77 and 85 kDa, and an obvious first question is whether and how these species are related. Separation of the two species was not achieved with various chromatographic methods used or with nondenaturing gel electrophoresis, suggesting either physical association or similarity in properties. The 85-kDa species had a NHp-terminal sequence predicted by the GSYl gene, and the 77-kDa species was present in yeast in which the GSYl gene had been disrupted. However, the sequences of several peptides derived from the 77-kDa polypeptides were either consistent with, or in one case only slightly different from, sequences predicted by the 20884 Yeast Glycogen Synthase Gene  Synthuse-Definition of yeast genomic DNA revealed a hybridizing fragment not the phosphorylation sites in yeast glycogen synthase will predicted for the GSYl gene, the data can best be explained require much more work. Our own initial efforts are compliif there are two glycogen synthase genes that code for proteins cated by the fact that we were successful in obtaining sequence with significant primary structural similarity. This idea would information only from phosphopeptides derived from the 77also be consistent with the fact that, although numerous yeast kDa polypeptide which we do not think is coded by GSYl. mutants with aberrant glycogen accumulation characteristics However, the sequence obtained matched a COOH-terminal have been identified, none of the mutations analyzed has been region of the predicted GSYl product, starting at Ser-659, a in a glycogen synthase gene. The chances of isolating glycogen residue preceded by an arginine. Assuming, as seems reason-synthase mutants are obviously much reduced if there are multiple copies of the gene. Our hypothesis is that the gene we have cloned, GSYI, would code for the 85-kDa polypeptides while a second gene would code for the 77-kDa polypeptides which may additionally undergo posttranslational modification that blocks the NH, terminus. Whether there is a physiological rationale for the existence of two genes will have to await further work.

Amino
Acid Sequence of Yeast Glycogen Synthuse-The predicted mass of the yeast glycogen synthase coded by GSYl is 80,501 daltons with a calculated p1 of 5.6. The overall acidic character of the protein is similar to that found for mammalian glycogen synthases (40,41,46) and reflects a high proportion of Glu and Asp residues (13% in rabbit muscle and 11% in yeast). As observed with the mammalian enzymes, the COOH terminus carries a net negative charge, though aspartic acid predominates over glutamic acid in the yeast protein. The S. cerevisiae glycogen synthase sequence is 50% identical to that of the mammalian muscle enzymes and 47% identical to that of rat liver. Searches of protein sequence data bases did not reveal strong similarities between the yeast glycogen synthase and other proteins. An alignment of yeast glycogen synthase with the NH2 terminus of E. coli glycogen synthase (47) can be made: 19.2% identity to residues l-313 of the E. coli enzyme and 31.6% allowing conservative substitutions. This level of similarity is not in itself remarkable except that it includes residues potentially involved in substrate binding. Lys-38 of rabbit muscle glycogen synthase has been implicated in UDP-glucose binding (48,49) and is in the local sequence -K-V-G-G-. The yeast enzyme has an arginine  in this position with sequence -R-V-G-G-. Lys-15 of the E. coli enzyme was recently reported to be involved in binding its substrate ADP-glucose (50). The local sequence is -K-T-G-G, and Furukawa et al. (50) have suggested that the motif -R/K-X-G-G might be a feature of nucleoside sugar binding sites. Interestingly, the E. coli enzyme also contains a repeat including this motif toward its COOH terminus (-R-T-G-G-L-A-D, residues 378-383); perhaps this is part of another ligand binding site.