Molecular cloning and expression of the regulatory (RG1) subunit of the glycogen-associated protein phosphatase.

DNA clones encoding the glycogen-binding (RG1) subunit of glycogen-associated protein phosphatase were isolated from rabbit skeletal muscle lambda gt11 cDNA libraries. Overlapping clones provided an open reading frame of 3327 nucleotides that predicts a polypeptide of 1109 amino acids with a molecular weight of 124,257. Northern hybridization of rabbit RNA identified a major mRNA transcript of 7.5 kilobases present in skeletal, diaphragm, and cardiac muscle, but not in brain, kidney, liver, and lung. Southern analysis of rabbit genomic DNA digested with various restriction endonucleases gave rise to a single hybridizing fragment, suggesting that a single gene is present. Expression of the complete RG1 subunit coding sequence in Escherichia coli generated a protein of apparent molecular weight on sodium dodecyl sulfate-polyacrylamide gel electrophoresis of approximately 160,000, similar to the size of the polypeptide detected by Western immunoblot in rabbit skeletal muscle extracts. The RG1 subunit shares significant homology with the Saccharomyces cerevisiae GAC1 gene product which is involved in activation of glycogen synthase and glycogen accumulation. The homology with GAC1 substantiates the role of this enzyme in control of glycogen metabolism. Hydropathy analysis of the RG1 subunit amino acid sequence revealed the presence of a hydrophobic region in the COOH terminus, suggesting a potential association with membrane. This result suggests that the same phosphatase regulatory component may be involved in targeting the enzyme both to membranes and to glycogen.

DNA clones encoding the glycogen-binding (Rcl) subunit of glycogen-associated protein phosphatase were isolated from rabbit skeletal muscle Xgtll cDNA libraries. Overlapping clones provided an open reading frame of 3327 nucleotides that predicts a polypeptide of 1109 amino acids with a molecular weight of 124,257. Northern hybridization of rabbit RNA identified a major mRNA transcript of 7.5 kilobases present in skeletal, diaphragm, and cardiac muscle, but not in brain, kidney, liver, and lung. Southern analysis of rabbit genomic DNA digested with various restriction endonucleases gave rise to a single hybridizing fragment, suggesting that a single gene is present. Expression of the complete Rcl subunit coding sequence in Escherichia coli generated a protein of apparent molecular weight on sodium dodecyl sulfate-polyacrylamide gel electrophoresis of approximately 160,000, similar to the size of the polypeptide detected by Western immunoblot in rabbit skeletal muscle extracts. The Rcl subunit shares significant homology with the Saccharomyces cerevisiae GACl gene product which is involved in activation of glycogen synthase and glycogen accumulation. The homology with GACl substantiates the role of this enzyme in control of glycogen metabolism. Hydropathy analysis of the Rc, subunit amino acid sequence revealed the presence of a hydrophobic region in the COOH terminus, suggesting a potential association with membrane. This result suggests that the same phosphatase regulatory component may be involved in targeting the enzyme both to membranes and to glycogen.
Protein phosphorylation is a major mechanism by which many cellular functions are regulated. The co-ordinated con-trol of both types of interconverting enzymes, protein kinases and protein phosphatases, determines the phosphorylation and hence modulation of the activity of intracellular target proteins. Regulation of protein kinases has been extensively studied and for some forms it is reasonably well understood (1-6), whereas knowledge and understanding of the potential control of protein phosphatases is incomplete. The major cellular serine/threonine protein phosphatases have been classified into two categories, type 1 and type 2, according to their substrate specificities and their sensitivities to inhibitor proteins (7,8). The cDNA coding the catalytic subunits of four forms (type 1, 2A, 2B, and 2C) included in this classification have been cloned (9-16), and with the exception of the 2C protein phosphatase, all appear to be derived from the same gene family. At least four forms of type 1 phosphatases differing in the associated regulatory subunit have been identified ( a ) the ATP-Mg-dependent phosphatase (17)(18)(19), ( b ) the glycogen-associated phosphatase (20), (c) the sarcoplasmic reticulum-associated phosphatase (21,22), and ( d ) the myosin-associated phosphatase (23,24). There is also evidence for nuclear associated type 1 phosphatases (25,26). They all appear to share a similar 37-kDa catalytic subunit complexed to different regulatory subunits that are responsible for regulation and/or targeting of the enzymes to specific cellular locales (8,9). These regulatory subunits are ( a ) inhibitor-2, ( b ) glycogen-binding or RG, subunit,' (c) sarcoplasmic reticulum-binding subunit, and ( d ) a putative myosin-binding subunit. A recent report has shown that the properties of the sarcoplasmic reticulum-associatedphosphatase in rabbit skeletal muscle are similar to those of the glycogen-bound phosphatase, suggesting that the &I subunit might play a dual role in targeting type 1 phosphatase to two different subcellular locations, glycogen and membranes (22).
Up to 60% of rabbit skeletal muscle phosphorylase phosphatase is associated with glycogen (27,28). This glycogenbound protein phosphatase was first identified and purified as a 137-kDa heterodimer, consisting of a 37-kDa catalytic subunit and a 103-kDa regulatory component, the &I subunit (20). However, the Rcl subunit is extremely sensitive to proteolysis. More recent analysis by Western immunoblotting The abbreviations used are: Rcl, glycogen-binding regulatory subunit of type 1 protein phosphatase. This protein had previously been designated "G subunit," causing some confusion with GTP binding proteins. We propose the present nomenclature as a means to avoid this confusion. In addition, other type 1 phosphatase regulatory subunits can be named in a parallel fashion, RSR and R M~ for the sarcoplasmic reticulum and myosin associated forms, respectively, and RI.* for inhibitor-2. The other abbreviations used are: HX, random hexamer-primed DT, oligo(dT)-primed SDS-PAGE, sodium dodecyl sulfate-polyacrylamide gel electrophoresis; kb, kilobase(s); IPTG, isopropyl-(3-thio-galactopyranoside; TLCK, 1-chloro-3-tosylamido-7-amino-2-heptanone; EGTA, [ethylenebis(oxyethylene-nitri1o)tetraacetic acid. indicated that the intact subunit has a M, of 160,000-170,000 (28,29). Its function appears to be targeting the phosphatase to the glycogen particle, where several of the enzymes, such as glycogen synthase and phosphorylase kinase, involved in glycogen metabolism are located. The &l subunit is phosphorylated in vitro by the CAMP-dependent protein kinase at two sites (29). Phosphorylation of site 2 has been proposed to cause dissociation of the catalytic subunit (30). Studies from our (31) and Cohen's (32) laboratories also revealed a complex multisite phosphorylation of the Rcl subunit. Phosphorylation by CAMP-dependent protein kinase formed the recognition sites for other protein kinases such as glycogen synthase kinase-3 and casein kinase 11.' The CAMPdependent protein kinase sites and one of the glycogen synthase kinase-3 sites (33) have been shown to be phosphorylated in vivo. Epinephrine is reported to enhance significantly the phosphate content of site 2 (33), whereas insulin leads to increased phosphorylation of site 1 (34).
To investigate the native structure and the role of the &I subunit in the regulation of the glycogen-associated phosphatase we have undertaken the molecular cloning of cDNAs coding for this subunit. This paper reports the first isolation and characterization of cDNA clones encoding the Rcl subunit. The deduced amino acid sequence of the entire translated region, the tissue specific distribution, and the expression of the coding sequences in Escherichia coli are presented.

EXPERIMENTAL PROEDURES
Materials-The bacteriophage T7 polymerase expression system was generously provided by Dr. F. W. Studier (Brookhaven National Laboratories). Oligonucleotides were synthesized in an Applied Biosystems DNA synthesizer model 380A. Restriction and other DNA modifying enzymes, M13 vectors, and agarose were purchased from Bethesda Research Laboratories. Genescribe-Z vectors and randomprimed DNA labeling kits were obtained from United States Biochemicals Corp. Deoxy-and dideoxy nucleotide triphosphates were from Pharmacia LKB Biotechnology Inc. Radionucleotides were purchased from Du Pont-New England Nuclear, and 1251-protein A was from ICN. Other general reagents were from Bethesda Research Laboratories, Sigma, and Boehringer Mannheim.
Isolation and Sequence Determination of cDNA Clones-Two rabbit skeletal muscle random hexamer-primed Xgtll cDNA libraries containing, respectively, lo6 and 6 X lo6 independent recombinants and one rabbit muscle oligo(dT)-primed Xgtll cDNA library containing lo7 independent recombinants were constructed (35). To isolate cDNA clones encoding the Rcl subunit, two oligonucleotide probes were synthesized, corresponding to the amino acid sequences around the CAMP-dependent protein kinase phosphorylation sites: G-

COMP1,GGCGGTGTGCACGTACACCTCCTCandG-COMP2,TT-(G/A)AA(G/C)CC(G/A)AA(G/A)TT(G/A)TCGGCGAA. The choice
of the codons was based on the most frequent codon usage for rabbit (36). The two oligonucleotides, 5'-end-labeled with T4 polynucleotide kinase and [ T -~~P J A T P , were used to screen on duplicate filters approximately 160,000 recombinants from the rabbit skeletal muscle random hexamer-primed unamplified library. Hybridization was performed at 50 "C in a solution containing 10 X Denhardt's (0.2% (w/ v) each of Ficoll, polyvinylpyrrolidone, and gelatin), 6 X SSPE (1 X SSPE: 0.15 M sodium chloride, 10 mM sodium phosphate, and 1 mM EDTA, pH 7.4), 0.05% NaPPi, 0.1% SDS, and the radiolabeled probe (2 X lo6 cpm/ml). Nitrocellulose filters were washed in 6 X SSC (1 X SSC: 0.15 M sodium chloride and 15 mM sodium citrate, pH 7) at room temperature for 10 min twice followed by 20 min at 50 "C. Positive clones were plaque-purified by consecutive screening. DNA from positive recombinant phages was isolated (37) and the cDNA inserts were subcloned into the GeneScribe-Z vector, pTZ19U, for restriction endonuclease analysis and into M13 vectors for DNA sequencing (38). To obtain the entire coding sequences, confirmed cDNA fragments labeled by the random hexamer priming method (39)  Roeske, P. J. Roach, and A. A. DePaoli-Roach, manuscript in preparation. the random hexamer-primed library and 160,000 recombinants from the oligo(dT)-primed library. Hybridization of the filters was carried out at 55 "C in the solution described above. Phages yielding positive signals were isolated and the cDNA inserts subcloned in pTZ19U for double-stranded DNA sequencing by the dideoxy chain termination method (40) utilizing vector-and cDNA-specific oligonucleotide primers.
Isolation of Genomic DNA Clones-A rabbit genomic library constructed in X phage Charon 4A (41), kindly provided by Dr. R. C. Hardison (The Pennsylvania State University), was screened (3.8 X lo5 plaque-forming units) with the cDNA fragment (995 bp) obtained from one clone (HX 1-1) and an oligonucleotide corresponding to nucleotides 61-79. One positive clone was identified. The DNA was isolated, digested with restriction enzymes, and analyzed by a Southern blot (42). Fragments hybridizing with the cDNA probe were subcloned into pTZ19U vector for sequencing, utilizing vector-and cDNA-specific oligonucleotide primers.
Northern Blot Analysis-Total RNA from rabbit brain, kidney, liver, lung and skeletal, diaphragm, and heart muscle was isolated by the method of Chirgwin et al. (43). RNA samples (15 pg) were electrophoresed through a 0.8% agarose/formaldehyde gel and transferred to nitrocellulose membrane in 10 X SSC (44). Prehybridization was carried out at 55 "C in a solution containing 10 X Denhardt's, 6 X SSPE, 0.05% sodium pyrophosphate, 0.1% SDS, and 0.1 mg/ml Torula RNA. The membrane was hybridized at 55 "C with two 32Plabeled cDNA fragments (2 X lo6 cpm/ml) comprising a total of 3854 bp. After hybridization, the filter was washed two times for 10 min at room temperature followed by 30 min at 68 "C. Autoradiography was performed utilizing Du Pont Quanta 111 intensifying screens.
Primer Extension-A 19-mer synthetic oligonucleotide complementary to residues 61-79 of rabbit Rcl subunit cDNA was used as a primer. The "P-end-labeled oligonucleotide (1.5 X lo6 cpm) was annealed to 25 pg of rabbit skeletal muscle, 30 yg of diaphragm or lung total RNA. The extension reaction was carried at 37 "C for 1 h in 50-pl volume containing first strand buffer (50 mM Tris-HC1, pH 8.3, 75 mM, KCI, 3 mM MgC12, and 10 mM dithiothreitol), 2.5 mM dNTPs, 500 units of mouse Moloney leukemia virus reverse transcriptase (Bethesda Research Laboratories), and 625 pg of actinomycin D. The resulting cDNA was analyzed and compared with the sequence of the 6.6-kb SphI genomic DNA fragment on a 6% polyacrylamide sequencing gel.
Southern Analysis of Genomic DNA-Rabbit genomic DNA (20 pg) digested with the restriction enzymes EcoRI, HindIII, and XbaI was separated on a 1% agarose gel and transferred to a nitrocellulose membrane. Following prehybridization at 68 "C in a solution containing 10 X Denhardt's, 6 X SSPE, 0.05% sodium pyrophosphate, 0.1% SDS, and 0.1 mg/ml Torula RNA, the membrane was hybridized with the 995-bp cDNA insert of clone HX 1-1 labeled by the nick translation method (1 X lo6 cpm/ml)(45). The membrane was washed twice for 10 min at room temperature and once for 40 min at 68 "C in 6 X SSC, 0.1% SDS, and 0.05% sodium pyrophosphate followed by a further wash at 68 "C in 2 X SSC for 30 min.
Construction of G.pET-8C Expression Vector-Three cDNA clones, HX 1-1 (995 bp), HX 5-1 (1412 bp), and HX 1-2 (2951 bp), were used to assemble the entire coding sequence. The 341-bp EcoRI-BglII portion of clone HX 1-1 was ligated with the 1102-bp BglII-EcoRI portion of clone HX 5-1 at the BglII site to generate a DNA fragment containing the 5'-most sequences. The resulting DNA fragment was digested with either RsaI and BglII or with SphI and BgZlI to produce a 340-bp RsaI-BglII and a 746-bp BglII-SphI DNA fragment. Two complementary oligonucleotides (18-mer and 14-mer), corresponding to the sequences from the start ATG to the RsaI site at nucleotide 28, were synthesized and annealed to generate an adaptor containing a cohesive NcoI site at the 5'-end and a RsaI site at the 3'-end. This adaptor was ligated to the RsaI-BglII which was subsequently joined to the BglII-SphI fragment. The resulting NcoI-SphI fragment was further ligated to a SphI-BamHI fragment derived from the HX 1-2 clone, giving rise to a 3872-bp NcoI-BamHI fragment, which was then inserted into the PET-8c expression vector (46, 47) cleaved at NcoI and BamHI sites. This plasmid will be referred to as G.pET-8c. The nucleotide sequences in the promoter region and the adaptor oligonucleotides were confirmed by sequencing. by adding 0.5 or 1.0 mM isopropyl-0-D-thiogalactopyranoside (IPTG), and the cell growth was continued for 3 h a t 30 "C. Cells were harvested by centrifugation a t 7,000 X g for 15 min and resuspended in 10 volumes of buffer containing 50 mM Tris-HC1, p H 7.5, 1 mM EDTA, 0.5 mM phenylmethylsulfonyl fluoride, 0.1 mM TLCK, 2 mM benzamidine, 10 pg/ml leupeptin, 50 mM 0-mercaptoethanol, and 1% Triton X-100. After freezing at -80 "C overnight, the cells were thawed and sonicated for 20 s twice. The lysate was centrifuged at 9,000 X g for 20 min. A sample of 1.4 p1 each of the whole cell lysate and the Triton-soluble fraction were analyzed by SDS-PAGE according to Laemmli (48) and by Western immunoblotting (49). Western Blot Analysis-A synthetic peptide KPGFSPQPS-RRGSESSEEVYV surrounding the CAMP-dependent protein kinase phosphorylation site 1 on the Rcl subunit was synthesized and used t o raise antibodies (anti-Rcl) in guinea pigs. The antibodies were affinity-purified on Sepharose 4B coupled to the peptide. Rabbit skeletal muscle was homogenized in 3 volumes of 50 mM Tris-HC1, p H 7.5, 1 mM EDTA, 1 mM EGTA, 0.5 mM phenylmethylsulfonyl fluoride, 2 mM benzamidine, 0.1 mM TLCK, 10 pg/ml leupeptin and centrifuged for 20 min a t 10,000 X g. A 1-pl sample of the rabbit skeletal muscle soluble extract and samples of E. coli extract prepared as described above were subjected to 7.5% SDS-PAGE. For immunoblotting, gels were equilibrated for 20 min in 20 mM Tris, 190 mM glycine, 20% methanol, sandwiched with nitrocellulose membrane, and subjected to ice-cooled transverse electrophoresis a t 100 V for 2 h. The nitrocellulose filter was blocked overnight a t room temperature in 5% powder milk in PBS-T (20 mM sodium phosphate, p H 7.4, 115 mM sodium chloride, and 0.1% Tween 20) and then incubated for 2 h with anti-Rcl antibody (5 pg/ml in PBS-T). Bound antibodies were detected by incubating the filter for 1 h in PBS-T containing 0.2 &i/ ml of "'1-protein A. After removal of unreacted protein A the filter was subjected to autoradiography.

RESULTS
Isolation and Characterization of cDNA and Genomic DNA Clones-The initial screening of the rabbit skeletal muscle random hexamer-primed Xgtll cDNA library identified two positive clones that hybridized with both oligonucleotide probes G-COMP1 and G-COMP2. Nucleotide sequencing confirmed that clone HX 1-1 (995 bp) coded for the available amino acid sequences (50) of the R G~ subunit polypeptide.
This labeled cDNA fragment was used to rescreen the original filters and additional 160,000 plaque-forming units. Clones isolated from this screen were used for subsequent screening of another random hexamer-primed and an oligo(dT)-primed cDNA libraries. A total of 630,000 independent recombinants were screened, and 74 positive clones were identified, out of which 23 were characterized. Complete nucleotide sequences were determined from six clones (HX 1-1, 1-2, 5-1, 11-1, 13-1 2 3 4kb 2, and DT 6-1), whereas partial sequences were obtained from other clones including HX 11-2, 15-1, 16-1,18-1, 18-2,18B-1, 21-1, and DT 3-1 (Fig. 1). These overlapping clones provided a combined sequence with an open reading frame of 3317 nucleotides and 537 nucleotides downstream of a stop codon. However, no in-frame ATG codon upstream of the known amino acid sequence was found. To search for the translational start site, the HX 1-1 cDNA (995 bp), containing the most 5"sequences and a synthetic oligonucleotide from nucleotide 61 to 79, were used to screen a rabbit genomic library. Southern blot analysis of the single positive clone identified a 6.6-kb SphI genomic fragment which hybridized with the 995-bp cDNA fragment. Sequence analysis revealed that 108 bp of an intron were present at the 3'-end of the SphI fragment, which continued upstream in the coding region at nucleotide 799. By utilizing as primer an oligonucleotide corresponding to nucleotide 61-79, sequences were extended at the 5'-end by 291 bp. In this region were several stop codons in all reading frames and no intronlexon boundary consensus sequence (51). Only two ATG codons were found, both in the same reading frame. The first was followed by an in-frame stop codon and the second was located 10 bp upstream from the cDNA 5'-end. The GCCCAATGG sequence around the second ATG is in reasonable agreement with the Kozak's consensus sequence (52). Thus, it seems likely that this ATG is the translational initiation codon. The COOH terminus is defined by a TAA codon which is followed by several other stop codons in all reading frames. No polyadenylation signal was found in the 537 bp of the 3"untranslated region. Sequence data from all the clones establish an open reading frame of 3327 bases (Fig. 2) encoding a protein of 1109 amino acids with a M , of 124,257. The amino acid sequences of six peptides obtained from purified rabbit skeletal muscle RGI subunit, provided by Dr. Philip Cohen (University of Dundee, Dundee, Scotland), were present in the deduced sequences ( Fig. 2) with only two mismatches, cysteine residues at positions 26 and 183 for tyrosine and serine, respectively. Hydropathy analysis by the method of Rao and Argos (53) indicated a region, at the COOH terminus between residues 1063 ando 1097, rich in hydrophobic residues and which predicts a transmembrane helix. Examination of the nucleotide sequences of all the cDNA clones characterized indicated the existence of two groups, clones HX 13-2, HX 1-2, HX 11-2, and HX 16-1 in one group and clones HX 1-1, HX 11-1, HX 5-1, and HX 15-1 in the CCAGAGAGCCCAATGGAGCCTTCTGMGTACCTGGTCAGAACAGCAAAGATMTTTTTTAGMGTTCCTMTTTGTCTGATTCTCTTTGTGMGATGMGMGTTMAGCTATTTTCMA 120

M E P S E V P G Q N S K D N F L E V P N L S D S 1 . C E D E E V K A T F K 36
CCTGGCTTCTCCCCTCMCCGAGCAGACGAGGTTCTGMTCTTCTGMGAGGTCTACGTGCACACCGCATCCTCAGGTG~AGMGAGTTTCGTTTGCTGACAACTTTGGATTCAATCTT 240

~S V K E F D T Y E L P S V S T T F E L G K D A F O T E~Y V I , S P I . F D I , P A 116
TCMAGGAAGATCTTATGCAACMCTACMGTTCAGAAAGCMTGCTGGAGTCMCTGAATATGTTCCT~TTCTACMGTATGMGGGTATTATTCGAG~TTGMTATTTCTTTTGAG 480

E N S K I A D T Y I P T I V C S H E E K E D L K S S Y Q N V K D V N T E H D E H 316
GAGTTACAGMGMCCAAAGCCACAGTGM~ATGCACTGACTTGTCCCMAGGCTTTTGTCTCCAGGTTCATCAGCAGAAAGTTCCTTMAGGGAGATT~TACCACACTGMAAATAT 1200

N E K E L E L M I N Q R L I R T R C A A S E Y G K N T L S S D P S N I P N K P E 356
TCCTCAGGAAATGAGTCCAGTCATCAGCCTTCAGATATGGGAGAMTCMCCCCTCATTGGGAGGTACTACCAGTGATGGATCAGTGCMTTACACATCA~AGTMAGMATCCTGGAT 1320

AAAGATTGCGMTGTTTACCMGAGATGTCCACTTG~GCATCGGACTATTTCA~ATCAACAG~CAGACCCTCCGAGGMGATTATGGGACTAGTAAGGATMTMGGAAAM 1560 D N A N P A H G S G R G E I S C S F P G Q L K A S N L N K K Y E G G A E N S E M 476
AGMTACAGTTAGATGTTGATGM~CMGCAAAMTTTTCGATCMTCTTCTATGACCMGAAAGAAATGTAGGCCACCTTGMATMCTGTGGMGGGATTGAAGCCAGTGACAGA 1680

MTGTGGAAATGTCGCMGGGCCAATGATTTTAGTCAGTGAATCTCGTGAGMCGTAGAMGGGAAAGGCATGAAAATGAAMTGAAGGACTGATTAACTCAGGTGACAAGGAATTTGAGAGCTCT 3120 N V E M S Q G P M I L V S E S R E N V E R E R H E N E G L I N S G D K E F E S S l O 3 6
GCTTCTTCTAGTCTACCTGTGCAGGAMCTCAAGATCAAAGCAATGAATCTCTTCTTTCAAMTACACCMCTCT~TACCTTATTTCCTTTTGTTTCTGATGTTTCTTGTAACCGTC 3240

A S S S L P V Q E T Q D Q S N E S L L S K Y T N S K I _ P Y F L L F J , "~J L V _ T -~1 0 7 6 TACCACTATGACCTCATGATAGGCTTGGCATTCTACCTTTTCTCATTGTATTGGCTATACTGGGMGAGGGCAGACAAAMGAGTCTGTCA~GMGTMCCTCAGCACTATTATTAT 3360
TAAAAGATMGCTGTTTAGCTCCAMCATTTGGATTGGTGMGGAGACTATTCATTGTTCAAAGAGCCAGTGCAGTTTTTCTCTTGAAGGATCATTTMAAAGGAATGCCTATGMGTTT 3480

1109
GCTCCTTCATATMGTMTTATTCTATATAGGACCATTATGTTTGGATCATTAAATACCTATATGMTATGAGATCTGAAGCACGTCMGTTGMATTAGGTACAGCTGTTGCTCCTTAG 3600 CAGGCTATGAAGTTGCMTGCTTCACGTCTCTTCACTACTT~GTGCTATTTCTTGTGTTCATTTCTTTTGCAGTAAAGCTTCATTTTTTTCCCCCTGAGCACATCTTTCCCTCTATGG 3720 TTTTTAAAMTAGATAMACATGGACMTGGCAGMGATTTTCTTCCTTTTTTTTTGTCTTTTAGGATTGACMTGAAATTTTCATCTACCACTGTATCATTTATTAGCACATAATGATA 3840 GATCAACTATTTCAACTCATATTTCATAGTTTTAAG second. The two groups differed in six nucleotides, at positions 703, 744, 1176, 1302, 944, and 1251. The last two caused changes in amino acid sequences at residue 311 from threonine to methionine and at residue 413 from asparagine to lysine, respectively. The other four differences were silent. Most likely this discrepancy is due to allelic variations.

~H Y D L M I G L A F Y L F S L Y W L Y W E E G R Q K E S V K K K~
Determination of the Transcriptional Start Site-Primer extension analysis was carried out in order to map the 5'-end of the rabbit Rcl subunit transcript.
Utilizing total RNA prepared from rabbit skeletal muscle, a major transcriptional start point at a C located 12 nucleotides upstream from the translational start codon ATG was identified (Fig. 3). This putative cap site was observed also in RNA prepared from rabbit diaphragm and cardiac muscle but not from lung (data not shown) in which no homologous RGl subunit appears to be expressed.
Tissue Distribution of the Rcl Subunit mRNA and Southern Analysis of Genomic DNA-The tissue distribution and the complexity of the Rcl subunit mRNA was investigated by Northern analyses. Total RNA from various rabbit tissues was hybridized with the labeled cDNA inserts from clone HX 1-1 (995 bp) and from clone HX 1-2 (2.95 bp) as probes. A major hybridizing mRNA species at 7.5 kb was observed in skeletal, diaphragm, and cardiac muscle (Fig. 4). Minor species of approximately 3.5 kb were also observed at low stringency. The level of the mRNA appeared to be higher in skeletal than in cardiac muscle. However, none of the mRNA species was present in brain, kidney, liver, and lung, although staining of the gel with ethidium bromide indicated that comparable amounts of RNA were loaded in each track. Southern blot analysis of rabbit genomic DNA digested with different restriction enzymes and probed with the 995bp cDNA fragment of clone HX 1-1 gave rise to a single hybridizing band (Fig. 5), suggesting the presence of only one gene.

Expression of Recombinant Rc, Subunit in E. coli Cells-
The structure of the polypeptide encoded by the composite cDNA was examined by expression in E. coli. Construction of the expression vector G.pET-8c as described under "Experimental Procedures" is illustrated in Fig. 6. E. coli BL21(DE3) cells transfected with the G.pET-8c plasmid were grown and lysed as described under "Experimental Procedures." Analysis of the cell extracts by Western immunoblotting indicated the presence of three major immunoreactive polypeptides, one of which had an apparent M , of approximately 160,000, similar to that observed in rabbit skeletal muscle extracts (Fig. 7). Similar results were also obtained when the E. coli cells were directly treated with 0.5% SDS at 100 "C for 5 min, before Western analysis (not shown). The amount of immunoreactive material increased with increasing concentrations of IPTG from 0.5 to 1 mM (Fig. 7, panel B: lunes 2 , 3 , 5 , and 6 ) and was absent in extracts from untransfected cells (lunes 7-9). The slight amount of polypeptides detected in extracts of G.pET-8c-harboring cells not induced by IPTG is attributed to the basal T7 RNA polymerase activity. Increasing time (data not shown) and induction by IPTG appeared to generate proportionally more of the lower molecular weight species.
When the cells were lysed in buffer without 1% Triton X- sequences. The open own indicates t,he assemhled R;l suhunit cDNA.

Protein
The orientation of ampicillin-resistant gene (bln) and of the replication origin (ori) are shown hy nrrours.
100, most of the immunoreactive material was associated with the particulate fraction (data not shown). Expression of E;, subunit cDNA was achieved only when freshly transfected cells were used. Transfected RL21(DE3) cells stored at -80 "C in 15% glycerol or on agar plates significantly lost their ability to express immunoreactive polypeptides.

DISCUSSION
We report the isolation and characterization of cDNAs encoding the regulatory (E;,) suhunit of the rahhit skeletal muscle glycogen-associated type 1 protein phosphatase. Sequencing of 23 cDNA clones failed to provide a translational ATG start codon, which was obtained from the isolation of a rabbit genomic clone. Primer extension analysis indicated that the 5'-untranslated region is very short, 12 nucleotides. The procedure of Gubler and Hoffman (54) 124,257. lmmunohlot analysis of rahhit skeletal muscle extracts had estimated the apparent molecular weight of R;l suhunit to he approximatelv 160,000 (28,29), which is clearly larger than that deduced from the nucleotide sequence. Two explanations could he advanced for this discrepancy: the first is that we did not have the complete coding region and the second that the rahhit skeletal muscle &;I subunit is posttranslationally modified, leading to a lower mohility on SDS-PAGE. However, expression of the cDNA in E . coli demonstrates that the deduced primary structure is complete, since a polypeptide with the same electrophoretic mohilitv as that present in skeletal muscle extract could he detected. Since post-translational modification in E. coli is unlikely, the larger size estimated by gel electrophoresis must he explained bv an intrinsic property of the protein. Interestinglv, two other regulatory proteins of t-ype 1 phosphatase, inhihitor-1 and inhihitor-2, have also heen shown to migrate anomalouslv on SDS-PAGE, with apparent molecular weights of 26.000 instead of 19,000 (55,56) and 31,000 instead of 23,000 (28. 57, 58), respectively. In all three instances the pol-ypeptides have heen shown to he highlv asymmetric as evidenced hv their large Stokes radius and small sedimentation constants ( 2 0 . [55][56][57][58]. In all cases, the size estimated hv amino acid sequence is approximately 70% of that determined hv SI>S-I'AGF:. However, despite sharing these properties. the three proteins show no resernhlance in sequence. T h e lower molecular weight polypeptides, 58 and 46 kDa (Fig.7), detected in extracts of E . coli transfected with suhunit cDNA, confirm the extreme sensitivitv of the polvpeptide to proteolysis. The observation that the same species are also present when the cells are disrupted directly hy 100 " C heat treatment in the presence of SIX (data not shown) suggests that degradation might he occurring inside the cell and not during processing of the extracts. Such an occurrence is not unusual and has heen reported for other mammalian polypeptides expressed in hacteria (.59, 60). RiI suhunit degradation products of similar molecular weight hnve heen ohserved in preparations of rahhit skeletal muscle glvcogenassociated phosphatase (27,61), which might indicate that specific regions of the pol-ypeptide are especiallv sensitive to proteolytic cleavage. We can exclude that the 58and 46-kDa forms are generated by initiation of translation at a downstream ATG, since the antibodies used for the detection were raised to the region corresponding to residues 37 to 56 and no other ATGs are present in this NHz-terminal region.
Recently Hubbard et al. (22) reported that a sarcoplasmic reticulum-associated phosphatase contains a polypeptide similar, if not identical, to the RG, subunit. Hydrophathy analysis of the deduced amino acid sequence reported here indicated a potential transmembrane region between residues 1063 and 1097 (Fig. 2), which could be responsible for anchoring the protein to the membrane, Thus, the same regulatory subunit might function to target the phosphatase to membranes and glycogen. Several lines of evidence argue against the existence of distinct muscle isoforms. The low molecular weight polypeptides observed in glycogen-associated phosphatase purified from rabbit skeletal muscle are not always detected by Western immunoblotting analysis (Fig. 7) and Northern hybridization, utilizing 3,850 bp of cDNA sequence, indicated one major mRNA species, The minor differences observed in the two groups of cDNA clones, four silent nucleotide substitutions and two changing the amino acid residues could be explained by allelic variations. Southern analysis also suggested a single gene. Studies in progress, with mutant protein in which the hydrophobic region has been deleted, should prove useful in addressing the question of how the same polypeptide might be directed to different cellular compartments.
Analysis of the tissue distribution of the RG, subunit mRNA ( Fig. 4) supports previous observations by Western immunoblot (28); which indicated that the polypeptide is specifically expressed in skeletal and cardiac muscle, but not in other tissues examined. This also suggests that the polypeptide responsible for targeting the phosphatase to glycogen in liver is not homologous to the muscle form, although their molecular weights appear to be very similar and both interact with a highly conserved catalytic subunit (62).
The rabbit skeletal muscle glycogen-associated phosphatase undergoes in vivo and in vitro phosphorylation at several sites all of which are located near the NH2 terminus. The CAMPdependent protein kinase sites are at residues 48 and 67, respectively, and the glycogen synthase kinase-3 sites at positions 40 and 44. Other potential recognition sites for CAMPdependent protein kinase are threonine 498 and 978 and serine 636. Since purification of intact R c l subunit has been difficult and some of the site identifications have been carried out on proteolyzed species (50), it is possible that additional phosphorylation sites might have been missed. Threonine 978, in the motif -Arg-Arg-Val-Thr-, is an especially strong candidate. Phosphorylation of this residue, similarly to serine 48, could form a recognition site for glycogen synthase kinase-3 (31).
Search of the Swiss protein (release 13) and the EMBL (release 21) data bases with PCGENE utilizing the FASTN and FASTP programs (63) revealed no significant homology between the Rc, subunit and other known sequences. However, a search of the protein data base assembled by Dr. Mark Goebl, at Indiana University, indicated significant homology with the product of the yeast gene, GACl, isolated by Dr. Kelly Tatchell at North Carolina State Uni~ersity.~ Over a segment of 144 residues, the identity is 27% and the homology 38% if conservative replacements are taken into a c c~u n t .~ Utilizing the algorithm of Lipman and Pearson (63), the optimal alignment score between R G 1 and GACl amino acid sequence is 13 standard deviations over the mean of the optimal score of 100 random shufflings of the GAG1 sequence. It is of significance that the similarity lies within the 40-kDa NHz-terminal portion of the protein, which is able to interact with glycogen and with the catalytic subunit of type 1 phosphatase (61). The GACl protein appears to be involved in activation of glycogen synthase and glycogen accumulation. Both of these functions are consistent with the GACl gene product being the yeast homologue of the R G~ subunit. Gene replacement in yeast should allow us to address this question. In addition site specific and deletion mutagenesis will provide a powerful tool to elucidate the physiological role and regulation of the glycogen-associated protein phosphatase.