Anglerfish Islet Pre-proglucagon I1 NUCLEOTIDE AND CORRESPONDING AMINO ACID SEQUENCE OF THE cDNA*

Glucagon is a 29-amino acid peptide hormone that regulates blood glucose concentrations. It is a member of a family of structurally related hormones that in- cludes in addition to glucagon, vasoactive intestinal peptide, gastric inhibitory peptide, and secretin. Like other peptide hormones, glucagon is synthesized as a larger precursor. Previously, we reported the amino acid sequence of an anglerfish pre-proglucagon of apparent M, = 14,500, derived from the sequence of cloned cDNAs. Now we have determined the nucleotide se- quence of cDNAs encoding a separate anglerfish pre-proglucagon of apparent M, = 12,500 and have derived the complete amino acid sequence of this second precursor. The configurations of the two pre-proglucagons are similar. Each pre-proglucagon is a polyprotein that contains two glucagon-related peptides arranged in tandem, a 29-amino acid glucagon sequence and a 34-amino acid sequence which shows homology to glucagon and the other members of the glucagon family. These two peptides are linked in the precursors by lysine-arginine and intervening penta- or tetrapep- tides. The glucagon sequences of 29 amino acids in the two precursors are closely homologous. Similarly, the 34-amino acid peptide sequences in the two precursors are highly homologous. Analyses of the genomic DNA prepared from the spleen of a single anglerfish show that these two pre-proglucagons are encoded by at least two separate

Glucagon is a 29-amino acid peptide hormone that regulates blood glucose concentrations. It is a member of a family of structurally related hormones that includes in addition to glucagon, vasoactive intestinal peptide, gastric inhibitory peptide, and secretin. Like other peptide hormones, glucagon is synthesized as a larger precursor. Previously, we reported the amino acid sequence of an anglerfish pre-proglucagon of apparent M, = 14,500, derived from the sequence of cloned cDNAs. Now we have determined the nucleotide sequence of cDNAs encoding a separate anglerfish preproglucagon of apparent M, = 12,500 and have derived the complete amino acid sequence of this second precursor. The configurations of the two pre-proglucagons are similar. Each pre-proglucagon is a polyprotein that contains two glucagon-related peptides arranged in tandem, a 29-amino acid glucagon sequence and a 34amino acid sequence which shows homology to glucagon and the other members of the glucagon family. These two peptides are linked in the precursors by lysine-arginine and intervening penta-or tetrapeptides. The glucagon sequences of 29 amino acids in the two precursors are closely homologous. Similarly, the 34-amino acid peptide sequences in the two precursors are highly homologous. Analyses of the genomic DNA prepared from the spleen of a single anglerfish show that these two pre-proglucagons are encoded by at least two separate genes. Analyses of the mRNAs in the anglerfish islets indicate that a single mRNA species encodes a M, = 14,500 pre-proglucagon but suggest that two separate mRNAs contain coding sequences for the M, = 12,500 pre-proglucagon. These studies indicate that in the anglerfish, glucagon is synthesized by way of the expression of at least two genes.
Glucagon is a hormone of 29 amino acids (molecular weight, 3,500) synthesized and secreted by "A" cells of the pancreatic islets (1) and involved in the regulation of plasma glucose concentrations (2). The hormone is a member of a family of structurally related hormones which includes gastric inhibitory peptide, vasoactive intestinal peptide, secretin, and glicentin (3). Analyses of labeled, newly synthesized proteins in intact islets have shown that glucagon is synthesized as a large precursor (4-6). Recently, we (7, 8 ) and Shields et al. (9) reported that islets of the anglerfiih (Lophius americanus) contain poly (A) RNAs which direct the synthesis in cell-free systems of at least two precursors of glucagon (pre-progluca-321 gons) of apparent molecular weights of 14,500 and 12,500. Hybridizations with islet poly(A) RNAs and cloned cDNAs prepared from total islet poly(A) RNAs demonstrated that the two pre-proglucagons are encoded by separate mRNAs (8). We previously described the cloning and nucleotide sequences of cDNAs encoding the M, = 14,500 pre-proglucagon (10). We have now sequenced cDNAs encoding the M, = 12,500 pre-proglucagon and here compare the nucleotide sequence and derived amino acid sequence of the precursor to those of the Mr = 14,500 pre-proglucagon. We also find by genomic blotting that the two precursors are encoded by at least two separate genes.

EXPERIMENTAL PROCEDURES
Isolation of Recombinant Plasmids Containing cDNA Encoding Precursors of Anglerfish Islet Glucagons-The details of the preparation of a cDNA library using the vector plasmid pBR322 and the host Escherichia coli (strain ,1776) and poly(A) RNA prepared from the islets of the anglerfish L. americanus have been described previously (11). Initial screening of this cDNA library by hybridization arrest and hybridization selection and cell-free translations provided two cloned recombinant plasmids containing cDNAs encoding the two pre-proglucagon precursors of M, = 14,500 and 12,500 (8). These two recombinant plasmids, labeled by nick translation (12) with ["PI dCTP, were used to screen by hybridization (13) 1,800 bacterial clones containing recombinant plasmids prepared from the anglerfish poly(A) RNA. Thereby, we identified 31 and 11 bacterial colonies containing coding sequences for anglerfish islet pre-proglucagon I and 11, respectively.
Determination of the Nucleotide Sequences of cDNAs Encoding the Two Pancreatic Islet Pre-proglucagons-Plasmids were isolated from several of the clones described above using the cleared lysate technique (14) followed by two successive centrifugations in cesium chloride/ethidium bromide gradients. Two recombinant plasmids encoding the smaller and three encoding the larger of the two separate pre-proglucagons were selected for nucleotide sequence analysis. Complete nucleotide sequences of both the sense (coding) and nonsense (noncoding) strands were determined by the chemical sequencing method of Maxam and Gilbert (15). We have previously reported the nucleotide sequence of the cDNA encoding the anglerfish islet pre-proglucagon I (IO).
Analyses of Pre-proglucagon Coding Sequences in Genomic DNA-Aliquots of DNA prepared from the spleen of a single anglerfish (16) were digested individually to completion with the restriction endonucleases EcoRI, BamHI, and HindIII. Ten micrograms of the digested DNA were separated by electrophoresis on 0.8% agarose gels and transferred to nitrocellulose (17). Duplicate nitrocellulose filters prepared from the same gel were hybridized individually with recombinant plasmids containing cDNAs encoding anglerfish pre-proglucagon I and pre-proglucagon I1 that had been nick translated with [32P]dCTP (12). The conditions of hybridization were 42 "C for 18 h in the presence of 5 X SSC (1 X SSC is 0.15 M NaC1, 0.015 M Na citrate, pH 7.0), 50 m~ sodium phosphate, pH 7.0, 50% formamide, and 10% dextran sulfate. Autoradiograms (72-h exposure) were prepared from the nitrocellulose fdters following hybridization. bridization (Northern Blot Ana1ysis)"Aliquots of the poly(A) RNA Analyses of mRNAs, Encoding Pre-proglucagons by Filter Hyused for the preparation of the cDNA libraries were analyzed by electrophoresis on 1% agarose gels. The RNAs were transferred from Pre-proglucagon cDNAs the gels to Gene Screen (New England Nuclear) by capillary transfer (17). Duplicate blots of the RNA on Gene Screen were hybridized individually to 32P-labeled cDNAs encoding the pancreatic pre-proglucagons I and 11. Hybridizations were for 16 h at 42 "C in the presence of 6 X SSC, 50 m~ Tris, pH 7.5,0.1% sodium dodecyl sulfate, 10 x Denhardt's reagent (1 X Denhardt's reagent is 0.02% Ficoll400, 0.02% bovine serum albumin, 0.02% polyvinylpyrrolidone 40), 50% formamide, and 0.01% sonicated denatured salmon sperm DNA.

RESULTS AND DISCUSSION
The sequences of the cDNAs encoding the M, = 14,500 (AFG 1') and M , = 12,500 (AFG 11) pre-proglucagons reveal coding sequences of 124 (AFG I) and 122 (AFG 11) codons, beginning with ATG initiator methionine codons and flanked by 5' and 3' untranslated regions (Fig. 1). 2 The overall configuration of the two pre-proglucagons is similar. The glucagon sequences reside in the midregion of each precursor with peptide extensions at the NH2 and COOH termini. At the NH, terminus of each precursor, following the initiator methionine, are sequences of predominantly hydrophobic amino acids characteristic of signal or leader sequences found in precursors of other secreted hormones and proteins (18,19). Following the putative signal sequences in the two precursors ( Fig. 1) are, in succession, an NHz-peptide or prosequence, a Lys-Arg, a %amino acid glucagon ~equence,~ a Lys-Arg, a tetra-or pentapeptide, a Lys-Arg, and a 34-amino acid COOHterminal peptide which is highly homologous to glucagon and to the other hormones of the glucagon family (Fig. 2). The paired basic residues flanking the peptides are characteristic of sites in prohormones that are cleaved in the post-translational formation of secreted hormones (20). Thus, post-translational processing of the proglucagons at these sites would result in the formation of four peptides from each precursor: an NHz-peptide, a 29-amino acid glucagon, a short intervening peptide, and a 34-amino acid glucagon-related COOH peptide. The 29-amino acid glucagon sequences in the precursors are highly homologous to each other (89%) and to mammalian glucagon (Fig. 2). Likewise, the 34-amino acid glucagon-related COOH peptides in the precursors are highly homologous to each other (74%) (Fig. 1). These homologies indicate evolutionary pressure to conserve these sequences and are consistent with a specific biologic role of the four glucagon-related peptides. Predictions of the biologic role of the glucagonrelated COOH-terminal peptides are speculative at this time. However, a gastric inhibitory peptide-like immunoreactant has recently been demonstrated in rat pancreatic A cells in the same secretory granules as glucagon (21,22). The anglerfish glucagon-related COOH peptides show considerable homology with mammalian gastric inhibitory peptide (Fig. 2) and may represent the counterpart of the mammalian gastric inhibitory peptide-like immunoreactant in the fish. On the other hand, the glucagon-related COOH peptide of the AFG I1 pre-proglucagon has the sequence Ala-Gly-Arg-Gly-Arg-Arg-Glu which could serve as a post-translational processing site leading to the formation of either a 28-or a 31-amino acid ' The abbreviations used are: AFG I, anglerfish glucagon I; AFG 11, anglerfiih glucagon 11.
We previously designated molecular weights of 14,500 and 12,500 to the proteins AFG I and AFG 11, respectively, based on their electrophoretic migration on sodium dodecyl sulfate polyacrylamide gels (10). The molecular weights calculated from the amino acid sequences shown above are 16,360 (AFG I) and 16,300 (AFG 11), demonstrating that the proteins migrate anomalously on sodium dodecyl sulfate-polyacrylamide gels.
Sequencing of three cDNAs corresponding to AFG I revealed a one-nucleotide point mutation in one of the cDNAs which changes a glutamic acid residue to a valine at residue 3 of glucagon (Fig. 1) (10). This difference may represent an allelic or nonallelic variant or a reverse transcriptase error during preparation of the cDNA library. peptide ending in alanineamide or arginineamide, respectively (23). The NHz-peptides and short intervening peptides in each precursor show little homology (33 and 25%, respectively).
This lack of homology may indicate their limited functional role except as "spacer" peptides required for correct posttranslational cleavages of the highly conserved glucagon-related sequences from the precursors.
The finding of two structurally related but different mRNAs indicates that the anglerfish possesses two different pre-proglucagon genes. However, the cDNA library was prepared from pooled anglerfish islets which raised the possibility that the two mRNAs might represent two marked polymorphic variants among fwh. To determine the genomic representation of the two cDNAs, we performed Southern blot hybridizations (17) with DNA extracted from a single anglerfish spleen. Duplicate Southern blots of restriction enzyme digests of the DNA were prepared and individually hybridized with labeled cDNAs encoding either AFG I or AFG 11. AFG I hybridized to different restriction fragments of the DNA than did AFG I1 in each of the three digests of genomic DNA (Fig. 3). This finding demonstrates that the two pre-proglucagon-encoding mRNAs are transcribed from different genes in a single individual. Surprisingly, despite an overall 74% homology in the nucleotide sequences of the two cDNAs, we observed no crosshybridization of the cDNAs with the genomic fragments under these stringent (42 "C) conditions of hybridization (Fig. 3). In preliminary studies, however, we observed weak cross-hybridization of the two cDNAs to the corresponding genomic restriction fragments under conditions of lower stringency of hybridization (37 "C). Although our findings demonstrate that there are at least two different glucagon genes in the angler-fBh, they do not exclude the possibility that there are additional glucagon genes which do not hybridize to either the AFG I or AFG I1 cDNAs.
To determine the sizes and complexity of the messenger RNAs encoding the two-pre-proglucagons, we analyzed anglerfish islet poly(A) RNA by electrophoresis on agarose gels, followed by transfer of the RNA to filters and hybridization with the two 32P-labeled glucagon cDNAs (Northern blot analysis) (Fig. 4). By this procedure, a single mRNA species of 650 bases was identified by hybridization with the probe containing the coding sequence for anglerfiih glucagon I. Two separate bands representing messenger RNAs of 630 and 670 bases were detected using the probe encoding the anglerfish glucagon 11. As was found with the genomic blotting, there was no sigmficant cross-hybridization between the two glucagon-cDNA hybridization probes. Thus, three separate mRNAs encoding the pre-proglucagons appear to exist. The two separate mRNAs encoding the anglerfish pre-proglucagon I1 may arise from separate genes encoding this particular glucagon precursor, by utilization of differential splicing mechanisms during the maturation of a single pre-mRNA to the mature mRNAs, or by differences in lengths of poly(A) tracts at the 3' end of the mRNAs.
Our observations indicate that fish islet pre-proglucagons are polyproteins (proteins containing more than one peptide in the same precursor) as are precursors to several other hormones (23)(24)(25). Synthesis of polyprotein hormonal precursors may alle-coordinate synthesis of multiple bioactive hormones in the same cell. Alternatively, tissue-specific cleavages of polyproteins may produce different biologically active peptides from the same precursor, as demonstrated for the pro-opiomelanocortin precursor (25). Based on our data on the configuration of the fish glucagon precursors and immunologic evidence about the mammalian glucagon precursors and products, it is tempting to speculate on the potential biologic importance of the processing of mammalian glucagon ' .  precursors. Mammalian glucagon precursors in islets (5,6,26), intestine (27, 28), and brain (29) appear to show a similar configuration to the fish precursors with glucagon in the midregion and peptide extensions at the NH2 and COOH termini. Glucagon, an NHZ-terminal fragment of the glucagon precursor, and a gastric inhibitory peptide-like immunoreactant have all been reported to exist in mammalian pancreatic A cells (21, 22, 26, 30). These observations indicate that the mammalian counterpart of the fish glucagon precursor may be processed completely in mammalian pancreatic A cells to release multiple peptides: glucagon, the COOH-terminal peptide, an NH2-peptide, and the short intervening peptide. In contrast, intestinal "L" cells are reported to contain a gastric inhibitory peptide-like immunoreactant (21,22,30) and a large glicentin-like molecule (27,28), consisting of glucagon covalently linked to a long NHz-terminal extension and a short COOH-terminal extension. This situation suggests that incomplete processing of the mammalian counterpart of the fish precursors in intestinal cells may release only the COOHterminal peptide and the remaining prohormonal fragment.  . Bars indicate identity of residues in the anglerfish peptides and mammalian glucagon. Circles around residues indicate identity of amino acids in the anglerfish COOHterminal glucagon-related peptides and mammalian gastric inhibitory peptide, and squares indicate different amino acids which are conservative changes from gastric inhibitory peptide (34).

FIG. 3.
Autoradiogram of Southern hybridization of anglerfish spleen genomic DNA with S2P-labeled glucagon cDNAcontaining plasmids which contain the sequences AFG I and AFG 11 (Fig. 1). The DNA was subjected to complete digestion with restriction endonucleases BamHI ( B ) , EcoRI ( E ) , and Hind11 ( H ) (see "Experimental Procedures"). Duplicate filters were hybridized with 32P-labeled glucagon cDNA probes, AFG I and AFG 11. Left, AFG I , lanes B, E, and H represent fragments of genomic DNA in BamHI, EcoRI, and H i d 1 1 restriction digests, respectively, which hybridized to '*P-labeled AFG I cDNA. Right, AFG 11, lanes B, E, and H represent fragments in the same restriction digests which hybridized to 32P-labeled AFG I1 cDNA. m (left and right) represents molecular weight markers from EcoRI digests of h DNA (arrows 1-5 point to DNA fragments of 7.5, 5.9, 5.5, 4.8, and 3.4 kilobases, respectively). Exposure of the autoradiogram was for 72 h.
Little is known about the biosynthesis of other members of the glucagon family of hormones, gastric inhibitory peptide, vasoactive intestinal peptide, secretin, and glicentin. Our observations, however, about the nature of the glucagon precursors raise the possibility that other members of the glucagon family may arise from similar polyprotein precursors.