Structure of an antifreeze polypeptide and its precursor from the ocean pout, Macrozoarces americanus.

Serum antifreeze polypeptides (AFP) from Newfoundland ocean pout have been resolved by ion exchange chromatography and reverse phase high performance liquid chromatography into at least 12 components. The protein sequences of three of the AFP were determined using a combination of protein Edman degradation and cDNA sequencing. The AFP precursor protein encodes for a preprotein of 87 amino acids with no obvious prosequences. Two of the AFP (SP1-A and SP1-C) were separate gene products with minor amino acid sequence differences. The protein structure of SP1-C precursor is MKSVILTGLLFVLLCVDHMTASQSVVAT QLIPINTALTPAMMEGKVTNPIGIPFAEMSQIVGKQVNTPVAKGQTLMPNMVKTYVAGK. The third AFP (SP1-B) is a post-translation modification product of SP1-C. These experiments indicate that the ocean pout AFP are a multigene family with protein structure different from any other known polypeptide antifreezes.

AFP from these species have been isolated and characterized. Those from the flounder and sculpin are rich in alanine (60 mol %) and high in a-helical content (5, 11). The sea raven AFP, on the other hand, are rich in cystine and possess p-structure (6). A third type of peptide antifreeze, represented by ocean pout AFP, is rich neither in alanine nor cystine and, in fact, has no unusual bias in its amino acid composition. This AFP is a mixture of at least 12 components which, individually, have similar compositions to the whole, and molecular weights in the range of 6000-7000 (7). Measurements of circular dichroism indicate that the ocean pout AFP * This work was supported by grants from the Medical Research Council and National Science and Engineering Research Council, Canada (to C. L. H. and P. L. D.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The abbreviations used are: AFP, antifreeze polypeptides; HPLC, high performance liquid chromatography. has a distinct tertiary structure but does not contain any significant amounts of either a-helix or 8-structure (12). Comparative studies have in&cated that AFP of diverse chemical structures have similar biological activity. More recently, the primary structures of the major components from the flounder and sculpin have been elucidated (13)(14)(15)(16). In order to further understand the structure and function relationship in peptide antifreezes and their basis of microheterogeneity, we have determined the structure of ocean pout AFP and its biosynthetic precursor using a combination of protein sequencing by Edman degradation and DNA sequencing of the cDNA clones.

RESULTS
Microheterogeneity and Size of Ocean Pout AFP-From our earlier (7) and present investigations, it is clear that the ocean pout AFP can be fractionated into two separate groups based on their binding to ion exchange resins, namely, the QAE-, and SP-(or CM-) binding groups. The QAE-binding group eluted as a single peak from QAE-Sephadex ( Fig. la), but the SP-binding group (Fig. lb) was split into four peaks (SP1-4). On reverse phase HPLC, the AFP were resolved into 12 components (Fig. 2a). By running each of the peaks shown in Fig. 1 separately on the reverse phase column, it was possible to identify their constituents. Peak SP1 contained components HPLC (4,5, and 6 ( Fig. 2b). Peak SP2 contained HPLC 1,2,3, and 11, peak SP3 contained HPLC 8,9, and 10, peak SP4 corresponded to HPLC 7, and the single QAE peak corresponded to HPLC 12 (not shown). Our earlier procedure uses QAE-, CM-, and SP-ion exchange chromatography and reverse phase HPLC to fractionate these ocean pout AFP (7).
The present investigation has simplified the protocol by eliminating the need for CM-Bio-Gel chromatography.
Despite their chromatographic heterogeneity, the ocean pout AFP all have molecular weights in the range of 6,000-7,000 as determined by gel permeation HPLC in the presence of 0.1% trifluoroacetic acid and 45% acetonitrile (Fig. 3). This value, which is lower than that obtained by previous estima- tion on gel permeation HPLC in neutral buffers (M, 10,000-16,000), agrees with earlier molecular weight estimates based on electrophoresis in sodium dodecyl sulfate, analytical ultracentrifugation, and amino acid analysis (7).
Characterization of Three AFP Components: the SP-1 Group-To analyze the structure of ocean pout AFP, the SP-1 group (Fig. 1) was selected for sequencing. The three components in this group, corresponding to HPLC 4, 5, and 6 in Fig. 2a, are referred to here as SP1-A, SP1-B, and SP1-C, respectively (Fig. 2b). Not only do they co-elute on gel permeation and ion exchange chromatography but they have comparable thermal hysteretic activities and similar amino acid compositions (Table 1). Their structural homology was confirmed by tryptic peptide mapping (Fig. 4). SP1-B and SP1-C, which are almost identical, have the majorityof tryptic peptides in common with SP1-A. A similar result. was obrtained from mapping chymotryptic peptides (not shown). The amino acid compositions of the major tryptic peptides from SP1-A, SP1-B, and SP1-C ( Fig. 4) are listed in Table 2. Minor compositional differences occur in peptides 1, la, and l b and peptides 8 and 8a. Subsequent protein structure determination and cDNA sequencing confirmed these differences. Peptide 1 is the C-terminal peptide of SP1-A with the sequence Thr-Tyr-Ala-Ala. In component SP1-B, there is an Ala to Val substitution in the sequence to give peptide l a (Thr-Tyr-Val-Ala), and in component SP1-C the equivalent peptide (lb) has an additional residue (Gly) at its C terminus. The presence of peptide la in the tryptic map of SP1-C (Fig. 4c) is due to contamination of the starting materials with SP1-B. The difference between peptide 8 in SPI-A and peptide 8a in SP1-B and SP1-C corresponds to the replacement of Ile by Leu in position 75 of the preprotein. N-terminal analyses indicated that the SP-1 group was blocked. Automatic Edman degradation reactions were done on chymotryptic peptides of SP1-A and tryptic peptides of SP1-B (Fig. 5 ) . The large scale tryptic digest of SP1-B in Fig. 5b cannot be compared directly to the tryptic map in Fig. 4b because it was done with trypsin which was not treated with ~-l-tosylamido-2-phenylethyl chloromethyl ketone. The amino acid compositions of the peptides used for automatic Edman degradation are presented in Table 3 and their sequences in Table 4. Subsequent successful treatment of SP1-C with pyroglutamate aminopeptidase indicates that pyrolidine-carboxylic acid, which arises from the cyclization of terminal glutamine is the N-terminal residue. Once removed, the peptide is readily accessible to Edman degradation (Table 4).
cDNA Cloning and DNA Sequencing-Poly(A)+ RNA from the 9-10 S region of a sucrose density gradient was used for cDNA cloning as described under "Experimental Procedures" (Fig. 6). After screening by colony hybridization, three clones (36, 69, and 77) were selected for DNA sequence determination. Clone 69, which has an insert size of 0.3 kilobase, was sequenced by the chemical procedure of Maxam and Gilbert (Fig. 7). The two larger clones, 36 and 77, were sequenced using the M13-dideoxy chain terminator procedure (Fig. 8). The peptide sequences in Table 4 were used to establish the correct orientation and reading frame for the DNA sequences. The insert in clone 77 is 491 base pairs long. It codes for an 87-residue preprotein, the C-terminal portion of which matches in composition (Table 1) and sequence (Table 4)    changes occurs in the coding region. This change, at base 145, is a silent one. Neither sequence contains the polyadenylation signal AATAAA at the 3'-end, perhaps as a result of S1 nuclease digestion during cDNA cloning.

G T T A A C T G A A C A T G T C A C T G T G G A G A C T G G A G A T
The insert in clone 69 is severely truncated at the 5'-end and lacks the DNA coding for the signal sequence plus the Nterminal portion of the mature AFP. However, from a composite of the remaining DNA sequence, the peptide sequences in Table 4, and amino acid compositional data (Tables 1 and  3), this clone can be matched to component SP1-A. Specifically, it contains the Leu to Ile replacement at position 75 and the Val to Ala replacement in the C-terminal tryptic peptide, but no other changes.

DISCUSSION
The primary structures of three of the 12 ocean pout AFP components have been derived here. These studies confirm our earlier suggestion that ocean pout AFP represent a new and distinct class of macromolecular antifreezes. Furthermore, computer search on a protein data bank does not indicate any sequence homology with other known proteins.
In deducing the structure of these AFP components, we have provided some insight into the molecular basis of their microheterogeneity. First of all, both protein and DNA sequence analyses have established that SP1-A and SP1-C are products of separate genes. These two polypeptides differ in at least two positions in their amino acid sequences ( positions  75 and 8 4 ) . Second, SP1-B and SPl-C probably differ as a result of post-translational modification. All three SP1 components lack the C-terminal lysine which is encoded in the DNA but, in addition, both SP1-A and SP1-B, but not SP1-C, are missing the penultimate C-terminal residue, glycine. Peptide SP1-B is likely derived from SP1-C by post-translational processing. Thermal hysteresis measurements indicate that the loss of the C-terminal glycine has no obvious effect  (14) (b). Residues are denoted as a-helix o, 6-sheet (NVK), or unstructured (W). A change in the direction of the chain indicates a p-turn. Residues are numbered from the amino termini. Predicted cleavage points for the removal of the signal peptides are shown by the arrows. The triangle in b denotes the junction between the pro segment and the mature AFP. on antifreeze activity. Simlar findings have been observed with AFP isolated from the sera of winter flounder (4). Last, although both clone 36 and clone 77 code for SP1-C, they are distinct sequences with at least four nucleotide differences. Thus, the same gene product might be derived from two or more genes. This suggestion is supported by preliminary results from genomic Southern blots of ocean pout DNA which indicate that the number of AFP genes is far in excess of 12, which is the number of separable protein components?
The ocean pout AFP precursor has a typical signal sequence at its N terminus, but the precise point of cleavage has not yet been established. According to the observations of Perlman and Halverson (24), processing is likely to occur after the alanine at position 21. However, amino acid analyses on the whole components (Table 1) and N-terminal peptides (such as 13 in Table 2) as well as amino acid sequence of the deblocked SP1-C suggest that Ser-22 is also missing. The serine residue might be removed by an amino peptidase following the cleavage of the Ala-Ser bond. Alternatively, the Ser-Gln bond represents the actual cleavage site for the signal peptidase. The glutaminyl residue, once being the new N terminus, cyclizes to form the pyrolidine-carboxylic acid. There is not evidence for a "pro" sequence analogous to that found in the winter flounder AFP precursor (14,15). This was confirmed by cell-free translation of ocean pout AFP mRNA in the presence of microsomal membranes?
The secondary structure of the preprotein precursor to SP1-C based on the predictive method of Chou and Fasman (23) is shown in Fig. 9 and compared to that derived for the winter flounder AFP precursor. Whereas the flounder AFP is predominantly a-helix and sea raven AFP has p-structure (6,11), there is no extensive stretch of either of these structures in the ocean pout AFP. This observation has been confirmed by direct circular dichroism studies (12). A mechanism has been proposed to explain the antifreeze activity of flounder AFP based on its repeating structure (10). The absence of G. Scott, C.-L. Hew, and P. L. Davies, unpublished results. N. Ng, and C. L. Hew, unpublished results. such a structure in ocean pout AFP and its dissimilarity to sea raven AFP and to the glycoprotein antifreezes have made it difficult to generalize about the mode of action of these AFPs. However, with the structural information contained here it will now be feasible to examine the structure and function of these AFP critically using techniques such as chemical modification and peptide synthesis.
The isolation of cDNA clones for ocean pout AFP has given us access to their genes and the opportunity to study both the organization of this multigene family and its regulation.