Antifreeze Protein Genes of the Winter Flounder*

Tris-HC1 (pH 7.9) containing 0.1 M NaCl and 0.1 mM EDTA followed by cooling to 0 “C over a period of 10 The primer was extended by reverse transcriptase under conditions for cDNA synthesis (20). When this short cDNA was used as a hybridization probe to detect restriction fragments containing 5’-exon sequences it was labeled during synthesis by the incorporation of [a-32P]dCTP with specific activity of >600 Ci/mmol. To sequence the short cDNA the primer was first 5’ end labeled using T4 polynucleotide kinase and [T-~~PJATP with specific activity of 3000 Ci/mmol. The end labeled cDNA was purified by electrophoresis on an 8% polyacrylamide gel in 8 M urea.

A genomic library of the winter flounder in h phage Charon 4 was screened with antifreeze protein cDNA. Nine genomic clones were isolated from three haploid genome equivalents of phage and were grouped into three classes on the basis of restriction mapping and heteroduplex analysis. The three chromosomal regions are non-overlapping and each contain two antifreeze protein genes spaced from 3 to 7 kilobase pairs apart. Genomic Southern blots indicate that these six nonallelic genes represent only 10-20% of the complement of this large multigene family. One gene has been identified by sequencing as a variant of the gene encoding the most abundant antifreeze protein. This gene is 1.0 kilobase pair long and contains an intervening sequence of 0.6 kilobase pairs between the coding region for the bulk of the signal sequence and that coding for the proprotein.
The winter flounder Pseudopleuronectus americanus produces a set of serum AFP' on a seasonal basis (1-3). Two proteins, components A and B in a 60-40 M ratio, comprise the bulk of the antifreeze activity (4). Both are rich in alanine (>60 mol %) and their compositions differ in only 1 amino acid. Minor quantities of other alanine-rich AFP can be resolved from A and B by high performance liquid chromatography on reverse phase columns (5). On the basis of their amino acid compositions most of these minor components appear to be genetically distinct, although one of them is probably a cleavage product of component A.
Component A is initially synthesized in the liver as a preproprotein 82 residues long. After removal of the signal polypeptide, the proprotein can be detected in the circulation (6). The pro section is removed within 24-48 h of appearance there (6) and the C-terminal glycine residue is also cleaved off from the majority of molecules to produce a 37-residue protein (4). This protein contains three of the tandemly repeated 11-amino acid units previously identified in flounder AFP (7,8), and which are postulated to be the basis of their antifreeze activity (7).
Seasonal fluctuations in the levels of AFP in serum (3) and of AFP mRNA in liver (9) have been measured. In mid-winter when serum levels of the protein reach 10 mg/ml, 0.5-1.0% of the total liver RNA is AFP mRNA (9, 10). In late summer this level has declined 700-fold but the mRNA is still detectable by hybridization and cell-free translation (9). These seasonal changes in flounder AFP levels are known to be affected by both environmental factors and the pituitary gland As part of our study of this system and its regulation, we report here the isolation and mapping of six non-allelic AFP genes, along with the sequence of one of them which proved to be a variant of the gene for the A component. (10-15).

EXPERIMENTAL PROCEDURES
Materials-Winter flounder testes were collected at the Marine Sciences Research Laboratory, St. John's, Newfoundland, Canada.
Restriction endonucleases, T4 DNA polymerase, and T 4 polynucleotide kinase, were purchased from Bethesda Research Laboratories.
The Klenow fragment of DNA polymerase I and calf intestinal phosphatase were obtained from Boehringer Mannheim, radioisotopes and terminal transferase from New England Nuclear, dideoxynucleoside triphosphates from P-L Biochemicals, and avian myeloblastosis virus reverse transcriptase from Life Sciences Inc. The oligodeoxyribonucleotide 5"CTTCAGTGATTC-3' was synthesised by BIO LOGICALS Inc.
Construction of the Flounder Genomic Library-Genomic DNA was prepared from frozen flounder testis by the method of Blin and Stafford (16). It was digested completely and under five partial digestion conditions with EcoRI. Digested DNA (300 pg) was pooled and centrifuged on a 10-40% linear sucrose gradient at 27,000 for 18 h at 20 "C in an SW 28.1 rotor (17). Fractions (1 ml) were analyzed on a 0.75% agarose gel and those containing 15-20-kb size fragments were pooled and precipitated. Annealed X Charon 4 arms were prepared as described by Maniatis et al. (17). The 15-20-kb flounder DNA was ligated to the Charon 4 arms and packaged by a modification of the method of Sternberg et al. (18). The individual particles (1.1 X lo6) were amplified on Escherichia coli K802 on plates and the final library phage suspension stored over chloroform at 4 "C.
Selection of Recombinant Phage-The flounder genomic library was screened by the method of Benton and Davis (19). Each 150-mm plate contained lo4 recombinant phage. Duplicate imprints on nitrocellulose filters were probed with 32P-labeled AFP cDNA in the presence of sheared, denatured calf thymus DNA and poly(A). The cDNA was synthesized from purified AFP mRNA (20) and labeled by incorporation of [ C Y -~~P J~C T P of specific activity >600 Ci/mM. Phage which showed positive hybridization signals were plaque-purified and rescreened using an AFP cDNA clone (4) as the hybridization probe. The cDNA clone was labeled to a specific activity of 5 X lo' dpm/pg by nick translation (21) with [cY-~'P]~CTP and 15 PM dATP, dGTP, and dTTP. The individual X phage were purified by precipitation with polyethylene glycol (22) followed by banding in cesium chloride gradients (23). nique of Davis et al. (24) as modified by Kidd and Glover (25). The Electron Microscopy-Heteroduplex mapping was done by the tech-DNA was stained with uranyl acetate and rotary shadowed. Samples were photographed with a Forgeflow EM4 electron microscope. Markers used were the relaxed circular form of plasmid (PAT 153) as a 9241 double-stranded length (3.6 kh) and a snapback structure containing a single-stranded loop 1800 nucleotides at one end of a doublestranded stem of 1.14 kh (26).
Southern Blotting-Flounder genomic DNA and the DNAs from genomic clones were exhaustively digested with a variety of restriction enzymes, electrophoresed on a 0.5% agarose gel, and blotted onto nitrocellulose according to the method of Southern (27). Hybridization was performed at 68 "C in 6 X SSC containing 0.5% sodium dodecyl sulfate, 5 X Denhardts solution, 100 pg/ml of sheared calf thymus DNA, and a nick-translated AFP cDNA clone with specific activity of 3 X 10' dpm/pg. The entire hybridization solution was boiled for 10 min before addition to the blot. The final wash of the blots was done in 0.2 X SSC containing 0.5% sodium dodecyl sulfate at 68 "C for 2 h.
DNA Sequence Analysis-PstI fragments from subclone E3 were ligated into the PstI site of the M13 vector mp9 (28) which was then transfected into E. coli JM103. Single-stranded DNA was prepared from cultures of recombinant phage by phenol extraction and sequenced by the dideoxy chain termination method (29) using a synthetic primer (30).
These sequences were confirmed and extended by the method of Maxam and Gilbert (31). Double-stranded DNA fragments were 5' end labeled using either the Klenow fragment of DNA polymerase I and various [a-32P]dNTPs or T, polynucleotide kinase and [y-"P] ATP after prior removal of 5' phosphate groups by calf intestinal phosphatase. PstI and HgiAI fragments were 3' end labeled using terminal transferase and [a-32P]cordycepin triphosphate.
Primer Extension Experiments-A 5-fold molar excess of the primer 5'-CTTCAGTGATTC-3' was allowed to anneal with AFP mRNA by boiling for 2 min in 20 mM Tris-HC1 (pH 7.9) containing 0.1 M NaCl and 0.1 mM EDTA followed by cooling to 0 "C over a period of 10 min. The primer was extended by reverse transcriptase under conditions for cDNA synthesis (20). When this short cDNA was used as a hybridization probe to detect restriction fragments containing 5'-exon sequences it was labeled during synthesis by the incorporation of [a-32P]dCTP with specific activity of >600 Ci/mmol. To sequence the short cDNA the primer was first 5' end labeled using T4 polynucleotide kinase and [ T -~~P J A T P with specific activity of 3000 Ci/mmol. The end labeled cDNA was purified by electrophoresis on an 8% polyacrylamide gel in 8 M urea.

RESULTS
Isolation and Classification of Recombinant X Phage Containing AFP Gene Sequences-Approximately 6 X lo5 recombinant phage from the Charon 4 flounder genomic library were screened using cDNA to purified AFP mRNA as a hybridization probe. Based on a haploid genome size of 3 X IO9 bp and an average insert size of 1.6 X lo4 bp, this number amounts to three haploid genome equivalents. Those phage which gave strong signals in this assay were probed again with cDNA during plaque purification. Ten phage continued to give a positive response in this second round of screening and each of these hybridized strongly to a nick-translated AFP cDNA clone (4). One of the 10 recombinant phage (7-1) did not amplify well and was set aside. DNA from each of the remaining phage was digested with restriction endonucleases A d , BamHI, BglII, EcoRI, HindIII, KpnI, SalI, SstI, and SstII. From a comparison of the digests, restriction maps, and the patterns of hybridization on Southern blots the nine DNAs were subdivided into three homology groups (classes I, 11, and I11 in Table I). Within Class I, X4-3, X33-1, and X37-1 were identical and are represented in Fig. 1 by X33-1. In class 11, X15-1 and X44-1 were the same, and in class 111, X37-3 and  X42-1 were the same. These two classes are represented in Fig. 1 by X15-1 and X37-3, respectively. Each of these genomic clones contained two regions, illustrated by the thick black lines in Fig. 1, which hybridized strongly to cloned AFP cDNA. The disposition of these putative AFP genes and the location of restriction sites within the inserts are quite different for the three classes.
Classification of X29-7 and AB-I-The remaining two genomic clones, X29-7 and X2-l, are variants of class I and class I1 types, respectively. X29-7 shares a 4.6-EcoRI fragment with the class I clones. This is the central EcoRI fragment in X33-1 and the left-hand one in X29-7 (Fig. 1). The nature and extent of this homology is indicated by heteroduplex analysis in Fig. 2 A . In this electron micrograph a loop of singlestranded DNA, respesenting the left-hand EcoRI fragment of the X33-1 insert lies adjacent to the long arm of Charon 4. There follows a 4.6-kb region of homology corresponding to the common EcoRI fragment, and then a region of nonhomology adjacent to the short arm of Charon 4. The short side of the mismatch is the right-hand EcoRI fragment of the X33-1 insert and the long side the central and right-hand EcoRI fragments of the X29-7. In order to confirm that the homology did not extend beyond the common EcoRI fragment, the righthand EcoRI fragment of the X33-1 insert was labeled by T4 DNA polymerase and used to probe EcoRI digests of X29-7 and X33-1 on a Southern blot. No hybridization was observed to the central and right-hand EcoRI fragments of the X29-7 insert. The most feasible explanation for the homology between X29-7 and the other members of class I is that of a ligation artifact occurring during construction of the library. Either the right-hand EcoRI fragment in X33-1 or the lefthand one in X29-7 may have become adventitiously linked to the other portions of those inserts.
The homology between X2-1 and X15-1 is far more extensive (Fig. 1). The only obvious difference is a 1.6-kb stretch of DNA which is present towards the right-hand end of the X2-1 insert but is absent from the equivalent EcoRI fragment in the X15-1 insert. The site of insertion/deletion was located by heteroduplex analysis. Two examples of X2-1 annealed to X15-

Genes of the Winter Flounder 9243
A FIG. 2. Heteroduplexes between X phage recombinants in classes I and 11. A, the representative structure obtained on annealing X29-7 with X33-1 shown as a sketch above the electron micrograph. R, two examples of the structure obtained on annealing X2-1 with X15-1. The example on the right-hand side is shown as a sketch. The distinction between single-and double-stranded regions is emphasized in the sketches by thin and thick lines, respectively. The junctions between inserts and Charon 4 arms are marked by arrows except for the junction between the long arm and X2-1/X15-1 which occurs in a region where the DNA is folded over itself.
1 are present side by side in Fig. 2B. The larger loop of singlestranded DNA lies adjacent to the short arm of Charon 4 and corresponds to the right-hand EcoRI fragment of the X15-1 insert. On the other side of this loop is a 1.8-kb region of homology followed by the smaller loop of single-stranded DNA. The latter represents the 1.6-kb stretch of DNA unique to X2-1. The remainder of the inserts are homologous right up to the junction with the long arm of Charon 4. Genomic Southern Blots-A genomic Southern blot was performed to establish the authenticity of the genomic clones (Fig. 3). Representatives of the three classes of genomic clones (As 33-1, 37-3, and 44-1) were digested with EcoRI (lanes 4  and 5 ) , electrophoresed alongside an EcoRI digest of flounder DNA (lane 3 ) , and probed with a ""P-labeled AFP cDNA clone. The 12-kb insert in X37-3 (a) corresponds to an EcoRI fragment in the genomic digest, as does the 9-kb EcoRI fragment of X44-1 (6). Immediately below the latter is a genomic EcoRI fragment which hybridizes with the same intensity as the 9-kb fragment but does not have its complement in any of the genomic clones isolated. Conversely, the 10.6-kb insert of X2-1, which migrates between a and b (not shown), does not have a matching band in the genomic digest. Fragments c and d (4.6 and 3.8 kb, respectively) both come from X33-1. The counterpart to fragment c can be seen in lane 3 but the genomic fragment matching d is only visable on a much longer exposure of the blot or after less stringent washing.
Perhaps the most striking result of the genomic blot was that the bulk of the hybridization occurred to EcoRI fragments which were too long (>30 kb) to have been cloned into the Charon 4 library. The majority of these genes were contained in 8-9-kb RamHI fragments (lane 2) and 2.5-3.5-kb SstI fragments (lane I). The Charon 4 library was relatively successful in representing those genomic EcoRI fragments which could be cloned.
Subcloning the AFP Gene Regions into pBR322"The 3.0and 4.6-kb EcoRI fragments from X33-1 were ligated into the EcoRI site of pBR322 to give subclones 11-3 and 5a, respectively. The 5.5-kb BarnHI-EcoRI fragment that contains both the AFP genes of X2-1 was inserted between these sites in subclone 21a. The same cloning strategy placed the left-hand portion of the X37-3 insert into subclone E3 and the righthand portion into subclone F2.
Additional restriction sites were mapped on these five subclones to help define the locations of the gene regions (Fig.  4). In each subclone PstI cut within or close by these regions. In E3 there are four PstI sites within a length of 400 bp and the order of the three fragments was only deduced from DNA sequencing. Subclone 5a contains 500-and 700-bp EcoRI fragments in addition to the 4.6-kb fragment. These smaller fragments are probably derived from the flounder DNA insert of X33-1 used in subcloning, but their locations in the phage DNA have not been identified.
Recognition and Sequencing of the Component A-type A F P Gene-The sequence of the most abundant AFP in winter flounder, (AFP component A) was derived previously from the DNA sequence of a cDNA clone (4). This cDNA sequence had two PstI sites 69 bp apart within the sequence corresponding to the mature protein. In an effort to identify a genomic sequence for AFP component A, the five subclones (5a, 11-3, 21a, E3, and F2) were screened on a 10% polyacrylamide gel for the presence of a 69-bp PstI fragment. Only subclone E3 had a fragment this size (not shown). The other two small PstI fragments that flank it were later found by DNA sequencing to be 104 and 213 bp long, respectively, Fig. 4.
These small PstI fragments, all of which hybridized to AFP

FIG. 3. Southern blot analysis of flounder genomic DNA.
Lanes 1-3, each contained 15 pg of DNA from an individual flounder digested to completion with SstI, RamHI, and EcoRI, respectively. Lune 4 was loaded with a mixture of EcoRI-digested DNA from X phages 33-1, 37-3, and 44-1. The amount of each phage DNA (0.66 ng) in this lane was adjusted to be twice the frequency of a single copy gene within the flounder genome. Lune 5 contained twice the amount of DNA loaded in lane 4. The gel (0.5% agarose) was electrophoresed at 18 V for 20 h and stained with ethidium bromide, A Southern blot of the gel was probed with nick-translated plasmid containing full length AFP cDNA as described under "Experimental Procedures." The blot was exposed to Kodax XAR-5 film for 40 h at -60 "C in the presence of an intensifying screen. Band a, the 12-kb insert in X37-3; band b, the 9-kb EcoRI fragment from X44-1; b a d c and d, the 4.6-and 3.8-kb EcoRI fragments from X33-1.
cDNA, were cloned into the PstI site of the M13 vector mp9. Of 32 recombinant M13 DNAs examined, 13 contained the 213-bp fragment, 11 the 104-bp fragment, and 8 the 69-bp fragment. The latter fragment was only found in one orientation, whereas both orientations were obtained for the two larger fragments. Representative clones were sequenced by the dideoxy method. A 950-bp HaeIII fragment which con- tained the three PstI fragments was isolated from subclone E3 (Fig. 5 8 ) . The DNA sequence of the PsfI fragments was confirmed and extended by Maxam and Gilbert sequencing (Fig. 6) according to the strategy shown in Fig. 5 8 . This  G g t a c g t g a a c a c t c a c t t t g t t t c t t c t   a t g a a t c t t g t t t t a c t g t a a a t a --/ / ---t g a a a c t t c c t g a t g a t c t g g t g a c a c c t   g c t g g t t g a a g g a a a c a g a g t t t g a g a g g c a g c a g a a a a a a t t a t t t t a g t t t a a a t g a a g a a g c t g t c a t t t g a t t t t a t g t t g g g g q g g g g t c a t c a c a c a c a q a t a t t q a t a a c t g t c a t c a c t q a g t t t g g t g a a a g t g a c g g a c c a g t a a a t g t t g t g a t a t a t a a t a t t a t c a t 110 120 130

G G A C A A T T G A T T T T C T T A T T T T G G A C A A T G A
IleThrGludlaSerProdspProd1a a a t a a t t a t a a t a a t a c c a t t a a t c t ( c t g c a d 9 A T Fig. 5, and are numbered as a continuum with the first transcribed nucleotide represented as +l. Intervening sequence is written in lower case letters and is not numbered. The TATA box sequence (-30 to -24), the CAAT box sequence (-81 to -78), the termination codon (296 to 298), and the putative polyadenylation signal (365 to 370), are underlined, and the PstI sites are boxed. The four base changes seen in a cloned cDNA sequence (4) are displayed beneath the E3 gene sequence while the predicted sequence of the gene product is shown above in italics. The asterisk beneath residue 322 indicates a correction to the cDNA sequence at this point. sequence spans the 3'-exon of the AFP gene from the intervening sequence into the DNA flanking the 3' end of the gene. The PstI site closest to the left-hand end of the HaeIII fragment marks the junction between the 5' end of the 3'exon and the intervening sequence.

T T T C T G T A G A T C A T G T A G A C T C C A G G A A G T G A T G C C A T T G T G C T G T T G A A~T
T o locate the 5'-exon of the gene, a specific hybridization probe was constructed. The oligodeoxyribonucleotide 5'-CTTCAGTGATTC-3', which is complementary to nucleotides 105-116 (Fig. 6) was synthesized to prime on AFP mRNA. When this oligonucleotide was 5' end labeled with 32P and extended by reverse transcription, 95% of the elongated sequences were in a discrete product approximately 120 nucleotides in length (not shown). This cDNA proved to have a unique sequence (Fig. 6) which was read to within a few nucleotides of the cap site (nucleotide 1). The sequence confirmed that the oligonucleotide primed cDNA synthesis from the correct location on AFP mRNA and that the cDNA could,

Antifreeze Protein
Genes therefore, be used as a hybridization probe for the 5' end of the gene. With the aid of this probe the 5'-exon was located in an 800-bp HaeIII fragment (Fig. 5A) which can be oriented to the E3 subclone map in Fig. 4 by reference to the SstI, BgII, and AuaI sites. The 5'-exon, which is approximately 600 bp upstream of the 3'-exon was sequenced by the Maxam and Gilbert method according to the strategy shown in Fig. 5A. This AFP gene is, therefore, 1 kb in length and encodes from nucleotide 50 to 295 an 82-residue antifreeze preproprotein precursor, which differs by 1 amino acid substitution (Ala + Asp) from the sequence described previously (4). The 600bp intervening sequence interrupts the gene close to the junction between the DNA coding for the TO" section and that coding for the signal peptide. The first two bases of the arginine codon (AGA) towards the end of the signal sequence (nucleotides 104 and 105) represent the terminal redundancy in the consensus splice junction sequences where AG is present at the 3' end of the 5'-exon, and the 3' end of the intervening sequence.

DISCUSSION
The AFP Multigene Family-The authenticity of all three classes of genomic AFP clones was established by the genomic Southern blot (Fig. 3). At least one type of clone from each class contained EcoRI fragments which matched in length, and strength of hybridization, bands in the genomic digest. However, variant sequences in classes I and I1 appear to have arisen as artifacts of the cloning procedure. In class I, the difference between X29-7 and the other members of this group is most likely due to a spurious ligation event, but it is not possible to say which of the two insert types was rearranged during the cloning (Fig. 1). That three examples of the X33-1 type of clone were isolated may just be a consequence of amplifying the library. In class I1 the 9-kb left-hand EcoRI fragment of X15-1 (Fig. 1) is represented in the genome but not the equivalent 10.6-kb fragment from 12-1. Although the two clones could possibly represent different alleles, a more plausible explanation for the occurrence of X2-1 would be that it resulted from a DNA rearrangement.
While most of the genomic EcoRI fragments which contained AFP genes and were of a size suitable for insertion into Charon 4 DNA were represented in this library, the majority of the AFP genes could not be cloned because they were in EcoRI fragments which exceeded the 22-kb upper size limit. The bands a and b in Fig. 3 each contain two AFP genes, c and d each contain one. Therefore, in the genomic digest at least six genes are not a part of the major group towards the top of the blot. On the basis of hybridization intensities we estimate that the flounder genome contains at least 30-40 AFP genes.
The AFP Gene in Subclone E3-The AFP gene in subclone E3 is highly homologous to the cDNA sequence described previously (4) which corresponds to nucleotides 44-368 (Fig.  6). In this stretch of 315 base pairs there are only four nucleotide substitutions. There are silent base changes at nucleotide 73 in the valine codon of the signal sequence (GTC + GTT), and at nucleotide 217 in a leucine codon of the mature protein (CTC + CTT). The base change at nucleotide 258 changes an alanine codon to as aspartic acid codon (GCC + GAC). The fourth base change (G + A) is in the 3'untranslated region at nucleotide 334. The insertion of a G at nucleotide 322 marked by the asterisk, represents a correction to the original cDNA sequence. The genomic sequence in Fig.  6 extends through the putative polyadenylation signal AA-TAAA (nucleotides 365-370) and beyond into the 3' flanking DNA. Twenty-two base pairs downstream from the centre of of the Winter Flounder this sequence at nucleotide 389 is the start of an oligo(dA) tract 5 base pairs long which might constitute the poly(A) attachment site.
At the 5' end of the gene the transcription initiation site (nucleotide 1) has been identified by comparison with full length AFP cDNA clones of the component A and B types. These clones were made by the procedure of Land et al. (32) and all begin with the sequence 5"ACCACATCTT. . . -3'.
Since the 5' untranslated regions of the A and B component clones and the corresponding sequence in Fig. 6 are all identical, the homology is likely to extend to the transcription start site. The AFP mRNA encoded by the gene in E3 would, therefore, be 389 nucleotides long without the poly(A) tract. Previous estimates for the mRNA and cDNA lengths were 520 and 405 nucleotides, respectively (20).
The only significant difference between this genomic clone and the cDNA sequence for the A component AFP is the base change at nucleotide 258 which changes an alanine codon for an aspartic acid codon. As a result the protein encoded by the gene in E3 will have 5 Asx and 1 Glu in its composition. This ratio of acidic residues has yet to be seen among the flounder AFP components which have been purified to date (4,5). The A component has 4 Asx and 1 Glu, and the B component 5 Asx and 0 Glu. However, there is certainly no indication that this AFP genomic clone is a pseudogene. It has conventional splice junction sequences (33), initiation and termination codons, a typical polyadenylation signal, and a conventional cap site (34). Exactly 31 bp upstream from the cap site is the start of a "TATA" box signal (34) TATAAA, and a further 51 bp upstream is the sequence CAAT which is believed to promote transcription (35). It is most likely that this gene is functional but unique, and that its protein is overshadowed by the products of the 30 or 40 other AFP genes, the majority of which presumably code for the more abundant A and B components.