Isolation of a cDNA clone for human antithrombin III.

Antithrombin III (ATIII) is an important plasma protease inhibitor with a central role in the coagulation system. On the basis of its protein sequence, ATIII is one member of a "super family" of protease inhibitors that includes alpha 1-antitrypsin and chicken ovalbumin. An increased risk of thromboembolism is associated with inherited ATIII deficiency. To study the structure and expression of the human ATIII gene, we have isolated complementary (cDNA) clones for ATIII from human liver mRNA. ATIII cDNA clones were identified by hybridization to a mixture of synthetic oligodeoxynucleotides encoding amino acids 251-256 of the ATIII protein sequence. The largest cDNA clone (1.4 kilobases) included the coding region of ATIII mRNA from codon 10 through a 3'-untranslated region. Comparison of ATIII cDNA clones from two different sources revealed a sequence polymorphism at an internal PstI restriction site. Analysis of both total genomic DNAs and an ATIII gene cloned in a bacteriophage Charon 4A showed that the ATIII gene is present once per haploid genome and is distributed over 10-16 kilobases of DNA. Computer-assisted comparison of the cDNA sequence with those for baboon alpha 1-antitrypsin and chicken ovalbumin revealed homologies consistent with their inclusion in the protease inhibitor superfamily.

super family" (7-9) and it has been proposed that divergence of these proteins from a common ancestor occurred more than 500 million years ago (9).
In order to study potential evolutionary relationships within this family and to understand the basis of inherited AT111 deficiency, we have initiated molecular analysis of the ATIII gene. As a first step, we have characterized AT111 complementary DNA (cDNA) clones from two human liver cDNA libraries. The genomic organization of ATIII sequences in cellular DNA has been studied and a recombinant X-phage clone containing ATIII genomic sequences has been isolated. A restriction site polymorphism within the gene coding region has been found that may prove useful in the study of families with ATIII deficiency. Finally, a comparison of the sequences of the cDNAs of ATIII, a,-antitrypsin, and ovalbumin has revealed remnants of the proposed original homology.

EXPERIMENTAL PROCEDURES
Radioimmunoassay for ATIlI in Human Liver-Approximately 1 g of tissue was minced in volumes of 0.1 M NaC1,20 mM Tris-HCI, pH 7.5, 1 mM EDTA, and 50 pg/ml of PMSF (Sigma). The tissue was then hand homogenized and SDS was added to a final concentration of 0.1%. Following sonication on a Sonifier Cell Disruptor (Heat-Systems Ultrasonics, Inc., Plainview, NY) for 2 min, the homogenate was clarified a t 3,000 X g for 10 min. The supernatant was again centrifuged a t 200,000 X g for 60 min in a Beckman Ti-65 rotor and then dialyzed overnight a t 4 "C in 0.1 X PBS (Gibco, Grand Island, NY) plus 5 pg/ml of PMSF. The dialysate was lyophilized and redissolved in one-tenth the original volume of water. The protein concentrations of the lysates were 60-150 mg/ml. Serial dilutions of the lysate or of purified ATIII (gift of R. D. Rosenberg, Harvard Medical School) were prepared in PBS and 3% BSA. 10 p1 were incubated overnight a t 4 "C with 5 pl of a rabbit antiserum prepared against highly purified ATIII (10). Approximately 2500 cpm of 1251-ATIII in 200 pl of PBS and BSA were then added and the incubation continued for an additional 60 min at 37 "C. IgG was precipitated with Staph A (IgG-sorb, The Enzyme Center, Boston, MA). The total amount of ATIII in each tissue was determined by a standard curve using purified ATIII. The sensitivity of the radioassay was such that 100 pg of the purified ATIII could he routinely detected.
Construction of a Human Adult Liver cDNA Library-Fresh adult liver was homogenized in the presence of SDS and the total RNA extracted with phenol (11). Poly(A)-containing mRNA was isolated by passage over oligo(dT)-cellulose and stored in liquid nitrogen. First strand cDNA synthesis was carried out in a final volume of 500 pI containing 25 pg of mRNA; 50 mM Tris-HCI, pH 8.2; 50 mM KC1; 6 mM MgCI?; 10 mM dithiothreitol; 400 p M of each of the dNTPs except for dCTP which was 100 VM; 100 GCi of [y3'P]dCTP (2000 Ci/mmol, New England Nuclear); 25 pg of oligo(dT)12-,8 (Collaborative Research, Waltham, MA); 5 p1 of placental RNase inhibitor (RNasin, BioTec, Inc., Madison, WI); and 150 units of avian myeloblastosis virus reverse transcriptase (Life Sciences, St. Petersburg, FL). The efficiency of first strand synthesis was 12-30% following a 60-min incubation a t 42 "C. The DNA was sized on an alkaline sucrose gradient and only material >500 nucleotides in length (60% of total DNA) was used for second strand synthesis.
Second strand synthesis was performed in a final volume of 1 ml containing 30 mM Tris-HC1, p H 7.5, 4 mM MgCI2, 70 mM KCl, 500 pM of each of the four dNTPs, 0.5 mM 0-mercaptoethanol, and approximately 3 pg of 32P single-stranded cDNA. The Klenow fragment of Escherichia coli DNA polymerase I (Boehringer-Mannheim, Indianapolis, IN) was added to a final concentration of 100-150 units/ ml and the reaction was incubated a t 20 "C for 5-6 h. Following phenol extraction and dialysis, the resultant douhle-stranded cDNA was treated with 10 units/ml of S1 nuclease (Sigma) for 30 min. 50-70% of input counts were SI resistant. The product was again sized on a neutral sucrose gradient to remove material of <500 bp in length. Following exhaustive dialysis and ethanol precipitation, dCMP tails were added as described (12). 1.8 pg of double-stranded cDNA was obtained from the original 25 pg of mRNA. The product was annealed to PstI-digested and dGMP-tailed plasmid vector pKT218 (13) and used to transform E. coli strain MC1061 (14) to tetracycline resistance. Transformation efficiency varied between 2-13 X IO5 transformants/pg of double-stranded cDNA with less than 10% background. The average length of cDNA inserts was approximately 1000 bp. The recombinant colonies were scraped from their nitrocellulose filters and stored a t -80 "C as a 25% glycerol stock without further arnplification. A total of 2.5 X IO5 independent colonies was obtained, sufficient to include nearly all liver mRNA species with >99% probability. Synthesis of Oligodeoxynucleotide Probe-A series of 17-nucleotide long oligomers, corresponding to the eight possible coding sequences for amino acids 251-256 of ATIII was synthesized by the triester method (15, 16). The mixture was 5'-end labeled with [y-32P]ATP (7000 Ci/mmol, New England Nuclear) using polynucleotide kinase and used as ATIII-specific probe in in situ hybridizations with recombinant bacterial colonies.
In Situ Colony Hybridization-Recombinant bacterial colonies were plated directly onto nitrocellulose filters (Millipore) and grown overnight on L broth containing 1.6% agar and 10 pg/ml of tetracycline. Following replica plating, the colonies were grown for an additional 4-6 h and then amplified overnight on medium containing 150 pg/ml of chloramphenicol. DNA was denatured and fixed to the filters as described by Woods et a/. (17) and hybridized overnight a t 40 "C in a solution containing 6 X SSC, 1 X Denhardt's, 100 pg/ml of tRNA, 0.05% Na pyrophosphate and 40 ng/ml of '*P-labeled 17mer (specific activity "10' dpm/pg). The filters were then washed exhaustively against several changes of 6 X SSC + 0.05% Na pyrophosphate at either 40 or 50 "C, dried, and exposed overnight on X-Omat AR film (Kodak) using a calcium tungstate intensifying screen.
Screening of Recombinant Phage Library-A human genomic DNA library (18), cloned in the X-phage Charon 4A (19), was screened with nick-translated ATIII cDNA (20). Approximately lo6 phage plaques were screened and three positives were identified.
Restriction Enzymes-These were purchased from New England Biolabs (Beverly, MA) or Bethesda Research Labs (Gaithersburg, MD) and used according to the directions of the supplier.
DNA Sequencing-This was performed as described by Maxam and Gilbert (21).

ATIII Levels in Human Liver-Previous work indicated
that liver is a rich source of ATIII ( 2 2 ) and thus presumably of ATIII mRNA. We first determined the level of immunoreactive ATIII in various human liver samples by radioassay to choose the most appropriate source for construction of a cDNA library for the cloning of ATIII sequences. ATIII Total liver lysate was prepared as described under "Experimental Procedures" and was free of proteolytic activity as judged by a failure of "'I-ATIII to be degraded following overnight incubation with the undiluted lysate. Serial 2-fold dilutions of lysate were assayed for ATIII and the results compared to the standard curve.
represented approximately 0.007% of total adult liver protein, whereas it was about 10-fold and &fold less abundant in fetal liver samples of 12-14 and 18-20 weeks gestation, respectively (Table I). In view of the higher ATIII level in adult liver, we initially employed an adult liver cDNA library for the isolation of cDNA clones. Screening of Adult Liver cDNA Library for ATIIZ Sequences-The adult liver cDNA was screened for ATIIIspecific sequences with a mixture of the eight synthetic heptadecanucleotides (17-mers) which could potentially encode amino acids 251-256 of ATIII (Fig. 1). Within the oligonucleotide mixture, only one 17-mer would be expected to be perfectly homologous to the ATIII mRNA sequence. Endlabeled 17-mer mixture was employed for hybridization in situ. Upon screening 1.5 x lo5 bacterial colonies containing recombinant plasmids, 32 positive colonies were initially identified after washing of filters in 6 X SSC at 40 "C. Twentyfour remained positive upon rescreening under more stringent washing at 50 "C. These presumptive ATIII cDNA clones represented 0.016% of the total colonies screened with the 17mer mixture.
The cDNA inserts contained in the positive colonies were examined after digestion of plasmid DNAs with PstI. Inserts ranged from 0.7-1.4 kb in length and shared common restriction sites for HpaI and PvuII. The largest of these clones, designated pATIII-2, was analyzed fully by restriction mapping and DNA sequencing.
DNA Sequence of pATIII-2-A restriction map for the insert and the strategy used for its nucleotide sequence determination are presented in Fig. 2. Complete nucleotide sequencing of the insert revealed that the ATIII protein se-

Human Antithrombin
111 cDNAs 8391 quence was encoded in continuous fashion from the codon for amino acid 10 through a termination codon following amino acid 432 (Fig. 3). The region chosen for oligonucleotide synthesis, which corresponded to amino acids 251-256, had the sequence 5'-ATG-ATG-TAC-CAG-GAA-GG-3'. The 3"untranslated region was 87 nucleotides in length and contained a single poly(A) addition signal AAAAATAAA (23,24) 24 nucleotides upstream from the actual site of poly(A) addition. Four termination codons were found, one of which was in the same reading frame as the coding sequence. A 53-nucleotide long poly(A) tail was followed by a CI9 terminal sequence.
The predicted amino acid sequence derived from the cDNA ACT GAG GAT W G sequence differed in several regions from that previously reported (7) (see Table 11). In most instances, this was due to apparent inability to distinguish between acidic amino acids and their reduced forms in the peptide sequencing. In addition, the cDNA sequence predicted the presence of an octapeptide (Val-Leu-Val-Asn-Thr-Ile-Tyr-Phe) between Leu 213 and Lys 214 (our Lys 222) that could not be unequivocally placed in the original amino acid sequence. The predicted length of the ATIII protein is therefore 432 rather than 424 amino acids and our sequence is numbered accordingly in Fig.  3.
Polymorphism in ATIII cDNA"50,OOO additional recombinant E. coli from an independently constructed human fetal  (25) were also screened for ATIII sequences with the oligonucleotide mixture. Positive colonies were identified at approximately half the frequency as that from the adult liver cDNA library (not shown). Analysis of these plasmids indicated similar insert sizes as well as the HpaI and PuuII restriction sites seen in ATIII clones from the adult liver library. Sequence analysis showed that the largest fetal cDNA clone began with the codon for Gln 101 of ATIII and was continuous through the 3"untranslated region (not shown).
Digestion with PstI, however, revealed the presence of an internal restriction site in each of the fetal cDNA inserts. DNA sequencing from this novel site indicated that it arose as the result of a translationally silent A -G transition in the third codon position of Gln 305 (nucleotide 901, Fig. 3). The presence of an internal PstI site in all fetal liver ATIII cDNA clones examined makes it highly unlikely that it was the consequence of an error during copying of the mRNA by AMV reverse transcriptase. Additional DNA sequencing also demonstrated the previously unrecognized octapeptide between Leu 213 and Lys 222 (not shown). Although we did not determine the complete sequence of a fetal liver ATIII cDNA clone, these data were most consistent with the presence of sequence polymorphism a t Gln 305. Such polymorphism was confirmed in our analyses of genomic DNAs (see below).
Genomic Organization of ATIII Sequences-The normal arrangement of ATIII sequences in human DNAs was examined by Southern blot analysis (26) using a variety of restriction enzymes and nick-translated pATIII-2 as probe. As shown in Fig. 4 4 , a simple hybridization pattern was observed after EcoRI digestion with hybridizing bands seen a t 11 and 4.1 kb. Since EcoRI fails to cleave the cloned cDNA insert (Fig. 21, the presence of more than a single hybridizing band implies either the existence of a t least one intervening se- quence in the cellular gene or more than one copy of the structural gene per haploid DNA. Digestion of human genomic DNAs with PstI revealed polymorphism with one of the three patterns shown in Fig. 4R. Normal individuals had either 3 hybridizing fragments (10.5, 2.5, and 1.8 kb), 4 fragments (5.5, 5.0, 2.5, and 1.8 kb) or 5 fragments (10.5, 5.5, 5.0, 2.5, and 1.8 kb).
From a partial EcoRI library of genomic DNA in the bacteriophage Charon 4A (18, 19), we isolated a clone that contains the ATIII-specific EcoRI fragments shown in Fig.  4A and PstI fragments of 5.5, 5.0, 2.5, and 1.8 kb (Fig. 4C).
DNA sequence analysis of this genomic clone from the I?stI sites at the ends of the 5.0-and 5.5-kb fragments demonstrated the presence of the coding region neighboring the codon for Gln 305 (not shown). These findings establish that the internal PstI site in the coding region, first identified in a fetal liver cDNA clone, leads to the presence of the 5.5-and 5.0-kb fragments in Southern blots. Its absence, as in the adult cDNA clone, gives rise to the 10.5-kh fragment in some  and human a,-antitrypsin ( B ) cDNAs. The ovalbumin sequence was that described by McReynolds et al. (28). The cy,-antitrypsin sequence was that described by Kurachi et al. (29). Individual groups of three nucleotides were compared between the sequences and matches expressed as a dot along the respective axes (BO, 31). Significant homologies are expressed as diagonal arrays o/ dots. Small regions of such homologies are apparent.
by guest on March 17, 2020 http://www.jbc.org/ Downloaded from genomic DNAs. Individuals heterozygous for the polymorphic PstI site have 10.5-, 5.5-, and 5.0-kb fragments in addition to the invariant 2.5-and 1.8-kb fragments. In other studies we have shown that the PstI polymorphism is inherited as a simple Mendelian trait.
Taken together, the presence of both ATIII-specific EcoRI genomic fragments in a single bacteriophage clone and the simple inheritance pattern of the PstI polymorphism indicates that the ATIII gene is represented only once per haploid genome. The coding region represented in pATIII-2 is distributed over more than 10.5 but less than 16 kb in cellular DNA. We also conclude that at least two intervening sequences are present in the cellular gene on the basis of PstI digestions which yield a minimum of three ATIII-specific genomic fragments.
Homologies among ATIII, Ovalbumin, and cu,-Antitrypsin cDNAs-On the basis of similarities in amino acid sequences, ATIII chicken, ovalbumin, and a,-AT have been classified into a protease inhibitor super family in which divergence from a common ancestral gene is estimated to have occurred more than 500 million years ago (9). The availability of an ATIII cDNA sequence, along with those previously reported for ovalbumin and baboon cul-AT (27,28), allowed us to examine potential homologies at the nucleic acid level. Using a computer-assisted direct base comparison, each cDNA sequence was aligned with the other two and the results of homologies expressed graphically in a "dot matrix" format (29,30). Fig. 5 displays the comparison of human ATIII with chicken ovalbumin and baboon (?,-AT sequences. Interrupted nucleic acid homologies were found, especially in the central portions of the cDNAs. Somewhat greater homology was observed between ATIII and m,-AT. Using a direct alignment program, a comparison of the cDNA of ATIII with those of nl-AT and ovalbumin revealed homologies of 43 and 39%, respectively. Protein and nucleic acid homologies did not always coincide. For example, amino acid residues 75-94 of ATIII were 50% homologous with the aligned residues of ovalbumin (9). The homology was only 35%, however, at the nucleic acid level. Conversely, while amino acid residues 166-190 of ATIII were 53% homologous with nI-AT at the cDNA level, only 29% of the amino acids were shared.

DISCUSSION
Synthetic oligonucleotides have recently been employed successfully as hybridization probes for specific DNA sequences and have permitted the isolation of several cDNA clones for low abundance products (17,25,(32)(33)(34)(35). In this manner, enrichment of mRNA or cDNA preparations prior to screening of recombinants is not necessary. The difference in stability of perfectly matched and singly mismatched oligonucleotides (36) is sufficient to permit the use of oligonucleotide mixtures in direct screening of recombinant clones. In our study, we used a mixture of 8 heptadecanucleotides (17-mers) as direct in situ hybridization probes of adult liver cDNA colonies to isolate clones for human AT111 mRNA sequences. A preliminary radioimmunoassay of the adult liver tissue from which our cDNA library was constructed had indicated an abundance of ATIII of approximately 0.007%. The frequency with which positive colonies were identified by in situ hybridization (0.016%) was in reasonable agreement with this value.
The largest cloned ATIII insert we obtained was a nearly full length COPY of the AT111 mRNA. The coding region was complete from codon 10 through a termination codon after amino acid 432. 84 nucleotides were also present in the 3'-untranslated region (Fig. 3). Several minor differences in the predicted amino acids from the reported protein sequence (7) were identified in addition to the presence of an internal octapeptide which had not been previously noted. On the basis of our cDNA sequence and the reported NH,-terminal sequence of the protein, the total number of amino acids in human ATIII is, therefore, 432 rather than 424. The 3'untranslated region contained a typical poly(A) addition site (23,24) as well as 4 termination codons, one of which was in the same reading frame as the coding sequence. The first 9 codons, an anticipated leader sequence, and the 5'-untranslated segment of the mRNA were not present in our largest cDNA clone. In subsequent studies, we have obtained additional clones encompassing these segments.' Studies of ATIII cDNA isolated from a fetal liver cDNA library led to the identification of a coding region polymorphism in the Gln 305 codon that leads to different hybridization patterns in Southern blot analysis of genomic DNAs. In other studies, we have shown that this polymorphism is present at high frequency in normal DNAs and can be used to examine the molecular basis of inherited ATIII deficiency (27). Initial analysis of a recombinant bacteriophage clone containing the cellular ATIII gene, in conjunction with this PstI polymorphism, permits us to conclude that only a single ATIII gene is present per haploid DNA. This gene encompasses more than 10 kb of DNA and is interrupted at least twice by intervening sequences. Further studies are underway to determine the fine structure organization of this locus and its evolutionary relationship with other genes of the proposed protease inhibitor super family. A knowledge of ATIII gene structure and the frequency of restriction enzyme polymorphisms should also allow for a systematic approach to the study of inherited ATIII deficiency.
Hunt and Dayhoff (9) and subsequently Kurachi et al. (29) have compared the partial or complete amino acid sequences of ATIII, baboon, (?,-AT, and ovalbumin. Homologies between tul-AT and ATIII, nl-AT and ovalbumin, and ATIII and ovalbumin were 28, 24, and 31%, respectively (29). An examination of baboon m,-AT and ovalbumin amino acid sequences by dot matrix analysis has been reported to show homology along the entire lengths of both molecules (37).
Our cloning of a nearly complete ATIII cDNA has now allowed for a direct comparisons with those for (?,-AT and ovalbumin. We have found homology among these cDNAs (Fig. 51, thus confirming the classification of these three proteins into a protease inhibitor super family (9). The full implications of these conserved segments cannot be appreciated until more is known regarding the functional domains of these proteins. Recently, Leicht et al. (37) have compared the intron structure in the tul-AT and ovalbumin genes. Unlike gene families such as the globins and vitellogenins where the number of introns and their positions tend to be highly conserved (38,39), the introns of m,-AT and ovalbumin show no such conservation. Indeed, the structure of these genes appears to more closely resemble that of the actin gene family in which, despite extensive exon homology, the numbers and positions of the introns are completely distinct among family members (40)(41)(42). Further analysis of cloned ATIII genomic sequences (Fig. 4C)  14.209-217