Cloning and expression of human apolipoprotein D cDNA.

The amino acid sequence of human apolipoprotein D, a component of high density lipoprotein, has been obtained from the cloned cDNA sequence. The 169-amino acid protein has no marked similarity to other apolipoprotein sequences, but has a high degree of homology to plasma retinol-binding protein and other members of the alpha 2u-globulin protein superfamily. Apolipoprotein D mRNA has been detected in human liver, intestine, pancreas, kidney, placenta, adrenal, spleen, and fetal brain tissue. Tissue culture cells transfected with the cloned cDNA secrete material that reacts with anti-apoD antibodies.

The nucleotide sequence(s) reported in thispaper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) 5026 I I.
cDNA Cloning-Human adult liver cDNA libraries in X g t l O were prepared and provided by Axel Ullrich and Lisa Coussens of Genentech, Inc. Recombinant phage were screened on nitrocellulose filters with single oligonucleotide probes ranging from 33 to 95 bases (9). DNA from cDNA and genomic clones was isolated and sequenced as described (9). Both strands of DNA were independently sequenced. A plasmid for expression of apoD was constructed using the expression plasmid p342E, which expresses hepatitis B surface antigen (17).
This plasmid consists of an SV40 origin and early promoter preceding the viral DNA encoding hepatitis B surface antigen and its transcription termination region. In addition, this plasmid contains the pML prokaryotic origin of replication of amp' gene, allowing its selection and propagation in Escherichia coli. After the EcoRI site 5' to the SV40 promoter was destroyed, this plasmid was digested with EcoRI and HpaI, which liberated only the hepatitis B surface antigen gene, and in its place was ligated an EcoRI-NotI-HpaI synthetic linker 17 nucleotides in length. This plasmid was digested with EcoRI to open it at this single synthetic EcoRI site, and the 698-bp EcoRI insert of cAPOD.8 was ligated in to create the intermediate plasmid pSAH.8.
pSAH.8 was linearized by digestion with SstIJ, and a 1713-bp SstII fragment containing the mouse dihydrofolate reductase gene (from plasmid pgD trunc dihydrofolate reductase; Ref. 8) was ligated in to give the final expression plasmid designated pSAHD.8. pSAHD.8, therefore, consists of the SV40 origin and early promoter leading into the apoD cDNA which is followed by the hepatitis B surface antigen transcription termination region. This is followed by a second SV40 origin and early promoter, leading into the dihydrofolate reductase gene. The dihydrofolate reductase gene is followed by the origin and amp' gene of pML, giving pSAHD.8 a total length of 6599 bp.

RESULTS
Partial Protein Sequence, Oligonucleotide Probes, and cDNA Cloning-Purified human apoD protein was subjected to limited microsequence analysis to generate oligonucleotide probes for cDNA clone identification. The amino terminus of the plasma-derived protein was blocked, but protein sequence was obtained from several peptides derived from cleavage with trypsin or cyanogen bromide. Single oligonucleotides were synthesized to represent one possible codon choice for each of the four regions of available peptide sequence. These probes ranged in size from 33 to 95 nucleotides. About 2 million phage from an oligo(dT)-primed human adult liver cDNA library (11) in A g t l O were screened with triplicate sets of nitrocellulose filters hybridized with various combinations of "P-labeled oligonucleotide probes (9). Two phage were recovered which produced detectable hybridization with each of the probes apod.3 (66-mer), apod.4 (G-mer), and apod.6

Cloning and
Expression of Human Apolipoprotein D (95-mer). (The recovery of only two apoD clones from 2 million phage was below expectations and will be discussed below.) Rescreening of the library filters with these cDNA clones as probes yielded only sibling copies of the same clones. Phage DNA was prepared, and the cDNA inserts were analyzed by sequencing. cDNA Sequence-The -700-bp inserts of the cDNA clones cAPOD.6 and cAPOD.8 were subcloned into the EcoRI site of M13 vectors (12) and subjected to dideoxy chain termination sequencing (13). ApoD mRNA and cDNA clones are diagramed in Fig. 1 (Appendix), and the DNA sequence is presented in Fig. 2 (Appendix). The initial dideoxy sequence of cAPOD.6 revealed an insert of 708 bp (not including synthetic linker sequences), whereas cAPOD.8 contained 698 bp. The two cDNA clones differed slightly in length and contained three internal nucleotide differences. Neither clone contained a poly(A) tail. Translation of the DNA sequence reveals a methionine codon a t sequence positions 62-64, followed by a continuous open reading frame. The initial methionine codon is preceded by CAAG and followed by G, in fair agreement with the consensus sequence surrounding translation start sites in eukaryotic mRNAs (14). It is followed by a peptide sequence rich in hydrophobic residues, consistent with an amino-terminal secretion signal peptide (15). The translated DNA sequence contains the regions of sequenced protein which were used in preparing the oligonucleotide probes. The derived protein sequence agrees with all 22 residues of the sequenced CNBr fragment represented in probe apod.3, with all 15 residues of the tryptic fragment of probe apod.4, and with all 32 residues of the partial tryptic fragment of probe apod.6 (which overlaps apod.3). This sequence also agreed with the protein sequence of 11 residues used for the apod.2 probe, even though it did not hybridize significantly during the library screen. Other regions of partial peptide sequence or where peptide data were obtained after cDNA cloning were also found to correspond to the cDNA-derived sequence and are underlined in Fig. 2 (appendix).
The availability of the cDNA sequence suggested a strategy for obtaining amino-terminal protein sequence information. The glutamine residue, 21 amino acids from the initiator methionine, was a likely candidate for several reasons. It comes after the first charged residue after the hydrophobic "core" of the presumed signal peptide and immediately follows glycine, a common final residue of a leader peptide (15). Furthermore, blockage of amino termini is frequently caused by cyclization of glutamine residues. Therefore, plasma-derived apoD protein was treated with pyroglutamate aminopeptidase, which did unblock the amino terminus. Nine residues of amino-terminal sequence were then obtained, which agreed with the prediction that the mature protein begins at the suspected glutamine. Thus, we predict, on the basis of the cDNA sequence, that human apoD consists of a 169-amino acid mature sequence preceded by a 20-residue leader peptide. The cDNA clone cAPOD.6 was "P-labeled and hybridized to a Northern blot transfer of adult human liver RNA. A single hybridizing band of 900 & 50 bases was detected (Fig.  3). This indicated that the sequenced cDNA clones were missing some nucleotides a t either or both termini, although they appear to contain the complete protein coding region. The 61 base pairs 5' to the translation initiator may not extend to the message start site due to the procedures employed in cDNA synthesis. In addition, the 3"untranslated region was incomplete. Neither clone possessed poly(A) tails or an AATAAA polyadenylation signal sequence (16) in the -70 bases of the 3"untranslated sequence. Both cDNA clones end in a very C-rich region: cAPOD.6 terminates shortly after a.
We surmised that such a sequence could present difficulties in reverse transcriptase or DNA polymerase synthesis of cDNA clones and account for the surprisingly low frequency of clones recovered from the library.
An apoD cDNA clone containing the complete 3"untranslated Region was obtained by screening a second liver cDNA library with a 300-bp BamHI-EcoRI fragment from the 3' portion of cAPOD.6. The sequence of this clone, designated cAPOD.16, coincides with the 3' portion of cAPOD.6 but extends to a poly(A) tail. Beginning with the TAA stop codon, there are 182 bases of 3"untranslated sequence terminating in a poly(A) tail of about 100 nucleotides. A polyadenylation signal sequence AATAAA occurs 22 bases upstream of the tail. The total length of cloned cDNA is 810 bp, which, with poly(A) tails and the uncloned part of the 5"untranslated region, is consistent with the mRNA length of -900 bases determined by northern blotting.
Expression of Recombinant ApoD-The cloned apoD coding sequences were expressed from the SV40 early promoter by inserting the EcoRI cDNA insert of cAPOD.8 into the vector plasmid pSLH (see "Materials and Methods"). The resulting plasmid, designated pSAHD.8, was introduced into Chinese hamster ovary cells and amplified by methotrexate selection (18). Supernatant from a clone of cells selected a t 50 nM methotrexate was metabolically labeled with [35S]methionine for 6 h, immunoprecipitated with anti-apoD antiserum, electrophoresed, and autoradiographed. Fig. 3a shows that the cell line transfected with pSAHD.8 ( l a n e I ) , but not a cell line transfected with a control expression plasmid (lane 2), secretes a protein reacting with apoD antibodies that migrates at an apparent M, -33,000, in agreement with the reported gel mobility of the glycoprotein isolated from plasma (Refs. 1 and 2 and our data, not shown).
Tissue Distribution of ApoD mRNA"A11 of the apolipoproteins that have been characterized are synthesized in the liver, although other sites of synthesis occur for some of these proteins. Poly(A)+ RNA was isolated from a number of human tissues and cell lines, electrophoresed, blotted onto nitrocellulose, and hybridized with radiolabeled cloned apoD cDNA. Liver RNA contained a hybridizing 900-base species (Fig. 3b). Pancreas, adrenal gland, kidney, small intestine, and placenta also contain hybridizing apoD RNA of the same size. The signal in these lanes is more intense than liver, indicating that apoD mRNA accounts for a higher percentage of message in these tissues. ApoD mRNA was not detected in white blood cells or in monocyte-like U937 and HL60 cell lines. In addition, apoD mRNA was found in spleen and in fetal brain in amounts comparable to that in intestine (data not shown).
ApoD Homologies-The nucleic acid sequence of apoD cDNA and its predicted amino acid sequence were compared to other known gene and protein sequences present in the GenBankTM and Dayhoff sequence data bases. No significant amino acid homology was noted between the apoD sequence and those of the apolipoprotein A, B, C, and E groups. In addition, application of the Chou and Fasman algorithm (21) predicts less than 5% a-helix in apoD, in contrast to the substantial amounts of a-helix present in apolipoprotein groups A, C, and E (22,23).
Significant homology was found between apoD and a class of related proteins, including human retinol-binding protein, human al-microglobulin, ungulate P-microglobulin, rodent a,,-globulin, and tobacco hornworm insecticyanin. These proteins make up the a2,-globulin superfamily, and their homologies to apoD extend throughout the length of the apoD molecule. Detailed comparison of apoD to human retinolbinding protein (25), for example, shows 5 out of 6 identical residues near the amino terminus (extending to 20 out of 25 homologous residues if conservative substitutions are allowed), 10 out of 16 identical residues in a central region (17 out of 18 for conservative substitutions), and 7 out of 9 residues (16 out of 17 for conservative substitutions) in a more carboxyl-terminal region. The overall amino acid homology is approximately 25%.

DISCUSSION
The apoD protein sequence derived from translation of the cloned cDNA sequence substantially agrees with the amino acid composition of the plasma-derived protein, as determined by two other independent groups (2,19). The number of amino acids per molecule of mature apoD we derived by composition analysis and by cDNA sequencing (in parentheses) are as follows: Asx, 22.4 (23); Thr, 10.1 (11); Ser, 7.4 (7); Glx,20.8 (19); Pro, 12.0 (12); Gly, 9.7 (6); Ala, 11.9 (10); Val, 12.0 (12); Cys, 4.3 (5); Met,2.8 (3); Ile, 9.7 (11); Leu, 13.5 (15); Tyr, 5.4 (7); Phe, 5.1 (7); Lys, 11.0 (11); His, 1.8 (2); Arg, 3.6 (4); and Trp, not determined (4). This similarity of the predicted and directly measured amino acid composition, plus the agreement of extended sequences of apoD from peptide sequencing with the cDNA sequence (Fig. 2, Appendix), and the immunoreactivity and similar gel mobility of plasmaderived and recombinant DNA-derived proteins (Fig. 3) demonstrate that the sequence presented here corresponds to fulllength native apoD. The accuracy of the DNA predicted protein sequence was enhanced by the sequencing of three independent cDNA clones plus a genomic clone. The difference between the predicted protein mass (-20 kDa) and gel mobility reflects the extensive glycosylation of apoD, as reported before (19) and as indicated by our digestion of the purified glycoprotein by N-glycanase (data not shown). In addition, asparagine-linked glycosylation at the two predicted sites was indicated by peptide sequencing.
The tissue distribution of apoD mRNA is distinct from that of most other apolipoproteins. Several apolipoproteins are synthesized primarily in the liver and intestine, which are the tissues known to be most active in the secretion of intact lipoproteins (20). Whereas the liver synthesizes and secretes apoD (24) as well as 1ecithin:cholesterol acyltransferase, the present studies suggest that other sites are more active in apoD synthesis. The functions of this distribution are not known.
Both the amino acid sequence and the mRNA distribution of apoD place it apart from the other apolipoproteins characterized to date. Comparison of the protein and gene sequence of the major plasma apolipoproteins has shown that several of these are related in that they share a repeated, 21residue sequence whose consensus describes regions of amphipathic helix separated by proline residues (29). In particular, apolipoproteins A-I, A-IV, and E show a strong relationship likely to be derived from origin in a common ancestral gene (30). Analysis of the sequence of apoD shows no apparent homology with these other apolipoproteins. Whereas it is a relatively minor component of total plasma apolipoprotein, apoD is found completely in association with lipids in plasma in lipoproteins also containing apoA-I, the major protein of human high density lipoproteins (1). Even after treatment with chaotropic solutions such as 3 M NaSCN, apoD remains associated with the major lipids found in high density lipoprotein (19). By accepted criteria, therefore, apoD has the properties of a plasma apolipoprotein. However, since it contains little if any predicted amphipathic helix, its association with lipids probably has a different basis than that found in apolipoproteins such as apoA-I.
In fact, the homology of apoD to retinol-binding protein suggests such a different interaction between apoD and hydrophobic moieties. The crystallographic structure of retinolcontaining human retinol-binding protein has been elucidated (27), and it reveals that this protein binds retinol in a pocket delineated by 0-sheet structure, designated a flattened cone (28). We suggest that apoD may be associated with lipid via an analogous structure. Most, or essentially all, of the enzyme 1ecithin:cholesterol acyltransferase in plasma is found associated in lipoproteins containing apoD (1)(2)(3). The major sequence homology reported here between apoD and plasma retinol-binding protein seems to strengthen the concept that apoD plays a role in lipid transfer related to the enzyme reaction. It seems possible that apoD functions to bind either the substrates of the 1ecithin:cholesterol acyltransferase reaction (lecithin or free cholesterol) or the product of the reaction (cholesteryl ester). Elucidation of the precise function of apoD and its relation to the enzyme activity will be facilitated by the availability of pure recombinant apoD. In addition, expression of the cloned apoD gene should make it possible to perform in uitro mutagenesis and thereby define regions of the molecule which might affect 1ecithin:cholesterol acyltransferase activity.

D 16539
FIG. 2. Sequence of human apolipoprotein D cDNA. Nucleotides (left of each row) are numbered from the 5' terminus of cDNA clone cAPOD.6. The complete amino acid sequence of apoD is shown above the DNA sequence. Negative amino acid numbers (above residues) refer to the leader peptide, whereas positive numbers refer to the mature protein. The polyadenylation signal hexanucleotide is double-underlined. Predicted N-liked glycosylation sites are ouerZined. Fragments from which direct amino acid sequenke data were obtained are underlined Circled numbers below these lines indicate sequences used to construct oligonucleotide probes which are discussed in the text. Complete DNA sequence was obtained on the three cDNA clones and the 3' genomic fragment shown in Fig. 1. Where they overlap, the four sequences were in complete agreement with the following exceptions: clone cAPOD.8 contains C at nucleotide 8 (shown below line), whereas cAPOD.6 and cAPOD.16 contain T at this position; clone cAPOD.16 begins at nucleotide 5; clones cAPOD. 8,cAPOD.16, and the single genomic clone sequenced contain C at nucleotide 449, whereas cAPOD.6 contains T at this position, resulting in a Leu to Phe amino acid substitution at residue 110. (The C to T, Phe codon substitution at this position appears to be a rare variant; two other cDNA clones, a genomic clone, and 18 chromosomes tested by oligonucleotide probing of genomic digests all contain a CTC Leu codon at residue 110.) The 3'-untranslated region corresponds to the genomic sequence and agrees with the cDNA clones with the following exceptions in the C-rich region which lies about 70 nucleotides 3' of the stop codon: in cAPOD.6, the sequence beginning with the T marked with an asterisk is TAC"ATAAAGAC-end of clone (as indicated by the dashed three-base deletion); in cAPOD.8, the sequence is TACCCCACCCC-end of clone; and in cAPOD.16 the sequence contains a C to G substitution (indicated by prentheses). These slight differences may represent polymorphisms or may be the result of artifacts in cDNA cloning occurring in this unusual poly(C) stretch.