Structure and Expression of the Human Apolipoprotein A-IV Gene*

We have isolated the human apolipoprotein (apo) A- IV gene from a cosmid library and determined its complete nucleotide sequence. The gene contains three exons of 162, 127, and 1180 nucleotides separated by two introns of 357 and 777 nucleotides. A sequence polymorphism has been identified in the 3’ noncoding portion of the third exon. The human apoA-IV gene lacks an intron in the area encoding the 5’ nontrans- lated region of its mRNA, which distinguishes it from all the other human apolipoprotein genes whose se- quences are known. Comparison matrix analysis of the human apoA-IV gene sequence revealed evidence for an ancestral 11-nucleotide repeat unit that spans the third exon. These repeated sequences are much more highly conserved than those present in either rat apoA- IV or in any other human apolipoprotein. Optimal alignments of the 5’ flanking regions of the rat and human apoA-IV genes disclosed multiple deletions in the rat sequence as well as a highly conserved region of 90 nucleotides (90% sequence identity) located within 170 nucleotides of the start site of transcription. The 5’ flanking regions of the human and rat apoA-IV genes were ligated to the bacterial chloramphenicol acetyltransferase

We have isolated the human apolipoprotein (apo) A-IV gene from a cosmid library and determined its complete nucleotide sequence. The gene contains three exons of 162, 127, and 1180 nucleotides separated by two introns of 357 and 777 nucleotides. A sequence polymorphism has been identified in the 3' noncoding portion of the third exon. The human apoA-IV gene lacks an intron in the area encoding the 5' nontranslated region of its mRNA, which distinguishes it from all the other human apolipoprotein genes whose sequences are known. Comparison matrix analysis of the human apoA-IV gene sequence revealed evidence for an ancestral 11-nucleotide repeat unit that spans the third exon. These repeated sequences are much more highly conserved than those present in either rat apoA-IV or in any other human apolipoprotein. Optimal alignments of the 5' flanking regions of the rat and human apoA-IV genes disclosed multiple deletions in the rat sequence as well as a highly conserved region of 90 nucleotides (90% sequence identity) located within 170 nucleotides of the start site of transcription. The 5' flanking regions of the human and rat apoA-IV genes were ligated to the bacterial chloramphenicol acetyltransferase gene, then transfected into different cultured cells. The apoA-IV gene sequences elicited preferential expression of chloramphenicol acetyltransferase activity when introduced into intestinally derived Caco-2 cells and liver-derived Hep-G2 cells, consistent with the tissue specificity of the native gene. Analysis of deletion mutants of the human apoA-IV 5' flanking region indicated that regions from -293 to -233 and from -127 to -60 upstream of the transcription start site contain sequences required for maximum gene expression. These findings on the structure and expression of rat and human apoA-IV should prove useful in studying the control of the apoA-IV gene.
The primary translation product of human apolipoprotein (apo)' A-IV mRNA is a 396-residue preprotein (1,2). After * This work was supported in part by Grants AM 31615, AM 30292, and HL 37063 from the National Institutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

to the GenBankTM/EMBL Data Bank with accession number(s)
11 Supported by Medical Scientist Training Program Grant GM 072000 and a Gerty T. Cori Predoctoral Fellowship from Sigma. ** An Established Investigator of the American Heart Association.

$3
To whom correspondence should be addressed.
The abbreviation used is: apo, apolipoprotein. co-translational proteolytic processing of its 20-residue signal peptide, the protein is secreted in association with chylomicron particles from its principal site of synthesis, the intestine (reviewed in Ref. 3). The precise function of apoA-IV is not known, although it has been shown to be a potent activator of lecithin-cholesterol acyltransferase in vitro (4). One feature that distinguishes apoA-IV from other apolipoproteins is its low affinity for plasma lipoproteins (5)(6)(7)(8). Apolipoprotein A-IV dissociates quickly from the chylomicron surface after this particle enters the circulation (9); some of the "free" apoA-IV may subsequently be integrated into high density lipoproteins (9). While the unassociated pool of apoA-IV exhibits reduced affinity for plasma lipoprotein particles compared with other well-characterized human apolipoproteins (A-I, A-11, B, C-I, (2-11, C-111, and E), the structural basis for this distinctive property is not understood. Recent analyses of the human apoA-IV sequence (1,10) have shown that it is composed of 14.5 tandemly repeated docosapeptides that have the potential to form amphipathic a-helices (1). Thus, a paradox exists: the number of such helical elements present in apoA-IV is greater than in any of the other plasma apolipoproteins listed above, yet it does not associate stably with the surfaces of plasma lipoproteins.
Phylogenetic analyses of the apolipoproteins have shown that apoA-IV, A-I, and E diverged from a common ancestral sequence, and that they are more closely related to each other than to any of the other apolipoproteins (11)(12)(13). The structure of the rut apoA-IV gene has recently been described (12). As in the human apoA-I and E genes, sequences coding for the repeated amphipathic docosapeptides in rat apoA-IV span the last two exons (12). Unlike other known mammalian apolipoprotein genes, which contain four exons and three introns, the rat apoA-IV gene lacks the intron that splits the 5' nontranslated regions of these apolipoprotein mRNAs. The absence of this intron could have arisen by a gene conversion event (12). However, it was unclear whether this is a unique feature of the rat gene or a general property of mammalian apoA-IV genes.
We have determined the structure and expression of the human apoA-IV gene. Our analysis of this gene sequence has focused on the following questions. How does the organization of the human apoA-IV gene compare with that of the other human apolipoprotein genes? Are the repeated docosapeptide sequences encoded by this gene more conserved or less conserved than those present in the other human apolipoproteins? Is there evidence for an 11-nucleotide ancestral unit in this gene, as has been found in apoE (14)? Does the relative degree of conservation of the repeated docosapeptide sequences provide any clues as to why apoA-IV does not associate stably with lipid surfaces or as to the validity of the amphipathic helix as a structural explanation for lipid-protein interactions? Finally, we have examined the 5' flanking region 7973 7974 Human Apolipoprotein A-IV Gene of the apoA-IV gene to investigate its cell-specific expression as well as the functional domains that determine promoter activity.

EXPERIMENTAL PROCEDURES
Screening of Human Genomic Library-Five hundred thousand clones in a human cosmid genomic library (kindly provided by Dr. Chris Lau, University of California, San Francisco) were screened with a 32P-labeled human apoA-IV cDNA probe using the conditions described in Ref. 1. Three positive recombinants, with an average insert length of 34 kilobases, were identified. These DNAs were digested with a variety of restriction endonucleases, and Southern blots were prepared (15). When the 32P-labeled human apoA-IV cDNA (1) was used to probe these blots, it was apparent that all three cosmids had inserts that contained the complete apoA-IV gene, and that the restriction maps of the apoA-IV gene in each insert were identical ( d a t a not shown). Therefore, a single cosmid, pHAIVG52, was employed for all subsequent studies.
Restriction fragments derived from this cosmid DNA were subcloned in bacteriophage M13 mp18 and mp19 prior to subsequent nucleotide sequence analysis by the dideoxy chain termination method (16). Initially, the sequences of both ends of these fragments were defined using the universal M13 primer. The resultant partial DNA sequences were subsequently used for designing synthetic oligonucleotides for further sequence analysis. This strategy was followed for both strands of the human apoA-IV gene. All oligodeoxynucleotides were produced by an Applied Biosystems (Foster City, CA) Model 380A synthesizer.
Primer Extensions-To determine the initiation site for transcription, a 27-nucleotide primer was synthesized that was complementary to a region beginning 79 nucleotides upstream from the initiator methionine codon of apoA-IV mRNA (1). The 27-mer was labeled at its 5' end using polynucleotide kinase and was subsequently hybridized to 2.5 pg of poly(A)-containing RNA from human small intestine. The 10-pl hybridization reaction mixture contained, in addition to primer and template, 5 mM KC1,S mM MgC12, 40 mM dithiothreitol, and 5 mM Tris at pH 8.3. The solution was incubated for 30 min at 60 "C and then for 15 min at 42 'C. After hybridization, 2 pl of 5 mM dATP, dTTP, dCTP, and dGTP, as well as 21 units of reverse transcriptase, were added to the reaction. The mixture was incubated for an additional 30 min at 42 "C. After ethanol precipitation, the sample was dissolved in 99% formamide containing bromphenol blue (0.03%) plus EDTA (0.75%) and heated for 3 min at 95 "C. The reaction products were analyzed on an 8% polyacrylamide sequencing gel (17).
Computer-assisted Comparative Sequence Analyses-All computations were carried out on a MicroVAX I1 computer (Digital Equip-P V~ n p s t ,~~y n  (17) were used to compute optimal alignments of nucleic acid and protein sequences, respectively. The FASTN program (18) was used to search the Genetic Sequence Data Bank (GenBank").
Construction and Churacterization of Deletion Mutants-XbaI/SacI (-893 to 26) and EglII/XmnI (-800 to 21) restriction fragments were isolated from the human and rat apoA-IV genes. These fragments were ligated into the polylinker region of the plasmid pLSI (pLSI is modified pTE (19) without thymidine kinase promoter). Deletions in the 5' portions of the human apoA-IV sequences were introduced either by restriction endonuclease digestion or by digestion with Bal-31 and DNA polymerase I. The digested fragments were ligated to EglII linkers and inserted into the EgZII/SacI site of pLSI, adjacent to the bacterial chloramphenicol acetyltransferase (CAT) gene (19). These constructions were introduced into different cell lines and examined for chloramphenicol acetyltransferase activity. A calcium phosphate co-precipitate containing 10 pg of DNA was added to different cultured mammalian cell lines. Cells were collected 48 h after addition of the DNA, and extracts were prepared by freezing and then heat shock at 60 "C (20). The reaction mixture (20) contained Tris (pH 7.8), 140 mM; acetyl coenzyme A, 0.44 mM; ["C] chloramphenicol (40-60 mCi/mmol, New England Nuclear), 0.2 mCi; and cell extract (50 pg of total protein). The reactions were allowed to proceed for up to 60 min. Samples were extracted with 1 ml of ethyl acetate. The solution was dried down, and the residue was redissolved in 10 pl of ethyl acetate and analyzed by ascending thinlayer chromatography using chloroform/methanol (95/5, v/v). The chromatograms were subjected to autoradiography. Chloramphenicol acetyltransferase activity was quantitated by counting scraped regions of the chromatograms in a liquid scintillation spectrometer. All chloramphenicol acetyltransferase activity measurements were normalized for the differences in the transfection efficiency between cells by co-transfecting the cells with plasmid containing the 8-galactosidase gene under the control of Rous sarcoma virus promoter and measuring the total 8-galactosidase activity in cell protein extracts as described (21).

RESULTS AND DISCUSSION
Nucleotide Sequence Analysis-The strategy used to determine the complete nucleotide sequence of the human apoA-IV gene is shown in Fig. 1. A comparison of this sequence with the previously determined nucleotide sequence of human apoA-IV cDNA (1,lO) disclosed that the gene contains three  site of transcription, indicated by the arrow (J), was determined by primer extension analysis (see "Experimental Procedures"). The TATA box is denoted by a solid rectangle in the 5' flanking region, and the AATAAA polyadenylation signal is denoted by the striped rectangle in the third exon. The beginning and end of the gene, corresponding to the mRNA product, as well as the exon-intron junctions are indicated by marks (A) below the sequence. Blanks at these positions and between amino acid codons are for clarity and do not indicate missing nucleotides. The single asterisk represents the start of the mature protein, and the three asterisks represent the stop codon. rupts both genes at precisely the same place: within the glycine codon corresponding to position -4 of the signal peptide (Fig.  3A). Alignment of the two signal peptides disclosed a sequence identity of 81%.' Both the human and rat apoA-IV genes lack an intron within the 5' noncoding region of the corresponding mRNA, making them unique among apolipoprotein genes (see Ref. 12). In all other defined mammalian apolipoprotein gene sequences (12), an intron is located about 20 nucleotides upstream from the mRNA translation start site. The lack of this intron in both apoA-IV genes raises the possibility that it was deleted during the postulated duplication event(s) (13) that gave rise to the apoA-IV and apoA-I genes. Because of this common ancestor, the two introns in the human apoA-IV gene are located in positions similar to those in other PRTALN was used for this analysis. The alignment parameters selected included a k-tuple of 1, a window size of 10, and a gap penalty of 3.

FIG. 4. Intrasequence comparison matrix analyses of human apoA-IV.
A, a self-comparison of the human apoA-IV gene nucleotide sequence is shown. A span length of 11 nucleotides was used. The unitary matrix (31) was used to score comparisons between spans of nucleotides. Only spans reaching or exceeding a score that had a probability of occurring by chance alone of less than 1 in 10 were plotted. Spans exceeding the predetermined threshold were plotted with a point that indicates only the center of the aligned span. B and C, a self-comparison of the amino acid sequence of preapoA-IV using a span length of 23 residues is shown. The PAM250 mutation data matrix (31) was used to score comparisons between spans. A threshold was selected for plotting such that in infinite random sequences with the same amino acid composition as this protein, the probability of achieving the threshold score was less than 1 in 500 (B) or 1 in 100 (C).

Self-comparison statistics for human and rat apolipoprotein A-IV genes
Spans of 23 residues were used to generate comparison matrices. Matches along the main diagonal (representing identity) were not counted in calculating the score distributions. The number of spans in a comparison matrix equals the square of the sequence length minus a correction factor for incomplete spans located at the edges of the matrix (33, 34). Expected frequencies were calculated from the double matching probability (33, 34) with edge corrections (30). The comparison span score equals the sum of amino acid pair similarity scores across the span. These similarity scores are based on the PAM250 amino acid replaceability matrix (31). known major apolipoprotein genes. To illustrate this observation, the structure of the human apoA-IV gene is compared with that of the apoC-III(24) and apoA-I (25) genes (Fig. 3B) to which it is linked closely (1, 26) on chromosome 11 (27). This similarity in structure is consistent with the notion that these apolipoprotein genes evolved from a common ancestral gene through a series of gene duplications, deletions, and chromosome translocations (11-13).
The second intron of the human apoA-IV gene interrupts the codon for Asn-39 in the mature protein-coding portion of the gene. Thus, the second exon specifies the last 4 residues of the signal peptide and the first 39 residues of the mature plasma protein. The amino-terminal oligopeptide domains encoded by the second exons of the human and rat genes exhibit a higher degree of sequence identity (80%) than those encoded by their third exons (61%). The first and second introns in the human gene are, respectively, 80 and 104 nucleotides larger than those in the rat gene (see Fig. 3A). The human apoA-IV gene sequence was compared with the primate, rodent, and mammalian DNA libraries contained in GenBank". No Alu family sequences were found in any region of this gene or its defined flanking sequences. Such sequences have been previously noted in the human apoE gene sequence (28) and in the human apoC-I1 gene sequence (29).
Comparison of the human apoA-IV exonic sequences with the nucleotide sequence of human apoA-IV cDNA (1) revealed only one difference. The portion of the third exon that specifies the 3' nontranslated region of apoA-IV mRNA contained four tandem repetitions of the sequence TGTC (beginning a t nucleotide position 2494, Fig. 2), whereas this sequence was repeated only three times in the cDNA sequence. The gene sequence was unambiguous, having been confirmed on both strands. In addition, a second apoA-IV gene clone (pHAIVG51, Ref. 1) from the same genomic library was examined and found to have the same nucleotide sequence in this region. The cDNA sequence was also unambiguous, since it was confirmed on both strands of three independent recombinants (1). Therefore, it is unlikely that this discrepancy represents a cloning artifact. Furthermore, the nucleotide sequence that we observed for the potentially polymorphic site in the gene agrees with the corresponding sequence of an apoA-IV mRNA that was determined independently for a different individual (10). Because the genomic and cDNA libraries were prepared from different individuals, the sequence variation probably represents a naturally occurring polymorphism.
Repeated Sequences in the Human ApoA-IV Gene-The human apoA-IV gene sequence was compared against itself using the comparison matrix algorithm of McLachlan (30) and a span length of 11 nucleotides (Fig. 4A). In this analysis, similarities between nucleotide spans of the chosen length are evaluated by determining similarity scores for each span (30, 31). The scores above a statistically significant threshold are plotted as a single point representing the center of the span. Because only the center of a span is plotted, the comparison matrix program requires that the span length be an odd integer. It is apparent in Fig. 4A that the third exon of this gene contains multiple repeats, indicated by the cluster of short diagonals offset from the main diagonal. A similar distribution was obtained when longer span lengths (23, 45, and 67 residues) were analyzed and when the threshold scores for plotting were raised (data not shown). Measurements of the displacements of these shorter diagonals from each other and from the main diagonal indicated that an 11-nucleotide repeat was present in this exon of the apoA-IV gene. These repeats cannot be detected in the second exon.
Analyses of repeated sequences in rat apoA-IV by our group (13,32) and others (12) have used amino acid sequences because they permit a more sensitive analysis. This extra sensitivity is helpful because Luo et al. (12) have recently shown that the rate of nucleotide substitution among apolipoprotein genes is higher than the average rate observed in other mammalian genes. This difference may reflect the fact that selective pressures act only to conserve a specific pattern of generic lipophilic and hydrophilic side chains as opposed to the conservation of a specific sequence of amino acids. The rat, and mouse apoA-IV genes. Optimal alignments were generated using NUCALN (17), a k-tuple of 3, a window size of 20, and a gap penalty of 7. Dashed lines, indicating hypothetical deletions, were placed in the sequences to achieve maximum homology. The numbers indicate the nucleotide positions of the human sequence upstream from the start of transcription. A colon between the rat and human sequences indicates an identity between their nucleotides. A colon between the mouse and rat sequences indicates a nucleotide position that is identical in mouse, rat, and human genes. A period between the mouse and rat sequences indicates an identity between nucleotides of only these two genes.  ----------------------------------------------------------. . . . . . . . -------------------------------------------------------- net result is that codons that specify the components of amphipathic helices may evolve rapidly (12,13), obscuring or eliminating evidence for repeats at the DNA sequence level in some regions of the apoA-IV gene. Therefore, to test further for repeat units, intrasequence comparison matrices of the human apoA-IV protein were generated (Fig. 4B) using a span length of 23 residues and several different plotting thresholds. The results show that repeated sequences overlap the junction between the second and third exons, although the repeating pattern is weak in the second exon as compared with the third exon (compare B and C of Fig. 4). The diagonal offset spacing of the lines in the third exon indicates that homologous repeated sequences of 11 or 22 residues account for about 80% of its length. Taken together, these matrix plots suggest that repeat units in the second exon have evolved at a different rate than the repeat units of the third exon. Similar observations have been made by our group (13) and Luo et al. (12) for other human apolipoprotein genes.

ACCTTGTTCTC-
The relative sequence conservation among repeat units within the human and rat apoA-IV polypeptides was indicated by the probability distribution of comparison span scores observed in the self-comparison matrix analysis (Table I). A comparison score equals the sum of similarity scores (31) for amino acid pairs contained within the span. The frequency of an observed score was compared with the expected frequency of that score in randomly shuffled sequences having the same amino acid composition as authentic apoA-IV. The repeated sequences in human apoA-IV achieved higher overall comparison scores than the corresponding sequences in rat apoA-IV. For example, there were 34 spans in the human sequence that had comparison scores greater than 295, whereas this was the highest score achieved for spans in the rat apoA-IV intrasequence comparison matrix. The ratio of observed/ expected scores for the highest scoring spans in rat apoA-IV ( i e . the four spans with scores of 295) was about 10-fold lower than the ratio for spans with the same score in human apoA-IV. These data indicate that the repeat units in human apoA-IV are much more highly conserved with respect to one another than the repeat units present in the rat protein.
The self-comparison matrix statistics for human and rat apoA-IV were compared with the previously published analysis of human apoA-I and apoE (13). The degree of sequence structed, and the chloramphenicol acetyltransferase activity was determined as described under "Experimental Procedures." The conserved region present in the apoA-IV gene of humans and rats is represented by the solid box in the 5' flanking region.
conservation among the repeat units in human apoA-IV is considerably higher than in human apoA-I. None of the comparison span scores for human apoA-I exceeded 288 (13). Furthermore, the ratio of observed/expected scores for the highest scoring spans in apoA-I was over 300-fold lower than the ratio for spans with the same score in human apoA-IV. There was even less sequence conservation among repeats in human apoE than in human apoA-I (13). Thus, among all the mammalian apolipoproteins whose complete primary structures have been defined, human apoA-IV has the greatest

Expression of rat and human apoA-IVpromoter-directed chloramphenicol acetyltransferase (CAT) gene in cultured cells
Expression of apoA-IV promoter in cell culture was performed using the human and rat apoA-IV 5' flanking region constructs described under "Experimental Procedures." These constructs were introduced into various cell lines, and transient expression of chloramphenicol acetyltransferase activity was measured (see "Experimental Procedures"). The data are expressed as a percentage of activity found in the human intestine-derived cell line Caco-2. The Rous sarcoma virus (RSV)-0-galactosidase vector was used as an internal control for the comparison of the activities of chloramphenicol acetyltransferase vector products as described (37). The actual expression of Rous sarcoma virus-directed 0-galactosidase in different cells as compared with that of Caco-2 cells is shown.

Relative expression of apoA-IV mRNA in rat and human tissue
The expression of apoA-IV gene in uiuo was performed using aliquots of total cellular RNA taken from the listed tissues, applied to nitrocellulose filters, and examined by hybridization and autoradiograms as described previously (1) number of repeat units as well as the highest degree of sequence conservation among those repeats.
The relatively high degree of sequence conservation should not be taken to mean that there are no structurally distinctive features in individual (amphipathic) docosapeptide repeat units. Weinberg and Spector (8,35) have noted that the interaction between human apoA-IV and lipoproteins can be influenced by the composition of the particle surface as well as by its propensity to self-associate (with high affinity) into dimeric forms. They have calculated the average hydrophobicity and helical hydrophobic moment in the multiple ahelical segments that make up human and rat apoA-IV. They found that those amphipathic a-helical segments located in the amino-terminal and carboxyl-terminal regions of the human protein have relatively higher hydrophobicity and hydrophobic moment (35,36). These observations suggest that these segments with distinctive hydrophobic a-helical properties could be responsible for apoA-IV's high affinity selfassociation. Moreover, their physical-chemical characteristics in the human (as compared with the rat) apolipoprotein could favor this intermolecular reaction over one involving apoA-IV and the surfaces of human plasma lipoproteins (36).

Analysis of the 5' Flunking Region of the A p A -I V Gem-
Previously, we have used RNA blot hybridization to compare the expression of the apoA-IV gene in a variety of rat, marmoset, and human tissues (1). One obvious difference between rodents and primates is that apoA-IV mRNA is abundant in rat liver but not in the livers of the other two species. It was of interest, therefore, to compare the 5' nontranscribed regions of the human and rat genes using the following two approaches. Alignment of the rat and human 5' flanking region (Fig. 5) revealed a typical TATA box beginning at comparable positions in both genes. Two other features shown in the alignment are particularly noteworthy. First, a region extending from nucleotide -77 to -167 in the human gene exhibits striking sequence similarity (90% identity) to its rat homolog. This high level of extended sequence conservation in gene 5' flanking regions between species is unusual. A further comparison to the promoter region of the mouse apoA-IV gene (37) reveals the same striking homology in this domain. Second, the human gene contains a large insertion that spans positions -340 to -399, which is surrounded by regions of 49% homology in the flanking 50 nucleotides. One hypothesis arising from this comparison is that these various conserved domains and/or the multiple insertions and deletions could be important in determining the observed patterns of apoA-IV tissue-specific expression in rodents and primates. Therefore, we initiated a series of in vitro studies to identify functionally important domains in the 5' flanking region of the orthologous human and rat apoA-IV genes. DNA sequences containing the 5' flanking region of the human and rat apoA-IV genes (i.e. 893 and 800 nucleotides, respectively, upstream from the start of transcription) were fused to the 5' end of the CAT gene (Fig. 6). These recombinants were introduced into different cell lines, and the transient expression of the CAT gene, mediated by the promoter activity of the apoA-IV gene fragments, was measured (Table 11). To assess potential differences in cell transfection efficiency, all cells were transfected with a plasmid containing the Escherichia coli ,&galactosidase gene linked to the Rous sarcoma virus promoter (38). The /3-galactosidase activity in the various cells examined did not vary by more than 30% (0.3-fold) from the activity in Caco-2 cells ( Table 11).
The results in Table I1 show that the 5' flanking regions of the rat and human apoA-IV genes exhibit the greatest promoter activity in cultured cells derived from tissues that have been shown previously (1,39) to express the gene (Table 111). Both the rat and human apoA-IV gene fragments were at least 10-fold more active in the human hepatoma cell line Hep-G2 and the intestinally derived cell line Caco-2 than in the other cells tested. These data are consistent with in uiuo results in the rat (39), which show that the liver and intestine are the only organs that express significant levels of apoA-IV. The human apoA-IV promoter was expressed about 20% more efficiently than the rat apoA-IV promoter in the Caco-2 cells (data not shown). The finding that the human apoA-IV gene promoter elements function in Hep-G2 cells wai not unexpected because apoA-IV mRNA was detected in these cells at a relative level of 15% of that of the small intestine (data not shown). However, the normal human liver has only a very low level of apoA-IV mRNA. These differences in the efficiency of human apoA-IV promoter element in normal and neoplastic hepatoma cells may reflect a number of physiological differences.
T o define further the location of functionally significant sequences in the 5' flanking domain of the human apoA-IV gene, additional mapping studies were performed. Fig. 6 shows that deletions of nucleotides -893 to -293 resulted in no significant change in apoA-IV promoter activity in Hep-G2 cells. However, deletion of the region from -293 to -233 resulted in a 50% reduction in the promoter activity. Deletion of the region from -127 to -60 resulted in a dramatic reduction in the promoter activity. This encompasses most of the regions of the orthologous rat and human apoA-IV genes that show exceptionally high sequence conservation (Fig. 5). These data therefore support the hypothesis that this domain represents a region of functional importance for efficient gene expression.
The loss of chloramphenicol acetyltransferase activity in transfected human hepatoma cells associated with progressive deletion of the 5' flanking region of the apoA-IV gene suggests that specific control elements are located within 300 base pairs of the transcription start site and may be part of a positive regulatory system that modulates expression of this human apolipoprotein gene. Although our data are consistent with the notion that sequences affecting cell-specific expression are situated within 900 nucleotides of the transcription initiation site, further mapping studies employing experimental strategies similar to those described here will be required to delineate their location in the apoA-IV gene.