Identification of regulatory sequences in the gene for 5-aminolevulinate synthase from rat.

The housekeeping enzyme 5-aminolevulinate synthase (ALAS) regulates the supply of heme for respiratory cytochromes. Here we report on the isolation of a genomic clone for the rat ALAS gene. The 5'-flanking region was fused to the chloramphenicol acetyltransferase gene and transient expression analysis revealed the presence of both positive and negative cis-acting sequences. Expression was substantially increased by the inclusion of the first intron located in the 5'-untranslated region. Sequence analysis of the promoter identified two elements at positions -59 and -88 bp with strong similarity to the binding site for nuclear respiratory factor 1 (NRF-1). Gel shift analysis revealed that both NRF-1 elements formed nucleoprotein complexes which could be abolished by an authentic NRF-1 oligomer. Mutagenesis of each NRF-1 motif in the ALAS promoter gave substantially lowered levels of chloramphenicol acetyltransferase expression, whereas mutagenesis of both NRF-1 motifs resulted in the almost complete loss of expression. These results establish that the NRF-1 motifs in the ALAS promoter are critical for promoter activity. NRF-1 binding sites have been identified in the promoters of several nuclear genes encoding mitochondrial proteins concerned with oxidative phosphorylation. The present studies suggest that NRF-1 may co-ordinate the supply of mitochondrial heme with the synthesis of respiratory cytochromes by regulating expression of ALAS. In erythroid cells, NRF-1 may be less important for controlling heme levels since an erythroid ALAS gene is strongly expressed and the promoter for this gene apparently lacks NRF-1 binding sites.

Identification of Regulatory Sequences in the Gene for 5-Aminolevulinate Synthase from Rat* (Received for publication, June 29, 1992) Giovanna Braidotti, Iain A. BorthwickS, and Brian K. May5 The housekeeping enzyme 5-aminolevulinate synthase (ALAS) regulates the supply of heme for respiratory cytochromes. Here we report on the isolation of a genomic clone for the rat ALAS gene. The 5"fIanking region was fused to the chloramphenicol acetyltransferase gene and transient expression analysis revealed the presence of both positive and negative cis-acting sequences. Expression was substantially increased by the inclusion of the first intron located in the 5'-untranslated region. Sequence analysis of the promoter identified two elements at positions -59 and -88 bp with strong similarity to the binding site for nuclear respiratory factor 1 (NRF-1). Gel shift analysis revealed that both NRF-1 elements formed nucleoprotein complexes which could be abolished by an authentic NRF-1 oligomer. Mutagenesis of each NRF-1 motif in the ALAS promoter gave substantially lowered levels of chloramphenicol acetyltransferase expression, whereas mutagenesis of both NRF-1 motifs resulted in the almost complete loss of expression. These results establish that the NRF-1 motifs in the ALAS promoter are critical for promoter activity. NRF-1 binding sites have been identified in the promoters of several nuclear genes encoding mitochondrial proteins concerned with oxidative phosphorylation. The present studies suggest that NRF-1 may co-ordinate the supply of mitochondrial heme with the synthesis of respiratory cytochromes by regulating expression of ALAS. In erythroid cells, NRF-1 may be less important for controlling heme levels since an erythroid ALAS gene is strongly expressed and the promoter for this gene apparently lacks NRF-1 binding sites.
The first step of the heme biosynthetic pathway in animal cells is catalyzed by the mitochondrial matrix enzyme 5aminolevulinate synthase (ALAS),' which converts glycine and succinyl-CoA to 5-aminolevulinate, In the liver and probably all other tissues, this enzyme is rate-limiting, and levels of ALAS will therefore determine the availability of cellular heme (1). Animal cells must synthesize heme for respiratory * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) X66736. 5352; Q To whom correspondence should be addressed. Tel.: 61-8-228-' The abbreviations used are: ALAS, 5-aminolevulinate synthase; kb, kilobaseb); CAT, chloramphenicol acetyltransferase; bp, base pair(s); CHO, Chinese hamster ovary; NRF-1, nuclear respiratory factor- 1. cytochromes and other hemo-proteins. In the body, most of the total heme is synthesized by erythroid cells for assembly into hemoglobin, although substantial amounts of heme are also made by the liver for cytochrome P450 proteins, particularly when the levels of these proteins are induced by foreign chemicals (2).
Recently, it has been established that there are two ALAS isozymes encoded by distinct genes in the mouse (3), chicken (4), and human genome (5, 6). One gene encodes an isozyme that is expressed exclusively in erythroid cells (2,6 ) , and this gene is activated during erythropoiesis to provide the large amounts of heme needed for hemoglobin formation. The genes for mouse (3) and human (6) erythroid ALAS have been characterized and the human gene located to the X-chromosome (5, 7). The second ALAS gene encodes a housekeeping isozyme, and this gene is apparently expressed in all tissues (2). Clones for the housekeeping ALAS isozyme have been isolated from rat (8), chicken (9), and human (10) liver cDNA libraries, and the chicken ALAS gene has been characterized (11). The human gene has been localized to chromosome 3 (7, 12). The rate of transcription of the housekeeping ALAS gene is greatly increased in the liver following administration to animals of foreign chemicals such as phenobarbital, and this induction presumably meets the increased demand for heme by induced hepatic cytochrome P450 apoproteins (13).
We are investigating the molecular mechanisms which regulate basal expression of the housekeeping ALAS gene in all cell types and induction of this gene by foreign chemicals. In earlier work, we characterized the chicken housekeeping ALAS gene (11), and we now report the isolation of a genomic clone for rat housekeeping ALAS. By transient transfection experiments and gel shift assays, we have identified regions in the rat ALAS gene promoter and first intron that are important for expression in different cell types and have investigated in detail the functional roles of two near identical cis-acting elements in the early promoter. Isolation, Characterization, and Sequencing of Genomic Clones-A rat genomic library was a kind gift from Dr. J. Bonner (14). The library was generated by a partial HaeIII digest of high molecular weight rat DNA. Digestion products were cloned into the X-phage derivative Charon 4A following the addition of EcoRI linkers. A rat liver ALAS cDNA clone plOlBl (8) radiolabeled by nick translation 1109 Regulatory Sequences in the 5-Aminoleuulinute Synthase Gene using [cx-~'P]~ATP was used to screen approximately 1.5 X lo8 plaques immobilized on nitrocellulose membranes by the filter hybridization method (15). After two rounds of screening, a single recombinant X clone was purified as described (15), and a 13-kb EcoRI insert subcloned into pUC19 to produce pRG-1. Restriction mapping and Southern blot analysis were performed essentially as described by Maniatis et al. (15). Restriction fragments corresponding to the 5'-end of the structural gene and to the 5"flanking region were isolated and ligated into M13mp18 and M13mp19. Single-stranded DNA was sequenced by the dideoxy chain termination procedure (16).
The extension reaction was performed at 42 "C for 1 h. The samples were then ethanol precipitated, and the pelleted nucleic acid was dried and resuspended in 5 pl of water and 5 pl of formamide load buffer. The extension products were analyzed by electrophoresis on 6% acrylamide, 7 M urea gels with molecular size markers of 32Plabeled HpaII-cut pUC19 and end-filled with [ C X -~~P I~A T P and an M13mp18 DNA sequence ladder.
RNase Protection Analysis-The 1.4-kb PstI fragment corresponding to region -479 to 971 bp from the rat genomic clone was inserted into the PstI site of pSP64. This construct was linearized at the BarnHI site allowing radiolabeled anti-sense RNA transcripts, corresponding to the region 493-971 bp, to be generated with SP6 RNA polymerase and [cx-~'P]~UTP as described by Krieg and Melton (20). Full-length transcripts (478 nucleotides) were excised from a 6% acrylamide, 7 M urea gel and eluted with a solution containing 500 mM ammonium acetate, 1 mM EDTA, and 0.1% (w/v) sodium dodecyl sulfate for 3 h at 37 "C. Approximately 5 pg of rat liver poly(A)+ RNA and 20 ng of 32P-labeled transcript were ethanol precipitated and dissolved in 10 pl of 30 mM sodium acetate, pH 4.6, 50 mM NaCl, 1 mM ZnCl', and 5% glycerol. This solution was heated to 65 "C for 5 min and allowed to cool slowly to 37 "C. Mung bean nuclease (either 10, 75, or 150 units) was added and the reaction incubated at 37 "C for 3 min. Protected products were resolved by electrophoresis on a 6% polyacrylamide, 7 M urea gel.
Plusmid Construction-A series of vectors containing the CAT reporter gene was prepared in the phagemid pIBI-76. A 1.65-kb HindIII/BamHI fragment from pSV2CAT (21) containing the CAT structural gene, the SV40 "t" intron, and the SV40 early region polyadenylation signal was end filled and cloned into the SmaI site of pIBI-76 in both orientations to generate pIBICAT S1 and S2 or into the Him11 site to generate pIBICAT H1 and H2. To generate expression vectors containing the ALAS 5"flanking region, a progenitor plasmid was prepared by cloning the 1.4-kb PstI fragment corresponding to region -479 to 971 bp of the ALAS gene into the PstI site of pIBICAT S2. ALAS coding sequence, including the ATG initiation codon, was removed from this progenitor plasmid by digestion at the unique restriction site for SmaI (896 bp) and SalI (pIBI-76 polylinker) to produce the construct pACAT-479. This construct contains regions of the ALAS gene from -479 to 896 bp. Plasmid pACAT-2700 was generated by inserting the 2.3-kb PstI fragment (-2700 to -479 bp) from pRG-1 into pACAT-479 to produce a construct containing ALAS gene sequence from -2700 to 896 bp. Digestion of pACAT-2700 with HindIII and subsequent religation removed ALAS sequence from -2700 to -1189 bp to produce construct pACAT-1189. Digestion of pACAT-1189 with HindIII and SmaI followed by end filling and religation removed ALAS sequence from -1189 to -990 bp to produce construct PACAT-990. Construct pACAT-161 was prepared by cloning the StuIISmaI ALAS fragment of 1057 bp located from -161 to 896 bp into the SalI site of pIBICAT S2. The orientation of each construct was confirmed by analytical restriction enzyme mapping. The positive control expression vector pSVCAT was constructed by cloning the 500-bp AccI/HindIII fragment containing the SV40 early promoter from pSV2CAT (21) into the SmaI site of pIBICAT H1.
Site-directed Mutagenesis of ALAS/CAT Constructs-Two intronesis of pACAT-2700 and pACAT-479 to produce pACAT-2700AI and less mutants were prepared by oligonucleotide site-directed mutagen-pACAT-479AI. Oligonucleotide site-directed mutagenesis was performed by the procedure of Zoller and Smith (22). The mutagenesis oligonucleotide 5'-CGAGAGCCCGCGCAGGACCCTCGACTCTAG-P1, 5"AGGGACTCGGGATAAGAATGGGC-3'; P2, 5"GCGGAG-3' was designed to loop out the 811 nucleotides from 81 to 891 bp corresponding to intron 1. Mutant recombinant clones were identified initially by analytical restriction enzyme mapping, and removal of the intron was subsequently confirmed by double-stranded DNA sequencing (23). Two more constructs lacking the intron were prepared by modifying pACAT-2700AI and pACAT-479AI. Digestion of pACAT-2700AI with HindIII removed sequence spanning the region from -2700 to -1189 bp to produce pACAT-1189AI.
To synthesize the construct pACAT-161A1, the 1.8-kb StuI fragment from pACAT-479A1, containing 161 bp of promoter and the CAT gene, was purified and cloned into the SnaI site of pIBI-76.
The construct pACAT-l189A1/5'-240 was generated by end filling the NarIIBamHI intronic fragment from 258 to 497 bp and inserting this into the HindIII site of pACAT-1189AI. The orientation was determined by double-stranded DNA sequencing (23). The NRF-1 motifs at positions -59 and -88 bp of the ALAS promoter were mutagenized with oligonucleotides 5'-GGCCGACCCACAGTG-GATCCGCAGCGGTCACC-3' and 5'-GCCGACTCCGGTGTG-GGTCCGCGCGGCAGGCC-3', respectively. The promoter and CAT gene of mutant recombinant clones were subsequently sequenced to ensure the absence of random mutations generated during the mutagenesis procedure.
Cell Culture and Transfections-The human hepatoma cell line HepG2 and the monkey kidney cell line COS-1 were grown as monolayer cultures in Dulbecco's modified Eagle's medium supplemented with 10% fetal calf serum in 150-cm2 flasks. The Chinese hamster ovary cell line CHO-K1, was grown in Ham's F-12 medium supplemented with 10% fetal calf serum. All cells were grown at 37 "C with 5% COz. Transfection of cells was performed by electroporation with the GenePulser and Capacitance Extender from Bio-Rad using a modification of the method of Chu et al. (24). Exponentially growing cells were harvested by trypsin treatment, and HepG2 and COS-1 cells were resuspended at a density of lo' cells/ml in HBS buffer (20 mM Hepes, pH 7.05, 137 mM NaCl, 5 mM KCI, 0.7 mM Na2HP04, 6 mM glucose) containing 250 pg of sonicated salmon sperm DNA as carrier. CHO-K1 cells were resuspended in phosphate-buffered saline (IO7 cells/ml) without carrier DNA. In an electroporation cuvette (Bio-Rad), 500 pl of cell suspension (HepG2 or COS-I) or 800 p1 of cell suspension (CHO-K1) was mixed with 1.7 pmol of construct DNA (equivalent to 7.5-10 pg) and 2 pg of pCHl10. The latter contains the @-galactosidase gene under the control of the SV40 early promoter and was used to normalize the efficiency of individual transfections. Plasmid DNA was purified by two cycles of CsCl density gradient centrifugation, and the subsequent concentration of DNA was estimated spectrophotometrically and confirmed by ethidium bromide staining of known amounts of DNA following agarose gel electrophoresis. Chilled cells were exposed to a single 220-volt pulse at a capacitance of 960 microfarads (HepG2) or to a 300-volt pulse at 250 microfarads (COS-I), or to a 1,300-volt pulse at 25 microfarads (CHO-Kl). Electroporated cells were gently plated onto 60-mm dishes containing 5 ml of Dulbecco's modified Eagle's medium plus 10% fetal calf serum (HepG2 and COS-1) or Ham's F-12 plus 10% fetal calf serum (CHO-Kl). Transfected cells were incubated at 37 "C for 24 h, after which the medium was replaced and the incubation continued for a further 24 h.
Assay for Chloramphenicol Acetyltransferase (CAT) and @-Galuctosidase Activity-Transfected cells were harvested, washed in phosphate-buffered saline, and cells resuspended in 100 pl of 250 mM Tris-HCI, pH 7.6. Cells were lysed by three cycles of freeze-thawing, and the lysate was spun for 5 min to remove cellular debris. The protein concentration of the supernatant was determined using the Bradford protein assay (42), and all subsequent assays were performed with a constant amount of protein to obtain activities within the linear range of the enzyme assays. The supernatant was first assayed for @-galactosidase activity as described by Herbomel et al. (25) and expressed as (A420 x pg of protein" X h-'1 X 100. To the remainder of the supernatant was added EDTA to a concentration of 5 mM, and the samples were incubated at 65 "C for 10 min and then spun for 5 min to remove deacetylase activity (26). CAT activity was determined by the method of Gorman et al. (21). Acetylated products of ["c] chloramphenicol were analyzed on thin layer chromatography plates (Merck). After autoradiography, the spots were cut out and the amount of radioactivity quantified by liquid scintillation counting. CAT activity was expressed as the amount of radiolabeled chloramphenicol acetylated by 1 pg of protein extract in 1 h. These numbers were then normalized for equal transfection efficiency. The correction factor was determined by adjusting the @-galactosidase activities to 1 unit of enzyme activity defined as (A420 X pg of protein" x h-') X in the 5-Aminolevulinate Synthase Gene 1111 100 = 1.00. CAT activities were then corrected by an equivalent factor to eliminate differences arising from unequal transfection efficiencies. Transfection experiments were repeated at least four times with at least two separate plasmid preparations for HepG2 cells. Subsequent transfection experiments in COS-1 and CHO-K1 cells were performed using DNA preparations which had been used in HepG2 cells to avoid discrepancies arising from differences in DNA preparations.
Gel Shift Assay-The sequences of the four complementary oligonucleotide pairs used in the gel retardation assays are as follows. To anneal the complementary oligonucleotide, 10 ng of 3ZP-labeled oligonucleotide was combined with 100 ng of unlabeled complementary oligonucleotide in 24 mM Tris-HC1, pH 7.6, containing 100 mM NaCl and the mixture heated to 100 'C for 3 min followed by 70 "C for 10 min and then allowed to cool to room temperature for 45 min.

R -59
Unlabeled oligonucleotides were also annealed as described above to give a final concentration of 10 ng/pl and used as specific competitors in the binding reactions. Nuclei from COS-1 cells were prepared essentially as described by Schreiber et al. (27). Nuclear proteins were extracted by constant agitation for 1 h at 4 "C in buffer D (0.5 M KC1, 50 mM Tris-HC1, pH 7.5, 10% sucrose, 5 mM MgCL, 0.1 mM EDTA, 20% glycerol, and 2 mM dithiothreitol). Following centrifugation at 4 "C for 15 min the supernatant was dialyzed against two changes of TM buffer (25 mM Tris-HC1, pH 7.6, 6.25 mM MgC1, 0.5 m M EDTA, 0.5 mM dithiothreitol, 10% glycerol) containing 100 mM KCl. The protein concentration was determined using a protein microassay (Bio-Rad). Binding reactions were carried out at room temperature with 0.1 ng of radiolabeled probe, 10 pg of sonicated salmon sperm DNA, and 15 pg of nuclear protein extract in a final concentration of 50 mM KCl. Complex formation was detected on a 5% nondenaturing polyacrylamide gel run at 25V/cm in Tris-glycine buffer, pH 8.5, at 4 "C.

Isolation and Analysis of a Rat Housekeeping ALAS Genomic
Clone-We have previously isolated cDNA clones for rat liver ALAS from a library synthesized using mRNA from the livers of adult rats induced with drugs (8). One clone, plOlB1, contained the complete coding sequence for ALAS precursor protein in a 2.0-kb insert. Using this clone as a specific hybridization probe, 1.5 X lo6 plaques from a X Charon 4A rat genomic library were screened, and one strongIy positive clone was identified and further analyzed. T h e genomic sequence in this clone was contained in a single EcoRI fragment about 13 kb in length, and this fragment was subcloned into pUC19 generating pRG-1. The restriction enzyme map of the 13-kb EcoRI fragment is shown in Fig. 1. To characterize pRG-1 further, the clone was digested with restriction enzymes and analyzed by Southern blotting using as probes the three consecutive PstI fragments of 76, 1,800, and 190 bp, which comprise plOlBl (8). The 190-bp PstI fragment containing sequence at the 3'-end of the ALAS cDNA clone including the polyadenylation signal failed to detect a corresponding region in pRG-1, indicating that the genomic clone lacked the 3'-end of the gene. The 76-bp PstI fragment of identified the 5'-end of the gene. Additionally, the 1.8-kb PstI fragment of plOlBl hybridized to the 8.5-kb PstIIEcoRI fragment of RG-1. Since this probe contains most of the ALAS coding sequence, the orientation of pRG-1 could be determined with the 8.5-kb PstIIEcoRI fragment being localized 3' to the 1.4-kb PstI fragment as shown in Fig. 1. The remaining 3.5 kb of pRG-1, contained in the EcoRIIPstI fragment 5' to the 1.4-kb PstI fragment, did not hybridize to plOlBl and was deduced to contain 5'-flanking sequence.
An oligonucleotide (Pl) was synthesized with sequences complementary to rat liver cDNA coding sequence (8) beginning 22 nucleotides downstream from the initiation ATG codon ( Figs. 1 and 2). The oligonucleotide hybridized to the 1.4-kb PstI fragment in pRG-1, and this PstI fragment was sequenced (Fig. 2). The sequence at the 3'-end of this 1.4-kb fragment was identical to that found previously in plOlB1 (8) and encoded the first N-terminal 20 amino acids of the rat liver ALAS precursor protein (Fig. 2). This confirmed the identity of the isolated genomic clone.
Mapping of the Transcript~on Start Site-Studies t o locate the transcription start site of the rat ALAS gene were performed using primer extension analysis and the oligonucleotide P1. When P1 was end-labeled, annealed to poly(A)+ mRNA from rat liver, and extended upstream with reverse transcriptase, two major primer extension products of 146 and 143 nucleotides in length were detected (Fig. 3). An examination of the nucleotide sequence in the immediate upstream region from these putative initiation sites did not reveal sequence corresponding to any of the known control    (lanes A, C, G, and T).

nnnGnnTccc T G T c n T c w n n n ccnntcnntc c c c T c t T i t n T c T t n T c T T n cnnr-box
codon. The possibility was therefore considered that the transcription start sites predicted from the primer extension studies using P1 as a primer did not represent the true sites but corresponded to sequence contained within an intron. To investigate this possibility, a second synthetic oligonucleotide P2 was designed which hybridized to sequence upstream from the predicted 5' donor splice site (see Figs. 1 and 2). When P2 was employed in the primer extension reaction, two major products of 40 and 43 nucleotides in length were observed (Fig. 3, lane 3). A third synthetic oligonucleotide (P3) was synthesized complementary to a sequence in the putative intron. This was used in primer extension reactions, but no major extension product was observed (Fig. 3, lane 2). From these results we concluded that there is an intron in the 5'untranslated region and that the G nucleotides at positions 1 and 4 (Fig. 2) represent the transcription start sites for the rat ALAS gene. Since the extension products from P1 include the initiation ATG codon, it can be deduced that the 5'untranslated region of the ALAS mRNA is either 101 or 98 nucleotides in length depending upon which of the two major transcription start sites is employed.
Mapping the Intron in the 5'-Untranslated Region of the Rat ALAS Gene-The location of the intron in the 5"untranslated region was investigated further using RNase protection experiments. The 1.4-kb PstI fragment (from position -479 to 971 bp in Fig. 2) spanned the putative 3'-acceptor splice site and was subcloned into the PstI site of pSP64. This construct was linearized at the BarnHI site (497 bp) contained within the intron, and RNA transcripts labeled with ["PI rUTP were synthesized with SP6 RNA polymerase. Following hybridization of the transcripts to poly(A)+ mRNA from rat liver and digestion with mung bean nuclease, a protected fragment of 86 nucleotides was observed (Fig. 4). Comparison of known consensus 3"acceptor splice sites with ALAS sequence in this region suggested that the intron boundary most likely occurs at position 891 bp and would result in a predicted RNase-protected fragment of 84 nucleotides. The discrepancy of two nucleotides between the expected and the experimentally derived site most likely reflects the difficulties encountered when sizing RNA against DNA. The location of the 5'- donor splice site was deduced to be 811 bp upstream from the 3"acceptor splice site at position 80 bp. Hence the ALAS gene contains an 811-bp intron which interrupts the 5'untranslated region from position 80 to 891 bp.
Deletion Analysis of the Rat ALAS 5'-Flanking Region-We next determined whether the 5'-flanking region of the ALAS gene has promoter activity. To do this, convenient restriction enzyme fragments containing increasing lengths of the ALAS 5"flanking region, including the intron in the 5"untranslated region, were linked to the bacterial CAT gene. These fusion constructs contained a 5'-flanking sequence ranging in length from 161 to 2700 bp and terminated at the SmaI site at position 896 bp in the 5"untranslated region. All of the constructs therefore contained the intron and 85 bp of 5"untranslated region. The promoter activity of these constructs was determined by transient expression analysis in human hepatoma (HepGZ), monkey kidney (COS-l), and Chinese hamster ovary (CHO-K1) cell lines. Constructs were co-transfected into these cells together with pCHllO which contains the lac2 gene under the control of the viral SV40 early promoter; to correct for any variation in transfection efficiency, CAT activity in individual cell lysates was standardized to 0-galactosidase activity.
The results from the transfection studies in all three cell lines are shown in Fig. 5. The shortest 5'-flanking region of the gene tested, from position -161 to 1 bp, has promoter activity and can direct expression of CAT activity in all three .. cell lines. The CAT activity expressed by this construct (PACAT-161) was taken arbitrarily to be 100% in each of the cell lines. Increasing the length of the 5"flanking regioll to -479 bp (pACAT-479) consistently resulted in an approximate %fold increase in the levels of CAT activity indicating the presence of additional positive control elements between -479 and -161 bp. When the length of the 5'-flanking region was further increased (PACAT-990), the level of CAT activity relative to that detected with pACAT-479 was reduced about 50%, and the addition of another 200 bp of 5"flanking sequence (pACAT-1189) resulted in a further decrease. No further effect on expression was observed when 5"flanking sequence from position -2700 to -1189 bp was included (PACAT-2700). It can be seen from these results that the ALAS 5"flanking region shows a similar pattern of axpression in the three unrelated cell lines, which is in keeping with the predicted housekeeping role of this gene, and that both positive and negative control elements located in the first 1189 bp of the 5'-flanking region contribute to basal expression.
Effect of the First Intron on Expression of the ALAS Gene-Four constructs with different lengths of 5'-flanking sequence were generated in which the first intron of the ALAS gene was precisely deleted by mutagenesis (Fig. 5). These constructs were tested for promoter activity by transient transfection analysis in HepG2, COS-1, and CHO-K1 cell lines. Fig. 5 shows that the expression of CAT activity from these

Regulatory Sequences in the 5-Aminolevulinate Synthase Gene
constructs was consistently decreased in all three cell lines by at least 50% compared with the corresponding construct containing the intron. These results indicate that although the ALAS 5"flanking region alone is sufficient to drive expression of the CAT gene, the intron in the 5'-untranslated region can enhance this expression.
To investigate further the possible role of the first intron in regulating the ALAS gene, attempts were made to remove from within the intron, various restriction enzyme fragments but these experiments repeatedly resulted in extensive deletions of the constructs. In an alternative approach, an intronic NarIIBarnHI fragment of 239 bp (position 258-497) was cloned in the inverted orientation relative to the transcription initiation site, at position -1189 bp, in the intronless construct pACAT-1189AI. (Attempts to clone this fragment in the other orientation were unsuccessful.) The resulting construct (pACAT-l189A1/5'-239) was transfected into all three cell lines and CAT activity compared with that obtained from pACAT-1189 and pACAT-1189AI. The results in Fig. 5 show that the presence of the intronic fragment results in a marked increase in CAT activity with the levels measured being somewhat greater than those obtained with constructs containing the entire intron. This result indicates that the intron contains DNA element(s) in the NarIIBarnHI fragment which can activate the ALAS promoter.
Sequence Analysis of the ALAS 5'-Flunking Region-The DNA sequence upstream from the transcription start sites of the ALAS gene contains several characteristic features (Fig.   2). The motif 5'-TATATTA-3', typical of a TATA box, is located at position -30 to -24 bp. Unlike many other promoters for housekeeping genes (29) there are no obvious GC boxes despite the high G+C content (59%) of the region from position -479 to 1 bp. Interestingly, however, there are two elements in the upstream vicinity of the TATA box each with striking sequence similarity to the consensus binding site for the trans-acting factor, nuclear respiratory factor 1 (NRF-1) (30-32). A functional binding element for NRF-1 has been located in the promoters of several nuclear genes encoding mitochondrial components, and it has been proposed that NRF-1 might provide a mechanism for co-ordinating expression of some nuclear genes the products of which are required for mitochondrial biogenesis (30-32). The NRF-1 consensus sequence as proposed by Evans and Scarpulla (30) is 5'-[T/

C]GCGCA[T/C]GCGC[A/G]-3'. The first of the two putative
NRF-1 binding sites in the ALAS promoter located at position -59 to -47 bp (see Fig. 2) contains a T/C to A mismatch at the first residue of the consensus. The second putative NRF-1 binding site, at position -88 to -76 bp, deviates from the consensus by an A to G mismatch at the 6th residue (Fig. 2). Further upstream in the ALAS 5"flanking region there is a putative CCAAT box at position -330 to -325 bp and a possible binding site for OctI (33) at position -402 to -395 bP.
Gel Shift Analysis of the Putative NRF-1 Binding Sites-The roles of the two putative NRF-1 binding sites in the ALAS promoter were examined in detail. Protein binding activity was first investigated since the two sequences deviated from the NRF-1 consensus as described above. A series of four complementary oligonucleotides (R-59, R-88, "59, and CYTC) were synthesized, annealed, and tested for protein binding activity using the gel shift assay (Fig. 6). The oligonucleotides R-59 and R-88 encompassed one of the two putative ALAS NRF-1 sites and spanned regions -65 to -42 bp and -94 to -71 bp, respectively (Fig.  2). "59 (Ei'AGTGGATCCGCA3') is a mutant homologue of R-59 with three base pair changes (underlined) and designed to prevent motifs. Four oligonucleotides were tested for complex formation in a gel shift assay. Oligonucleotide CYTC (lane 1 ) corresponds to the region from -173 to -147 bp of the rat somatic cytochrome c promoter containing the NRF-1 binding site (31). Oligonucleotides R-59 (lane 2 ) and R-88 (lane 4 ) correspond to ALAS promoter sequence from -65 to -42 bp and -94 to -71 bp respectively, and each contains one of the two NRF-1 motifs. Oligonucleotide "59 (lane 3) is a mutant version of R-59 containing three base changes to the NRF-1 consensus. The four oligonucleotides were end labeled and combined with 15 pg of crude COS-1 nuclear protein extract. Complex formation was determined by gel electrophoresis in a 5% polyacrylamide gel. NRF-1 binding (31). The oligonucleotide CYTC contains the sequence found in the functionally active NRF-1 binding site located in the rat somatic cytochrome c promoter (30-32). Each set of oligonucleotides was radiolabeled with [y3*P] ATP and incubated with crude nuclear protein extract from COS-1 cells in the presence of salmon sperm DNA as competitor and the samples analyzed by polyacrylamide gel electrophoresis. The CYTC probe produced a single retarded band consistent with the reported binding activity of this sequence (30,31), whereas the two ALAS probes, R-59 and R-88, each produced a single retarded band of the same size as the CYTC complex (Fig. 6). Under these assay conditions, the oligonucleotide "59 repeatedly failed to produce a retarded complex, and, since this sequence contained mismatches known to prevent binding of NRF-1 (31), the result shows that complex formation by R-59 requires a conserved NRF-1 motif. Retardation of the CYTC, R-59, and R-88 probes was abolished by inclusion of poly(d1 .dC) in the binding reaction instead of salmon sperm DNA (data not shown). Since poly(d1-dC) acts as a specific competitor of NRF-1 binding, because of the GCrich nature of the NRF-1 consensus sequence (30, 31), our observations are consistent with the proposal that the NRF-1 motif is the target sequence in the observed protein-DNA interactions.
To investigate whether the same or a similar protein species is involved in complex formation with CYTC, R-59 and R-88, a series of competition experiments was performed. Binding of COS-1 nuclear protein extract to the radiolabeled CYTC, R-59, or R-88 probes was performed in the presence of 100-fold excess of each of the unlabeled competitor oligonucleotides. As shown in Fig. 7, retardation of CYTC could be abolished by either of the ALAS oligonucleotides, suggesting that CYTC, R-59, and R-88 bind the same or a very similar protein species. In keeping with this finding, unlabeled CYTC competed with both ALAS probes. The mutant "59 oligonucleotide failed to compete with any of the other three probes. To determine the specificity of the competition seen with the competitor oligonucleotides, gel shift experiments were performed in which unlabeled competitor oligonucleotide was titrated. Fig. 8 shows that complex formation with the CYTC probe was severely reduced by very low amounts of R-59 and R-88 cold competitor oligonucleotide, ranging from 4to 40-fold excess. Competition of CYTC probe with unlabeled R-59 or R-88 oligonucleotides revealed that almost four times more R-88 was required to completely abolish complex formation compared with R-59. These results suggest that the NRF-1 motif at position -59 bp of the ALAS promoter has a higher affinity for protein binding in a gel shift assay than the NRF-1 motif at position -88 bp. To provide further evidence that the observed DNA-protein complexes with CYTC and R-59 involve the same or similar protein species, thermal inactivation studies were performed. Nuclear protein extract from COS-1 cells was incubated at various tempera- tures for 10 min, the denatured protein removed by centrifugation, and the supernatant used in gel shift assays. With CYTC and R-59, a similar profile of diminished procein binding activity with extract exposed to increasing temperature was observed, and complex formation was completely abolished using extract which had been incubated at 50 "C (data not shown). The latter temperature is 10 "C lower than that reported by Evans and Scarpulla (31), but this discrepancy may reflect differences in the procedures used for the preparation of crude nuclear protein extracts.
Overall our data suggest that the nuclear protein species in COS-1 cells which binds to CYTC is able to interact specifically with R-59 and R-88 and that this binding involves an NRF-1 motif. Although it is assumed that the protein present in our complexes is NRF-1, experiments with purified NRF-1 (32) will be required to establish this unequivocably. The presence of NRF-1 binding activity was investigated in a variety of cell lines using R-59 as the probe. In addition to COS-1 cells, substantial NRF-1 binding activity was observed in HepGZ (human hepatoma), C2CI2 (mouse myoblast), JZE-1 (mouse erythroid), WIL-2 (human B cell), and JK-1 (human erythroid) cell lines, indicating the ubiquitous distribution of this protein.

Role of the NRF-1 Motifs in Expression of the AMSICAT
Constructs-Transient transfection experiments were undertaken in COS-1 cells to determine whether either or both of the identified NRF-1 motifs were functional in the promoter. Site-directed mutagenesis was employed to introduce into each NRF-1 motif the same three base pair changes that were shown to abolish protein binding in a gel shift assay. The NRF-1 motifs in the promoter of pACAT-479 were mutated individually and in combination. In addition, pACAT-479 wild type and mutant constructs were synthesized without the first intron. Each construct was transfected into COS-1 cells and expression compared with that of the wild type pACAT-479 construct containing the first intron. Mutation of the -59 NRF-1 site consistently resulted in at least an 80% decrease in expression of CAT activity which was further decreased to 90% by removal of the first intron (see Fig. 9). When the -88 NRF-1 site was mutated, expression of the CAT gene was decreased by about 50% and once again, removal of the first intron led to a further reduction in expression. When the ALAS promoter contained mutations in both NRF-1 sites, expression of CAT activity was virtually abolished, in the presence or absence of the first intron, with measured CAT activity being only slightly greater than that activity. ALAS/CAT fusion constructs were mutated as described under "Experimental Procedures." Each of the two NRF-1 motifs was mutated individually and in combination, and this is represented by a cross. Constructs with and without the first intron of the ALAS gene, represented as the striped box, were also tested. The CAT activities were standardized for variation in transfection efficiency using P-galactosidase activity as an internal control. The normalized CAT activities of the different constructs are compared with pACAT-479 set at 100%. In addition, a construct containing the CAT gene but lacking a promoter (pIBICAT) was also tested. Each value represents the average of three or more independent transfection experiments in which 1.7 pmol of ALAS/CAT construct DNA and 0.4 pmol of pCHllO were used. from the promoterless CAT construct. Therefore, the NRF-1 motifs are essential for basal expression with each motif contributing to expression. Since the absence of both functional NRF-1 motifs results in the almost complete loss of expression even in the presence of the first intron, gene activation by the intron must be dependent on the presence of at least one functional NRF-1 motif.

DISCUSSION
We have isolated a genomic clone for rat ALAS using a rat liver cDNA clone (8) as a specific hybridization probe. Since the cDNA clone when used as a probe in Northern blot analysis detects a 2.3-kb ALAS mRNA in all rat tissues examined (B), this genomic clone represents the housekeeping ALAS gene. We have reported previously the structure of the chicken housekeeping ALAS gene which spans 6.9 kb and contains 10 exons (13). Whereas the isolated rat genomic clone contains 10 kb of gene sequence but lacks an unknown amount at the 3'-end, our recent analysis of the human housekeeping ALAS gene indicates that this gene is about 20 kb in length. A feature of the rat (and human) housekeeping ALAS genes is the presence in the 5"untranslated region of a single intron which is absent in the chicken gene.
DNA regions which contribute to expression of the rat housekeeping ALAS gene have been investigated by transient transfection experiments. These experiments established that the sequences which direct maximal expression in COS-1, HepGP, and CHO-K1 cells reside in the ALAS promoter region from -479 to 1 bp and also in the first intron. Putative control elements identified in the promoter include a TATA box at position -30 bp, two adjacent NRF-1 sites at positions -59 and -88 bp, a CAAT box at -330 bp, and an OctI site at -402 bp. In the present work, we have focused on the NRF-1 motifs and have investigated whether they are important for expression of the ALAS promoter. Functional NRF-1 sequences, characterized by an alternating GC motif, were first identified by Evans and Scarpulla (30) in the promoter for somatic cytochrome c and subsequently in the promoters of several other nuclear genes encoding mitochondrial proteins (31, 32). The two putative NRF-1 elements in the ALAS promoter differed from each other and from the proposed consensus sequence (30) by a single nucleotide. Gel shift assays showed that each of these elements and an NRF-1 binding site from the cytochrome c promoter bound a protein species of the same mobility. Moreover, binding to these ALAS elements was competed by the authentic NRF-1 in competition assays. On this basis the two ALAS promoter elements were considered to be binding sites for NRF-1, and evidence indicated that the element at -59 bp has a greater affinity for this protein.
Transient transfection experiments in COS-1 cells established that the two NRF-1 binding sites were important for expression of the ALAS promoter in COS-1 cells. Mutagenesis of the NRF-1 site at position -88 bp resulted in approximately a 50% loss of CAT activity, whereas mutagenesis of the NRF-1 site at -59 bp gave an approximate 80% loss. The greater loss of CAT expression with the altered NRF-1 at position -59 bp may reflect the higher affinity of this site for NRF-1. Since the level of CAT expression with both native NRF-1 sites in the promoter is greater than that expected from a contribution of individual NRF-1 sites, there may be a synergistic interaction of the two NRF-1 protein species with the transcriptional machinery. Following mutation of both NRF-1 elements in the promoter, expression of CAT activity was virtually abolished in COS-1 cells, confirming the critical roles that the two NRF-1 binding sites play in directing gene expression. We have also identified NRF-1 binding sites ,in the promoters for both the human and chicken housekeeping ALAS genes,' further emphasizing the importance of the motifs in expression of this gene.
The identification of binding sites for NRF-1 in the promoters of several nuclear genes encoding mitochondrial proteins concerned with oxidative phosphorylation has led to the proposal that this protein may be important for coordinating expression of these genes in response to cellular energy demands (31). The critical role played by NRF-1 in directing expression of the ALAS promoter is in keeping with this proposal, and NRF-1 by enhancing expression of the ratelimiting enzyme of heme biosynthesis would ensure an adequate supply of heme for respiratory cytochromes. It is noteworthy that o f the four housekeeping genes (34-37) so far characterized for enzymes of the heme biosynthetic pathway, including mitochondrial ferrochelatase (37), only the promoter of the housekeeping ALAS gene contains binding sites for NRF-1. Jacob et al. (38) have reported a DNA-binding protein, termed a-PAL, which binds to a sequence almost identical to the NRF-1 consensus and is important for expression of the cytosolic a-subunit of translation initiation factor 2 (eIF-2a). Recently  demonstrated that purified NRF-1 interacts with one of a-PAL motifs in the elF-Pa promoter. This finding suggests that a-PAL and NRF-1 are the same or a similar protein species. On this basis, it G. Braidotti, I. A. Borthwick, and B. K. May, unpublished data. in the 5-Aminolevulinate Synthase Gene 1117 may be speculated that NRF-1 may co-ordinate expression of oxidative phosphorylation genes with the energy demands of cytosolic translation (32). Transient transfection experiments in the present study also established that the first intron of the ALAS gene contributes to expression since deletion of this intron resulted in a significant decrease in expression of CAT activity. When both NRF-1 sites were altered by mutagenesis, the expression of CAT was virtually abolished even in the presence of the first intron, indicating that gene activation by the first intron is dependent on the presence of a t least one functional NRF-1 binding site. When a portion of the first intron was transferred to a site upstream in an intronless ALAS/CAT construct, CAT expression was regained, and even slightly increased, in the different cells compared with a construct containing the entire intron in its normal position. These data provide preliminary evidence for the existence of an enhancer element in the intron. Transient transfection studies also demonstrated that the region of the ALAS promoter between -1189 and -479 bp negatively affected expression in different cell types, and the identity of these elements is under investigation.
Housekeeping genes can be broadly divided into two classes (28). The promoters of genes in the first class lack a TATA box, are G+C-rich, have several Spl binding sites and multiple transcription start sites, but apparently lack other regulatory sequences (39,40). In the second class of housekeeping genes the promoters have a TATA box, are G+C-rich, and contain. an array of different regulatory elements (41). Genes in tE.c latter class exhibit both basal expression and tissue-specific regulation (28). The promoters of the housekeeping genes for uroporphyrinogen decarboxylase (34) and porphobilinogen deaminase (35, 36), both enzymes of the heme biosynthetic pathway, are G+C-rich and apparently only contain binding sites for Spl. These genes are examples of the first class of housekeeping genes. By contrast, the rat ALAS housekeeping gene can be assigned to the secor;d class of housekeeping genes. This gene is expressed a t a basal level in all tissues to provide essential heme for the respiratory cytochromes and other hemoproteins but is substantially induced in the liver of animals treated with porphyrinogenic drugs when additional heme is needed for cytochrome P450 formation (13). There is also evidence that transcription of the housekeeping ALAS gene can be negatively regulated by heme in liver and perhaps other tissues (8). Current studies are aimed a t identifying control elements, in addition to NRF-1, that are important for basal and regulated expression of the ALAS gene in different cell types.