Nonconservative Utilization of Aldolase A Alternative Promoters*

Recently, analysis of the sequence and expression of the human aldolase A gene revealed the unique arrangement of three tandem promoters and exons preceding a common coding sequence. A muscle-specific promoter (M) and two flanking widely used promoters (N and H) produce mRNA species which, in their mature forms, differ only in the sequence of their 5'-untranslated regions. We have isolated and investigated the expression of a mouse aldolase A gene. This mouse gene represents a functional gene by sequence analysis, recombinational screening, and by transfection into C2C12 cells. Although there is a high degree of sequence similarity between the mouse and the human gene in the region of the alternative first exons, we have been unable to detect a functional utilization of the 5'-most promoter (N) in the mouse. Steady state mRNAs isolated from a variety of adult tissues and cultured cells were analyzed by RNase protection and primer extension to identify first exon utilization. Consistent with previous reports, exon M is found only in skeletal muscle and exon H, the "housekeeping" exon, is utilized in every tissue where aldolase A is expressed. Under identical conditions we fail to see any evidence of the N exon. Therefore, although sequence homology exists between rodents and primates in the N region, the absence of selective pressure to preserve its primate pattern of expression may have resulted in functional promoter extinction.

Recently, analysis of the sequence and expression of the human aldolase A gene revealed the unique arrangement of three tandem promoters and exons preceding a common coding sequence.
A muscle-specific promoter (M) and two flanking widely used promoters (N and H) produce mRNA species which, in their mature forms, differ only in the sequence of their 5'untranslated regions.
We have isolated and investigated the expression of a mouse aldolase A gene. This mouse gene represents a functional gene by sequence analysis, recombinational screening, and by transfection into C&z cells. Although there is a high degree of sequence similarity between the mouse and the human gene in the region of the alternative first exons, we have been unable to detect a functional utilization of the 5'-most promoter (N) in the mouse. Steady state mRNAs isolated from a variety of adult tissues and cultured cells were analyzed by RNase protection and primer extension to identify first exon utilization. Consistent with previous reports, exon M is found only in skeletal muscle and exon H, the "housekeeping" exon, is utilized in every tissue where aldolase A is expressed.
Under identical conditions we fail to see any evidence of the N exon. Therefore, although sequence homology exists between rodents and primates in the N region, the absence of selective pressure to preserve its primate pattern of expression may have resulted in functional promoter extinction.
Aldolase is a glycolytic enzyme whose function is indispensable to normal cellular metabolism. The aldolase isoenzymes, A, B, and C, and their respective mRNAs are each uniquely distributed in specific tissues in a strict developmentally programmed fashion (for a review see Ref. 41). In every tissue of the developing embryo, A and C are the only forms present (1,27). Maturation of fetus to adult is accompanied by the establishment of tissue-specific expression of each isoenzyme. Aldolase B is found in the liver, kidney, and small intestine (34), and aldolase C is restricted to the tissues of the brain and central nervous system while aldolase A retains a somewhat ubiquitous distribution. In addition, the A isoenzyme is exceptionally abundant in adult muscle where it constitutes over 5% of the soluble protein content (26). The mammalian isoenzymes are encoded by three evolutionarily related but unlinked genes (45). These genes have nearly identical intronexon arrangements within and among species. The interspe-* This research was supported by Grant 5-560 from the March of Dimes.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequence(s) reported in thispaper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) JO551 7. ties conservation of the protein coding sequence for each isoenzyme is also quite remarkable, particularly for aldolase A (22).
Aldolase A, unlike aldolases B and C, exhibits an additional level of complexity in regulating its expression during development and differentiation. Multiple mRNA species for aldolase A have been reported either in different tissues or in the same tissue at different times during development. Sequence differences among these mRNAs occur only in their 5'-untranslated region. Subsequent sequence analysis of a rat aldolase A genomic clone revealed the presence of tandem promoters and exons whose alternative usage could account for the observed rat aldolase A cDNAs (20). The most 5' promoter, promoter M, directs transcription of an exon sequence found only in muscle tissue while the more 3' or proximal promoter, promoter H, directs transcription of an alternate 5' exon in all tissues and developmental stages (Fig.  1). Activation of the M promoter occurs during fetal myogenesis and, as we have shown previously, an analogous activation of the muscle-specific promoter occurs during myogenesis of CZCIZ cells in culture (8).
Surprisingly, analysis of the human aldolase A gene revealed the presence of a third promoter, promoter N, just. upstream of the muscle-specific promoter ( Fig. 1) (18, 32). This promoter is utilized in a similar fashion to that of the most 3' proximal promoter, H, but is much less active, producing lower amounts of steady state mRNA. Thus, the resultant human aldolase A promoter region consists of a muscle-specific promoter flanked by two relatively nonspecific or "housekeeping" promoters all clustered within 1.6 kbp' of one another.
In the present study we have isolated and analyzed the transcription of the mouse aldolase A gene. By utilizing two different screening regimens we have rigorously selected and verified the identity of the mouse aldolase A gene. The functionality of the putative promoter region was demonstrated by its ability to direct expression of a reporter gene in culture. Sequence comparisons show a high degree of similarity between the mouse gene and the rat gene and somewhat less between the mouse and human aldolase A sequences. Expression of the mouse A gene was found to be identical to the rat in the various mouse tissues assayed; however, no mRNAs were detected produced from the N promoter as seen in the human. This suggests that although sequence homology exists between rodents and primates in this particular promoter region, necessary selective pressures for conservation of expression could be absent. This, therefore, may be the first example of promoter extinction between two closely related species.           Isolation of the Aldolase A Genes-As a primary step in analyzing the genomic structure of aldolase A in the mouse, we examined the hybridization pattern of mouse genomic DNA to a mouse brain aldolase A cDNA. Genomic DNA was digested with BumHI, EcoRI, and Hind111 and then hybridized to either an aldolase A or B cDNA clone as shown in Fig. 2 (see Miniprint Supplement). The aldolase B probe showed hybridization to single bands in each enzyme digest, consistent with the presence of that gene as a single copy as it is found within the human (46), rat (48), and chicken (6)  copies of aldolase A or aldolase A-like sequences. Identical results were obtained by repeating the analysis using exon 3and exon g-specific probes (data not shown) suggesting a repeat is not associated with any particular region of the aldolase A cDNA.
The finding of multiple aldolase A-like sequences within the mouse genome is consistent with reports of multiple pseudogenes both in rat (20) and human (45). Therefore, some bands may be due to pseudogenes for mouse aldolase A. Nevertheless, the possibility still exists that mouse aldolase A is encoded by more than one gene, each with its own unique first exon, as opposed to being encoded by a single gene capable of producing all anticipated mRNA variants. In order to distinguish among these possibilities, a comprehensive approach to cloning the mouse aldolase A gene(s) was taken.
The possibility of multiple genes was explored by using ho.mologous recombination in vivo to preferentially select only those clones most similar to coding region probes (11,23,25,43,47). A mouse genomic library was constructed in the vector Syrinx 2A and was allowed to undergo recombination between probe plasmids bearing exons 3 or 9. Recombinants were selected for on the proper hosts, and 72 plaques from each trial were picked and analyzed by hybridization to each probe. The results are shown in Table I. We further selected for clones that were most likely to span the entire gene by examining only those clones hybridizing to both exons 3 and 9. Forty-four were restriction-digested, and six sisters were identified. Three sisters predominated, constituting 34 of 44 cases. All three were found to have overlapping restriction maps which proved to encompass the entire aldolase A gene (Fig. 3A, Miniprint Supplement).
To assess the possibility of a single unique sequence, 70 aldolase A clones, isolated independently by hybridization to a mouse brain aldolase A cDNA probe, were screened with oligonucleotides specific to the optional exons N, M, and H (Fig. 3C, Miniprint Supplement). One clone, X16, showing hybridization to all three oligonucleotide probes was selected for further analysis. This clone proved identical by restriction mapping to the most frequently retrieved clones obtained by homologous recombination (Fig. 3A, Miniprint Supplement).
In an effort to show these cloned sequences were representative of a single locus, an internal restriction fragment corresponding to the intronic region between exons M and H was isolated and used as a hybridization probe against genomic DNA digested with EcoRI, HindIII, X&I, or SacI. In addition it was also hybridized against the three most prevalent recombination clones to further substantiate identity. As can be seen in Fig. 4 (Miniprint Supplement), a single genomic DNA fragment was identified in each digest, and, furthermore, the sizes of the Sac1 and Hind111 genomic fragments correspond exactly to the fragments produced from the recombinant clones. Therefore, we conclude that the aldolase A sequences cloned represent the genomic sequences most similar to a mouse brain aldolase A cDNA and contain all sequence information needed to encode anticipated alternate 5' exons. Most importantly, these cloned sequences all overlap an identical single copy sequence.
N-specific Sequences Are Not Detected in a Variety of Mouse Tissues-In order to determine first exon utilization in steady state mRNA populations in the mouse, and in particular to look for the presence of the N exonic sequences, both RNase protections and primer extension analysis were performed. Initially, genomic fragments representing alternative exon sequences were subcloned into vectors containing T3 and T7 bacteriophage promoters such that uniformly labeled cRNA probes could be generated. These subclones, designated pAN2, PAM, and pAB, their relative lengths, positions in the genomic clone, and the sizes of their anticipated protected fragments are shown in Fig. 6. A variety of mouse tissues was selected for analysis including those which correspond to tissues in the human where N was shown to be expressed at the highest levels, such as adult spleen and muscle. Mouse liver and mRNA from cultured hepatoma cells were also included as this was the initial site where N transcripts were first described in the human (40).  The H exon was specifically protected when hybridized to probe pAB and produced two sets of protected fragments in almost all tissues where the H exon is expressed (Fig. 6). This corresponds to the two transcriptional start sites, approxi-mately 50 bp apart, previously identified by us in the mouse (8) and also seen in the rat (20) and human (18,32). Bands of the greatest intensity were seen in mRNAs isolated from the brain (lane 14) and hepatoma cell line BWTG3 (44) (lane 17). Lower levels of exon H were detected in muscle (lane 15) and spleen (lane 16). Interestingly, we see in muscle a greater relative abundance of the second protected fragment size suggesting more frequent utilization of the second start within the H exon. This confirms our previous observation that a small portion of transcripts in adult mouse do indeed arise from the H promoter and use this second cap site (8). There also appears to be a preferential utilization of the 5' start site in hepatoma in contrast to brain, where both starts are roughly equivalent. We were unable to detect the expression of H exon sequences by RNase protection using 50 pg of adult mouse liver (lane 18) where aldolase B is normally expressed. This corroborates Northern blot analysis which also fails to detect aldolase A mRNAs in adult mouse liver." In contrast with exon H, hybridization of mRNAs to probe PAM detected exon M only in muscle tissue (see lanes 9-13) as would be anticipated.
From these results we conclude that our assay conditions, the probes, and the integrity of our RNA adequately represented the distribution of exons H and M. Conservation of this distribution of these sequences in the mouse was maintained as compared with the rat and human. However, such is not the case for expression of the N exon.

Concurrent
RNase protections utilizing aliquots of the same mRNA preparations and probe PAN, did not show protected fragments in either adult spleen (lane 3) or muscle (lane 2), where in similar amounts of mRNA, this exon was detected in the human. Furthermore, we failed to detect the presence of this sequence in adult brain (lane 1) or liver (lane 5) or in the mouse hepatoma cell line BWTG3 (lane 4). Additionally we have examined mRNA from fetal muscle and the mouse myoblastic cell line CC,, and again saw no evidence of any expression of the N exon (data not shown). In order to verify the competency of our probe pANZ to detect N exon sequences, we prepared an additional genomic subclone, designated PAN?.:~;, which included the last 37 nucleotides of the Nr exon along with the 3'-flanking sequence. In vitro-generated unlabeled mRNAs from PAN~.:~~ were hybridized to labeled pANr cRNAs as a control for RNase protection.
The protection product, as seen in lane 7, was the appropriate size and was clearly detectable. In addition, titration studies (not shown) with similarly generated in vitro RNA from PAN~.:~; demonstrate that we can successfully detect a 1000-fold lower level of N-type transcripts than that shown in Fig. 6, lane 7. The above RNase protection experiments were repeated using a probe spanning exon N1 (data not presented). The findings corroborated the results obtained using probe PAN, and eliminated alternative splicing as the determinant of N, absence. Two different oligonucleotides were utilized for the primer extension assays to again check for the existence of N-specific exons within steady state mRNA. The first, IIeS, was designed to hybridize to sequences located downstream of the AUG sequence in exon 2 (Fig. 7) and thus should hybridize to any aldolase A mRNA irrespective of first exons. Using RNA from spleen, brain, and muscle as seen in lanes l-4 of Fig. 7, extension products from primer IIZR were detected in each tissue. Two start sites within exon H (extension products of 210 and 172 nucleotides long) were seen with spleen and brain mRNA confirming the RNase protection data. Although the levels of aldolase A were low in spleen, longer exposure of the autoradiographs (compare lanes 1 and 2) show the character- of '"P-end-labeled primers is also shown. II,i (lanes l-4) or Npg (lanes 6-9) hybridized to 2 pg of muscle (lanes 4 and 6), 5 pg of brain (lanes 3 and 7), or 5 fig of spleen (lanes 1, 2, and 8)  istic double start sites. An additional extension product of 137 nucleotides was seen in the brain. We have not found that band to be consistently reproducible (see Ref. 8) and attribute it to "stuttering" of the reverse transcriptase caused by extensive secondary structure in the 5'-untranslated region of the mRNA. mRNAs isolated from muscle resulted in an extension product of 119 nucleotides, indicative of utilization of Aldolase A Alternative Promoters of the M exon. Although we detect small amounts of H exonspecific sequences with much longer exposures as seen previously (not shown, Ref. 8), the major aldolase A mRNA species in muscle is associated with the M exon.
A second oligonucleotide, NZ9, was also hybridized to samples of the same RNA preparations as IIzs. This oligonucleotide, previously utilized in our initial screening of X genomic clones to pick out those with homology to the N exons of the human gene, is specific for hybridization to exon NP. Consistent with our RNase protections, we see no extension products with any of the tissue RNAs (Fig. 7, lanes 6-8). As a further control for the competency of the NZ9 primer to hybridize to N-specific sequences, we synthesized exon NP mRNA in vitro from the PAN:! subclone containing the mouse NB exon. This mRNA gave an appropriately sized extension product (lane 9), demonstrating that the primer was both capable of hybridizing to N-specific sequences and of being extended.
From the above results we concluded that the N exon is not expressed in the mouse either in similar tissues or at similar levels as in the human. In fact, we have yet to detect expression of these sequences in any mouse tissues, cell lines, or at any time during development.
The Cloned Aldolase A Gene Is Transcriptionally Functional-In order to demonstrate that we had indeed cloned a functional gene, the putative promoter regions were subcloned into pBLCAT3 (30). As shown in Fig. 8, the construct 7K82CAT contained 7 kbp of the aldolase A sequence beginning 2 bp upstream of the ATG initiation codon in exon 2 and ending 5 kbp upstream of exon M. This construct allows any splice initiated by M, H, or putative N exons to be completed to the common splice acceptor found in exon 2. Additionally, an "H-less" construct was also created by deleting a region spanning from 195 bp 5' to 500 bp 3' of the H exon thus deleting the H exon and all proximal promoter elements including three "GC" boxes, one "CAATT" box, and all H-associated "TATA" boxes (12,13,21). The resultant construct contained 5 kbp of the 5' exon M flanking region, exon M, a l.l-kbp hybrid intron, and 21 bp of exon 2 allowing splicing of exon M to exon 2. The myogenic cell line C&i2 was chosen as we have previously shown that both promoters are functional in these cells with the M promoter specifically activated as differentiation occurs (8).
Transient transfections were timed to coincide with myoblast withdrawal from the cell cycle, just as myotube differentiation was beginning. As shown in Fig. 8 Constructs are schematized with open boxes representing exons as labeled from the mouse aldolase A gene and pBLCAT3. Arrows indicate cap sites, and angled lines show anticipated splicing patterns. Percent conversion was calculated as follows: {(cpm acetylated ["Clchloramphenicol (AC-CM))/(total cpm ["Clchloramphenicol (CM))] X 100 as excised from the thin layer chromatogram pictured to the right. the activity of 7KAH82CAT was detected, possibly due to enhanced promoter M activity. This suggests the higher 7K82CAT activity may be the product of both "residual" promoter H activity and induced promoter M activity. These data demonstrate both that the cloned aldolase A promoters are transcriptionally active and that promoter M can function independently of promoter H.

DISCUSSION*
The aldolase A gene serves as an excellent system from which much can be learned about the developmentally regulated interactions of closely spaced promoter elements and multiple transcription initiation sites. The aldolase A gene in humans is known to have four and possibly as many as five such initiation sites regulated by three independent promoters producing four to five different mRNAs (18,32). To study the expression of this gene in the mouse we isolated an aldolase A cDNA from a mouse brain library (36). Subsequent use of this probe on genomic Southern blots shows that aldolase A in the mouse is part of a multisequence family. This correlates with reports of one to two aldolase A pseudogenes in the human (45) and four to five in the rat (20). It has not been conclusively proven, in these species, that all of these aldolase A-like loci are nonfunctional. We left open such a possibility in selecting for genomic sequences responsible for encoding the brain-specific cDNA by using two completely different independent approaches to isolate the gene. The homologous recombination screen, using coding region probes, unambiguously selected the most aldolase A-like clones from a genomic library by virtue of their high relative recombination frequencies. Analysis of these clones identified the bulk (>90%) as carriers of a common genomic region. This sequence was shown to be single copy and identical to a clone isolated by oligonucleotide hybridization. That clone contained all necessary information needed to encode all aldolase A mRNA variants. The inherent stringency and thoroughness of the combined screening techniques ensure the cloned mouse aldolase A sequence is the only mouse gene sequence capable of encoding the observed mouse aldolase A cDNAs (36, 38).
Recent reports of three functional promoters for the human aldolase A gene (18,32) led us to ask if such a situation existed in the mouse genome. Expression of the individual noncoding exons in a tissue-specific manner was investigated using both RNase protection and primer extension assays. Exon H was found in the highest levels in the brain followed by the hepatoma cell line BWTG3, muscle, and spleen. This exon shows two different "TATA"-directed start sites that are used differentially as is exemplified by muscle, hepatoma, and brain tissues. Exon M, as expected, was only found in muscle. Exons Ni and NZ were not found in mRNAs from any mouse tissues at any stage of development suggesting that promoter N is not functioning in mouse as it does in the human. The absence of N-containing mRNAs is surprising since critical homologies such as the conserved splice junction consensus sequence GT-AG (5) and a "TATA" box exist in the N exon region along with other significant sequence similarities. Therefore, these sequences may represent "pseudoexons" similar to the IE-like sequence described for the aA-crystallin gene (19). Given that this assumption is correct, we can speculate as to how this situation may have arisen. Perhaps all three promoters were active prior to the rodent-primate divergence and due to increased fitness bestowed by promoters M and H, N received less selective pressure against deleterious promoter mutations. Alternatively, constitutive promoter N activity might have been down-regulating promoters M and H passively by transcriptional interference (9). Thus selection against the weaker 5' N promoter allowed an overall net increase in aldolase A transcriptional output. As alluded to under "Results," there are conflicting reports as to the exact start site of N transcripts (18,32,40). Interestingly, the "TATA" box corresponding to the extended N1 exon start site, indicated by the primer extension results of the human liver N-type cDNA (40), is highly conserved among all species. The more downstream "TATA" box, directing the transcription start site identified by Maire et al. (32), is not. Therefore, the exact contribution of such possible promoter mutations is hard to assess. The result in either case appears to be extinction of N promoter function and the release of exons N1 and N2 to undergo a higher rate of sequence drift, as evidenced by the greater degree of divergence in these pseudoexons when compared with M and H exons. Last, we acknowledge that it is possible the promoter N has acquired a different tissue specificity, for which we have not yet properly assayed. However, we believe that tissue-specific regulation would be unlikely given the ubiquitous distribution of N sequences in the human (32).
As a test for transcriptional competency of the cloned gene sequence, the promoter region was fused into a CAT expression vector. Expression was obtained when 7K82CAT (the wild-type promoter) was transfected into the myogenic cell line C2C12. The H promoter was deleted from this construct, creating 7KAH82CAT, which also showed expression in C2C12 cells. Preliminary time course data (not shown) and the data shown suggest that the construct 7KAH82CAT contains those elements necessary for myotube specific expression and that promoter H may be down-regulated upon induction of promoter M during myogenesis. This supports the concept of promoter interference, which has been shown to occur between the alternative promoters in the Drosophila melanogaster alcohol dehydrogenase gene (9). Alternatively, downregulation may be specifically mediated by negative factors. These constructs will be used for deletion analysis in conjunction with the C& cell line as a system to delineate the essential promoter elements regulating aldolase A expression and to subsequently identify and purify tram-acting factors that modulate aldolase A gene expression.
In summary, we have isolated a functional mouse aldolase A gene and have investigated both its structure and the pattern of expression of its alternative first exons. This investigation has led us to conclude that the mouse aldolase A gene has two alternative functional promoters and exon one sequences and one set of pseudoexons.
The pseudoexons appear to be the result of a promoter extinction event. Future experiments will address this more fully and should determine essential &-acting regions involved in promoter M and H utilization.