The Mouse Ribosomal Protein L7 Gene ITS PRIMARY STRUCTURE AND FUNCTIONAL ANALYSIS OF THE PROMOTER REGION*

The expressed gene coding for mouse ribosomal protein L7 (rpL7) was structurally and functionally char- acterized. It consists of seven exons, spans 3107 base pairs, and its coding sequence initiates within exon 1. The primary structure of mouse rpL7 (270 amino acids), as inferred from the nucleotide sequence of the exons of the gene, and from the cDNA, is 12 residues longer than the rat counterpart. The rpL7 gene shares common structural features with most other mammalian ribosomal protein genes analyzed thus far. These include the lack of a canonical TATA box and a major transcription initiation site at a cytidine residue em- bedded in a stretch of 14 pyrimidines, flanked by C + G-rich regions. Transient expression assays revealed that the promoter region of rpL7 gene bears several regulatory elements, both upstream to the capsite and within the transcribed portion of the gene. One inter- nal regulatory element was assigned to the first intron and a second one to a 20-base pair region spanning the first exon-intron junction, The activity of a deletion mutant of rpL32 gene, lacking its internal elements can be rescued by insertion, in the sense orientation, of the corresponding

The expressed gene coding for mouse ribosomal protein L7 (rpL7) was structurally and functionally characterized.
It consists of seven exons, spans 3107 base pairs, and its coding sequence initiates within exon 1. The primary structure of mouse rpL7 (270 amino acids), as inferred from the nucleotide sequence of the exons of the gene, and from the cDNA, is 12 residues longer than the rat counterpart.
The rpL7 gene shares common structural features with most other mammalian ribosomal protein genes analyzed thus far. These include the lack of a canonical TATA box and a major transcription initiation site at a cytidine residue embedded in a stretch of 14 pyrimidines, flanked by C + G-rich regions. Transient expression assays revealed that the promoter region of rpL7 gene bears several regulatory elements, both upstream to the capsite and within the transcribed portion of the gene. One internal regulatory element was assigned to the first intron and a second one to a 20-base pair region spanning the first exon-intron junction, The activity of a deletion mutant of rpL32 gene, lacking its internal elements can be rescued by insertion, in the sense orientation, of the corresponding elements from the rpL7 gene. The unique spatial organization of the regulatory elements in rpL7 gene, as well as in other murine ribosomal protein genes examined thus far, might indicate that this common architecture is involved in the mechanism coordinating their expression.
The biosynthesis of mammalian ribosomes requires an equimolar accumulation of four RNA molecules and over 70 different ribosomal protein (rp)' species. This stoichiometry is achieved by coordinate regulation at various levels of gene expression and under diverse physiological conditions (for review see Ref. 1). Thus, the steady state levels of different rp mRNAs in proliferating murine cells is relatively uniform (2) resulting from comparable transcriptional efficiencies (3). During differentiation of mouse myoblasts, the transcription of rp genes is coordindately repressed (4). In contrast, the * This research was supported by grant 86-00070/2 from the United States-Israel Binational Science Foundation (BSF), Jerusalem, Israel and by the Sir Zelman Cowen Universities Fund. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The GenBank accession numbers of mouse rpL7 gene, mouse rpL7cDNA, mouse r-protein L7 and the 3' end of mouse rpL7 processed gene are M29015, M29016, M29017, and M29018, respectively.
' The abbreviations used are: rp, ribosomal protein; ICE, internal control element; kb, kilobase( bp, base pair(s); CAT, chloramphenicol acetyltransferase. abundance of various rp mRNAs is elevated in regenerating rat liver (5). Translation of rp mRNA is also coordinately controlled, as was shown in mouse lymphosarcoma cells upon glucocorticoid treatment (6) and in differentiating (4) or insulin-treated mouse myoblasts (7). A comparative analysis of the promoter region of three mouse rp genes (8-ll), and to a lesser extent of two different human rp genes (12, 13), has revealed several common features including (a) the lack of a canonical TATA box, and (b) the major site of transcriptional initiation is located at a cytosine residue which is embedded in a pyrimidine tract, flanked by sequences of high G + C content. More refined characterization of the elements comprising the promoters of mouse rpL30, rpL32, and rpSl6 genes disclosed a similar general organization, particularly the presence elements downstream of the cap site (3, 14-17). Yet the trans-acting factors interacting with the cis-regulatory elements of rpL30 and rpL32 are clearly distinct from those interacting with the corresponding elements in rpS16. Moreover, the latter lacks transcription regulatory elements downstream of exon 1 (3). To verify whether other rp genes can be categorized according to their promoter architecture into one of these types (L30/ L32 or S16), or yet belong to a distinct type, it is essential to carry out a comparative study of the structure, organization, and regulatory properties of additional rp genes. Screening of mouse genomic library resulted in the isolation of seven genes encoding for mouse rpL7, of which only one (L7-16b) exhibits a high degree of homology with L7 mRNA and contains introns, whereas the others appear to be processed pseudogenes (18).
In this report we present a detailed structural and functional analyses of the transcriptionally active mouse rpL7 gene (L7-16b). Our data indicate that this gene is 3.1 kilobase pairs long and encodes for a protein of 270 amino acids which is 12 residues longer than the rat counterpart.
The rpL7 gene is composed of seven exons, exhibits distinctive structural features unique for rp genes and requires sequences within exon 1 and intron 1 for its expression. The intragenic regulatory elements coevolved in several mammalian rp genes (3, 14-17) and might be involved in the coordination of rp gene expression. EXPERIMENTAL PROCEDURES'

Primary
Structure of rpL7 Gent-The isolation of an intron-containing gene (L7-16b) and its preliminary restriction   initiates with a cytosine residue embedded in a tract of 14 consecutive pyrimidines. Furthermore, the 5' region of rpL7 gene exhibits another feature of mouse rp genes, namely the absence of a canonical TATA box about 25 nucleotides upstream of the transcriptional start site. This region, however, contains a TTTAA sequence flanked by stretches of high (>70%) cytosine-plus-guanine content. Mapping of the 3' end of rpL7 gene was inferred from the respective sequence of an L7 processed gene (L7-18), isolated and mapped previously (18) an ATG codon within exon 1 following 24 nucleotides of 5'untranslated region (Fig. 1). The translational reading frame terminates in a TAA codon at position 2678 within exon 6, leaving 63 nucleotides of 3'-untranslated region. A comparison of the fused cxonic sequence of mouse rpL7 and the deduced sequence of amino acids in the corresponding protein with the respective rat sequence, revealed a significant divergence (Fig. 4). Moreover, the mouse rpL7 exon sequences contain 30 nucleotides between positions 82 and 111 which are entirely absent from the rat mRNA (Fig. 4). To further confirm this divergence between the mouse and rat L7 sequences, and to exclude the possibility of cloning artifacts, we have also determined the nucleotide sequence of a previously cloned mouse L7 cDNA (2). The sequence of this 727 bases long cDNA is identical to that located between positions 30 and 756 of the fused exonic sequences (Fig. 4). Thus, mouse rpL7 (270 amino acids) is 12 residues longer than the rat counterpart due to the presence of extra 10 codons between positions 20 and 29 and two at the carboxyl terminus (positions 269 and 270). Moreover, the nucleotide difference results in replacement of ten amino acids of which seven represent conservative substitutions. Expression of rpL7 Gene-The sequence homology between the mouse L7 cDNA and the rpL7 gene (L7-16b) does not necessarily indicate that the latter is an expressed member of this multigene family. To directly address this question, a specific probe for the first intron was isolated and hybridized with size fractionated nuclear poly ( 2) and from exon 1 (Fig. 1) ( Fig. 5 DNA), hybridized to a set of discrete RNA components which were also detected by the cDNA probe (Fig. 5 RNA). This data indicates that rpL7 gene is indeed transcribed. It should also be noted that the size of the largest RNA component (3.4 kb) detected by both probes is in good agreement with the expected size of the primary transcript of rpL7 gene, taking into account the size of the gene (3107 bp) and the assumption that the poly(A) tail on the pre-mRNA is 200 nucleotides long (43). The apparent discrepancy between the number of introns in rpL7 gene (six) and the number of RNA components revealed by the cDNA probe (ten) can be explained by the presence of several alternative intron excision pathways, as was shown previously for other rp pre-mRNAs (8,9). Nevertheless, unless additional intron specific probes are used, we cannot exclude the less likely possibility that the cDNA probe also detects processing intermediates of another intron-containing L7 gene(s).

Functional
Characterization of L7 Promoter-Functional analyses of promoters of mouse rp genes have revealed that expression of rpL30 and rpL32 genes is regulated by elements spanning the 5'-flanking region, the relatively short (38 and 46 bp, respectively) noncoding exon 1 and sequences within the 5' end of intron 1 (3, 14). The rpS16 gene, however, possesses a relatively long first exon (10lbp) containing the translational start site and lacks transcriptional regulatory element within intron 1 (17). Exon 1 of rpL7 genes resembles those of rpL30 and rpL32 genes by its size (38 bp), yet, like that of rpS16 it bears the translational initiation codon. To verify whether the rpL7 promoter region is of L3O/L32 type, S16 type, or yet another distinct variant, we linked various portions of this region to a promoterless CAT gene. These chimeric genes were transfected into mouse Ltk-cells and the effectiveness of the different L7 promoter segments was evaluated by measurements of CAT activity in transient expression (Fig. 6). We defined the activity of the -300 to +248 construct as 100%. Deletion of sequences 5' to -87 reduced the CAT activity to about 70% (pL7CAT6). Further deletion to -56 decreased activity to 40% (pL7CAT9). The 3' boundary of the rpL7 promoter was determined by constructing progressive 3' deletions starting at position +248 and ending at +27 (Fig. 6). Deletion of intronic sequences between +46 and +248 resulted in about 50% reduction in CAT activity (compare pL7CAT6 with pL7CAT8), suggesting that the boundary of a regulatory element lies within this region as previously demonstrated for rpL30 and rpL32 genes (3,13) A further deletion to position +27, within exon 1, abolished almost completely CAT activity, indicating that another important element resides between positions +28 to +46. Extending the 5'-flanking region up to position -300 could not compensate for the missing intragenic sequences (see pL7CAT4 in Fig. 6).
Interestingly, the deleted fragment contains the sequence 5'-CAGCCTCC-3' which is located in the anti-sense orientation at position +34 to +27. This sequence seems to be related to similar sequences, which have been shown previously to comprise the binding site of the d-factor in rpL30 (CGGCCATC at positions +15 to +22) (3) and in rpL32 (CTGCCATC at positions +30 to +37) (16 of this motif in transcriptional regulation has been demonstrated by base substitution within the d-factor binding site, of rpL30 which led to a decreased binding of the b-factor, with concomitantly reduced promoter activity (3). To examine whether this or other sequences in its close vicinity might be involved in transcriptional regulation of rpL7, we have introduced them inversely within a 23-bp fragment (+24 to +46), downstream to the deleted L7 promoter (pL7asL7CAT4, Fig.  6). Clearly, despite the opposite orientation of this sequence it exhibits a two fold stimulatory effect on the CAT activity. This stimulation might reflect a component of the internal control element which is orientation independent and therefore should function at the DNA level.
Since the mouse c-myc gene has a positive transcriptional element at the 3' end of exon I (65), a region that also contains a binding site for the rpL3O/rpL32 &factor (14), we have examined its ability to complement the L7 deletion mutant. A 59-bp fragment (+521 to 579), spanning the exon I-intron I boundary of c-myc gene and containing the motif CAGCCTTC which is similar to that found in the B-factor binding sites of rpL30 and rpL32, was cloned downstream to the rpL7 deleted promoter (pmyc-L7CAT4). Fig. 6 shows that the myc transcriptional regulatory element causes about 4fold induction in the CAT activity, quantitatively resembling its effect on the activity of the enhancerless SV40 early promoter (65).
In most constructs the first AUG codons is the L7 initiation codon at +25 immediately followed by a TAA termination codons at +40 except for pL7CAT4 and pL7CAT7 where the stop codon is located in the 5'-untranslated region (UTR) of the CAT transcript. Differences in CAT activity can, theoretically, be achieved by interferance at any possible level during gene expression. Thus, the abrupt drop in CAT activity, achieved by deleting the intragenic sequences from +248 to +27, could be attributed to impaired translation of the transcript with the shorter 5' UTR, rather than decreased abundance of the respective mRNA. To address this question we stably transfected mouse L cells with pL7CAT3 (-300 to +248) and pL7CAT4 (-300 to +27) and evaluated the relative abundance of the CAT DNA sequences and the respective transcripts (Fig. 7). Northern blot analysis revealed that the CAT transcript in cells transfected with pL7CAT3 was readily observed as compared to the barely detectable CAT transcript (at least loo-fold lower) in cells transfected with pL7CAT4 (Fig. 7~). Equal amounts of RNA from both transfected cultures were loaded on the gel, as evident by the similar abundance of the endogenous L7 mRNA detected in the parallel pair of lanes in the same blot (Fig. 7b). It is worth noting that the ratio of CAT activity in these transfected cells paralled that of the abundance of the respective mRNA (data not shown). A Southern blot of genomic DNAs from the respective cultures was hybridized with a probe containing CAT sequences as well as unique sequences from intron 3 of mouse rpL32, serving as an internal quantitative reference (Fig. 7~). Densitometric scanning of the autoradiogram revealed six copies of L7CAT3 per cell as compared with one to two copies of L7CAT4. These results clearly suggest that deletion of intragenic sequences (+27 to +248) exerts its repressory effect on CAT expression by diminishing the abundance of CAT mRNA.
Internal Regulatory Elements of rpL7 Gene Can Rescue an rpL32 Deletion Mutant-Having demonstrated that the spatial organization of regulatory elements, in the 5'-flanking and internal sequences of the rpL7 promoter region, is similar to those of rpL30 and rpL32, it was interesting to verify whether the internal elements are also functionally equiva- lent. It has been shown previously that deletion of internal control elements (ICES) of the rpL32 promoter results in a substantial decrease in transcriptional activity (13,14). An even greater repressing effect is apparent upon deletion of internal sequences (from +27 to +248) of the L7 promoter (compare pL7CAT3 and pL7CAT4 with pL32CAT5 and pL32CAT9 in Fig. 8). When the L7 ICE was inserted within a 225-bp fragment (+24 to 248) or a 97-bp fragment (+24 to +120) in the 5' to 3' orientation, immediately downstream of the L32 sequences in an L32 mutant, lacking its own ICE (+12 to +77), they substituted for the missing element (Fig.  8, pL7-L32CAT9a and pL7-L32CAT9c, respectively). However, the deletion mutant not only can not be rescued if these two L7 fragments were placed in the reverse orientation, but it was even further repressed (Fig. 8, pL7-L32CAT9b and pL7-L32CAT9e). It is noteworthy that replacement of L32 ICE by L7 ICE renders the hybrid promoter more active than the intact L32 promoter (compare pL32CAT3 with pL7-L32CAT9a and pL7-L32CAT9c, Fig. 8). These results are consistent with the apparent difference between L7 and L32 promoters (pL7CAT3 and pL32CAT5, Fig. 8). Interestingly, insertion in the sense orientation of a 23-bp segment (+24 to +46), which contains the dominant component in rpL7 ICE (Fig. 6), can only partially complement the L32 mutant (pL7-L32CAT9d, Fig. 8). However, unlike the two longer segments (225 and 97 bp), when placed in the antisense orientation, it still exerts a stimulatory effect (pL7-L32CAT9f, Fig. 8). It seems that the L7 and the L32 promoters are differentially affected by the L7 ICES. Thus, the L7 promoter activity is mostly impaired by deletion of a 20bp segment (+28 to +46) in Fig. 6), whereas that of the L32 mutant by a deletion of the downstream L7 intron sequences spanning positions +47 to +248 (compare pL7-L32CAT9a with pL7-L32CAT9d in Fig. 8).

DISCUSSION
Evolution of rpL7 Gene-A comparison of the amino acid sequence of mouse rp L7, as inferred from the nucleotide sequence, with that of the rat demonstrates differences in composition, as well as in size. The 258 amino acids of rat L7 exhibit 96% homology with the mouse counterpart as compared with 100% homology among mouse, rat, and human L32 (8,44,45) or S6 (46-48) and between mouse and rat L30 (9,49) or human and Chinese hamster S14 (13,50). However, even more striking is the occurance of 10 residues between positions 20 and 29 in mouse L7 which are entirely absent from the rat L7. A closer examination of the respective nucleotide sequence of the mouse L7 mRNA discloses that the 30 nucleotides (position 82-ill), encoding the 10 extra codons, comprise the middle segment of a series of three tandem repeats which exhibit 70-87% homology between each of the segments (Fig. 9, mouse). All three repeats are encoded in a single exon (exon 2). The rat L7 mRNA, however, includes only two repeats (Fig. 9, rat), corresponding to the two outer ones (segments A and C) in the mouse mRNA. It seems that this region evolved by two duplication events of a 30-bp segment. One occurred prior to the divergence of these two rodent species and the second one in the mouse after speciation.
It should be noted that a similar duplication in the coding sequence of Xenopus luevis rpL1 led to the generation of two Ll molecules, within the same species, which differ in size by 5 residues (51). The apparent difference between mouse and rat rp L7 should result in altered relative mobility of these proteins in two-dimensional gel electrophoresis. The absence, to the best of our knowledge, of any report of such variation may simply reflect the lack of rigorous comparison between the electrophoretic mobility of rps from these two species. Alternatively, it might be the result of the presence of both L7 variants in each species, but one in a lower abundance and hence not readily detectable, as for the case of Lla and Llb in X. lueuis (51). The latter possibility cannot be readily ruled out, as we have not investigated the entire gene family encoding L7 and hence cannot exclude the presence of more than one expressed rpL7 gene in murine cells.
The Expression of rpL7 Gene Is Controlled by External and Intragenic Regulatory Elements-Functional analysis of the promoter region of rpL7 has disclosed an organization with striking similarity to those of rpL32 and rpL30 (3,14). Thus, rpL7 promoter possesses at least two positive elements within the 5'-flanking region: one or more upstream to position -87 and another one residing between positions -87 and -56. The latter includes a nine-nucleotides segment (5'-TTTCCGG-CT-3') spanning positions -78 and -70. Closely related sequences appear in the same orientation but slightly downstream (-58 to -50) in rpL30 promoter (5'CTTCCGGTCS') or reversely oriented in a similar position (-79 to -71) of rpL32 (S'CTTCCGGCT-3').
These segments in rpL30 and rpL32 promoters include a domain which also comprises a binding site for a common factor, as evident by the efficient cross-competition (3,14). It is worth noting that although we have not examined the possible existence of a more proximal upstream element, the occurance of such an element has been demonstrated in all other mouse rp promoters studied thus far (3,14,15).
The rpL7 promoter like those of rpL30, rpL32, and rpS16 (3, 14, 15) extends into the transcribed portion of the gene and contains at least two ICES. One resides in the first intron (between positions +46 to +120; another one is included in a 20-bp long segment spanning the first exon-intron junction (positions +27 to +46).
The occurrence of intragenic regulatory elements which control gene expression at the transcription level has been demonstrated for both cellular and viral genes (52-64). Some internal regulatory elements exert their effect as typical enhancers (X-54,57,58).
Others, however, have been shown to function only in a position and orientation dependent manner (57, 58). The L7 ICE, with its two components, cannot be referred to as an enhancer-like element, since it functions only in its sense orientation.
Likewise, analysis of the intron regulatory element of rpL32 revealed that its activity is also repressed by translocation or inversion (15). Nevertheless, the short L7 regulatory segment (+24 to +46) seems to operate in an orientation independent manner, when linked to the L32 promoter.
The contribution of L7 ICE to the CAT expression clearly cannot be attributed to a nonspecific effect of the upstream AUG codon within the L7 sequence for the following reasons: (a) the L7 initiation codon is present both in pL7CAT3, which we have considered to include the fully active promoter and in pL7CAT4 which exhibits only 1% of the activity of the former; (b) a similar stimulatory effect of the 23-bp segment, spanning positions +24 to +46, can be obtained when positioned in the sense or the anti-sense orientations, with respect to the L32 mutant promoter. This is evident despite the fact that the resulting transcript of the latter lacks the AUG codon; (c) removal of most of the intronic sequences (from position +46 to +248) leads to a marginal effect on the L7 promoter (compare pL7CAT8 with pL7CAT6, Fig. 6), but a very dramatic one on the L32 promoter (compare pL7-L32CAT9a with pL7-L32CAT9d, Fig. 8). This prominent discrepancy is apparent, although the upstream L7 initiation codon is similarly positioned with respect to the CAT sequence in both pL7CAT8 and pL7-L32CAT9d. Northern blot analysis indicates that deletion of the L7 ICE from pL7CAT4 greatly impaired the accumulation of the CAT transcript (Fig. 7). Although we have not directly measured the relative rate of transcription, several lines of circumstantial evidence support the notion that these ICES function at the transcriptional level. (a) The requirement of the intron promoter element of rpL32 for maximal transcriptional activity was demonstrated by run-on assay in isolated nuclei (15). This element together with the rpL32 exon regulatory element can be efficiently substituted by the L7 ICE (+24 to +120). (b) A 23-bp segment, spanning the first exon-intron junction from rpL7 gene, contains a motif similar to that found in rpL3O/rpL32 s-factor binding site. This segment can exert a small, yet reproducible, stimulatory effect on both L7 and L32 promoter mutants, when placed in the anti-sense orientation. These results support the notion that an element within this sequence is orientation-independent and therefore likely to operate at the DNA level. (c) A 59-bp segment containing mouse c-myc transcriptional regulatory element, which also bears a similar motif, can partially complement the L7 promoter mutant. (d) The internal elements of rpL30 and rpL32 genes bind similar nuclear factors as evident by cross competition between these elements (3). Similarily, the L7 ICE specifically binds to a factor(s) in nuclear extracts and competes with the L30 ICE binding to nuclear factor(s).3 In light of these facts it seems that the internal elements of rpL30 and rpL7 genes constitute a binding site(s) for some common factor(s), which conceivably involves with the transcriptional regulation of these genes. It is noteworthy that the rpS16 gene bears a distinctive set of cis-regulating elements, which do not cross compete with those of rpL30 and rpL32 (17). Yet, the ICES of the rpL32 gene can be adequately replaced by internal sequences (downstream to position +29) of the rpS16 gene (11). Thus, it seems that despite differences in the nature of the cis-regulating elements, the general architecture of the promoter modules of the rp genes can account for their coordinate expression.