A purine-rich exon sequence enhances alternative splicing of bovine growth hormone pre-mRNA.

A previous study has demonstrated that deletion of a region within the last exon of bovine growth hormone (bGH) pre-mRNA results in almost complete retention of the upstream intron (Hampson, R. K., LaFollette, L., and Rottman, F. M. (1989) Mol. Cell. Biol. 9, 1604-1610). We now demonstrate that insertion of a simple purine-rich element (GGAAG), which is present within the deleted region, activates intron splicing upon expression in transfected cells. Moreover, several repeats of the GGAA(G) sequence restore splicing to near wild-type levels and direct the binding of a factor present in HeLa cell nuclear extracts. Mutation of the 5'-splice site toward U1 small nuclear RNA complementarity eliminates dependence on the downstream exon sequence for splicing. These results support a model for alternative intron retention in which purine-rich sequences function as part of an "exonic splicing enhancer" to complement a weak 5'-splice site and thereby facilitate intron removal. As a result, the majority of bGH mRNA is processed to remove intron D while still allowing a fraction of bGH mRNA containing the intact intron to reach the cytoplasm.

A previous study has demonstrated that deletion of a region within the last exon of bovine growth hormone (bGH) pre-mRNA results in almost complete retention of the upstream intron ( Moreover, several repeats of the GGAA(G) sequence restore splicing to near wild-type levels and direct the binding of a factor present in HeLa cell nuclear extracts. Mutation of the 5-splice site toward U1 small nuclear RNA complementarity eliminates dependence on the downstream exon sequence for splicing. These results support a model for alternative intron retention in which purine-rich sequences function as part of an "exonic splicing enhancer" to complement a weak 5'-splice site and thereby facilitate intron removal. As a result, the majority of bGH mRNA is processed to remove intron D while still allowing a fraction of bGH mRNA containing the intact intron to reach the cytoplasm.
Alternative pre-mRNA splicing is an important mechanism in gene regulation and can include the use of alternative 5'and/or 3"splice sites, exon skipping, mutual exon exclusion, and/or intron retention (1)(2)(3)(4). Although intron retention is common in viral pre-mRNAs, it appears to be a relatively rare form of alternative splicing in vertebrates (5). It is believed that intron-containing mRNAs normally are prevented from being transported to the cytoplasm due to the formation of the spliceosome complex, which commits the pre-mRNA to the splicing pathway (1, [6][7][8]. However, there are examples of alternative intron retention in vertebrates that result in novel proteins due to translation of the intron sequences (9)(10)(11)(12).
Therefore, there must be a mechanism by which a fraction of the mRNA molecules, containing an intact intron, escape the splicing pathway. Because both the 5'and 3"splice sites are required for spliceosome commitment complex formation (131, it is possible that suboptimal splice sites may prevent complex formation, thereby allowing introns to be retained and introncontaining mRNAs to be transported to the cytoplasm. Bovine growth hormone (bGH)' pre-mRNA undergoes alternative splicing in which the last intron (intron D) is retained in * This work was supported by United States Public Health Service Grant DK32770 (to F. M. R.) from the National Institutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. $ To whom correspondence should be addressed. Tel.: 216-368-3420; Fax: 216-368-3055.
The abbreviations used are: bGH, bovine growth hormone; FP, FspI-PvuII; bPRL, bovine prolactin; CHO, Chinese hamster ovary; snRNA, a small fraction of the cytosolic mRNA (11). An open reading frame is maintained through the intron and into the last exon, leading to production of a bGH-related protein2 differing from normal bGH at the carboxyl terminus. The human growth hormone V gene, which contains an open reading frame in intron D with a high degree of sequence similarity to bGH, also retains intron D in a fraction of the human growth hormone V mRNA in the placenta (141, suggesting a possible physiological role for the protein resulting from intron retention. In previous studies ( E ) , we identified a 115-nucleotide FspI-PuuII region (FP sequence) within the last exon (exon 5) of bGH pre-mRNA that is required for efficient splicing of intron D upon expression in transfected cells. Interestingly, this 115nucleotide fragment enhances splicing in either orientation, which led us to focus on a 10-base pair palindromic sequence (CTTCCGGAAG) present within the FP sequence. Furthermore, when bovine prolactin (bPRL) intron D is placed immediately upstream of bGH exon 5, splicing occurs irrespective of the presence of the downstream FP sequence, suggesting that a specific component(s) in bGH intron D necessitates the presence of the FP sequence (15).
In this report, we identify the purine-rich portion of the 10base pair palindrome as a critical sequence required for stimulation of intron D splicing and show that a protein present in nuclear extracts binds specifically to multiple copies of this sequence. We show that a suboptimal 5"splice site leads to the requirement for the FP sequence in efficient splicing. This simple purine sequence (GGAAG), perhaps in conjunction with other purine sequences found in exon 5, compensates for the weak 5"splice site in intron D while allowing a fraction of the bGH mRNA to retain intron D.
MATERIALS AND METHODS DNA nansfection of CHO Cells and SI Nuclease Mapping of RNA-CHO cells were maintained in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum, nonessential amino acids, and antibiotics (penicillin and streptomycin). Cells were transfected using Lipofectin (Life Technologies, Inc.) under the conditions recommended by the manufacturer. Cells were harvested, and polysomal RNA was prepared 48 h after transfection as described (15). Polyadenylated RNA was prepared by oligo(dT)-cellulose chromatography (16).
Probes for S1 nuclease protection experiments were 3'-end-labeled with T4 DNA polymerase and [CY-~~PI~CTP. S1 nuclease analysis of poly(A)+ RNA was camed out as described earlier (15). Scanning of SI nuclease-protected fragments was carried out using a PhosphorImager (Molecular Dynamics Inc.), and the resulting peaks were integrated by area.

1
linked as previously described (17), with the exception that the reactions were treated with a combination of RNases A and T1 after crosslinking.

Initial Characterization of 10-Base Pair Palindrome That
Stimulates Splicing of bGH Intron %Earlier studies (15) demonstrated that deletion of a 115-nucleotide FP sequence within the last exon (exon 5) of bGH pre-mRNA results in marked inhibition of splicing of the upstream intron (intron D) of bGH mRNA. A 10-base pair palindromic sequence (CTTCCGGAAG) within the FP exon sequence was identified, which when inserted into the deleted region of pFPD, restored splicing t o levels that were intermediate between pFPD and wild-type bGH mRNA (151. However, this palindrome is not the exclusive activating element contained in the FP sequence because deletion of just this 10-nucleotide sequence resulted in only partial diminution of splicing relative to that of the entire FP deletion. Only when other sequences within the FP element that resemble the palindrome were also deleted did intron D retention approach the levels observed with pFPD (15). Therefore, we sought to define the critical component(s) of this palindromic sequence responsible for the positive influence on splicing of intron D.
To determine whether the palindromic sequence must be uninterrupted for splicing stimulation to occur, the palindrome was disrupted by insertion of GGCC in the center of this sequence (CTTCCGGCCGGAAG; see "Materials and Methods"). Transient expression revealed that this extended palindrome gives rise to splicing levels nearly identical to those observed with the wild-type palindromic sequence (Fig. U3, lanes 3 and  41, suggesting that the precise sequence of the entire palindrome is not critical. There is another sequence within the 115-nucleotide FP element that is identical to the palindrome at 8 out of 10 residues. Therefore, we reasoned that multiple copies of the activating sequence may be necessary to obtain wild-type levels of splicing. To test this hypothesis, a DNA fragment containing three copies of the 10-base pair palindrome was inserted in place of the FP region. These copies were separated by 11 and 12 base pairs of "nonspecific spacer" sequences designed to be nonpalindromic in nature and to minimize potential secondary structure using an RNA secondary structure prediction algorithm (19)(20)(21). To minimize further any influence of the spacer sequences, the triple repeat-containing fragment was inserted in both orientations. The results of this experiment (lanes 5 and 6 ) indicate that splicing of intron D was not improved significantly by insertion of three copies of the palindrome relative to insertion of a single copy (lane 3 1. These results suggest that the precise sequence of the palindrome is not required and that simple repetition of the entire palindrome cannot account for wild-type splicing levels, although multiple copies of the activating sequenceb) may still be required (see "Discussion").
Palindrome Does Not Stimulate Intron D Splicing through U l snRNA Complementarity-There are reports indicating that the binding of U l small nuclear ribonucleoprotein to exon sequences can influence splicing in either a positive (22) or negative (23) manner. These exon sequences exhibit complementarity to U1 snRNA, and inspection of the bGH palindromic sequence in its normal context revealed complementarity to the U1 snRNA consensus sequence-binding site ((C/A)AGGU(A/ G)AGU) (24) at 6 out of 9 nucleotides. This level of U1 complementarity is comparable to the examples cited above, suggesting that U1 small nuclear ribonucleoprotein may bind to the 10-base pair palindrome. To assess the importance of the U1 snRNA complementarity in bGH exon 5 to intron D splicing stimulation, the palindrome was mutated to either enhance or disrupt this potential interaction. The palindromic sequence plus the adjacent 3' 2 nucleotides in its normal context (CTTC-CGGAAGGA) were mutated away from the U1 consensus sequence (CCGCAAGCA) or toward the consensus sequence (CAGGAAAGT) and inserted into the deleted region of pFPD (Fig. 2). As predicted by the above hypothesis, mutation away from the U1 binding consensus sequence (lane 4 ) attenuated splicing relative to the palindromic sequence (lane 3). However, mutation toward the U1 binding consensus sequence also decreased splicing efficiency in comparison to the wild-type palindromic sequence (lane 5), instead of stimulating splicing as predicted. Thus, it appears that the influence of the bGH palindromic sequence on intron D splicing is not mediated through binding of U1 small nuclear ribonucleoprotein to the palin-   dromic sequence, although we cannot rule out its involvement through a noncanonical interaction(s).

Purine-rich Exon Sequence Is Responsible for Positive Influence on Splicing of bGH Intron D-
The palindromic sequence (CTTCCGGAAG) contains separate pyrimidine-and purinerich domains. To determine if both of these domains are necessary for the splicing activation, they were inserted separately into the deleted region of pFPD (Table I). Insertion of the pyrimidine-rich domain (CTTCC) had little effect on splicing of bGH intron D compared to pFPD. However, insertion of the purine-rich half (GGAAG) of the palindrome into pFPD markedly stimulated intron D splicing compared to pFPD. Moreover, when the purine sequence was extended to contain multiple tandem copies of GGAA(G) ( Table I) resulted in further stimulation of splicing. In contrast, extending the pyrimidine-rich sequence in an analogous manner, if anything, reduced splicing relative to pFPD. These data suggest that only the GGAAG half of the palindrome is involved in stimulation of splicing and that multiple GGAAG-like sequences within the FP element may be required to produce wild-type levels of intron D splicing. entire FP sequence, we synthesized various 32P-labeled RNAs that contained the FP sequence, FP-deleted exon 5, or FPdeleted exon 5 into which was inserted G7&, C7T6, or the 10-base pair palindrome (Fig. 3A). These RNAs were incubated in a HeLa cell nuclear extract in the presence or absence of Mg2+-ATP, UV-irradiated, and treated with RNases A and T1 (Fig. 3B). As observed previously, the FP sequence cross-linked to a 55-kDa and a 35-kDa protein (lanes 1 and 2 ) , whereas FP-deleted exon 5 cross-linked only to the 55-kDa protein (lanes 9 and 10). Since the 35-kDa protein bound to the FP sequence and not the FP-deleted sequence, insertion of the palindromic, G7&, or C7T6 sequence into FP-deleted exon 5 was designed to determine whether the 35-kDa protein binds specifically to the purine-rich sequences. Somewhat surprisingly, the palindromic sequence did not cross-link specifically to any protein (lanes 3 and 4). This may be due to the fact that the palindrome has only one copy of the GGAA(G) sequence, which may not allow the protein to bind with high affinity in vitro. Interestingly, the G7& sequence (lanes 5 and 61, but not the C7T6 sequence (lanes 7 and 81, cross-linked strongly to a protein doublet. This doublet is larger than the 35-kDa protein that cross-linked to the FP sequence (lanes 1 and 2 ) and may or may not be the same protein(s) (see "Discussion").

Weak 5'-Splice Site Is Required for bGH Alternative Intron
Retention-We previously reported (15) that the FP sequence of exon 5 is required for bGH intron D splicing, but not for splicing of another constitutively spliced intron. Deletion of the FP sequence, which results in a marked diminution of bGH intron D splicing, has no effect on splicing when bGH exon 5 is placed downstream of heterologous bPRL intron D (15). This suggests that a specific component(s) in bGH intron D necessitates the presence of the FP element in exon 5. To identify this component, the 5'-and 3'-portions of bGH intron D were replaced with corresponding regions of bPRL intron D, and splicing was examined in the presence and absence of the downstream FP sequence (Fig. 4). The goal was to define a small region of bPRL that could replace a component in bGH intron D and cause splicing of intron D to become independent of the FP sequence.
Pre-mRNA in which the 3'-portion of bGH intron D was substituted with the corresponding bPRL sequences (including both the branch point and splice acceptor site sequences) still required the presence of the downstream FP sequence for splicing (Fig. 4A, lanes 3 and 4 ). In contrast, when the 5'-portion of bGH intron D was replaced with bPRL intron D sequences, splicing no longer required the FP element (lanes 5 and 6).
Inspection of the 5"region of bGH intron D revealed that the 5"splice site deviates from the consensus sequence and contains an intriguing 21-nucleotide palindromic sequence located 49 nucleotides downstream of the 5"splice site that is capable of forming a perfect 9-base pair stem-loop structure. The stem loop does not appear to be involved in splicing of bGH intron D because disruption of this putative structure does not result in splicing that is independent of the FP sequence (Fig. 4 B , lanes  3 and 4 1. In addition, inclusion of the stem loop in the 5'-bGW bPRL intron D chimera did not result in restoration of dependence on the FP element (lanes 1 and 2).
Mutation of the splice donor site to the perfect consensus sequence, however, resulted in splicing irrespective of the presence of the FP sequence. The splice donor sequence of bGH intron D (CGG/GUGGGG) matches the mammalian consensus sequence ((C/A)AG/GU(A/G)AGU) (24) at only six out of nine positions, while the bPRL donor site (CAG/GUGAGC) matches the consensus site at 8 out of 9 nucleotides. We reasoned that bGH intron D may possess a suboptimal donor site, lowering the efficiency of splicing and thereby requiring the additional positive acting signals in the FP element. To test this hypothesis, three point mutations were introduced into the bGH splice donor site, making it identical to the consensus sequence. This mutation resulted in constitutive splicing of bGH intron D, independent of the presence of the FP sequence (Fig. 4 B , lanes  5 and 6).

DISCUSSION
There is a growing body of evidence to suggest that sequences other than those previously defined at the splice donor, lariat branch point, polypyrimidine tract, and splice acceptor sites play a critical role in the selection of splice sites (reviewed in Refs. 5 and 22). Most of these auxiliary sequences are found within exons, and with few exceptions, the exact nature and role of these sequences in splicing remain unclear. In this report, we provide evidence that a purine-rich sequence within bGH exon 5, even as short as GGAAG, is capable of stimulating intron D splicing. Repetition of this simple sequence restores splicing to near wild-type levels. Furthermore, these results suggest that this GGAAG sequence is part of an "exonic splicing enhancer" (ESE) that functions by compensating for a suboptimal (i.e. weak) splice donor site in intron D in vivo. In this context, we define the ESE as being contained within the 115nucleotide FP fragment and the GGAAG sequence as a core element of the ESE. Mutation of the splice donor site toward the consensus sequence completely eliminates dependence of intron D splicing on the presence of the ESE in exon 5.
Stimulation of bGH intron D splicing by the GGAAG sequence offers a n explanation for several other results in this study. Disruption of the 10-base pair palindrome by insertion of a GGCC fragment ( Fig. 1 B , lane 4 ) did not alter the GGAAG sequence and therefore did not affect splicing. Mutation of the palindrome away from the U1 consensus sequence inhibited splicing (Fig. 2, lane 41, perhaps because the GGAAGGA sequence was disrupted by insertion of 2 C residues. In contrast, the pre-mRNA containing the mutation of the palindrome toward the U1 consensus sequence was spliced more efficiently than pFPD RNA (Fig. 2, lane 51, presumably due to the fortuitous introduction of a 7-nucleotide GGAAG-like sequence in this mutant. We cannot explain why multiple copies of the complete 10-base pair palindrome did not improve splicing over the stimulation observed with a single copy of this sequence (Fig. 1B), whereas repetition of the GGAAG portion of the palindrome improved splicing over a single GGAAG sequence (Table I). One possibility is that repetition of the entire palindrome also included the pyrimidine-rich portion, which may be inhibitory. Alternatively, two of the three copies of the palindrome may base-pair with one another to form a strong secondary structure, which prevents stimulation of splicing above that observed with a single copy.
Involvement of purine-rich exon sequences has been suggested in efficient and/or alternative splicing of other pre-mRNAs (22,25). In these reports, it is argued that the purine-rich sequences are required for the recognition and/or selection of a weak 3"splice site (22,251 and do not compensate for a weak downstream 5"splice site in exon definition (25). However, splicing of bGH intron D is unique in that the influence of the ESE appears to involve the upstream splice donor site. "he results presented here suggest a mechanism of action for a n ESE sequence in which a protein enhances spliceosome complex formation by binding to the ESE and interacting in some fashion with the 5'-splice site rather than exclusively  0 > 99 with the 3"splice site, as appears to be the case in other systems (22,25). Formation of a spliceosome complex is believed to commit a n intron to the splicing pathway and thereby prevent transport of intron-containing mRNAs to the cytoplasm (1, 6-8). Attenuation of 5'and/or 3"splice sites may provide a mechanism whereby assembly of early spliceosome commitment complexes (13) is rendered inefficient, allowing transport of intron-containing RNAs to the cytoplasm. An enhancer sequence that increases the efficiency of complex formation could provide an effective means of modulating this process. The dependence of spliceosome complex formation in vitro (17) on the presence of the FP element containing a n ESE sequence is consistent with this hypothesis.
Presumably, the ESE sequence functions through binding of a trans-acting factor. We previously demonstrated that a 35-kDa protein(s1 specifically cross-links to the FP fragment, which contains an ESE and is required for efficient in vitro splicing of intron D (17). In the present study, we show that cross-linking of a protein doublet is dependent on the GGAA(G) repeat. Furthermore, the cross-linking of this doublet is greatly diminished in SlOO fractions (data not shown), indicative of serinelarginine-rich splicing factors. This protein doublet appears to be larger than the 35-kDa protein that cross-links to the FP fragment (Fig. 3B). Two possible explanations may account for this increased size. The 35-kDa factor and the doublet protein(s) may be the same protein, but with altered gel mobilities due to varying lengths of RNA cross-linked to the protein following RNase digestion. If the UV-cross-linked material is treated with RNase A alone (data not shown), the apparent difference in mobility between the 35-kDa protein and the protein doublet is even greater (compared to digestion with RNases A and T1; Fig. 3B ), suggesting that the RNase A-resistant, purine-rich sequence in E5/G7& RNA is causing the protein doublet to migrate more slowly. Alternatively, different proteins could bind to the FP element and the GGAA(G) repeat. The exact nature of these proteins awaits future characterization. In either case, these results are consistent with a model in which a protein factor(s) binds to the ESE sequence in exon 5 and compensates for the weak bGH 5"splice site in spliceosome complex formation.
In conclusion, we demonstrate here that the alternative splicing (i.e. intron retention) of bGH intron D results from a balanced interplay between a weak 5"splice donor site and a downstream exonic splicing enhancer sequence. A GGAA(G) motif, which represents the core of this ESE, can by itself substantially enhance intron D splicing. Multiple copies of this short motif are probably necessary to provide the full ESE effect. A similar balanced interplay between a weak splice site and an ESE may be generally employed to control other alternative pre-mRNA splicing events.