Recognition of Exon-Intron Boundaries by the Halobacterium volcanii tRNA Intron Endonuclease”

The intron-containing tRNATrp precursor from Halobacterium volcanii, like many intron-containing ar- chaebacterial precursor tRNAs, can assume a structure in which the two intron endonuclease cleavage sites are localized in two three-nucleotide loops separated by four base pairs. To investigate the role of this structure in cleavage by the halophilic endonuclease, a se-ries of mutant tRNATrP RNAs were prepared and eval-uated as substrates. We find that alterations in this structure result in the loss of cleavage at both 5’ and 3’ sites. Cleavage of a 35-nucleotide model RNA substrate, containing only these features, demonstrates that sequences and structures present at the exon-intron boundaries are sufficient for recognition and cleavage. We have also examined the mechanism used by the halophilic endonuclease to identify the cleavage sites. Addition of a single base, or a base pair in the anticodon stem above the cleavage sites, does not affect the cleavage site selection. The addition of nucleotides between the two cleavage sites significantly decreases cleavage efficiency and has an effect on the cleavage site selection. These results demonstrate that the halophilic endonuclease requires a defined structure at the exon-intron boundaries and does not identify its cleavage sites by a measurement mechanism like that employed by eukaryotic tRNA intron endonucleases.

The intron-containing tRNATrp precursor from Halobacterium volcanii, like many intron-containing archaebacterial precursor tRNAs, can assume a structure in which the two intron endonuclease cleavage sites are localized in two three-nucleotide loops separated by four base pairs. To investigate the role of this structure in cleavage by the halophilic endonuclease, a series of mutant tRNATrP RNAs were prepared and evaluated as substrates.
We find that alterations in this structure result in the loss of cleavage at both 5' and 3' sites. Cleavage of a 35-nucleotide model RNA substrate, containing only these features, demonstrates that sequences and structures present at the exonintron boundaries are sufficient for recognition and cleavage.
We have also examined the mechanism used by the halophilic endonuclease to identify the cleavage sites. Addition of a single base, or a base pair in the anticodon stem above the cleavage sites, does not affect the cleavage site selection. The addition of nucleotides between the two cleavage sites significantly decreases cleavage efficiency and has an effect on the cleavage site selection.
These results demonstrate that the halophilic endonuclease requires a defined structure at the exon-intron boundaries and does not identify its cleavage sites by a measurement mechanism like that employed by eukaryotic tRNA intron endonucleases.
In the eukaryotes, introns have been shown to occur in all genes classes, mRNA, rRNA, and tRNA (for reviews see . Some evidence suggests that the nuclear mRNA and the organellar group I and II introns are related (2). However, the relationship of the nuclear-encoded tRNA introns to the other classes remains an enigma. The recent descriptions of interrupted tRNA genes in the genomes of the archaebacteria extends the range of organisms which have interrupted tRNA genes and raises the question of whether these introns are yet another class of tRNA introns.
Interrupted tRNA genes have been reported in the genomes of Halobacterium volcanii, Halobacterium mediterranei, and Halobacterium cutirubrum (5)(6)(7), Sulfolobus sulfataricus (8,9), Thermoproteus tenax (lo), Desulfurococcus mobilis (ll), and Thermophilum pendens (12). The complexity of these introns is similar to that of the eukaryotic nuclear tRNA introns *  which range in size from 14 to 60 nucleotides (3,4). Unlike nuclear-encoded tRNA introns where the introns are always located two nucleotide 3' to the anticodon, between positions 37 and 38 of the mature tRNA, the archaebacterial tRNA introns are not restricted to a single location. Some introns are located at this position, but others have been found in the anticodon stem (lo), the anticodon itself (lo), and in the extra arm (12). Variability in intron location is not unique to the archaebacteria tRNA genes. Chloroplast tRNA introns are found at several locations (13). However, these introns are not related to the nuclear-encoded tRNA introns, they are similar to the group I and II introns (13)(14)(15). The conservation of tRNA intron location plays a central role in intron removal by the eukaryotic endonuclease. The processing of intron-containing pretRNAs in the nucleus takes place in two separate protein-catalyzed reactions, endonucleolytic cleavage at the exon-intron boundaries releasing intron, followed by exon ligation (16). In vitro and in vivo studies have shown that the endonuclease requires mature tRNA tertiary structure in the precursor and interacts with the highly conserved U8 and C56 residues in the mature region of the precursor (17-24). Cleavage sites are identified by a measurement mechanism which senses the distance from the center of the molecule, possibly the top of the anticodon stem, to the 5' and 3' cleavage sites (21-23). Since only some of the archaebacterial introns are located in the same position as the eukaryotic tRNA introns, these organisms must use a different mechanism, and possibly a different endonuclease, to identify their cleavage sites.
We have previously shown that the H. volcanii tRNA endonuclease does not require the presence of mature tertiary structure in the intron-containing precursor or a complete intron; only those sequences and structures present at the exon-intron boundaries are required for recognition and cleavage (25). Comparative sequence analysis of archaebacterial intron-containing pretRNAs revealed that the cleavage sites were often located in two, three-nucleotide bulge loops, separated by four base pairs suggesting that this structure could function as a recognition element (26).
In this report we describe experiments in which mutations introduced into a synthetic gene for the H. mediterranei intron-containing tRNATm gene were used to analyze the role of this proposed structure in cleavage by the halophilic tRNA intron endonuclease. Results from these studies establish that the halophilic enzyme requires the presence of bulge loops at the cleavage sites and that identification of these sites does not involve a mechanism which measures the distance from the top of the anticodon stem to the 5' and 3' cleavage sites. gene has a relatively large intron, 104 nucleotides, we have used a two-step procedure to reconstruct this gene. First, a smaller version of this gene, containing sequences for both exons and 22 nucleotides of the native intron, was assembled from nine overlapping oligonucleotides ( Fig. 1 and Table 1). In addition to the gene sequences, this construction contained a T7 RNA polymerase promoter located immediately upstream of exon 1, a BstNI restriction site at the 3' end of exon 2, and single-stranded termini corresponding to BamHI overhang sequences for use in cloning. These annealed oligonucleotides were cloned into pUC18 plasmid to give the gene 0167. To complete the tRNATW gene, an 82-base pair HinPI fragment, which was absent in the 0167 construction, was reintroduced into the HinPI restriction site of 0167 yielding the complete tRNATW gene, 016 (Fig. 1). Transcripts generated from 016 and 0167 genes, following cleavage with the restriction enzyme BstNI, should begin at the first nucleotide of exon 1 and terminate at their 3' ends with CCA residues. Transcripts from these genes migrated as single bands in denaturing gels, although we have noted some heterogeneity at the 5' and 3' termini.
' Portions of this paper (including "Materials and Methods," Table  1,  Both 0167 and 016 RNAs were substrates for the halophilic endonuclease; both produced products corresponding in size to excised intron and exons 1 and 2 ( Fig. 2). No evidence of ligation, in the presence or absence of ATP, was observed. End group analysis of 5' end-labeled intron and exon 2 products, produced from post cleavage labeling, demonstrated that both substrates were cleaved accurately (0167, data not shown; see Fig. 7 for 016). The cleavage of 0167 was less efficient than 016, and this difference was reflected in the apparent K, values for these RNAs, 85-115 and 25-40 nM for 0167 and 016, respectively. A small amount of a RNA which migrated as a two-thirds sized product was detected in some reactions. However, these RNAs did not undergo further cleavage and appeared to be a dead-end product.

Mutations in the Loop Structure-In
previous studies we established that the mature tRNA tertiary structure and complete intron of the pretRNATw molecule are not required by the halobacterial endonuclease for cleavage (25). It appeared that sequences or structures present at the exon-intron boundaries were the primary sites for recognition. A comparison of the two halobacterial tRNA intron-containing genes, tRNATW and tRNA""' , revealed that the cleavage sites for these RNAs were localized in two, three-nucleotide, bulge loops separated by four base pairs (5,7). The presence of a similar structure in other archaebacterial intron-containing tRNAs (26) suggested that this structure may play a role in substrate recognition. To test whether this structure was required, the intron sequence 5'-GGAG-3', which participates in the four base pairs between the two loops, was removed from 016 (Fig. 3). This mutation leaves the sequences at both cleavage sites intact, but places them in an asymmetric internal loop. If recognition and cleavage site identification were dependent only on sequence or structures above and below 3. Analysis of structural alterations at the exon-intron boundary junctions of 016 RNA. Toppanek; are secondary structures of exon-intron boundary junctions of substrate RNAs generated from the wild-type and mutated 016 genes. Bold face type indicates nucleotide insertions, and arrows indicate predicted cleavage sites for the endonuclease. Products from the cleavage assays for these RNA substrates are shown in the panels below. Identities of RNA products are indicated. The -and + designations indicate absence or presence of the halophilic endonuclease in the assay.
pratma -w the cleavage sites, 016-AGGAG RNA would be expected to remain an active substrate. When assayed, this RNA was not cleaved by the endonuclease (Fig. 3), indicating a requirement for structure at the exon-intron boundaries.
To determine whether both cleavage sites must be present in loop structures, and if the 5' and 3' sites are recognized independently, three mutant genes were constructed. In the first gene, the 3' cleavage site loop sequence (5'-AUA-3') was deleted, extending the anticodon helix and leaving the 5' cleavage site in its original bulge loop structure (Fig. 3). When this RNA was incubated with endonuclease, no cleavage products were observed (Fig. 3). A second mutation was constructed in which three nucleotides complementary to the 5' cleavage loop sequence (5'-UCU-3') were added to the intron opposite the loop (Fig. 2). Transcripts from this gene retained the 3' cleavage site structure and sequence, however, the 5' cleavage sequence, while present, was in a base-paired region. Like the 3' cleavage site deletion mutation, this RNA was not cleaved by the endonuclease (Fig. 3). From these mutants it appeared that both cleavage sites must be present in loop structures before the enzyme will cleave at either site. To determine whether bulge loops were specifically required, an additional mutant was constructed. In this mutant, three nucleotides identical to the 5' cleavage loop (5'-AGA-3') were added to the intron in the region adjacent to the 5' cleavage loop, placing the 5' cleavage site in an internal loop. As observed for the other mutations, this RNA was not cleaved by the endonuclease (Fig. 3). Therefore, productive interaction between the endonuclease and the substrate requires that both cleavage sites be located in bulge loops.
Based on these findings a model gene was constructed which would produce a 35-nucleotide RNA transcript containing the sequences and structures at the exon-intron boundaries of the tRNATw precursor (Fig. 4). In the presence of endonuclease, three products corresponding in size to exon l+ intron, intron, and exon 1 were produced (Fig. 4). The exon 2-sized RNA (seven nucleotides) was detected but could not be recovered stoichiometrically from the reaction mixtures. The presumed intron band produced in the reaction was 5' end-labeled with polynucleotide kinase and [T-~*P]ATP and subjected to RNA sequence analysis (Fig. 5). The sequence of this product revealed that cleavage took place at the expected site for the exon l+ intron boundary. The 3'-terminal nucleotide of the intron could not be determined from this analysis, but the length of the sequence was consistent with correct cleavage at the intron-exon 2 boundary. As observed for 0167, a partial cleavage product corresponding in size to cleavage at the intron-exon 2 boundary was also present. We have not yet determined whether this RNA is the product of accurate cleavage at the intron-exon 2 boundary; its stability suggests that it may not undergo further cleavage.
Identification of Cleavage Sites-The yeast and vertebrate endonucleases use a measurement mechanism which essentially counts down the anticodon stem to the 5' and 3' cleavage sites (21-23). While the halobacterial enzyme requires that the 5' and 3' cleavage sites be present in bulge loop structures, it was still possible that a measurement mechanism like that of the eukaryotic enzyme was utilized. If such a mechanism was used, the addition of a base pair to the anticodon stem would shift the cleavage sites, resulting in an intron one nucleotide larger at each termini. Addition of a C-G pair to in the anticodon stem (C29:1-G146:l) had only a minor affect on the cleavage efficiency (Fig. 6). Analysis of the 5'-terminal nucleotides of the end-labeled intron and exon 2 products revealed that each began with an A residue, demonstrating that the cleavage took place at the normal positions (Fig. 7). In addition, we noted that the single C addition to the anticodon stem of the 5' exon (C29:1), used in the preparation of the base pair mutant, had little affect on cleavage efficiency (Fig. 6), and no affect on cleavage accuracy (Fig. 7). The sequences or structures involved in localizing the cleavage sites must therefore lie below this point in the anticodon stem.
Since both cleavage sites were required to be in bulge loops, it was possible that the enzyme used a measurement mechanism that sensed the distance between the two loops. To examine this, a C-G pair, C36:1-G136:1, was added to the 5' end of the helix which separates the two cleavage site loops. If this type of measurement mechanism was used, cleavage of this RNA would yield an intron with a 5'-terminal G residue and an exon 2 with a 5'-terminal U residue. When assayed, this mutant RNA was cleaved with a decreased efficiency when compared with 016 RNA (Fig. 6). Analysis of the 5'- terminal residues of end-labeled intron and exon 2 products indicated that no shift in the cleavage sites had occurred, both terminated with A residues (Fig. 7). An additional mutant was constructed in which a C residue was inserted after position 36 of exon 1; this led to the formation of a fournucleotide 5' cleavage loop, 5'-CAGA-3'. Cleavage of this RNA was also reduced and less specific (Fig. 6). End group analysis of the 5' end-labeled exon 2 product demonstrated that cleavage occurred at the predicted site for the intronexon 2 boundary (Fig. 7). Analysis of the intron produced indicated that the majority of cleavage took place at the correct position. However, a small amount of intron terminating with a G residue was also observed (Fig. 7). It appears that the endonuclease senses the distance between the two cleavage sites. However, the ability of the enzyme to find the correct cleavage sites in these mutants suggests that specific nucleotide binding may also be involved in cleavage site identification.
Cleavage of 016 RNA by Yeast Endonuclease-A requirement by the halophilic endonuclease for a defined structure at the exon-intron boundaries was consistent with our earlier observation that this enzyme would not cleave the yeast pretRNA"'" (25). Despite this, it was possible that pretRNAT'" RNA would be a substrate for the yeast endonuclease since it had mature tRNA structure and its intron was in the conserved location, two nucleotides 3' to the anticodon. When a crude extract from yeast was examined for its ability to cleave yeast and halophilic precursors, it was found to efficiently cleave and ligate the yeast tRNA"h' precursor but was unable to cut the halophilic precursor 016 (Fig. 8). It is unlikely that this inability to cleave the halophilic precursor is due to the large size of this intron since large inserts have been introduced into eukaryotic precursor without loss of cleavage activity (35). In addition, the substrate 0167, which has an intron similar in size to the typical yeast precursor (22 nucleotides), was an inefficient substrate for the yeast enzyme; less than 10% of the RNA was converted to products (data not shown). Since the yeast enzyme acts on most tRNA precursors with no apparent sequence requirements, the exception being a preference for a purine residue at the 3' end of the exon 1 (22), which is also present in the 016 and 0167 RNAs, it appears that structural features of the halophilic precursor prevent cleavage by this enzyme.

DISCUSSION
Based on RNA folding predictions and preliminary structure probing," the intron cleavage sites of the H. volcanii intron-containing pretRNAT'i' RNA were predicted to be present in two three-nucleotide loops separated by four base pairs. The conservation of this structure in the two characterized halophilic intron-containing precursors (5,7), and the presence of related structures at the exon-intron boundaries of other archaebacterial interrupted genes, suggested that this structure may be required for cleavage by the intron endonuclease. To test this possibility, we examined the cleavage potential of four mutant RNAs, derived from a synthetic H. mediterranei tRNA'rv gene, which lacked the ability to form the bulge loop structures. Cleavage was eliminated when the four nucleotides of the intron which participate in the base pairing between the loops were removed (AGGAG), when the 3' cleavage site loop was deleted (AAUA), and when the 5' cleavage loop was changed to a base-paired structure (VUCU), or to an internal loop structure (VAGA) (Fig. 3). These data indicate that the halophilic endonuclease requires a structural ' L. D. Thompson  element defined by the presence of two bulge loops; sequence alone or sequence presented in an internal loop is not sufficient. We cannot exclude the possibility that these mutations have resulted in an alternative secondary structure or tertiary folding patterns other than the proposed bulge loop structure. However, preliminary structure probing experiments with the AGAGG intron deletion mutant RNA reveal the appearance of a new Sl cleavage site at the endonuclease cleavage site, which is consistent with the proposed structure.* Further support for this structure comes from the observation that the 35 nucleotide Trp-Model substrate, which has far fewer potential structures, is also a substrate for the endonuclease.
Although the recognition of bulge loops by proteins has been suggested for the interaction of ribosomal proteins with rRNAs, only recently has a defined interaction between a protein and a bulge nucleotide been described (36). The pre-cise structure of bulge loops in RNA helices is not fully understood.
Studies on bulge loop structures in DNA helices have shown that these structures cause sharp bends or kinks in the molecule, and the magnitude of these changes is dependent on the size of the bulge loop (37)(38)(39). Studies with model RNAs indicate that bulge loops in RNA helices also result in kink formation (40), and that the stability of the resulting structure may be dependent on the sequence of the bulge loop and its surrounding base pairs (36,41). By analogy, the two-bulge loop structure of the 016 RNA might be expected to cause two kinks in the RNA, forming a zigzag structure. If such a structure was required to properly align the cleavage sites on the enzyme, then those mutations which changed one or both of the loops would affect this structure and possibly the contacts with the active site.
The strict requirement by the halophilic endonuclease for a defined structure at the intron-exon boundaries is in sharp contrast to the properties of the eukaryotic nuclear endonucleases. All nuclear-encoded tRNA introns are localized in the anticodon loop two bases 3' to the anticodon (3,4), and in general, they play a passive role in substrate recognition and cleavage. In many precursors, but not all, intron sequences base pair with the anticodon loop extending the anticodon helix. This interaction is not essential since mutations which eliminate these interactions do not block processing by the yeast endonuclease (22,42). However, some structures at the exon-intron boundaries have a major affect on cleavage; these changes usually limit or block processing. For example, base pairing of the 3' cleavage site of yeast SUP32 resulted in a loss of cleavage at both sites by the yeast endonuclease (43), and removal of the 3' cleavage site of yeast tRNAPh' (23), or changes in the base pairing between the cleavage sites of yeast tRNACI'*"' (42) decreased cleavage by the Xenopus endonuclease. Differences have also been noted in the overall efficiency with which an endonuclease acts on heterologous substrates; some Schizosaccharomyces pombe pretRNAs are poor substrates for the Saccharomyces cereuisiae endonuclease (44,45), and the wheat germ nuclear endonuclease is incapable of cleaving S. cereuisiae, X. lueuis, and human pretRNAs (46). The observation that the yeast endonuclease does not cleave the halophilic 016 precursor is further evidence that structure at the exon-intron boundaries plays some role in cleavage and that the endonucleases from various organisms may have different structural requirements. As the halophilic endonuclease differs from the eukaryotic endonuclease in having a structural requirement at the exonintron boundaries, it also differs in the mechanism it uses to identify the cleavage sites. The conserved location of introns in eukaryotic nuclear tRNA precursors ensures that the cleavage sites in each molecule, regardless of sequence, will be in a similar position. It is this conservation which is the basis for cleavage site identification for the yeast and X. laevis endonucleases (21-23). Using base pair insertions into the anticodon stem, it was established that these enzymes identify cleavage sites by a measurement mechanism which senses the distance from the top of the anticodon stem to the 5' and 3' cleavage sites (22,23). When a base pair was inserted into the anticodon stem of the 016 gene, there was only a minor effect on the cleavage efficiency and no effect on the cleavage accuracy. Therefore, measurement from the top of the anticodon stem is not a criteria for cleavage site identification by the halophilic enzyme. When the two cleavage sites are present in the proposed two loop structure, both cleavage sites are located the same distance (four nucleotides) from the center of the helix which separates them. If this distance was used for identification, a base pair insertion might be expected to shift the cleavage by one nucleotide at each site. When such a mutant was examined, a decrease in cleavage was observed, but the cleavage which took place was in the correct position. A decrease in cleavage efficiency was also observed when one nucleotide was added to the 5' loop, increasing the distance from the center of the helix to the 5' cleavage site. In this case the major product was cleaved accurately, however, some slippage had occurred since a minor amount of intron which terminated in G was observed. It appears that the distance between the cleavage sites plays an important role in recognition, but the ability of the endonuclease to locate the correct sites in these mutants suggests that additional factors, such as nucleotide recognition, may also be involved. Further mutations in the loops and the base pairs between the loops will be needed to determine the role of individual nucleotides in substrate alignment. The relationship between the archaebacterial tRNA introns and their eukaryotic counterparts is an interesting question. Choloroplast and nuclear tRNA introns are readily distinguished from each other by their complexity and processing mechanisms. The archaebacterial tRNA introns appear to share some properties with both groups. Their relatively small size is similar to the nuclear-encoded tRNA introns, however, their variability in location is more like that observed for the chloroplast introns. Since archaebacterial tRNA introns are found in various locations, their processing enzymes, by necessity, must use a mechanism different than that used by the nuclear endonuclease to identify the cleavage sites. The requirement for a defined structure at the exon-intron boundaries would provide the flexibility needed to identify introns at different locations, provided the conserved structure was present. In support of a conserved intron endonuclease in the archaebact,eria, we have found that other archaebacteria have endonuclease activity capable of processing the tRNATv precursor.' Variability in intron location also suggests that the introns in the archaebacterial tRNA genes were not derived from a single ancestral intron-containing tRNA species; more likely these introns were inserted, possibly as self-cleaving mobile introns, into preexisting tRNA genes. It has been proposed that tRNA introns were originally capable of selfcleavage and that modern tRNA introns are the result of independent solutions to the transition from self-cleaving introns to protein-assisted or protein-mediated processing (47). The properties of the halophilic endonuclease, and the structural properties of archaebacterial tRNA introns, support this proposal. Further studies on the archaebacterial endonuclease proteins may reveal the origin of the protein components of these processes.