The RNA component of the Bacillus subtilis RNase P. Sequence, activity, and partial secondary structure.

The gene defining the catalytic RNA component of RNase P in Bacillus subtilis 168 was cloned into bacteriophage lambda and plasmid vectors. The nucleotide sequence of the gene and its surroundings was determined from the cloned DNA and by directly sequencing or reverse transcribing the RNase P RNA. The B. subtilis RNase P RNA sequence (400-401 nucleotides) is remarkably different from that of Escherichia coli (377 nucleotides) (Reed, R. E., Baer, M. F., Guerrier-Takada, C., Donis-Keller, H., and Altman, S. (1982) Cell 30, 627-636; Sakamoto, H., Kimura, N., Nagawa, F., and Shimura, Y. (1983) Nucleic Acids Res. 11, 8237-8251). At best the two are less than 50% similar in sequence. To verify that the RNase P RNA gene was analyzed, a modified, putative gene was cloned adjacent to a bacteriophage T7 promoter and various transcripts were tested for RNase P activity. The intact gene transcript, but not fragments, showed full activity. Full catalytic activity was restored upon mixing the fragments. The extensive differences between the B. subtilis and E. coli RNase P RNAs precluded full covariance analysis of secondary structure, but phylogenetically consistent foldings for portions of both molecules could be derived.

Institutes of Health. DNA synthesis was partially funded by a grant from the Indiana Corp. for Science and Technology to the Institute for Molecular and Cellular Biology, Indiana University. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
$ i.e. elucidating its secondary and tertiary foldings. Altman and his colleagues (5, 6) have determined the nucleotide sequence of the Escherichia coli RNase P RNA (M1 RNA).
They proposed a possible secondary structure based on minimum energy calculations and chemical and enzymatic structure mapping data (7). However, there are credible alternatives to the proposed folding which also are consistent with the structure mapping data.
At this time, the best a priori method for determining the secondary structure of large RNAs is the phylogenetic comparative approach (8). That is, possible helices in an RNA, as indicated by complementary sequences, are tested by seeking the equivalent pairing in the homologous RNA from another organism in which the sequence varies. Helical regions are indicated by covariance in compared sequences; mutations compensate one another to maintain complementarity.
The RNase P RNAs from E. coli and Bacillus subtilis are homologous in function; the RNA and protein subunits from each organism will complement those from the other in the low salt, holoenzyme reaction (2). Therefore, the structural elements involved in catalysis and RNase P protein interactions likely are similar in the two RNAs. In order to analyze the RNase P RNA structure by phylogenetic comparisons, we have cloned the B. subtilis 168 RNase P RNA gene and report here the determination of its nucleotide sequence. In addition, we describe the manipulation of the gene into an efficient in uitro expression vector and show that the gene product is enzymatically active.

DISCUSSION
The isolation of the RNase P RNA gene from B. subtilis proved unusually difficult. Not only was the genomic region containing the gene subject to deletion in the vectors and hosts employed, but also the low cellular abundance of the RNA rendered questionable the purity of hybridization probes. We estimate, based on recoveries of the RNA during purification, that only about 20-50 copies of the RNase P RNA are present in each B. subtilis cell. Additionally, the termini of the RNase P RNA are not good substrates for polynucleotide kinase and RNA ligase, which were used for Portions of this paper (including "Materials and Methods," "Results," and Figs. 1 and 3-5) are presented in miniprint at the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. isotopically labeling the RNA for hybridization experiments. Taken together, however, these facts meant that RNAs contaminating the RNase P RNA region of preparative gels, even at low levels, provided troublesome signals during Southern blot analyses and searches of genomic clone libraries. The inclusion of nonradioactive ribosomal RNA (rRNA) in hybridizations, to compete with labeled rRNA fragments contaminating the labeled RNase P RNA, was essential for analyses and clone searches. When enough sequence information for the RNase P RNA was gathered, the preparation of more specific probes, via primer extension reactions from synthetic oligonucleotides complementary to regions of the RNA, facilitated further analyses. The RNase P RNA gene contains elements common to many bacterial transcriptional units. Examination of sequences adjacent to the 5' end of the mature domain of the RNase P RNA gene (Fig. 2) reveals a potential promoter region with a canonical -10 box, and a -35 box with two departures from the B. subtilis consensus sequence (26). There is also an oligo(A) tract near the 5' side of the -35 box, a feature that appears in other B. subtilis genes (26). The ostensible RNase P RNA promoter is positioned properly for transcription initiation at the 5"terminal G residue of the mature molecule.
The inverted repeats, which in an RNA transcript would be capable of forming a hairpin consisting of a 9-base pair stem and a 7-nucleotide loop. Such a structure would be similar to numerous, p-independent, transcription termination sites (27). We have no evidence that this structure serves to terminate transcription of the gene. If it does, however, then the mature 3' end of the RNase P RNA must be generated by some processing event, as is the case in E. coli (28).
An 80-amino acid open reading frame (ORF') begins 97 pairs distal to the 3' end of the RNase P RNA gene. It is preceded by a short, G-rich region, corresponding in location to a Shine-Dalgarno sequence (29). There is, however, no apparent promoter between the presumptive transcription termination site described above and the beginning of this ORF. The possibility that an RNA defining this potential protein might be cotranscribed with RNase P RNA is intriguing.
A second ORF, starting 287 base pairs distal to the 3' end of the RNase P RNA mature domain, extends 53 potential codons to the end of the region sequenced. The lack of a * The abbreviation used is: ORF, open reading frame.  compelling Shine-Dalgarno sequence, the overlap with the first ORF, and the lack of an upstream promoter make the significance of this ORF questionable.
The sequences of the RNA components of RNase P from We have aligned the B. subtilis and E. coli sequences on the basis of similar potential secondary structures in the terminal regions of the two molecules (below) and maximizing primary structural homology in the intervening regions (not shown). The overall sequence similarity (32) in this alignment is 43%, compared with 76% for an alignment of their 16 S rRNA sequences. There occur large regions in one sequence with no homolog in the other, at least in part owing to the significant length difference in the B. subtilis and E. coli RNase P RNAs.
If the effects of these regions on the homology are limited by counting only the first five consecutive alignment gaps, the net sequence similarity is still only 49%.
The secondary structure similarities of the B. subtilis and E. coli RNase P RNA terminal regions mentioned above are illustrated in Fig. 7. Except for the pairing of the termini, these foldings are different from those proposed previously for the E. coli RNase P RNA (7). Each of the stems in the models shown in Fig. 7 includes pairs of sequence positions which differ in primary structure between the two species, but which covary so as to preserve their complementarity. These examples of nucleotide covariance suggest a biological importance of the sequence complementarities and hence provide evidence for the secondary structural elements which contain them. Unfortunately, the unexpectedly low homology of the B. subtilis and E. coli sequences has made the alignment of the sequences sufficiently uncertain to preclude extension of the B. subtilis structure model further at this time, without undue speculation. In this vein, we have not unambiguously located the B. subtilis homolog to the presumptive pairing between E. coli sequence positions 152-156 and 161-165, which is supported by a covariance relative to the Salmonella typhimurium sequence (33).
It has been pointed out that the E. coli RNase P RNA (M1 RNA) contains several short sequence repeats (5). The base composition of the B. subtilis RNase P RNA is 28:20:2823 (per cent A:C:G:U), much more uniform than that of the E. coli sequence (23:27:35: 15). An expected consequence of this pattern of nucleotide usage is a decrease in the number of random recurrences of short sequences (34). Indeed, there are far fewer directly repeated, short (6 to 20 nucleotide) sequences in the B. subtilis RNA than in that of E. coli (data not shown). It is not clear that random sequence similarities, owing to the base composition, completely explain the sequence repeats in the E. coli RNase P RNA, but their absence in B. subtilis suggests that they are not functionally important.
Finally, we point out that the experiment shown in Fig. 5, which reconstructs fully active RNase P RNA from fragments, suggests that the RNase P RNA tertiary structure, not a simple nucleotide sequence, is required for complete activity. Although the experiment does not prove the point, since the EcoRI site might interrupt the active site, it shows that the regions of the RNase P RNA required for activity can be sought by in vitro recombination of abbreviated fragments. Preparation of RNase P RNA --RNase P RNA from B. subtilis strain b a c t f p l alkaline phosphatase, was labeled at its 5' terminus with RNA sequencinq --RNase P RNA, dephosphorylated by treatment with [ y -PIATP and polynucleotide kinase or at its 3' terminus with [5s-32P1 3',5'-cytidine bisphosphate and RNA ligase (10).
The labeled products were purified by electrophoresis on 8% polyacrylamide gels and sequenced by standard enzymatic methods 111).

247-263
analysis of the 5' terminus of th8 RNA was carried out using the same protocol, but omitting dideoxynucleotides.
probe --Primerextension reactions w=us<dTo synthesize a probe, complementary to RNase P RNA, which was used for screening plasmid recombinant clones. Typically, the DNA primer (250 ng) was hybridized to the RNA template (1 ! A g) in a 5-pl reaction containing 50 mM KC1, 25 m M Tris-HC1, pH 8.5, at 90OC f o r 1 minute, and allowed to cool slowly over 10 min to 25°C. The hyyidized primer-template mixture was added to 300 Ci of dried identification of the 5 ' terminus by RNA sequencing was consistent The 5 ' end of the RNase P RNA was seen to be homogeneous during RNA elling of the molecule at its 3' end, using RNA populations, differing in length by one nucleotide. This was revealed as doublet bands in RNA sequencing gels. Based on their relative incorporation of the 3' end-label, about half the molecules terminate after Udn8, the remainder include UdO1 (see below). ; ; ; : y : i : : : q$-W PI 3',5'-cytidine bisphosphate, yielded two The above results are summarized in Fig. 1, which indicates the restriction sites relevant to the cloning strategies, the extents of synthetic oligomer primers from which extensions were carried out. We were concerned about its physiological relevance. Although we had the RNase P activity (above), it remained conceivable that tbe true enzyme was a minor component of that preparation, hence was not cloned or sequenced. We therefore deemed it important to demonstrate that transcripts from the cloned gene are active in the RNase P reaction.
RNase P RNA gene and clone them as the single HindIII-PstI unit were Initial attempts to join the two fragments of the ostensible unsuccessful. Possibly, the regions flanking the gene %both the 5' and 3' termini undergo some sort of recombinational event that renders the fragment unstable. In this regard, it is noteworthy that early vectors (pUC19 and pBR322) were unsuccessful. Although primary clones attempts at cloning the intact gene in bacteriophageh and plasmid were recovered in all cases, they segregated during subsequent growth, presumably through rearrangement and deletion of cloned material. The

Bacillus stearotbermophilus (unpublished).
Same phenomenon occurred with the intact RNase P RNA gene from In seeking to circumvent the deletion problem during cloning, a coding region. Sequences at the 5' side of the gene were removed by strategy was devised to eliminate genomic sequences flanking the cleavage with PstI, which also deleted 23 nucleotides of the mature domain'of the molecule. Nucleotides removed from the gene were added back in the form of a synthetic oligonucleotide. Positions 2 and 5 in site at the 5' end of the gene. The oligonucleotide~lso included a the mature RNA sequence were altered to introduce a BamHI restriction -HindIII site 5"adjacent to the BamHI site. The abbreviated 5' fragment was subcloned as a ~d I l I --~R I fragment in pUC19.
The replicative form of the M13 clone containing the 3' EcoRIgenerating a 128-base pair fragment containing 12~nucleotides of the PStI fragment was digested with EcoRI and DraI endonucleases, 3' end of the mature molecule. This fragment was subcloned into the -EcoRI and +I sites of pUCl9.
gene were isolated, the 5' portion of the mature domain as a &gHI-Both of the abbreviated fragments of the ostensible RNase P RNA -BcoRI fragment (construction described above), and the 3' EcoRl-DraI fragment ( Fig. 1) a s a EcoRI-Hind111 (from the pUC19 p o F l i n E ) fragment, and then ligaterintothe BamHI and HindIII sites adjacent to the phage T7 promoter, in the e c e s s i o n x a s m i d vector pMT71 (supplied by Dr. Olke Uhlenbeck). This construct contains the full length of the coding region of the RNase P RNA gene, essentially devoid of flanking sequences, and it proved to be stable to maintenance in E. coli HB181. The isolated 5' and 3' fragments also were subcloned in expression vectors pT72 and pT71 (from U . S . Biochemical Corp.), as HindIII-&RI and &RI-*dIII fragments, respectively.
were isolated, linearized with an appropriate restriction endonuclease Activitz of in vitro transcripts --The described plasmid DNAs (legend, Fig. 3 ) and incubated in the presence of T7 RNA polymerase and the four ribonucleoside triphosphates, to produce run-off transcripts. The nucleotide sequences of these transcripts are given in Fig. 3 and a gel electrophoretic analysis of them is shown in Fig.  4.
to accurately process a precursor of tRNA Is, as detailed in Materials The various run-off transcripts were(.assayed for their abilities and Methods. The results of such an experiment are sho.wn in Fig. 5 . yielding the same products generated by the RNase P RNA isolated from The full length transcripts accurately cleave the tRNAH1' precursor, cells. The activity of the in vitro transcripts establishes that the B. subtilis RNase P RNA gene was successfully cloned. In fact, the RNA produced in vitro is more active than an equivalent mass of that isolated f r o m c e c p r o b a b l y because the cellular RNA is contaminated with other RNAs or was damaged in some way during isolation. The 5' however upon miring the fragments, activity is restored. This in vivo, the RNA isolated from & subtilis cells; intact (x) and intact (H), transcripts of the cloned gene linearized with =I and HindTII, respectively; 5' 70%, 5' 70% of the gene, ending a t the EcoRITite; 3' 30% (X) and 3' 30% (H), the cloned 3' end of the gene up to the X k I and ~ld111 sites, respectively.