The Primary Structure of Rat Ribosomal Protein L5 A COMPARISON OF THE SEQUENCE OF AMINO ACIDS IN THE PROTEINS THAT INTERACT WITH 5 S rRNA*

The covalent structure of rat ribosomal protein L5, which associates with 5 S rRNA in the organelle, was deduced from the sequence of nucleotides in a recombinant cDNA (pL5-6-4) and confirmed from the se- quences of amino acids in portions of the protein. Ribosomal protein L5, encoded by pL5-6-4, contains 296 amino acids and has a molecular weight of 34,298. However, a second recombinant cDNA, pL5-8-5, encodes a protein with an additional methionyl residue at position 236 and may be the product of a second active L5 gene. Rat L5 is homologous to yeast YL3 and to Halobacterium cutirubrum HL13, proteins that also bind to 5 S rRNA. No significant structural similarity, however, was found between rat L5 and other 5 S rRNA-binding proteins; not with a second H. cutiru- brum protein HL19, nor the Escherichia coli ribosomal proteins, L5, L18, or L25, nor the Xenopus laevis transcription factor IIIA. H. cutirubrurn HL19, how-ever, has structural identity with E. coli L5 and seems to be related to yeast YL3 and, hence, may be an evolutionary link between the prokaryotic and eukaryotic 5 S rRNA-binding proteins. A group of ribosomal proteins not known to be associated with 5 S rRNA are also related to rat L5. They include: rat L39,


A COMPARISON OF THE SEQUENCE OF AMINO ACIDS IN THE PROTEINS
THAT INTERACT WITH 5 S rRNA* (Received for publication, March 31, 1987) Yuen-Ling Chan, Alan Lin, James McNally, and Ira G . Wool From the Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, Illinois 60637 The covalent structure of rat ribosomal protein L5, which associates with 5 S rRNA in the organelle, was deduced from the sequence of nucleotides in a recombinant cDNA (pL5-6-4) and confirmed from the sequences of amino acids in portions of the protein. Ribosomal protein L5, encoded by pL5-6-4, contains 296 amino acids and has a molecular weight of 34,298. However, a second recombinant cDNA, pL5-8-5, encodes a protein with an additional methionyl residue at position 236 and may be the product of a second active L5 gene. Rat L5 is homologous to yeast YL3 and to Halobacterium cutirubrum HL13, proteins that also bind to 5 S rRNA. No significant structural similarity, however, was found between rat L5 and other 5 S rRNA-binding proteins; not with a second H. cutirubrum protein HL19, nor the Escherichia coli ribosomal proteins, L5, L18, or L25, nor the Xenopus laevis transcription factor IIIA. H. cutirubrurn HL19, however, has structural identity with E. coli L5 and seems to be related to yeast YL3 and, hence, may be an evolutionary link between the prokaryotic and eukaryotic 5 S rRNA-binding proteins. A group of ribosomal proteins not known to be associated with 5 S rRNA are also related to rat L5. They include: rat L39, Euglina gracilis chloroplast 57, Saccharomyces cerevisiae L31 and L46, Homo sapiens L32 and, perhaps, several others as well. There is an especially close interrelationship between rat L5, rat L39, yeast L46, human L32, and mouse L32. These results, and others, suggest that ribosomal proteins form an extended family and that L5 may contain in its structure traces of this affinity.
It is a truism that what is required to solve the structure of the ribosome is information on the chemistry of the constituents and knowledge of how the proteins and nucleic acids interact. Ribosomal 5 S RNA and the proteins that associate with it provide a convenient and tractable means for studying the latter problem since the nucleic acid is small and only a few proteins bind to it.
Moreover, ribonucleoprotein complexes containing 5 S rRNA appear to be discrete subparticles of the organelle (1-8). The complexes can be reconstituted independently of other ribosomal components and they can be removed from, and added back to, the ribosome as a unit. The hope is that the rules that govern the interactions in this simpler case will be easier to determine and will be generally applicable.
* This work was supported by National Institutes of Health Grants GM 21769 and 33702. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
What is essential for this analysis is a means of determining with some precision the region of 5 S rRNA that is in contact with the proteins that bind specifically to the nucleic acid. For this purpose, use has been made of the cytotoxic ribonuclease a-sarcin (9). The toxin has proven nearly ideal for footprinting RNA, something that had been heretofore both tedious and difficult. Thus, the binding site on 5 S rRNA for the Escherichia coli ribosomal proteins L5, L18, and L25 (lo), for the rat ribosomal protein L5 (ll), and for the Xenopus laevis transcription factor IIIA (12) have been defined precisely. A second necessity for this analysis is the structure of the proteins that associate with 5 S rRNA and, ultimately, the identity and the conformation of the amino acids in the portion of the protein in contact with the binding site on the nucleic acid. We report here, as a part of the undertaking, the primary structure of rat ribosomal protein L5 which associates with 5 S rRNA in the organelle. The sequence of amino acids in L5 was inferred from the sequence of nucleotides in a recombinant cDNA and confirmed by sequencing portions of the protein. A comparison has been made of the sequence of amino acids in rat L5 and in the other proteins that bind to 5 S rRNA. The purpose of this search was to determine if there are common conserved structural features in the several proteins that might lead to the identification of the nucleic acid binding site, EXPERIMENTAL PROCEDURES'

RESULTS AND DISCUSSION
The Sequence of Nucleotides in a Recombinant cDNA Encoding Rat Ribosomal Protein L5-A cDNA library of 30,000 independent transformants was constructed from poly(A)+ mRNA prepared from regenerating rat liver. The library was screened for clones that hybridized to an oligonucleotide probe that was synthesized to be complementary to the sequence of nucleotides predicted to be present in the portion of the mRNA that encoded 9 consecutive amino acids near the NH, terminus of rat ribosomal protein L5 (Fig. 1). A random selection of colonies from the original cDNA library were screened with the L5 probe; 15 gave a positive hybridization signal. The DNA from the plasmids of the 15 transformants was isolated, digested with restriction endonucleases, and analyzed by gel electrophoresis. These clones had inserts that ranged in length from 0.35 to 1.1 kilobases and Southern blot hybridization with the oligonucleotide probe confirmed that all of the inserts contained cDNA for L5. The anticipated length of the L5 coding sequence, calculated from the molecular weight of the protein, is 885 nucleotides. For this reason, the 2 clones with the largest inserts (1.1 and 1.0 kilobases, designated pL5-6-4 and pL5-8-5, respectively) were selected and the nucleotide sequence determined. Nucleotide sequences from both strands of the DNA, and overlapping sequences for each restriction site, were obtained (Fig. 5).
The cDNA insert in pL5-6-4 contains 1102 nucleotides and includes the homopolymer linkers, the 5' and 3' noncoding sequences, a terminal poly(A) tract, and a single open reading frame (Fig. 6). In the other two reading frames the sequence is interrupted by many termination codons. The open reading frame of 888 nucleotides begins at an ATG codon a t position 82 and ends a t a termination codon (TAA) at position 970; it encodes a protein of 296 amino acids (Fig. 6). The hexamer AATAAA (position 1006-loll), presumed to be the recognition sequence directing post-transcriptional cleavage-polyadenylation of the 3' end of pre-mRNA (44), is located 13 nucleotides upstream of the start of the poly(A) stretch. This hexamer is generally found 10-30 nucleotides from the site of the initiation of polyadenylation. The length of the poly(A) tail is 67 nucleotides. The 5' noncoding sequence is 56 nucleotides in length. The context in which the initiation codon occurs, CAGATGG, differs from the optimum (ACCATGG) a t positions -3, -2, and -1 but has the crucial guanosine at +4 (45).
The sequence of nucleotides in the cDNA insert in pL5-8--an 5 differs from pL5-6-4 in two respects. First, the former (pL5-8-5) lacks the codons for the two carboxyl-terminal amino acids, the 3' non-transcribed region, and the poly(A) tail. Second, and more importantly, pL5-8-5 encodes an additional methionine at position 236. Thus, the sequence in pL5-6-4 is -Pro-Asp-Met-Glu-Glu-Met-; whereas, the sequence in pL5-8-5 is -Pro-Asp-Met-Met-Glu-Glu-Met-. It is, of course, conceivable that the additional methionine codon was inserted in pL5-8-5 as the result of a copying error (i.e. the enzyme stuttered) during the synthesis of either the first or second strand of the cDNA prior to construction of the library. This is conceivable but not likely since the enzymes, reverse transcriptase and E. coli polymerase I, are not known to have trouble copying ATG sequences (polymerase I sometimes makes errors with G or C repeats and reverse transcriptase with A or T repeats); moreover, the error would need to be an insertion of exactly three nucleotides. A second possibility is that there is a single active gene for ribosomal protein L5 but that there are two species of mRNA, perhaps, as a result of a splicing error during processing of the precursor. This possibility would be more plausible if the methionines in question were at an exon-intron junction. Finally, one can imagine that there are two active L5 genes and that they differ only in having 1 or 2 methionyl residues in the relevant region. Unfortunately, we cannot decide between these possibilities from the data on hand since the analysis of the number of methionyls in L5 is not sufficiently precise. However, there is indirect evidence for 2 methionines and it comes from an alignment of the NH2-terminal regions of rat L5 and the homologous yeast protein YL3 (Fig. 7). Although, the yeast cn 90

Structure of Rat Ribosomal Protein
L5 12881 protein contains neither 1 nor 2 methionines at this position, 2 residues rather than 1 are needed to properly align the surrounding region (cf. Fig. 7 and the discussion later). This does not mitigate against the possibility that there are two different mRNAs. Preliminary experiments suggest that the difference between pL5-8-5 and pL5-6-4 is not the reflection of a cloning artifact, but rather that there are two species of mRNA for L5. We synthesized two oligodeoxynucleotides (43-mers) complementary to the region in question. The first encodes 1 methionine as in pL5-6-4 and is referred to as L5(M1); the second has 2 methionine codons as in pL5-8-5 and is referred to as L5(M2). L5(M1) has 22 nucleotides on the 5' side of the methionine codon and 18 on the 3' side; L5(M2) has 22 nucleotides on the 5' side of the 2 methionines and 15 on the 3' side. The two probes were hybridized at 40 "C to poly(A)+ mRNA fixed to nylon filters and then washed at 50, 68, 77, and 85 "C; 50, 68, and 77 "C are the melting temperatures (T,) for hybrids having 11, 5, or 1 mismatches, whereas, at 85 "C all hybrids will melt. The hybrids formed with L5(M1) and L5(M2) were stable at 50,68, and 77 "C; of course, neither was stable at 85 "C. At the conclusion of the experiment the probes were rehybridized to the filters containing mRNA at 40 "C and washed a t 50 "C. Since rehybridization of the probes was as great as in the first cycle we c-nclude that little or no mRNA was lost from the filters even at 85 "C. In as much as both probes bound to the poly(A)+ mRNA at 77 "C where even one mismatch should have caused a hybrid to melt it is likely that there are two populations of mRNAs for L5. It now needs to be determined whether there are two functional genes for L5 whose only difference is whether they encode 1 or 2 methionines near position 236, as seems probable to us, or whether a splicing mishap occurs near this position during processing of the primary transcript.
The Primary Structure of Rat Ribosomal Protein L5-The reading frame in pL5-6-4 flanked by initiation and termination codons encodes a protein of 296 amino acids (Fig. 6). This protein was identified as rat ribosomal protein L5 in the following manner. The recombinant cDNA clone (pL5-6-4) was selected using an oligonucleotide probe that was complementary to the codons for a sequence of 9 amino acids near the NH, terminus of L5 (Fig. 1). Moreover, the amino acid sequence derived from the sequence of nucleotides in the cDNA corresponded exactly to the NH2-terminal 18 residues of L5 determined directly from the protein (Figs. 1 and 6), to the carboxyl-terminal 3 residues determined with carboxypeptidase (Fig. 4), and to a pentapeptide (-Arg-Phe-Pro-Gly-Tyr-) derived from L5 ( Fig. 6; positions 179-183 in the amino acid sequence and 616-630 in the nucleotide sequence). The amino acid composition inferred from the cDNA is, in general, similar to that previously (29) derived from an hydrolysate of purified L5 (Table I). There are, nonetheless, discrepancies (for example, in the number of glycyl, histidyl, isoleucyl, leucyl, methionyl, and tyrosyl residues) that we are at a loss to account for, other than to attribute them to experimental error in the original amino acid analysis.
The molecular weight of rat ribosomal protein L5, calculated from the sequence of amino acids deduced from the nucleotide sequence of pL5-6-4, is 34,298, close to that of 32,500 estimated from the migration of the purified protein on sodium dodecyl sulfate gels (29). It is to be noted that since pL-5-8-5 encodes an additional methionine the number of residues in L5 could be 297 and the molecular weight 34,447. Moreover, the NH2-terminal residue in L5 is glycyl (Fig. 1); thus, the initial methionine is removed and the number of amino acids in the processed protein is 295 or 296 and the molecular weight is 34,149 or 34,298, or both depending on whether there are 1 or 2 methionyls near position 236 and whether there are one or two distinct varieties of L5 in the cell.
Protein L5 has an excess of basic (26 arginyl, 33 lysyl, and 5 histidyl) over acidic (15 aspartyl an? 20 glutamyl) residues; a total of 64 or 22% of the former and 35 or 12% of the latter. The protein has a strikingly large number of tyrosyl and of methionyl residues, 21 and 10 or 11, respectively. It needs to be remarked that basic amino acids and tyrosine are frequently involved in binding to nucleic acids (quoted in Ref. 46). There are 4 cysteinyl residues (at positions 61, 76, 100, and 144) but whether they are linked in disulfide bridges is not known. The basic and acidic residues are not uniformly or randomly distributed in L5. There are 3 groups of 4 consecutive basic amino acids (at positions 21-24, 195-198, and 261-264) and two regions where they are concentrated. The first is of 20 basic residues in a stretch of 53 near the NH, terminus (positions 5-58) and the second, near the carboxyl terminus (positions 193-2831, is of 27 residues in a sequence of 89. The clustering of basic amino acids in eukaryotic ribosomal proteins has been remarked on before (47,48) but the significance of the observation, if, indeed, there is any is not known. There is, perhaps, one region where acidic residues are frequent; at position 184-237, 13 of 53 of the amino acids are acidic.
The Interaction of Ribosomal Proteins with 5 S rRNA-A purpose in determining the sequence of amino acids in L5 was to facilitate an analysis of the nature of the interaction of the ribosomal protein with 5 S rRNA. There is a great deal of data concerning the structure of the regions of 5 S rRNA that interact with proteins (10-12, 49-51). The comparison ranges from the proteins that associate with the nucleic acid in E. coli (10,50), to those in Halobacterium cutirubrurn (52), in Saccharomyces cerevisiae (53), and in rat (11). What has been gleaned from these studies is the rather strong impression of the conservation of the structure of the attachment site for ribosomal proteins on 5 S rRNA (cf. Fig. 8). The similarity is all the more remarkable inasmuch as the number of proteins with an affinity for 5 S rRNA varies. For example, one can affect the release of discrete ribonucleoprotein complexes from the large subunit of ribosomes that contain 5 S rRNA and one to three ribosomal proteins. In the ribonucleoprotein particle from E. coli ribosomes (6) there are three proteins (L5, L18, and L25), in the H. cutirubrum complex (7) there are two proteins (HL13 and HL19), and in the assembly from yeast (YL3) (8) or rat (L5) (1, 5) there is only one. It is curious that the number of ribosomal proteins in primary contact with 5 S rRNA has decreased from three in prokaryotic, to two in archebacterial, to one in eukaryotic ribosomes, although, there have been oniy modest changes in the structure of 5 S rRNA and in the attachment sites for the proteins (cf. Fig. 8). One explanation is that a small number of amino acid domains (each of moderate size) are in contact with the nucleic acid and that these domains can occur in one, two, or three proteins.
Comparison of the Sequence of Amino Acids in the Proteins That Bind to 5 S rRNA-We sought to determine if there had been conservation of the nucleic acid-binding domain in proteins that associate with 5 S rRNA. The first comparison was of rat ribosomal protein L5 and Xenopus transcription factor IIIA. Although it is not a ribosomal protein, the latter forms a complex with 5 S rRNA (54-56). The binding of factor IIIA to an intragenic control region is necessary for the initiation of transcription of 5 S rRNA genes by RNA polymerase I11 (57-60). During oogenesis in Xenopus factor IIIA also binds to the transcript, i.e. to 5 S rRNA, to form a ribonucleoprotein particle (54-56) that is part of an inventory that is used later for ribosome biogenesis (61, 62). Thus, transcription factor IIIA binds to 5 S rDNA and to 5 S rRNA. The sequence of amino acids in factor IIIA was deduced by Ginsberg et al. (31) from the sequence of nucleotides in a recombinant cDNA. However, it was Miller et al. (63) who uncovered an extraordinary feature of the protein. Transcription factor IIIA has nine repeats of a 30-residue unit. The most prominent motif in the consensus sequence abstracted from the repeats is two invariant pairs of cystinyl and histidyl residues. The suggestion (63) is that the 30-residue repeat constitutes a "finger" in which pairs of cysteines and histidines tetrahedrally coordinate one zinc molecule and an extended loop of amino acids forms a DNA-binding region in contact with one-half turn of the helix. This proposal is, of course, most directly pertinent to the binding of factor IIIA to the 5 S gene and the relevance, if any, that it has for the association with 5 S rRNA is not known, although, it has been proposed that factor IIIA binds to a similar helical structure in both 5 S rDNA and rRNA (12). More to the point, the binding site for factor IIIA on 5 S rRNA closely approximates that for L5 (Fig. 8) and, hence, one might anticipate similarities in their structure. Despite this expectation no identity between factor IIIA and rat ribosomal protein L5 was found (Table 11). Protein L5 has no repeat sequences, it does not have an unusually large number of cysteinyl or histidyl residues (Table I), and there is no indication of the presence of a sequence of amino acids related to the consensus derived from the factor IIIA repeat. The latter is not surprising since there is no evidence that L5 coordinates zinc and, hence, it has no need for the pairs of cysteines and histidines that define the factor IIIA repeats. Finally, there are no similarities in the structure of factor IIIA and any of the other ribosomal proteins that bind to 5 S rRNA (Tabie 11). There is not available a complete sequence of the amino acids in any other eukaryotic 5 S rRNA-binding protein. Nazar et al. (8,32) have determined 119 of the approximately 318 residues in yeast YL3; 30 at the NH, terminus (which we refer to as YL3-CNl), and 89 in a cyanogen bromide fragment of 102 residues (YL3-CN2) derived from the carboxyl end of the protein. Even with the limited sequence information available for the yeast ribosomal protein it is apparent that YL3 and L5 are homologous. The term is used here in the
The comparison was made using RELATE. The values are the number of standard deviations of the real score above the random score; three standard deviations correspond t o p cc 0.001 and is taken to indicate the proteins are likely to be homologous. " _ strict sense to connote derivation from a common ancestral gene. The score for the comparison of the NH, termini of YL3-CN1 and L5 is 5.0 standard deviation units (Table II), where 3 S.D. is generally taken to indicate homology (64). For YL3-CN2 the score is 4.8 (Table 11). The relatedness is apparent in an alignment of the NH, termini of rat L5 and yeast YL3 (Fig. 7); there are 18 identical or similar' amino acids in a sequence of 27 consecutive residues (67% identity). For the YL3-CN2 fragment there are 14 identities in 29 comparisons (48% identity) (Fig. 7). Note that 2 methioninyl residues are required a t positions 235 and 236 to properly align the 2 amino acid segments.
Smith et al. (7) have determined the sequence of the amino acids at the NH, termini of the two proteins from H. cutirubrum ribosomes, HL13 (30 residues) and HL19 (28 residues), that bind to 5 S rRNA. In the comparison of rat L5 with the archebacterial ribosomal proteins one gets a mixed signal. The extent to which the NH,-terminal portion of HL13 is related to the same region of rat L5 is impressive; the score is 6.2 (Table 11). However, there is no indication that the NH,terminal residues of HL19 are related to rat L5; the score is -0.9 (Table 11). These results are, of course, provisional since only a portion of the sequence of proteins HL13 and HL19 is available. Nonetheless, the similarity of the structure at the NH, termini of HL13 and rat L5 is striking when the alignment is examined (Fig. 7); there are 15 identities' in 27 comparisons (56% identity). It can be seen that HL13 is related to YL3 as well (Fig. 7); there are 12 identities in 21 comparisons (57% identity).
The comparison of the structure of rat ribosomal protein L5 with the E. coli 5 S rRNA-binding proteins L5, L18, and L25 (43) using the computer program RELATE failed to reveal a relationship (Table 11). Nazar et al. (8) had suggested, from inspection of the amino acid sequences, that the NH, termini of YL3 and of HL13 were related to E. coli L18 and that fragment CN2 of YL3 was similar to E. coli L5. We do not find evidence of either of these identities when the comparison is made using the program RELATE (Table 11). We do support, with a more rigorous analysis, another suggestion (7), namely, of a relationship of H. cutirubrum HL19 to E. coli L5. The score for the comparison is 6.6 (Table 11) and in the alignment (Fig. 7) there are 11 identities in 20 comparisons (55% identity). Moreover, there is a suggestive relationship of H. cutirubrum HL19 to yeast YL3-CN1; the score is 2.5. Once again one must be cautious because only a portion of the structure of HL19 has been done. Nevertheless, it does appear that E. coli L5 and HL19 are, at the least in part, related. It is striking that one of the 5 S rRNA-binding proteins in H. cutirubrum ribosomes, HL13 is related to the eukaryotic proteins, yeast YL3 and rat L5, whereas the second one, HL19, is related to the E. coli protein L5 and perhaps to yeast YL3 as well. This adds support to the surmise that archebacterial ribosomes are evolutionary transition particles between eukaryotes and prokaryotes and have properties of both (65).
We undertook these comparisons in the hope that they might reveal conserved domains in the proteins that could be considered candidate 5 S rRNA-binding sequences. Although, we do find conserved domains we are not yet able to associate them with binding to 5 S rRNA because the amino acid sequences of several of the relevant proteins are not complete, because the number of proteins for which there is sequence data is not sufficiently large, and because there are inconsist-We include both strict identities and conservative substitutions, isoleucine/leucine/valine, aspartic acid/glutarnic acid, serine/threonine, and lysine/arginine. encies. For example, there is no relation that we can uncover in the sequence of amino acids in rat L5 and in Xenopw transcription factor IIIA despite the observation that they bind to the same site on 5 S rRNA. There are several possible ways to resolve this paradox. There may be subtle similarities in the primary structure of the two proteins, transcription factor IIIA and L5, that have escaped our notice, or different sequences of amino acids might be able to bind to the same structure in 5 S rRNA, or what is perhaps most likely there are similarities in the secondary or tertiary structures of the two proteins that accounts for their association with the same site on the nucleic acid. It is not easy to account for the lack of a relationship between rat L5 and the E. coli proteins L5, L18, and L25. It may be that the differences in the secondary structure of prokaryotic and eukaryotic 5 S rRNA, no matter how trivial they seem, is sufficient to require a different sequence of amino acids in the nucleic acid-binding site on eukaryotic and prokaryotic ribosomal proteins, or, once again, the similarities may be in the higher order structure of the proteins.
There is encouragement in the finding that a number of the ribosomal proteins that bind to 5 S rRNA are related, rat L5 to yeast YL3 and to H. cutirubrum HL13, and H . cutirubrum HL19 to E. coli L5. When the amino acid sequence of the yeast and H . cutirubrum proteins is completed there may be obvious conserved domains that will allow one to postulate the chemistry of the nucleic acid-binding site on the proteins.
Comparison of the Sequence of Amino Acids in Rat L5 with that in Ribosomal Proteins from Other Species-The sequence of amino acids in the rat ribosomal protein L5 was compared, using the computer program RELATE (30), to the sequence of amino acids in 280 other ribosomal proteins contained in a library that we have compiled, and that includes the complete set of 52 from E. coli (43). High scores were obtained for several proteins (Table 111). A score, computed as the distance in standard deviations between the actual comparison and the mean of 100 comparisons of randomized sequences, of at least 3.0 is ordinarily required to assign significance to a relationship (64). The ribosomal proteins related to rat L5 include the following : rat L39, with a score of 4.3 standard deviations; Euglina gracilis chloroplast S7, 3.8; S. cerevisiae L46, 3.5; S. cereuisiae L31, 3.3; and Homo sapiens L32, 3.0. A remarkable aspect of the results of the comparison is the indication that pairs of ribosomal proteins from the same organism are related i.e. rat L5 and L39, and that a single rat protein (L5) may be related to two proteins in a second organism, i.e. to yeast L46 and to yeast L31. There is, furthermore, a set of ribosomal proteins that gives scores between 2.9 and 2.6 that we interpret as indicating the possibility of a relationship to rat L5 (Table 111). We attempted to align the several proteins (Table 111) with rat L5 using the program of ALIGN (30) but we were not successful.
There is a group of eukaryotic ribosomal proteins, rat L5 and L39, yeast L46, mouse L32, and human L32, that appear to be closely related (Table IV). Apart from rat L5 none of these proteins are known to be associated with 5 S rRNA.
Mouse and human L32 have identical amino acid sequences as had been noted before (36) just as it was known that yeast L46 and rat L39 are homologous (34). What is apparent now is that all of these ribosomal proteins are related to rat L5. There is, for example, one segment of 15 amino acids beginning at position 13 in rat L5 that is very similar to sequences that begin at position 74 in human and mouse L32, at position 8 and again a t 33 in rat L39, and at position 34 in yeast L46 (Fig. 9). A second similar fragment, this time of 19 amino acids, is at position 254 in rat L5, at position 24 in human and mouse L32, at position 32 in rat L39, and at position 33 in yeast L46 (Fig. 9). Note that a fragment of rat L39 (starting near position 32) can be aligned with two parts of rat L5 and that two sequences in L39 (beginning at positions 8 and 33) can be aligned with a single segment in L5 (Fig. 9). These results, and others (66, 67), suggest that ribosomal proteins form an extended family and that L5 may contain in its structure traces of the affinity.