Amino Acid Signature Enables Proteins to Recognize Modified tRNA

Human tRNALys3UUU is the primer for HIV replication. The HIV-1 nucleocapsid protein, NCp7, facilitates htRNALys3UUU recruitment from the host cell by binding to and remodeling the tRNA structure. Human tRNALys3UUU is post-transcriptionally modified, but until recently, the importance of those modifications in tRNA recognition by NCp7 was unknown. Modifications such as the 5-methoxycarbonylmethyl-2-thiouridine at anticodon wobble position-34 and 2-methylthio-N6-threonylcarbamoyladenosine, adjacent to the anticodon at position-37, are important to the recognition of htRNALys3UUU by NCp7. Several short peptides selected from phage display libraries were found to also preferentially recognize these modifications. Evolutionary algorithms (Monte Carlo and self-consistent mean field) and assisted model building with energy refinement were used to optimize the peptide sequence in silico, while fluorescence assays were developed and conducted to verify the in silico results and elucidate a 15-amino acid signature sequence (R-W-Q/N-H-X2-F-Pho-X-G/A-W-R-X2-G, where X can be most amino acids, and Pho is hydrophobic) that recognized the tRNA’s fully modified anticodon stem and loop domain, hASLLys3UUU. Peptides of this sequence specifically recognized and bound modified htRNALys3UUU with an affinity 10-fold higher than that of the starting sequence. Thus, this approach provides an effective means of predicting sequences of RNA binding peptides that have better binding properties. Such peptides can be used in cell and molecular biology as well as biochemistry to explore RNA binding proteins and to inhibit those protein functions.

T here are three human isoaccepting tRNAs for the amino acid lysine, htRNA Lys1,2,3 . The three human tRNA Lys decode the two lysine codons, AAA and AAG. Two of the isoacceptors, htRNA Lys1,2 CUU with the anticodon CUU, decode AAG, but only one, htRNA Lys3 UUU with the anticodon UUU, responds to the cognate codon AAA and wobbles to AAG. Besides its important role in protein synthesis, htRNA Lys3 UUU serves as the primer of reverse transcription in the replication of the lentiviruses, including human immunodeficiency virus type 1 (HIV-1). 1,2 During the replication of HIV-1, the host cell htRNA Lys3 UUU is recognized and bound and its structure destabilized by nucleocapsid protein 7 (NCp7). 3−5 This destabilization allows the relaxed U-rich anticodon stem loop (hASL Lys3 UUU ), as well as the acceptor stem (Figure 1), to be annealed to the HIV viral RNA. During the subsequent infection, htRNA Lys3 UUU is the primer for HIV reverse transcriptase.
htRNA Lys3 UUU is one of the most uniquely processed tRNAs having chemically rich post-transcriptional modifications that are important to conformation and function of the tRNA during protein synthesis. 6 Until recently, the role(s) these modifications play in the tRNA's interaction with NCp7 and in viral replication were not known. The naturally occurring modifications, 5-methoxycarbonylmethyl-2-thiouridine (mcm 5 s 2 U 34 ), at tRNA's wobble position-34, and 2-methylthio-N 6 -threonylcarbamoyladenosine (ms 2 t 6 A 37 ), at position-37, 3′-adjacent to the anticodon in the loop of the hASL Lys3 UUU , are both chemically rich and constitute a unique combination in human tRNAs ( Figure 1). 7,8 These modifications enhance NCp7's ability to recognize and bind to the RNA, suggesting that these modifications are an important discrimination factor for recognition by NCp7. 9 The presence of these modifications increases NCp7 affinity for hASL Lys3 almost 10-fold (K d = 0.28 ± 0.03 μM for modified and K d = 2.30 ± 0.62 μM for unmodified ASL). 9 NCp7 is critical to HIV replication because it binds and relaxes the htRNA Lys3 structure, facilitating annealing of the tRNA to the viral genomic RNA and packaging of the genomic RNA into the viral capsid. 4 Fifteen-and 16-amino acid peptides have been selected to mimic NCp7's preferential recognition of the fully modified hASL Lys3 UUU . 9,10 These peptides can be used to study modification-dependent protein recognition of RNAs, in general, and recognition and annealing of htRNA Lys3 UUU to the HIV viral RNA, specifically. One peptide, P6 (sequence RVTHHAFLGAHRTVG), was also shown to mimic NCp7. P6 not only binds hASL Lys3 UUU but also destabilizes the ASL structure as does NCp7. 9 The ability of peptides to mimic NCp7 makes it possible to engineer a peptide with a signature amino acid sequence that can be used as a tool in future studies of protein recognition of RNAs, particularly those with unique modifications chemistries. Herein, we report the development of a signature amino acid sequence for the recognition of htRNA Lys3 UUU . An algorithm was developed that optimizes the amino acid sequence by combining self-consistent mean field (SCMF) and Monte Carlo (MC) approaches. The resulting peptides were then validated as binders through empirical experimentation in order to corroborate the computer-aided approach. Amino acid substitutions in silico indeed enhanced the modification-dependent binding of the peptide ligand with high affinity and selectivity in vitro. The peptide sequences predicted by the algorithms preferentially bound the modified hASL Lys3 UUU with affinities at or higher than P6, and with greater specificity. The signature sequence provides insight into peptide and protein recognition of the modified tRNA Lys3 UUU .

■ EXPERIMENTAL PROCEDURES
Materials and Reagents. All materials, buffers, and reagents were of RNA grade quality and RNase free. The modified and unmodified hASL Lys3 UUU were chemically synthesized by Thermo Scientific. The modified hASL Lys3 UUU was synthesized with the nucleoside phosphoramidites that were 2′-protected with tert-butyldimethylsilyl-ether. 9 The unmodified hASL Lys3 UUU was synthesized with "ACE" chemistry. 11 All fluorescein labeled peptides were obtained from Sigma-Aldrich (PEPscreen) with sequence verification by mass spectrometry.
In Silico Evolution of Peptide Sequences. A random initial sequence that satisfies the constraints on hydration properties is generated ( Figure 2). For the search described here, we started with the 15-amino-acid sequence of peptide P6, RVTHHAFLGAHRTVG, found experimentally to bind selectively to the modified hASL Lys3 UUU . 9 The peptide backbone conformation is determined via atomistic simulation of the peptide-hASL Lys3 UUU complex and then held fixed with respect to the hASL Lys3 UUU conformation throughout the search. The binding free energy for the complex is then evaluated. Subsequently, a random number is generated to determine whether to mutate one amino acid or to exchange two amino acids. If one amino acid is to be mutated, one site along the peptide sequence is chosen randomly. The amino acid at that site is then mutated to another amino acid of the same residue type. The best rotamer for the new amino acid is chosen to substitute for the old amino acid in this mutation step. If an exchange step is chosen, two random sites along the chain and their corresponding amino acids are chosen for a mutual exchange attempt. In this exchange step, we calculate the effective potential of all the possible rotamers and perform the self-consistent mean field (SCMF) procedure 12 described below to obtain the best rotamer combination for the exchanged amino acids. Regardless of whether one amino acid was mutated or two amino acids were exchanged, the new generated peptide sequence is evaluated further by calculating the new binding free energy and accepted or rejected according to the Metropolis criterion. After a total of 10,000 evolution steps, the best peptide sequences with the lowest binding free energy are identified.
In our use of the SCMF, a trial exchange between two amino acids at randomly chosen sites is implemented ( Figure 3). The conformational probability matrix P = P 0 is set initially for the two amino acids so that all possible rotamers have equal conformational probability. The initial conformational matrix P 0 is then used to calculate the effective potential of each amino acid in each rotamer state. Once the effective potentials for all the rotamer states are known, new conformational probabilities of the rotamers are obtained according to the Boltzmann law so as to constitute a new conformational matrix P 1 . Next, the absolute error between P 1 and P 0 is calculated. If the absolute error is less than 10 −3 , the best rotamers with the highest conformational probability for the two amino acids are selected from P 1 to repack the side chains. Otherwise, the conformational matrix P is updated by employing a self-consistent iteration. The updated conformation matrix P is stored as the old conformational matrix P 0 for the next round evaluation. The conformational matrix is iterated until the absolute error between P 1 and P 0 is less than 10 −3 . Eventually, the best combination of rotamers is found, thereby repacking the backbone.
The absolute binding energy for a ligand and a receptor is defined to be the difference between the free energy of the complex, and the free energies of the ligand and the receptor prior to binding. 13 It was calculated according to the following equation: Figure 2. Search algorithm flow strategy. An initial peptide sequence is chosen (in this instance peptide P6). Random numbers were generated to determine whether to mutate one amino acid or not ("No" or "Yes"). If yes, then one amino acid from the sequence was randomly changed to an amino acid from the same residue category (Table 1). If no, then two amino acids from the sequence were randomly exchanged regardless of the residue category. The SCMF algorithm was then used to determine the lowest-energy rotamer combination. The MC algorithm was used to accept or reject the newly generated peptide sequence based on the calculation of binding free energy (ΔG 1 binding ).
where G TOT complex , G TOT ligand , and G TOT receptor represent the total free energies of the complex and the ligand and the receptor in the solution, respectively. The total free energy G TOT of the molecular complex in the solution was calculated as follows: where U INT , U VDW , U ELE , G EGB , and G GBSUR indicate the internal energy (INT), van der Waals energy (VDW), electrostatic energy (ELE), the polar solvation energy (EGB), and the nonpolar solvation energy (GBSUR). The internal energy U INT is defined as the potential energy associated with the random, disordered motion of the molecule itself, including the vibration of bonds, bond angles, and the torsion of dihedral angles. The van der Waals energy U VDW between two nonbonded atoms adopts a typical 12−6 Lennard−Jones equation. The electrostatic energy U ELE between two nonbonded atoms follows the conservative Coulomb's law. The polar solvation energy G EGB is calculated based on the generalized Born model, which is an analytical way to evaluate the electrostatic contribution to the solvation free energy. The nonpolar solvation energy G GBSUR is approximated by a pairwise potential that occurs at the incompatible interface of two distinct species molecules. In the solution, actually, it is proportional to the solvent-accessible surface area of solute molecules. The expressions for the energies in eq 2 are as follows: where K b and K θ are the harmonic force constants, b 0 and θ 0 are the equilibrium bond length and bond angle, respectively, b and θ are the actual bond length and bond angle, respectively, V n is an energy constant, n is the dihedral multiplicity, ø is the actual torsion angle, and δ is phase shift. All of the values of K b , b 0 , K θ , θ 0 , V n , and δ come from the Amber force field.
(b) van der Waals energy U VDW where ε (ij) and r 0(ij) are the well depth and the characteristic distance for a pair of atoms i and j, respectively, and r ij is the actual distance between two nonbonded atoms i and j. Each atom has a depth of the potential well ε, and a characteristic distance r 0 at which the potential reaches its minimum. For any pair of atoms i and j, ε (ij) = (ε (i) ·ε (j) ) 1/2 and r 0(ij) = r 0(i) + r 0(j) . All of the values of ε and r 0 come from the Amber force field.
(c) electrostatic energy U ELE where q x (x = i or j) represents the charge quantity of atom x, and r ij is the actual distance between two nonbonded atoms i and j. All of the values of q x come from the Amber force field.
(d) polar solvation energy G EGB where q x (x = i or j) represents the charge quantity of atom x, ε r is the relative dieletric constant (in this work, ε r = 80.0 for water), r ij is the actual distance between atoms i and j (thus, r ii = 0.00), α x (x = i or j) is the effective Born radii of atom x, and f GB is a smoothing function, of which a common form is (e) nonpolar solvation energy G GBSUR where γ is surface tension (in this work, γ = 0.0072 kcal/ mol/Å 2 for the interface between water and biomolecules), σ i is the solvent-accessible surface area of atom i, and b is an offset parameter (in this work, b is set to 0.00).
Fluorescein-Labeled Peptides. Fifteen-amino acid peptides were selected from the sequences predicted in silico and were chemically synthesized, each with fluorescein (Flc) at the N-terminus. This set of peptides included the original P1 and P6 sequences to be used as an internal control. 9 The lyophilized peptide set was reconstituted via standard suggestions from the manufacturer (80% DMSO/20% H 2 O, v/v). Concentrated peptide stocks were stored in 25 μL aliquots at −80°C for later use. Working concentrations were diluted for each experiment and kept on ice or stored at −20°C.
Fluorescent Assays. Fluorescent assays were conducted in phosphate buffer (10 mM Na 2 HPO 4 and 10 mM KH 2 PO 4 , pH 6.8) in low volume 384 well plates. All buffers, peptides, and RNA were pipetted into wells via a liquid handling robot . Self-consistent mean field (SCMF) procedure. A trial exchange between two amino acids is implemented. The conformational probability matrix P 0 is set initially so that all possible rotamers at any one site have equal probabilities. The effective potential experienced by each rotamer at each site is calculated, and the Boltzmann law is used to determine new conformational probabilities of the rotamers for each amino acid and hence a new conformational probability matrix P 1 . If the absolute error between P 0 and P 1 is less than 10 −3 , the rotamer combination with the highest conformational probability is selected from P 1 to repack the side chains. Otherwise, the conformational matrix P is updated by employing a self-consistent iteration until the absolute error falls below a certain tolerance.
(Janus, PerkinElmer). All plates were read using a plate reader fitted with fluorescein-specific filters (PerkinElmer EnVision) which were optimized for each plate and peptide before each experiment. Initial validation screens were conducted by obtaining fluorescent signals for each peptide (0.50 μM) alone (FS 0 ) and in the presence of a 2-fold excess of modified or unmodified hASL Lys3 UUU (FS 1 ). Percent change in the fluorescent signal was calculated (% change = 100·(FS 1 /FS 0 )). A decrease (quench) in fluorescent signal in the presence of RNA indicated a binding event between the peptide and ASL. In control wells, H 2 O, phosphate buffer, and ASLs were individually tested for any inherent fluorescent signal. Water and buffer signals served as blank background signals. The hASL Lys3 UUU had a negligible inherent signal, which was taken into account when calculating the overall signal quench. All controls and experimental sets were repeated in triplicate within a single plate, and signals are an average of each triplicate. For the subset of peptides which were further studied to obtained binding constants, the fluorescent binding assay was completed as described above. The peptide fluorescent signal was monitored throughout an increasing ASL concentration (0−3 μM). The percent quench in signal was plotted against the hASL concentration. Binding (dissociation) constants (K d ) were calculated using the single-site, nonlinear regression function within SigmaPlot. Because the peptide concentration was 0.50 μM, K d values between 0.05 and 0.13 μM are considered approximate (≈). Experiments were performed in triplicate within a single plate, and fluorescent signals are an average of each triplicate. The binding of peptides P6, P27, P31, and P35* to the unmodified hASL Lys3 UUU resulted in little change in fluorescence and could not be satisfactorily fitted with a single-site, nonlinear regression function. Thus, these K d s were considered indeterminable. Results could not be curve fitted, and a K d could not be calculated from the data ( Table 4).

■ RESULTS
Development of an Algorithm That Optimizes a Peptide Sequence for Binding RNA. Previous studies demonstrated the feasibility of selecting peptides with modification-dependent recognition of tRNAs' anticodon stem and loop domains, ASLs. 18,19 The peptides were selected from completely and partially randomized phage display libraries. 9,10 However, optimizing 15-and 16-amino acid peptide sequences using this approach is not feasible since there are over 3.3 × 10 23 possible sequences. Because of the exorbitantly high costs of creating and screening millions of peptides even with the benefit of phage display, we turned to computer algorithms 20 and assisted model building with energy refinement, AMBER, simulations to pare down the number of possibilities before performing in vitro assays. We developed a novel optimization strategy 21 that combines MC with SCMF to evolve amino acid sequences. The peptide P6 sequence RVTHHAFLGAHRTVG 9 was the starting point from which an optimized peptide was sought to bind the modified hASL Lys3 UUU with the highest specificity and affinity. The canonical 20 amino acids were categorized into six distinct groups according to hydrophobicity, polarity, size, and charge (Table 1A). These hydration properties were necessary to ensure the peptide did not become too hydrophobic (and thus insoluble) or so hydrophilic that binding to hASL Lys3 was inhibited. The overall charge of the peptide was chosen to be slightly positive to ensure interaction with the negatively charged ASL. By adjusting the number of amino acids in each category, via N pho , N neg , N pos , N pol , N oth , N gly , we maintained hydration properties similar to those of the original P6 sequence while evolving the sequences. 9 The peptide sequence was optimized using the following computational procedure. The stable structure for the complex between the original P6 sequence and ASL was determined using AMBER. The structure of the fully modified hASL Lys3 UUU was taken from the high resolution solution structure, 6 providing a restrained structure to which the peptide would bind in silico. Once the stable structure of the peptide P6 with the ASL Lys3 was determined, the peptide's amino acid sequence was evolved and optimized while keeping the backbone fixed. Each peptide sequence evolved in two types of " moves": (1) a single randomly chosen amino acid in the peptide sequence was mutated to a different amino acid from the same residue category (Table 1); or (2) two randomly chosen amino acids in the peptide sequence were exchanged regardless of the their residue category (Figure 2) using SCMF. 22 SCMF finds the optimal rotamer combination with the lowest binding energy for the two exchanged amino acid residues (Figure 3) based on the preferred, distinct side chain conformations in Lovell's rotamer Library. 23 The sequences were subjected to continued rounds of optimization ( Figure 2). By comparing the changes in binding energy before and after each of the two types of moves, the peptide sequence was evolved to those with the lowest binding energies and thus increased binding affinity to the modified hASL Lys3 UUU . Evolved Peptides Have Increased Specificity for ASL Modifications. The initial P6 sequence was subjected to an evolution of over several hundred thousands of rounds of 15amino acid peptide sequences that, based on binding energies, should recognize and bind modified hASL Lys3 UUU with a similar or higher affinity than that of P6. Initial results from the in silico selection suggested two optimized peptide sequences, P26, R- These sequences exhibited binding energies to the hASL Lys3 UUU lower than that of P6 (P26, −22.55 kcal/mol, and P27, −22.07 kcal/mol, respectively, vs P6 −21.26 kcal/mol). On the basis of these initial results, we developed three distinct peptide sequence cases. The three cases varied within the six residue categories (Table 1B). Although different, each of the three cases is still within the overall desired levels of moderate hydration and charge properties (slightly positively charged). P6, the initially evolved sequences P26 and P27, and three of the top 10 sequences from each of the first two cases 1 and 2 have lower binding energies than those of case 3 ( Table  2). This is likely due to the increased allowance in cases 1 and 2 for positively charged and hydrophobic residues ( Table 2). The binding energies calculated for P26 and P27 and cases 1 and 2, but not 3, are on par with or lower than the binding energy for P6. This suggests a potential increase in their binding affinity for modified hASL Lys3 UUU versus that of P6. In Vitro Fluorescence Screens Validate in Silico Methods: Qualitative Analysis. Sequences predicted during the in silico optimizations to have the lowest binding energies and thus potentially higher affinity for modified hASL Lys3 were selected for validation with a fluorescence assay. 9 Fifteen peptides (Table 3) were chemically synthesized with fluorescein at the N-terminus to allow for very sensitive, low volume detection of peptide−RNA binding interactions. P1 and P6 from phage display selections, 9 the initially evolved sequences, P26 and P27, and the best binders from each of the cases 1, 2, and 3 were synthesized. Variants of these sequences that had one or two amino acid changes were also synthesized ( Table 3). During the initial validation assay, changes in the amount of fluorescence were monitored to determine whether the peptide was binding to the modified and/or unmodified hASL Lys3 UUU and to what relative degree ( Figure 4). In this screening assay, P6 behaved as expected.
When bound by the modified hASL Lys3 UUU , P6 fluorescence was altered more than that when bound by the unmodified hASL Lys3 UUU . Other peptides (P27, P31, and P35*) behaved similarly. Still others demonstrated nonspecific binding in that the degree of fluorescence did not significantly differ between the binding of modified and unmodified hASL Lys3 UUU (Figure 4: P30 and P32). Three peptides exhibited a greater degree of change in fluorescence quenching when binding the unmodified RNA versus the modified RNA; their binding could have been negatively affected by modifications (P26, P28, and P37). The stacking interactions of fluorescein with nucleobases can be nonspecific and result in fluorescence quenching that is mistakenly attributed to a selective binding of a ligand to a nucleic acid. We have employed a direct comparison of peptide binding of the modified hASL Lys3 UUU to binding of the control that is identical in sequence except for the modified nucleosides in the anticodon loop. If nonspecific stacking interactions were to occur, they were most likely limited to the stacked and identical stem regions of the control and experimental RNAs. Therefore, the observable differences in quenching could be attributed to the peptide ligand's affinity for the modifications.
Three peptides of the 15 exhibited fluorescence enhancement when binding the modified RNA but exhibited quenching when binding the unmodified RNA (Figure 4: P34, P36, and P38), whereas a fourth exhibited the opposite (P1). This result is counter to the better understood binding of P6. 9 The result is also counterintuitive. Fluorescein stacks with the nucleobases of nucleic acids resulting in a reduced fluorescence. The quenching of fluorescein fluorescence is one of its advantageous properties in studying the interactions of fluorescein-coupled ligands with nucleic acids. Without a detailed investigation of the mode of binding by these four peptides, interpretation of these results is difficult at this time.
Three peptides, P27 one of the two initially evolved, P31 from case 1, and P35*, a variant of P35 from case 2, Table 2. Binding Energies for Selected, Optimized Peptide Sequences a Using the search algorithm, peptide sequence candidates are evolved on the basis of a backbone scaffold of structure. For each binding structure, an explicit binding energy is generated. This search and selection process does not quantify the error in the binding energy. b Peptides are numbered in accordance with the previous report of phage display selected sequences. 9 P6 is from the original selection, and P26−P38 were chosen based on predictions in silico. P35* with the sequence RWNHCQFWSGWRANG has a single amino acid serine change from P35. Peptide sequences A and B from cases 2 and 3, respectively, were not selected for chemical synthesis and analysis.

Table 3. Synthesized Peptides Based on Predictions in Silico a
a Peptides named following the peptides from original phage display library screens. 9 P1 and P6 (bold font) are from the original screen. P6 has been characterized. 9 In addition, the binding of the modified and unmodified hASL Lys3 UUU by P1, P27, P31, and P35* (bold font and shaded row) are characterized in this article.  Table 3 Biochemistry Article dx.doi.org/10.1021/bi401174h | Biochemistry 2014, 53, 1125−1133 preferentially bound the modified hASL Lys3 UUU as compared to their binding of the unmodified hASL Lys3 UUU , as did the phage selected P6. P27, P31, and P35* showed a very high selectivity for the modified hASL Lys3 UUU. When these peptides were tested for their binding to unmodified hASL Lys3 UUU , little or no change in fluorescence was observed (Figure 4; Table 4). However, with the addition of the modified hASL Lys3 UUU , the fluorescent signals of P27, P31, P35*, and P6 were quenched. Peptides P32 and P33 had the best binding properties among the case 3 family of peptides. Even so, they had significantly weaker affinities and a lack of specificity for the modified hASL Lys3 UUU as compared to those of P27, P31, and P35* (Figure 4). Thus, peptides P27, P31, and P35* evolved in silico from P6 as having high affinities for the modified hASL Lys3 UUU also appear from the screening assay in vitro to have higher affinities and higher specificities for the modified RNA.
Evolved Peptides with Lower Free Energies of Binding Also Have Higher Affinity for Modified hASL Lys3 UUU : Quantitative Analysis. Five of the 15 peptides were chosen for detailed analysis of their binding because the data collected and presented in the initial screen ( Figure 4) was indicative of these peptides distinguishing modified from unmodified RNA. Also, these peptides were chosen for detailed analyses because in their binding of the modified hASL Lys3 UUU , they exhibited sizable changes in fluorescence that could be accurately quantified (Figure 4 and Table 4). The peptide fluorescence signal was monitored with an increasing hASL concentration, and the percent quench in signal was plotted against the hASL concentration. Binding constants (K d ) were calculated using the single binding site, nonlinear regression function. Peptides P6, P27, P31, and P35* having exhibited qualitatively the highest affinity and specificity for the modified substrate ( Figure 4) were subjected to a quantitative analysis of their binding to both the modified and the unmodified hASL Lys3 UUU . P1 was also subjected to this analysis though the peptide exhibited a decreased fluorescence when binding the modified hASL Lys3 UUU and an increased fluorescence in binding the unmodified hASL Lys3 UUU . Equilibrium binding constants (as the dissociation constant K d ; Table 4) were determined from the concentration-dependent fluorescence quenching with the binding of the modified and unmodified hASL Lys3 UUU . This data was compared to that for P6 from which these peptides were evolved in silico (Table 4). Little to no fluorescence quenching was observed when P27, P31, P35*, and P6 were titrated with the unmodified hASL Lys3 UUU ; thus, binding constants could not be extracted from the data ( Table   4). In contrast, the fluorescence of P27 was quenched considerably with increasing amounts of the modified hASL Lys3 UUU . The peptide bound the RNA with high affinity (K d ≈ 0.05 ± 0.10 μM) ( Table 4). P27 exhibited the highest affinity for the modified hASL Lys3 UUU . P31 bound the modified hASL Lys3 UUU with a 10-fold less affinity (K d = 0.50 ± 0.10 μM), and its affinity for the hASL was equivalent to that of P6 (Table  4). P35* bound the modified hASL Lys3 UUU with the lowest affinity of these peptides (K d = 1.87 ± 1.00 μM). The fluorescence of P1 was altered by both the modified and unmodified hASL Lys3 UUU . In binding the modified hASL Lys3 UUU , P1 exhibited a quench in fluorescence from which we extracted a modest binding constant (K d ≈ 0.13 ± 0.02 μM). However, the fluorescence change in binding the unmodified hASL Lys3 UUU resulted in an equivalent affinity (K d ≈ 0.15 ± 0.04 μM). Thus, in contrast to P27, P31, P35*, and P6, P1 lacked specificity for the modified hASL Lys3 UUU . Computational Analysis of Peptide P27 Binding of the Modified hASL Lys3 UUU . Peptide P27 has the highest affinity coupled with highest specificity for binding of the modified hASL Lys3 UUU . The calculated structure bound to the modified hASL Lys3 UUU at equilibrium reveals how this affinity and specificity could be achieved and thus suggests future experimentation ( Figure 5A). Amino acids throughout P27 are engaged with the extensive chemistries of the two modifications unique to the tRNA Lys3 UUU . The two arginines, R 1 and R 12 , bracket the threonylcarbamoyl-group of ms 2 t 6 A 37 ( Figure 5B). At the middle of the peptide, F 7 is closely associated with the hydrophobic methyl of the threonyl-side chain. The imidazole ring of W 11 lies above the methyl-ester of the 5methoxycarbonylmethyl-moiety of mcm 5 s 2 U 34 . As evidenced by calculations for each of the 15 amino acids, R 1 , R 12 , F 7 , and W 11 contribute to ΔG Binding ( Figure 5C). The binding energy contributed by each of the nucleosides of the modified hASL Lys3 UUU have also been calculated. The binding energy is concentrated in the anticodon loop, as opposed to the stem. However, the two modifications, particularly ms 2 t 6 A 37 , provide the most significant binding energies.

■ DISCUSSION
An Amino Acid Signature Sequence Important for the Recognition of Uniquely Modified RNA. Protein recognition of RNA has been well-defined for two families of RNA binding proteins. The Pumilio protein was first identified as a Drosophila protein critical for segmentation of the posteria, 24 and the FBF protein was found responsible in Caenorhabditis to mediate the sperm/oocyte switch. 25 Other RNA binding domains or RNA recognition motifs have been studied extensively, revealing conserved sequence/structure relationships, yet these structures differ by having subtle distinctions, some of which are dynamic. 26 However, insight into protein recognition of RNA's modified nucleoside chemistries has not been pursued and thus has yet to be characterized.
The primary goal of this study was to demonstrate that a signature amino acid sequence can be identified as binding a uniquely modified RNA with high affinity and specificity. We reached this signature sequence using a combination of computational simulations to obtain optimized amino acid sequences that were then confirmed by binding studies in vitro. By comparing peptide sequences which specifically bound the modified hASL Lys3 UUU to those which did not, we were able to derive an amino acid signature that should be useful for protein/peptide recognition of RNA with modifications. Focusing primarily on those peptides which showed the highest affinity and specificity for the modified hASL Lys3 UUU , the amino acid signature emerged: R-W-Q/N-H-X 2 -F-Pho-X-G/A-W-R-X 2 -G (where X can be most amino acids, and Pho is hydrophobic) ( Table 5).
The evolution of peptide sequences in silico is rapid relative to screening at the bench. We have developed an algorithm with the potential to simulate a very high number of binding events for the 15 amino acid peptide theoretical combinations (>3.3 × 10 23 ) to each substrate. In our algorithm, all 20 amino acids are considered. However, we group them for the purpose of describing their hydration properties. There are concessions such as grouping the amino acids by side chain properties to more quickly move through peptide evolution. Our developed algorithm proved to be a powerful tool in accurately predicting peptides which would bind specifically to hASL Lys3 UUU modifications. We believe that we can improve the accuracy of in silico predictions by developing simulations in tandem to look more closely at nonspecific binding of the peptide to other small RNAs and/or unmodified tRNAs or ASLs. A cross-check performed by a parallel screen assessing binding energies of peptides binding to different ASLs could potentially eliminate nearly all false positives before moving to in vitro and/or in vivo experiments. The validation screens in vitro revealed that while the computer algorithms were not 100% correct in predicting peptide sequences with both high affinity and specificity however, the selection in silico was a serious tool for predicting binding trends and quickly screening through many peptide sequence combinations.
Amino Acid Signature Sequence R-W-Q/N-H-X 2 -F-Pho-X-G/A-W-R-X 2 -G. The signature amino acid sequence offers an opportunity to investigate the mode or mechanism of binding in future research. The sequence offers clues and surprises to be studied as to why the optimized peptides from cases 1 and 2 bind the modified hASL Lys3 with high affinity. Interestingly, the 5′-amino terminal sequence is more hydrophilic (R, Q, and H) than the center (F and Pho) or the 3′carboxyl terminus (G). Conventional thought would have the two positively charged arginine residues (positions 1 and 12) preferentially engaged with the negatively charged phosphate linkages via charge−charge interactions and/or the hydrophilic sugars. Here, the two arginine residues are also involved in interactions with the mcm 5 s 2 U 34 and ms 2 t 6 A 37 due to VDW energy ( Figure 5B). The increased number of hydrophobic residues, specifically tryptophan (position 11) and phenylalanine (position 7), contribute to the overall binding specificity through VDW interactions.
The fluorescence changes observed during the binding of the peptides to the hASL Lys3 UUU are to a lesser degree than expected for fluorescein stacking or intercalating with nucleobases of a duplex stem. Our results are not consistent with fluorescein labeled peptides recognizing the identical stem regions of the modified and unmodified hASL Lys3 UUU . However, the results are consistent with the modest effect on fluorescence that could be expected of a differential interaction of peptides P6, P27, P31, and P35* with the single stranded nucleosides of the modified hASL Lys3 UUU loop. For instance, one could expect that the phenyl-ring of phenylalanine would intercalate within   the 3′-base stack of the anticodon domain. The N 6threonylcarbamoyl-group of ms 2 t 6 A 37 is known to enhance base stacking. 6 Phenylalanine has been observed to intercalate between anticodon nucleosides of tRNA Lys in the cocrystal structure of lysyl-tRNA synthetase and tRNA Lys . 27 Instead of the expected intercalation, in the computed structure F 7 interacts with the threonyl-side chain contributing to the affinity and specificity of the peptide ( Figure 5B). Though the signature sequence and the selected peptide sequences P27 and P31 have the highest affinity and specificity for the modified hASL Lys3 UUU and have two arginines each, there is little sequence homology with RNA binding proteins that are rich in arginine 28−32 or with single-stranded RNA binding proteins. 33−37 Potential for Modification-Dependent, RNA Binding Peptides. The optimization of RNA binding peptides to recognize the unique chemistries of modified nucleosides and the contributions they make to local structure affords the opportunity of inhibiting RNA binding proteins studied in vitro and possibly in vivo. The benefits of modification-dependent signature peptides are many-fold. First, an amino acid signature peptide that uniquely recognizes a specific RNA modification or combination of modifications becomes a tool in the study of RNA binding proteins that interact with RNA in a modification-dependent manner. Modifications are most often found in the terminal and internal loops of RNA structures. 38 There the modifications negate intraloop hydrogen bonding and can enhance or even decrease the possibility of base stacking. 39 Peptides that recognize the ubiquitous anticodon domain modification N 6 -threonylcarbamoyladenosine can be used as a tool to study other modified tRNA−protein interactions, for instance, those between tRNAs and their modification enzymes and/or aminoacyl-tRNA synthetases.
Second, the amino acid signature sequence has the potential to shed light on the residues necessary in protein recognition of modified RNA and hence aid in future studies of modified RNA−protein interactions. Indeed, pBLAST searches for this signature find hits in other RNA-binding proteins such as RNA polymerase sigma factor RpoE and threonyl-and prolyl-tRNA synthetases. The synthetases recognize tRNA substrates having t 6 A 37 . Whether these amino acids confer specific recognition to substrate modifications has yet to be determined. There are many instances whereby a particular tRNA modification is known to exist, its presence determined by mass spectrometry, 2-dimensional thin layer chromatography, or other appropriate method (RNA MDB), with very few clues about which enzyme(s) are responsible for the modification.
Third, peptides directed against specific modifications could also help in identifying incompletely modified RNAs. Complex modifications typically occur in an organized stepwise fashion. 40 Even the earliest analyses demonstrated that many tRNA modification enzymes often recognize a partially modified RNA. 41,42 Fourth, knowledge of signature peptide sequences for recognition of specific modified nucleosides could provide clues as to how best to narrow bioinformatic searches for modification enzyme functions. Fifth, optimized peptides can be used as competitive inhibitors of protein−RNA interactions that are related to human disease, symptoms, or causative factors. The enhanced binding activity of the peptides reported here could inhibit the recruitment of htRNA Lys3 UUU and its annealing to the HIV genome, thus inhibiting HIV replication. Sixth, the amino acid signature sequence or the methodology reported here also could be used to derive a signature in the engineering of new proteins and the design of peptide based therapeutics. 43