De Novo Single-Stranded RNA-Binding Peptides Discovered by Codon-Restricted mRNA Display

RNA-binding proteins participate in diverse cellular processes, including DNA repair, post-transcriptional modification, and cancer progression through their interactions with RNAs, making them attractive for biotechnological applications. While nature provides an array of naturally occurring RNA-binding proteins, developing de novo RNA-binding peptides remains challenging. In particular, tailoring peptides to target single-stranded RNA with low complexity is difficult due to the inherent structural flexibility of RNA molecules. Here, we developed a codon-restricted mRNA display and identified multiple de novo peptides from a peptide library that bind to poly(C) and poly(A) RNA with KDs ranging from micromolar to submicromolar concentrations. One of the newly identified peptides is capable of binding to the cytosine-rich sequences of the oncogenic Cdk6 3′UTR RNA and MYU lncRNA, with affinity comparable to that of the endogenous binding protein. Hence, we present a novel platform for discovering de novo single-stranded RNA-binding peptides that offer promising avenues for regulating RNA functions.

incubated again at 37℃ for 1 h.An equal amount of 2x Laemmli sample buffer (Bio-Rad; Hercules, CA, USA) was then added to the reaction and centrifuged at 10,000 × g for 3 min to remove the precipitants.The supernatant was collected and loaded on a SDS-Urea-PAGE gel (3.5% w/v SDS-PAGE stacking gel, 10% w/v SDS-6 M Urea-PAGE resolving gel).The electrophoresis was conducted at 30 mA for the stacking gel and at 50 mA for the resolving gel.Note that a 4-12% w/v gradient SDS-PAGE gel (TEFCO; Hachioji, Tokyo, Japan) were used from round 3 to round 6 of the VMM/VRR codon libraries instead of the SDS-Urea-PAGE gel, since we could separate the mRNA-tag and mRNA-peptide conjugates with higher resolution.For a 4-12% w/v gradient SDS-PAGE gel, electrophoresis was conducted at 60 mA for 90 min.Finally, the mRNA-peptide conjugates were isolated by excision as previously described 2 , and purified by gel-purification.

Gel-purification.
A gel slice containing the desired band was purified from a gel through electroelution followed by ethanol precipitation.For electroelution, Model 422 Electro-Eluter (Bio-Rad; Hercules, CA, USA) apparatus was used along with a glass tube, a frit, and a membrane cap (MWCO 12 kDa) from Bio-Rad.After this apparatus was properly assembled, the kneaded gel slice was loaded into the glass tube filled with 1x TBE buffer.The sample was eluted at 10 mA/glass tube for 45 min.At the end of elution, the polarity of the initial current was reversed for 30 s to dissociate the purified products from the dialysis membrane.Then, the eluate was carefully collected into the microcentrifuge tube.As for ethanol precipitation, sodium acetate (3 M) equivalent to 10% v/v of the obtained volume was added to the eluate and mixed well.A volume of pre-chilled ethanol (99.5%, v/v) (FUJIFILM Wako Pure Chemical; Osaka, Japan) equal to three times the total sample was further added to the mixture.The sample mixture was then centrifuged at 4℃, 15,000 × g for 1 h.After removing the supernatant, 1 mL of the chilled 70% v/v ethanol was added to the microcentrifuge tube.The dislodged pellet in ethanol (70%, v/v) was centrifuged again at 4℃, 15, 000 × g for 15 min.Subsequently, the supernatant was removed, and the pellet was air-dried at room temperature to evaporate the residual ethanol.Finally, the pellet was resuspended in nuclease-free water and the mRNA concentration was measured with NanoDropTM2000c spectrophotometer (Thermo Fisher Scientific; Waltham, MA, USA). ).

Negative selection of mRNA-tag by the affinity with poly RNA.
Negative selection was performed before in vitro translation to eliminate any mRNA products that accidentally obtained complementary bases via PCR or transcriptional error.First, 20 µL of 25-µm strep-tactinimmobilized magnetic beads (MagStrep "type3" XT Beads; IBA Lifesciences; Göttingen, Germany) was captured by a magnetic stand (DynaMag-2 Magnet; Thermo Fisher Scientific; Waltham, MA, USA) and washed twice with RNA-binding buffer (Tris-HCl (10 mM; pH 7.4), EDTA (0.5 mM), NaCl (500 mM), Tween 20 (0.05%, v/v)).After that, the poly RNA was immobilized on the magnetic beads.Biotinylated poly RNA (2.5 µM) in 200 µL of the RNAbinding buffer was added to the magnetic beads and mixed at 4℃ for 30 min with a rotator (80 rpm).After discarding the supernatant, the magnetic beads were washed three times with the RNAbinding buffer.Subsequently, 25 µL of the mRNA-tag (100 ng/µL) in the RNA-binding buffer (Tris-HCl (10 mM; pH 7.4), EDTA (0.5 mM), NaCl (500 mM), Tween 20 (0.05%, v/v)) was added to the magnetic beads and mixed for 30 min at room temperature with a rotator (80 rpm).Finally, the supernatant was collected for the following in vitro translation.

Peptide selection based on the affinity with poly RNA.
The mRNA-peptide conjugates with affinity to poly RNA were selected over the affinity of its mRNA-tag portion.First, two sets of strep-tactin-immobilized magnetic beads (20 µL) were washed twice with the RNA-binding buffer by magnetic stand.Next, the target poly RNA was immobilized to one of those magnetic beads as described in the previous section.Subsequently, 100 ng of mRNA-peptide conjugates (1.7 pmol, 1.0 × 10 12 molecules) was prepared in the RNA-binding buffer (200 µL; Tris-HCl (10 mM; pH 7.4), EDTA (0.5 mM), NaCl (500 mM), Tween 20 (0.05%, v/v)), and the solution was first added to the intact magnetic beads and mixed for 10 min at room temperature with a rotator (80 rpm) to remove the conjugates with affinity to the surface of the beads.To avoid the effect of metal cations on the peptide-RNA interaction, EDTA was added to the RNA-binding buffer.The supernatant, containing mostly the nonspecific binding-free mRNA-peptide conjugates, was then recovered and used for the affinity selection with the target RNA.The entire supernatant was added to the target RNA-immobilized magnetic beads and mixed for 30 min at room temperature with a rotator (80 rpm).After the removal of the supernatant, the magnetic beads were washed three times with 1x Buffer W (IBA Lifesciences; Göttingen, Germany).The 25 µL elution buffer (Tris-HCl (100 mM; pH 8.0), NaCl (150 mM), EDTA (1 mM), Biotin (50 mM)) was then added and vigorously mixed.
Finally, the mixture was incubated for 10 min at room temperature.The eluate was collected and used for reverse transcription polymerase chain reaction (RT-PCR).

Reverse transcription PCR (RT-PCR).
The eluate mRNA-peptide conjugates (10 µL) were used as a substrate for RT reaction using ReverTra Ace-α-(TOYOBO; Osaka, Japan) with RT-PCR-F and RT-PCR-R primers (each 0.2 µM) by following the manufacturer's protocol.RT reaction (20 µL) was started by adding the reverse transcriptase (reverse transcription reaction conditions: 50℃, 20 min, 99℃, 5 min, and 4℃, 5 min).The reaction was then mixed with 20 µL of Q5 High-Fidelity 2x Master Mix (New England BioLabs; Ipswich, MA, USA) and aliquoted into a total of six tubes to recover samples at different PCR cycles (0, 10, 15, 20, 25, and 30 cycles) to check the amplification efficiency.PCR condition was set as follows: initial denaturation at 98℃ for 30 s, followed by 30 cycles of 98℃ for 10 s, 65℃ for 10 s, and 72℃ for 10 s, then final extension at 72℃ for 30 s.To avoid over-amplification of DNA, these reaction samples were run on a 3% w/v TAE agarose gel stained by SYBR Gold, and the optimal amplification cycle was determined based on the band intensity and the absence of non-specific amplified products.Finally, RT-PCR reaction (40 µL) was once again conducted with optimal PCR cycles, and the product was purified using NucleoSpin™ Gel and PCR Clean-up Kit (Macherey-Nagel) for the next round of selection and sequencing.
Sequence analysis.Sequence files were obtained from the MiSeq Illumina platform as FASTQ files.A self-made pattern search program written in Perl script was used to extract the coding region, which is the region between the upstream of the start codon and the fixed sequence (8 nucleotide sequence) of the random region.Then, by using the count option of the software called FASTAptamer 3 , the number of identical sequences was counted and ranked for each round.Since the number of reads acquired in each round was different, we followed the progression of enrichment based on the normalized reads per million (RPM) value.For round 7, we performed clustering analysis on peptide sequences with RPM values of 10 or higher to create clusters based on their sequence similarity.In this case, we used the cluster option of FASTAptamer to define a cluster as a group of peptide sequences with a distance of 7 or less between peptide sequences.
The sequence logo of the selected RNA-binding peptides was created using WebLogo 3 server 4 , and log2-fold changes of amino acid frequencies after selection was calculated from the average amino acid frequencies in round 7 and 0.
Principal component analysis.The amino acid frequencies within each peptide were calculated by a self-made program written in Perl script.Principal component analysis was performed against the calculated amino acid frequencies using R ver 4.0.2(https://www.r-project.org/), and the loading plots were drawn using the first and second principal component axes.The peptide sequence dataset was prepared from randomly collected 100 peptide sequences from the peptide sequences in round 0 and the top peptide sequences of the top 30 enriched clusters in round 7.

Calculation of hydrophobicity and net charge of peptides.
The hydrophobicity and net charge index of peptides were calculated using "Peptides" ver 2.4.3 package 50 of R ver 4.0.2 5 .The GRAVY score (Grand average of hydropathy), which is a hydrophobicity index of peptides, was calculated based on Kyte and Doolittle's amino acid hydrophobicity index 6 .The net charge index at pH 7.4 was calculated using the Henderson-Hasselbalch equation using Lehninger's pKa scale 7 .

Enrichment analysis of combinatorial amino acid pairs. The frequency of combinatorial
dipeptide motifs of peptide libraries from round 0 and round 7 was calculated, and the enrichment score was derived from the following equation.Hierarchical clustering was then used to visualize the enrichment score of dipeptide motifs by heat mapping.
Figure S1.Schematic overview of codon-restricted mRNA display.This method starts with a designed codon-restricted DNA library (VMM/VRR or HHY codon library) encoding 33-amino acid random peptides.The DNA library is in vitro transcribed to mRNA library and undergoes ligation with the puromycin-DNA-tag.The tagged products are gel-purified and subjected to negative selection.The remaining tagged products are in vitro translated using a cell-free system and the resulting mRNA-peptide conjugates with a variety of 10 12 sequences are gel-purified and subjected to affinity selection against poly RNA.The selected peptide-mRNA conjugates are reverse-transcribed to the DNA library and used in the next round of selection.Also, DNA libraries from each round (round 0 to round 7) are subjected to Illumina MiSeq sequencing.

Figure S2 .
Figure S2.Codon translation table.The codons used by each library are shown in the codon table.(A)Codons corresponding to ''VMM'' and ''VRR'' in the VMM/VRR codon library are

3 Figure S3 .
Figure S3.Synthesis of DNA library and addition of puromycin-DNA-tag to mRNA.(A) Gel images were taken after synthesis of the DNA libraries and electrophoresis in a 3% w/v TAE agarose gel.(B) Schematic overview of Y-ligation reaction.(C) Gel photograph of mRNA and Yligation reaction mixture after electrophoresis on a 8 M urea 6% w/v TBE gel (left: SYBR Gold staining, right: FITC fluorescence detection).

Figure S4 .
Figure S4.Synthesis of mRNA-peptide conjugates by in vitro translation.In vitro translation reactions were performed with puromycin-DNA-tagged mRNA synthesized in each round and electrophoresed on SDS-polyacrylamide gels (stacking gel: 3.5% w/v polyacrylamide gel, separation gel: 10% w/v polyacrylamide gel) (For the VMM/VRR codon library, 4-12% w/v polyacrylamide gels were used from round 3 onwards).Subsequently, the images were captured by FITC fluorescence: (A) VMM/VRR codon library and (B) HHY codon library.

Figure S5 .
Figure S5.cDNA synthesis using the selected mRNA-peptide conjugates.cDNAs were synthesized from the mRNA-peptide conjugates selected by their targeted RNA binding capability.mRNA-peptide conjugates were reverse transcribed and amplified by polymerase chain reaction.The synthesized cDNA was electrophoresed on a 3% w/v TAE agarose gel: (A) VMM/VRR codon library and (B) HHY codon library.

Figure S6 .
Figure S6.Principal component analysis of amino acid frequencies.For round 0, 100 peptide sequences were randomly selected, while for round 7, the representative peptide sequences of the

Figure S7 .
Figure S7.Thermodynamic plots of the identified RNA-binding peptides with polyRNA/DNA.The dissociation constants between polyRNA/DNA and AP2/CP1 were determined by spectral shift (SpS) experiments; (A) thermodynamic plots of AP2 against poly(A) RNA/DNA, (B) thermodynamic plots of CP1 with poly(C) RNA/DNA.The error bar represents the mean ± SE from three independent experimental replicates.

Figure S8 .
Figure S8.Thermodynamic plots of the identified RNA-binding peptides with hybridized RNAs.The dissociation constants between poly RNA and AP2/CP1 were determined in the presence of the antisense oligo DNA by spectral shift (SpS) experiments: thermodynamic plots of (A) AP2 with cy5-labeld poly(A) RNA in the presence of poly(T) DNA and (B) CP1 with cy5labeled poly(C) RNA in the presence of poly(G) DNA.The error bar represents the mean ± SE from three independent experimental replicates.

Figure S9 .
Figure S9.Thermodynamic plots of AP2 peptide and its mutated variants.The dissociation constants between AP2 variants and poly(A)/(C) RNA were determined by spectral shift (SpS) experiments.Native peptide (AP2) was compared to the mutated variants whose QNQ sequence was mutated to (A) TTT or (C) TNH, while (B) the truncated AP2 (position 6 to 26) was compared to the mutated variants whose Q was mutated to S. The error bar represents the mean ± SE from three independent experimental replicates.

Figure S10 .
Figure S10.Thermodynamic plots of CP1 peptide and its mutated variants with poly(C) RNA.The dissociation constants between CP1 variants and poly(C) RNAs were determined by spectral shift (SpS) experiments.Native peptide (CP1) was compared to the mutated variants whose (A) TN sequence was mutated to TS or (B) NH sequence was mutated to SH.The error bar represents the mean ± SE from three independent experimental replicates.

Figure S13 .
Figure S13.Proposed hydrogen bonding between Gln/Asn and adenine/cytosine.Green: Watson-Crick-type hydrogen bonding interaction between Gln/Asn and adenine/cytosine.Blue: Hoogsteen-type hydrogen bonding interaction between Gln/Asn and adenine.The common side chain structure of Gln and Asn is shown here.The dashed line indicates hydrogen bonding.

Figure S14 .
Figure S14.Thermodynamic plots of polyQ/N peptide with polyRNAs.Microscale thermophoresis was employed to determine the dissociation constants of polyQ/N peptide (6-mer) with poly(A)/(C) RNA.(A) polyQ peptide and (B) polyN peptide.The error bar represents the mean ± SE from three independent experimental replicates.