Design and Application of Mini‐libraries of miRNA Probes for an Efficient and Versatile miRNA‐mRNA Cross‐linking

Abstract MicroRNAs constitute a class of endogenous, non‐coding RNAs that influence various processes within the cell. By base‐pairing to partially‐complementary sites located in the 3’ untranslated region of target messenger RNAs, microRNAs participate in post‐transcriptional regulation of the majority of human protein‐coding genes. Their dysregulation has been related to many pathological processes and diseases. Thus, an in‐depth understanding of the microRNA mechanisms of action is crucial. Here, we present a new concept of probe design to achieve an efficient and sequence‐independent miRNA‐mRNA cross‐linking. The new strategy is based on the utilization of a controlled mixture of probes for a chosen miRNA, in which a trioxsalen moiety is introduced at the N 4‐position of a selected cytidine through short oligoethylene glycol‐based linkers. In vitro photo‐cross‐linking experiments with mini‐libraries of probes for microRNAs of interest showed variable cross‐linking efficiencies, demonstrating a general applicability of the presented approach.


Introduction
MicroRNAs (miRNAs) are endogenous, highly conserved RNAs acting as powerful regulators of a wide array of physiological processes in a variety of organisms. [1] They bind to conserved sites in the 3'-untranslated regions (3'-UTRs) of target messenger RNAs (mRNAs) and induce their degradation [2] or repress their translation. [3] In canonical miRNA targeting, target recognition occurs by base-pairing between the "seed" region (nucleotides 2-7 counting from the 5'-end) of the miRNA and its mRNA target. However, miRNAs also participate in a wide variety of non-canonical interactions, for example that involve regions other than their seeds, [4] that contain bulged [5] or mismatched pairing, [6] by binding to regions outside the 3'-UTR, [7] or by binding to alternative classes of cellular RNAs. [8] Alterations to miRNA expression patterns have been linked to numerous diseases including cardiovascular disorders, [9] atherosclerosis, [10] hepatitis C infections [11] and cancer. [12] In order to clarify the functions of a miRNA, new methods are needed to identify its targetome in cells. In the past few years, various computational and experimental approaches have been employed to identify the targetomes of miRNAs (reviewed in reference [13]). Our laboratory introduced the miRNA cross-linking and immunoprecipitation (miR-CLIP) method, which employs chemically-synthesized miRNA probes, that are conjugated with photo-reactive trioxsalen and biotin groups. [8c] After transfection of the probe into cells, mild irradiation cross-links the miRNA to its targets ( Figure 1a). MiRNA-mRNA products are isolated on streptavidin beads, and targets are identified by RNA sequencing. A strength of miR-CLIP is that non-canonical targets, i. e. new functions of miRNAs that are difficult to identify by other means, can be identified. [8c,14] Trioxsalen is a psoralen derivative, which is widely used for RNA cross-linking, including miRNAs. [15] It covalently cross-links pyrimidines (preferentially uridines) in double-stranded regions of RNA upon irradiation with UV light (λ = 365 nm). [16] In previous miR-CLIP studies, [8c,14] we prepared and tested miR-CLIP probes for miR-106a, miR-132 and miR-124. Additionally, miR-106a and miR-132 mono-labelled analogues bearing trioxsalen at distinct sites were cross-linked to short, complementary counter-strands that mimic miRNA targets in the cell and were characterized using a dedicated in vitro photo-crosslinking HPLC-assay. Trioxsalen was conjugated to the 2'-Oposition of an adenosine via copper(I)-catalysed azide-alkyne cycloaddition (CuAAC; 'click') reaction ( Figure 1b). We found that the efficiency of the cross-linking reaction, which occurs through the minor groove in the miRNA/target duplex was variable and target-sequence dependent; for some sequences, the data suggested that intra-strand cross-linking was a significant by-product. We hypothesized that this was at least partially due to a paucity of accessible uridines in the target strands that could be reached by trioxsalen connected through a rigid triazole-containing linker of defined length.
Here, we introduce a new strategy designed to capture miRNA targets in a sequence-independent fashion. The approach employs probes in which trioxsalen is positioned in the major groove of a miRNA-target duplex, with the cross-linker conjugated via ethylene glycol-based linkers of different lengths to cytidines (Figure 1c). We tested the approach using model probes of miR-124 in the in vitro photo-cross-linking assays and found superior cross-linking efficiencies compared to firstgeneration probes prepared with the use of CLICK chemistry. Nevertheless, cross-linking efficiency was dependent on linker length, and since it was not possible to predict a priori the ideal linker composition, we experimented with mixtures of probes bearing oligoethylene glycol-linkers of different lengths, with the expectation that at least some library members would cross-link sufficiently well to ensure efficient capture of the entire targetome. We validated the approach on different minilibraries of six miRNA probes. Moreover, the miR-124 probes were accepted into the RNA-induced silencing complex (RISC) and acted as miRNA mimics in a cellular reporter assay.

Results and Discussion
First-generation probes ORN-1 and ORN-3 ( Figure 1d) were synthesized following the original protocol. [8c] They were then tested for cross-linking efficiency in the aforementioned in vitro assay. Thus, duplexes D1 and D2 were irradiated for 15 min (λ = 365 nm), and cross-linking yields were assessed using reversephase HPLC (RP-HPLC). Peaks with the lowest and the highest retention times corresponded to the unmodified counter-strand (ORN-2) and trioxsalen-labelled probes (ORN-1, ORN-3), respectively, as verified by mass spectrometry (Figure 1e). Small amounts of cross-linked products appeared on the chromatograms between the parent compounds, concomitant with the decrease in the peak intensities of the single-strands. Sideproducts of identical mass to the probe strand may have derived from intra-strand cross-linking.
In the new probe design, ethylene glycol (EG)-based linkers were employed for a cross-linking reaction through the major groove of the miRNA-target duplex. We have shown previously that small functional groups in the major groove do not significantly perturb the miRNA-induced silencing complex (miRISC). [17] We considered oligoethylene glycol groups as ideal linkers since they are flexible, are chemically-inert and have been widely used in drug discovery, thanks to their advantageous pharmacokinetic properties. [18] Trioxsalen analogues with amino-linkers containing up to four ethylene glycol units were obtained following previously reported methods ( Figure 2). [19] Thus, trioxsalen was converted to the 4'-chloromethyltrioxsalen (1), [19a] and then substituted with the appropriate n-ethylene glycol to produce 2 a-2 d [19b] (Figure 2a). The hydroxyl groups were then mesylated and reacted with sodium azide to give 4 a-4 d. In the final step, the azide derivatives were reduced to amines (5 a-5 d) with a Staudinger reaction. Purification of the amines was performed by acid-base extraction. We found that for longer linkers (n = 3, 4), extended reaction times were often

Chemistry-A European Journal
Full Paper doi.org/10.1002/chem.202101171 required. Excess amounts of tri-and tetraethylene glycol were needed for quantitative conversions in the pegylation step, though the removal of the excess glycols required extensive aqueous washing steps and double purification by column chromatography. 4'-Aminomethyltrioxsalen (6) was obtained from 1 by the Gabriel synthesis ( Figure 2b). [19c] The trioxsalen derivatives were conjugated at specific cytidines in the probe sequences using the "convertible nucleoside" approach [20] (Figure 1c), which we have used previously for site-specific functionalization of miRNA mimics. [17] In this method, O 4 -triazolyluridine [21] is incorporated into the oligonucleotide during solid-phase synthesis. After synthesis of the fulllength sequence, but before oligonucleotide release from the solid support, the triazole leaving group is displaced by the amine-bearing trioxsalen. Thus, O 4 -triazolyluridine phosphoramidite (7, Figure S1) was employed at position 6 of miR-124 (ORN-4, Table 1), and five equal portions of this solid support were then transferred into vials and treated with amines (5 a-5 d, 6) in the presence of DBU. Reaction mixtures were shaken for 6 h at 60°C, as these conditions appeared to be optimal in test reactions (data not shown). Reaction suspensions were then concentrated, and oligonucleotides were deprotected and detritylated under standard conditions. The RP-HPLC purification resulted in the clean separation of the products, although for some of the compounds more than one purification round was needed. In total, five ORN-4 homologues modified at the N 4 -position of cytidine C 6 with a trioxsalen attached through EG 0 -EG 4 linkers were obtained (ORN-4a to ORN-4e, Table 1,  Table S1).
In order to confirm that the cross-linker groups would not perturb hybridization of the probes to RNA targets, their thermal melting stabilities (T M 's) with a 15-nt complementary sequence (ORN-2) were measured (Table 1, Figure S2a). The experimentally obtained T M 's for the probes were in general slightly lower than that of the unmodified duplex (À 3.6°C for D8, comprising ORN-4e). However, for duplex D4 comprising the probe with the EG 0 linker (ORN-4a), the T M value (62.1 + /-0.3°C) was lower than that of the unmodified parent duplex D3 by 15°C. Such a significant drop implied that the base-pairing at this position was inhibited. [22] Previous studies describe that mono-substitution on the exocyclic N 4 -amine of (deoxy)cytidine does not abolish Watson-Crick base-pairing in DNA and RNA helices, [17,20a,22,23] if the appended residue protrudes away from the Watson-Crick face, towards the major groove of a duplex. [20a,24] In the case of ORN-4a, the linker was possibly too short so that the trioxsalen residue disturbed the local duplex structure. Therefore, probes with EG 0 linker were not prepared for other miRNAs in further studies. We also performed circular dichroism (CD) measurements on the probes ( Figure S2b). A strong positive band at 260 nm, a negative band at 210 nm and a small local maximum at 225 nm were consistent with A-form helical conformations similar to unmodified RNAs. Taken together, the biophysical data provided supporting evidence that the new probe design would not alter the conformation of a duplex, thereby helping ensure that it would not affect miRISC functions in cells.
Next, in vitro photo-cross-linking experiments were performed with single probes of the ORN-4 series to assess how they might perform in miR-CLIP experiments. Each of the probes was annealed to its counter-strand ORN-2 and irradiated (λ = 365 nm) for 15 min. Samples were then examined by RP-HPLC and masses of the isolated product peaks were analysed by LC-MS. The parent duplex D3 served as a negative control to ensure that no radical-based cross-linking independent of trioxsalen occurred. Pleasingly, clean cross-linking was observed for duplexes D6, D7 and D8. The cross-linking was most efficient for the probe with an EG 2 linker (ORN-4c, D6; Figure 3d, Table 2), which appeared to be almost quantitative on the chromatogram. Cross-linking efficiency gradually decreased with the increased length of the linker, achieving~34 % and~24 % for D7 and D8 with EG 3 and EG 4 linkers, respectively (Figure 3e-f, Table 2).
The cross-linking efficiencies observed for the probes carrying the modification at the N 4 position were significantly higher in comparison to those with the trioxsalen attached at the 2'-O-position of the same cytidine C 6 (data not shown). This might have been due to shifting the modification from the minor to the major groove, and/or the higher flexibility of the linker chains. Both of these significantly affect the reach of the trioxsalen, possibly facilitating intercalation and subsequent cross-linking. Nevertheless, we observed substantial differences in the cross-linking efficiency as a function of linker length, with EG 2 being the most effective reagent for the miR-124 sequence modified at C 6 . Since C 6 is located in the seed region of miR-124, all canonical mRNA targets of miR-124 share the same subsequence in the immediate region of the expected crosslinking. However, it was apparent that for non-canonical miR-124 targets, or a change to a different miRNA, the cross-linking

Chemistry-A European Journal
Full Paper doi.org/10.1002/chem.202101171 profile of the probes with oligoethylene glycol linkers would be altered. Furthermore, we had no easy means to predict which linker lengths might provide the most efficient cross-linking yields. To help circumvent this constraint, we proceeded to investigate the use of a controlled mixture of probes, i. e. with linkers of different lengths. Based on the similar melting temperatures of ORN-4b to ORN-4e, we hypothesized that most of the probes would hybridize equally well to miR-124 targets in cells creating a heterogeneous population of cross-linked duplexes that differ only in the lengths of the trioxsalen linkers (represented schematically in Figure 1f). Thereafter, irradiation would be expected to produce a mixture of cross-linked products with varying composition, all of which would serve equally well to identify the RNA targets of the miRNA in the miR-CLIP protocol. Therefore, equimolar (2 μM) amounts of ORN-4a to ORN-4e were mixed with the counter-strand (ORN-2; 10 μM) and irradiated. Chromatograms for the samples prior to irradiation (Figure 4a, top panel) showed a clean separation of the individual oligonucleotides. Post irradiation, the deconvoluted mass of the main newly-formed peak (marked with an asterisk in Figure 4a) matched the expected masses of the cross-linked duplexes D6 and D7. Concomitantly, the intensity of the peak corresponding to ORN-4c (containing EG 2 linker) decreased. The intensities of ORN-4d and ORN-4e were also moderately reduced, reflecting the lower cross-linking efficiency of these probes, in alignment with the data of Figure 3 and Table 2. The data confirmed that the application of the probes as a mixture was not detrimental to cross-linking efficiency.
In order to determine whether such promising results could be also obtained when the trioxsalen is conjugated at other positions in miR-124, we prepared a second set of probes, in which the oligoethylene glycol linkers were moved from C 6 to C 8 (ORN-5a to ORN-5d). Purities of the oligonucleotides in the ORN-5 series were similar to the purities of the members of the ORN-4 series (Table S1). ORN-5a to ORN-5d were annealed with ORN-2 and subjected to photo-cross-linking in an analogous fashion to ORN-4a to ORN-4e. Once again, the shifted retention times in the RP-HPLC chromatogram, consumption of the counter-strand and the mass analysis of the newly-formed peaks were consistent with duplex cross-linking (Figure 4b), but because of the presence of various ionic adducts in the MS deconvolution spectra, we were unable to identify which of the individual probes within the mixture had reacted preferentially (deconvoluted masses from the cross-linking experiments are presented in the Supporting Information, section IV).
To expand the analysis to different miRNAs, we prepared an analogous set of four probes each for miR-10b, miR-146a, miR-  CL [c] product (yield) [

Chemistry-A European Journal
Full Paper doi.org/10.1002/chem.202101171 208a and miR-155-3p (Table 3, Table S1). The purities of the synthesized oligoribonucleotides were with a few exceptions higher than 95 % (Table S1). The new probes were also irradiated as mixtures using the in vitro assay in an analogous fashion to that of miR-124. Cross-linked products were identified based on their masses and their retention times, and showed a range of reactivity profiles. Specifically, duplexes comprising miR-10b probes yielded cross-linking products mostly from the EG 3 and EG 4 linkers (ORN-6c to ORN-6d, resp.; Figure 4c), which were confirmed unambiguously by their masses. MiR-146a probes comprising EG 2 , EG 3 and EG 4 linkers (ORN-8b to ORN-8d) cross-linked to ORN-9 with almost complete consumption of the probes (Figure 4d), whereas for miR-208a, cross-linking to ORN-11 was most efficient with the longer linker lengths, i. e., with ORN-10c and ORN-10d (Figure 4e). For miR-155-3p, trioxsalen was conjugated to position 15 of the sequence (ORN-12a to ORN-12d), i. e. outside of the seed region. In this example, two new faster migrating peaks were detected in the HPLC chromatogram, for which the mass analysis corresponded to starting material (Figure 4f). This was consistent with no capture of the complementary strand ORN-13, but rather with intra-strand cross-linking events, possibly at dinucleotides U 11 U 12 or U 17 U 18 in ORN-12. It should be noted that position-15 is the last base-pair of the ORN-12/ORN-13 duplex, which may also have adversely influenced cross-linking with proximal uridines, since the last base-pair may be susceptible to duplex "breathing".
Taken together, these results validate the concept of using controlled mixtures of probes with variable linker lengths to increase the overall cross-linking efficiency and at least partly circumvent the sequence-dependent covalent capture of target mRNA by miRNA probes.
Having demonstrated the potential advantageous properties of using mini-libraries of this new class of probes for miR-CLIP experiments, we assessed their compatibility with the RNA interference mechanism in cells. It has been reported that the substitution of miRNA mimics with functional groups -particularly in their seed regions -can lead to their inactivation in miRNA mechanisms of silencing, presumably due to a failure to be taken into the miRISC. [8c,25] To assess the functionality of the new probes, the miR-124 probe series were tested in HEK293T cells using a reporter assay that we employed in the original miR-CLIP study. [8c] For this purpose, a fully complementary binding site to miR-124 was cloned into the 3'-UTR of the Renilla luciferase gene, encoded on a dual reporter plasmid. Next, HEK293T cells were co-transfected with the plasmid and with duplexes containing probes (ORN-4b to ORN-4e and ORN-5a to ORN-5d), native miR-124 (ORN-4) and a positive control (siRNA targeting Renilla gene) or a negative control (randomized duplex). [26] After readout, all of the trioxsalen-bearing probes showed similar concentration-dependent inhibition of reporter gene expression, with effects comparable to the positive controls ( Figure 5). Probes with the modification at position C 6 of the sequence (ORN-4 series) demonstrated slightly stronger target repression than probes substituted at C 8 (ORN-5 series). These results provided confirmatory evidence that this new class of probe can be functional and incorporated into miRISC machinery in cells.

Conclusion
In conclusion, we have presented a new strategy for the design and use of novel class of trioxsalen-labelled probes conjugated to cytidines of a miRNA through oligoethylene glycol linkers using the convertible nucleoside approach. These probes crosslinked in vitro to complementary synthetic targets with varying efficiencies depending on the length of the linker and the relative location of reacting uridines in the target strands. By employing an equimolar mixture of probes with different linkers for a given miRNA, the overall yields of successful cross-linking could be maximized while at the same time, attenuating sequence-dependent effects governed by the miRNA sequence. This strategy is therefore well-suited to the miR-CLIP method where all of the targets of a given miRNA can be captured and identified in cells. We validated the strategy on six model miRNAs, of which only one -miR-155-3p -failed to produce a cross-linking reaction. Reasons for this were unclear, but may well have been related to the technical set-up of the assay, i. e. it is possible that in a native environment, miR-155-3p probes (ORN-12 series) would also be effective. The convertible nucleoside approach is a very effective method to introduce modifications into the major groove of RNA duplexes. The versatility of our method can be increased further by adapting this method also to the N 6 position of adenosine. Experiments in this direction, as well as miR-CLIP reagents using these probes are underway.

General experimental details
Trioxsalen (CAS Number: 3902-71-4) was purchased from TCI. Thinlayer chromatography (TLC) was done on silica gel 60 Å F254 aluminium sheets (Merck). Synthesized compounds were purified using the preparative flash column chromatography (Isolera One flash chromatography system, Biotage) which was carried out on silica gel, 60 Å (Fluka). Solvents for TLC, column chromatography, and extractions were commercial grade. NMR spectra were recorded on a Bruker Av400 at a resonance frequency of 400 MHz (for 1 H NMR), 101 MHz (for 13 C NMR) or 162 MHz (for 31 P NMR). Solvent signals were used as internal standards. Chemical shifts (δ) are given in ppm. 1

Oligonucleotide synthesis
The sequences of the synthetic miRNAs were designed based on the miRNA sequences reported in the freely available miRBase database. [27] General information

Post-synthetic triazole substitution
20 mg portions of the CPG with ORN-4, ORN-5, ORN-6, ORN-8, ORN-10 and ORN-12 modified with the O 4 -triazolyluridine were prepared in screw cap tubes. Amines (5a-5d or 6; 10-18 mg/reaction, depending on the linker length) were dissolved in a mixture of 150 μL of dry ACN and 50 μL of DBU and added to the CPG. The suspension was then incubated for 6 h at 60°C with shaking (ThermoMixer C, Eppendorf) and concentrated in the SpeedVac (miVac duo SpeedVac, Genevac). Further deprotection and purification was performed according to the standard procedure for the modified oligoribonucleotides (described above). Some samples required double purification by RP-HPLC.

In vitro photo-cross-linking assay
The in vitro photo-cross-linking with single probes: The in vitro photo-cross-linking assay was performed following the published protocols with minor changes. [8c] Calculated volumes of the trioxsalen-modified ORN and its unmodified counter-strand were mixed, dried, and re-dissolved in 200 μL of the phosphate buffer (2.5 mM Na 2 HPO4, 5 mM NaH 2 PO 4 , 100 mM NaCl and 0.1 mM Na 2 EDTA) so that the final concentration of each strand was 3 μM. For annealing, the solution was heated to 95°C, held at that temperature for 5 min and cooled down to room temperature over 2 h. Samples were irradiated for 15 min (Bio-Link BLX, Vilber. UV source: 5 × 8-watt lamps, λ = 365 nm, a distance of the plate from the lamp: 5 cm) in an open 24-well plate placed on ice. Then, the samples were purified by RP-HPLC [XBridge OST C-18 column (10 × 50 mm, 2.5 μm; Waters) at 65°C with a flow rate 5 mL/min] using a gradient 1-60 % D in 12 min. Collected fractions were dried (miVac duo SpeedVac, Genevac), re-dissolved in ultrapure water and analysed by LC-MS [Acquity OST C-18 column (2.1 × 50 mm; Waters). The column oven was set to 65°C, flow-rate: 0.3 mL/min] with a gradient 5-60 % D in 14 min. In case of some of the oligonucleotides modified with the ethylene glycol linkers, the analysis by LC-MS was made more difficult by the presence of salt adducts (usually sodium or potassium cations), especially in case of cross-linked duplexes.
The in vitro photo-cross-linking with a mixture of probes: Samples were prepared according to the standard protocol described above with a small modification; calculated volumes (corresponding to the 2 μM concentrations) of the trioxsalen-modified ORNs were taken and mixed with the calculated volume (corresponding to the 10 μM or 8 μM concentration) of the unmodified counter-strands. The mixture was concentrated, re-dissolved in 200 μL of the phosphate buffer (2.5 mM Na 2 HPO 4 , 5 mM NaH 2 PO 4 , 100 mM NaCl and 0.1 mM Na 2 EDTA), annealed and cooled down to room temperature. After irradiation, samples were analysed according to the standard procedure.