The contribution of pseudouridine to stabilities and structure of RNAs

Thermodynamic data are reported revealing that pseudouridine (Ψ) can stabilize RNA duplexes when replacing U and forming Ψ-A, Ψ-G, Ψ-U and Ψ-C pairs. Stabilization is dependent on type of base pair, position of Ψ within the RNA duplex, and type and orientation of adjacent Watson–Crick pairs. NMR spectra demonstrate that for internal Ψ-A, Ψ-G and Ψ-U pairs, the N3 imino proton is hydrogen bonded to the opposite strand nucleotide and the N1 imino proton may also be hydrogen bonded. CD spectra show that general A-helix structure is preserved, but there is some shifting of peaks and changing of intensities. Ψ has two hydrogen donors (N1 and N3 imino protons) and two hydrogen bond acceptors because the glycosidic bond is C-C rather than C-N as in uridine. This greater structural potential may allow Ψ to behave as a kind of structurally driven universal base because it can enhance stability relative to U when paired with A, G, U or C inside a double helix. These structural and thermodynamic properties may contribute to the biological functions of Ψ.


INTRODUCTION
Pseudouridine (5-ribosyluracil, É) is one of the most abundant modified nucleotides (1). For example, it is found in transfer, ribosomal, small nuclear and small nucleolar RNAs (snoRNAs), and is likely important for structure and function. Isomerization of the glycosidic bond from N1 to C5 of uracil ( Figure 1) enhances base rotation (1,2) and allows both N1 and N3 imino protons to serve as hydrogen bonding donors. It was postulated that É could increase the thermodynamic stability of RNA duplexes due to formation of an additional hydrogen bond and/or better stacking (3,4). In comparison to uridine, É favors C3 0 -endo conformation of the ribose which is correlated with anti conformational preferences of nucleobases (3). Published results demonstrate structural rigidity of É within single-and double-stranded RNA regions, but the basic topology of U and É containing RNA is usually similar (1,(3)(4)(5)(6)(7)(8)(9)(10). In a model for the U2 snRNA involved in splicing of pre-mRNA, however, É-induced significant changes of the architectural landscape of the spliceosomal branch site of RNA compared with an unmodified counterpart (11,12).
É is found in the TÉC loop of almost all tRNAs. For most tRNAs, É is also found in the D stem and/or in the anticodon stem and loop (1). É contributes to stabilization of specific structural motifs. For example, É is phylogenetically conserved in major splicesomal snRNAs (U1, U2, U4, U5 and U6) and mostly located in regions involved in RNA-RNA and RNA-protein interactions essential for assembly of the splicesome and for the splicing process (12)(13)(14). The 5 0 end of U2 snRNA binds directly to pre-mRNA splice sites and is particularly rich in É residues (12). Structural studies of models for RNAs involved in splicing indicate that a É-A pair adjacent to a splice site stabilizes the helix formed between U2 snRNA and pre-mRNA and induces bulging out of an adenosine to initiate splicing (4). The function of É in snoRNAs is unknown. É is also present in functionally important regions of ribosomal RNAs (small subunit, large subunit, 5.8S and 5S), including the peptidyl transferase center (PTC) (1). In eukaryotic large subunit rRNAs, É accounts for 0.9-1.4% of the nucleotides (1). Positions of many É in rRNAs are very conserved, suggesting importance for structure and function. For example, Escherichia coli rRNA helix 69 (region 1906-1924) and human rRNA region 3722-3740 contain three and five conserved É residues, respectively (1,15,16). Spectral studies on a bacterial hairpin containing helix 69 show that É's exhibit a range of effects on RNA stability and structure, depending on their location (5,7,8,17). É's placed in the single-stranded loop destabilize the hairpins whereas É in the helical stem increases stability.
Here, we present results concerning the influence of É residues on thermodynamic stabilities of RNA duplexes containing É-A, É-G, É-U and É-C pairs at internal and terminal positions. The contribution of É as 5 0 -and 3 0 -terminal dangling ends is also reported. Additionally, the influence of A-U and G-C adjacent pairs on the thermodynamic stabilities of RNA duplexes with a central É-A or É-G pair is reported. Circular dichroism (CD) spectra of É containing RNA duplexes were also measured. Thermodynamic and CD results for É are compared to those of isosequential duplexes with U instead of É. NMR spectra of imino proton regions of RNA duplexes containing É indicate that the N3 proton is usually hydrogen bonded in É-A, É-G and É-U pairs.

Choice of oligonucleotide sequences
The oligonucleotide sequences were selected to allow placing of É at internal and terminal positions with the same nearest neighbor to minimize influence of surrounding Watson-Crick base pairs on stabilities, structures and interactions of É pairs. Duplexes with É-A, É-G, É-U and É-C pairs were compared to otherwise identical duplexes with É replaced by U.

Synthesis and purification of oligonucleotides
Oligonucleotides were synthesized on an Applied Biosystems DNA/RNA and MerMade12 (Bioautomation) synthesizers, using standard b-cyanoethyl phosphoramidite chemistry on solid support (23). Commercially available A, C, G, U (GenePharma) and É (ChemGenes) phosphoramidites with 2 0 -O-tertbutyldimethylsilyl were used for synthesis of RNA. The syntheses of all oligonucleotides were performed on universal Unylinker synthesized for this purpose. The details of deprotection and purification of oligoribonucleotides were described previously (24,25). Thin-layer chromatography (TLC) purification of the oligonucleotides was carried out on Merck 60 F 254 TLC plates with the mixture 1-propanol/aqueous ammonia/water = 55:35:10 (v/v/v). Purified oligonucleotides were characterized by mass spectrometry (MALDI TOF).

UV melting experiments
Oligonucleotides were melted in buffer containing 1 M NaCl, 20 mM sodium cacodylate, 0.5 mM Na 2 EDTA, pH 7. Oligonucleotide single strand concentrations were calculated from absorbance above 80 C and single strand extinction coefficients were approximated by a nearestneighbor model (26,27). The extinction coefficient of É nucleoside at 260 nm is $20% smaller than for U but was assumed to be equal to U (28). This results in only 1-2% error in duplex concentration. Absorbance versus temperature melting curves for nine concentrations of RNA duplexes ranging in concentration between 10 À3 and 10 À6 M were measured at 260 nm with a heating rate of 1 C/min from 0 to 90 C on a Beckman DU 640 or JASCO V-650 spectrophotometer with a thermoprogrammer. Melting curves were analysed and thermodynamic parameters calculated from a two-state model with the program MeltWin 3.5, which includes standard deviation estimates of errors (29). For all sequences, except 5 0 UCACUGAGU/3 0 AGUGGCUCA, the ÁH derived from T À1 M versus ln(C T /4) plots is within 15% of that derived from averaging the fits to individual melting curves, as expected if the two-state model is reasonable. For the exception, two transitions were observed, so the two-state approximation is not valid. Comparisons between duplexes used values from T À1 M versus ln (C T /4) plots fit by linear least squares. Standard deviations were propagated by taking the square root of the sum of the squares of the individual standard deviations. The correlations coefficients between ÁH o and ÁS o were 0.99995 for duplexes with and without É, as calculated from Tm's (24).
CD spectroscopy CD spectra of RNA duplexes were measured in triplicate from 205 to 350 nm at 10 C and 25 C on a JASCO 815 spectropolarimeter. The buffer was 1 M NaCl, 20 mM sodium cacodylate and 0.5 mM Na 2 EDTA at pH 7 and duplex concentration was 0.12 mM. The measured CD spectra were averaged; spectrum of the buffer subtracted and the result converted into molar ellipticity per nucleotide (Áe).

Nuclear magnetic resonance
All RNAs used for NMR measurements were dissolved in buffer containing 150 mM NaCl, 10 mM sodium phosphate and 0.1 mM EDTA, pH 6.8. For studies of exchangeable protons, the solvent was H 2 O/D 2 O (9:1, v/v) and for assignment of aromatic protons the solvent was 99.990% D 2 O (Sigma Aldrich). All the samples were annealed by heating at 90 C for 5 min, then slowly cooled to room temperature and stored at 4 C.
NMR spectra were measured on a Bruker AVANCE III 700 MHz spectrometer, equipped with a QCI CryoProbe. The 3 mm sample tubes were used with a final sample volume of 210 ml. The 1D proton spectra in H 2 O/D 2 O (9:1, v/v) were collected with water suppression using excitation sculpting with gradients (30) and in D 2 O, with presaturation, from 128 scans. Assignment of resonances was based on homonuclear NOESY (30) and heteronuclear 1 H-15 N HSQC spectra (31)(32)(33)(34). Spectra were processed and prepared with TopSpin 3.0 Bruker Software.

RESULTS
Thermodynamic data for model RNA duplexes with É and, in parenthesis, with É replaced by U are presented in Table 1. The thermodynamic effects of adding terminal nucleotides were measured by extending core duplexes 5 0 C AGUCAGU/3 0 GUCAGUCA and 5 0 UCAGUCAG/3 0 AG UCAGUC ( Table 2). The thermodynamic effects of replacing U with É were measured in duplexes of the type 5 0 UCAXMYAGU/3 0 AGUYNXUCA, where X-Y is a Watson-Crick pair, M is U or É and N is A, C, G or U (Tables 1 and 3). Table 2 summarizes thermodynamic effects of adding terminal nucleotides U or É as a 5 0 -dangling end on the C of a CG pair. U and É enhanced free energy of duplex formation (ÁÁG 37 ) by similar increments of 0.78 and 0.66 kcal/mol, respectively, at 37 C. Both stability increments are more favorable than the 0.1 kcal/mol measured for the 5 0 -dangling U of (UCCGGp) 2 , where p represents a 3 0 -phosphate (39). Either U or É as a 3 0 -dangling end on the G of a GC pair favors duplex (ÁG 37 ) by 0.97 kcal/mol, within experimental error of the 0.6 kcal/mol measured for CCGGUp (40). It is commonly postulated that enhancement of RNA duplex stability by single 5 0 -and 3 0 -dangling ends are due to the stacking ability of the unpaired nucleotide (40)(41)(42). Based on NMR structural studies it is postulated that É stacks better than U, presumably due to better rotational lability of the C1 0 -C5 glycoside bond in É over the C1 0 -N1 in U (3). However, those conclusions were formulated for É in an internal position of helical RNA.

Contributions of terminal unpaired and paired ) on thermodynamic stabilities of RNA duplexes
Larger and sequence dependent stability increments are measured when a 5 0 or 3 0 terminal U or É is paired with A, G, U or C. A 5 0 -terminal U-A in a 5 0 UC/3 0 AG nearestneighbor sequence enhances duplex stability by 1.88 kcal/ mol at 37 C, identical within error limits to the 1.81 kcal/ mol measured for the 5 0 -terminal 5 0 ÉC/3 0 AG sequence (Table 1). A similar negligible difference is seen when U is replaced by É in the 3 0 -terminal 5 0 GU/3 0 CA sequence, where addition of the terminal U-A or É-A pairs provide stability increments of -2.18 and -2.10 kcal/mol, respectively. The increments of -1.88 and -2.18 kcal/mol for U are similar to increments of (-2.35+0.45) = -1.90 and (-2.24+0.45) = -1.79 kcal/mol expected from a nearest-neighbor model involving only Watson-Crick pairs (24). The increment of -1.81 kcal/mol for the terminal É-A pair is similar to the value (-2.49+0.31) = -2.18 kcal/mol expected from a nearestneighbor model, but the measured -2.10 kcal/mol is smaller than the predicted (-3.29+0.31) = -2.98 kcal/mol (19).
Adding a U-G and É-G pair at the 5 0 end enhanced stability by 1.78 (predicted is 1.80 kcal/mol, (43)) and 1.53 kcal/mol, respectively, which is within experimental error of the values for U-A and É-A (Table 2). Placing U-G and É-G pairs at the 3 0 end enhanced stabilities by 2.44 (predicted is 2.15 kcal/mol) and 3.22 kcal/mol, respectively, suggesting some extra stabilization from the 3 0 -terminal É-G. Adding a U-U or É-U pair at the 5 0 end enhanced stability by 0.86 and 1.41 kcal/mol, respectively, similar to values of 1.05 and 1.83 kcal/mol for the additions at the 3 0 end. Here É-U appears more stable than U-U at both ends. Adding a U-C or É-C pair at the 5 0 end enhanced stability by 1.12 and 0.81 kcal/mol, respectively, which are close, considering experimental errors. The values at the 3 0 end are 1.36 and 1.52 kcal/ mol, respectively, also within experimental error.
The results summarized in Table 2 reveal that in most cases the thermodynamic effect of substitution of É for U at terminal positions is within experimental error. The exceptions are 3 0 terminal U-G and U-U pairs in the 5 0 GU/3 0 CG and 5 0 GU/3 0 CU nearest-neighbor motifs, respectively, where replacement of U with É enhances stability by 0.8 kcal/mol. In all but two cases, the terminal stability increments are also consistent with previous measurements. One difference is that the 5 0 dangling end U in the 5 0 UC/3 0 G motif had a stability increment of -0.78 kcal/mol compared to a previous single measurement of -0.1 kcal/mol. It is possible that the enhanced stability reflects a dynamic pairing ensemble of the 3 0 G with 5 0 UC relative to that possible with the previous measurement on the shorter duplex (UC CGGp) 2 . Another possibility is an unexpected effect from the 3 0 phosphates in (UCCGGp) 2. The other case where the measurement is not predicted within experimental error is the 3 0 terminal 5 0 GÉ/3 0 CA motif where the measured stability increment is -2.10 kcal/mol compared to a nearest-neighbor prediction of -2.98 kcal/mol. The nearest-neighbor parameters for É-A pairs, however, are currently based on only five occurrences of each (19).

Contribution of internal paired ) on thermodynamic stabilities of RNA duplexes
Somewhat larger effects were measured when U was replaced by É in É-A, É-G, É-U and É-C pairs at internal positions (Table 3). In most cases, replacing U with É clearly enhanced thermodynamic stabilities (i.e. gave more negative ÁG 37 ). Most enhancements in stabilities upon replacing U with É to give internal É-A, É-G and É-U pairs ranged from 0.3 to 0.8 kcal/mol ( Table 3). The two outliers are the motifs 5 0 CÉG/3 0 GAC and 5 0 GÉC/3 0 CGG where É substitution for U enhances duplex stability by 2.43 and 1.40 kcal/mol, respectively. In these cases, flipping the Watson-Crick base pair orientation gives enhancements of 0.71 and $0.4 kcal/mol, respectively, which is within the usual range of ÁG 37 's. Presumably, the outliers reflect favorable stacking and hydrogen bonding interactions of É relative to U. As shown below, there is no NMR evidence for hydrogen bonding involving the H1 imino proton of É in the 5 0 GÉC/3 0 CGG duplex. The results suggest that 3D structures and computational studies of the outlier duplexes might be interesting because little is known about interactions favoring stacking.
Because nearest-neighbor parameters are available for U-A and É-A pairs adjacent to Watson-Crick pairs, it is possible to compare the measured and predicted effects of replacing a middle U-A pair with É-A (Table 3). In two cases, the difference between measured and predicted exceeds 1 kcal/mol. The measured versus predicted values are -2.43 versus -0.8 kcal/mol for the 5 0 CÉG/ 3 0 GAC central motif and -0.27 versus -3.5 kcal/mol for the 5 0 AÉU/3 0 UAA central motif. The 5 0 AÉU/3 0 UAA motif occurs in the duplex predicted least well by nearest-neighbor parameters (Table 1). It also contains the two É-A nearest-neighbor parameters (19) that differ most from their U-A (24) equivalents with É enhancements of -1.70 and -1.81 kcal/mol.

NMR reveals at least one hydrogen bond of pseudouridine pairing with A, G or U
Pseudouridine is potentially more versatile in its hydrogen bonding than uridine because two imino protons are    Differences in measured free energies relative to analogous RNA duplexes with uridine (U) instead of pseudouridine (É); values in parenthesis are differences predicted on basis of nearest parameters (38) and (19); thermodynamic values used are from Tm -1 plots. c Two transitions were observed when central pair was U-G, so two-state approximation is not valid. available to serve as hydrogen bond donors (Figure 1). NMR spectroscopy was used to compare hydrogen bonding in a central U-A pair with that with a central É paired with A, C, G or U (Figure 2).

NMR of duplexes with a U-A or W-A central pair
The 1D proton spectrum of the duplex containing a U 5 -A 14 pair revealed seven imino proton resonances. The G and U imino protons were distinguished based on analysis of 15 N chemical shifts in 1 H-15 N HSQC spectra and by characteristic NOE cross-peaks to C amino protons for G and to AH2 protons for U. Assignments were made with 2D NOESY spectra (data not shown), which revealed typical intra-and interstrand correlations. No imino resonances were detected for the terminal U's. Non-exchangeable protons were assigned with standard methods (44). The NOESY spectra recorded in D 2 O exhibited typical features of double-stranded sequential connectivities and are consistent with a duplex structure in which all nucleotides are involved in Watson-Crick base pairs.
The 1D proton spectrum of the É-A duplex ( Figure 2) shows eight major imino resonances and is similar to that of the U-A duplex except for the appearance of a new signal at 10.25 ppm. This high field resonance was assigned to the É 5 N1H imino proton and its chemical shift is in agreement with previous observations of É-A base pairs (8,17,45). Assignments of É 5 N1H resonances throughout this work were based on the fact that they exhibit strong NOEs to upfield peaks assigned as É 5 CH6. The É 5 N3H imino proton is shifted upfield by $0.19 ppm relative to the U 5 N1H imino proton of the U-A duplex. The presence of NOE cross-peaks between É 5 N3H and G 4 N1H and G 13 N1H was helpful in their unambiguous assignments. The observed strong NOE cross correlation between É 5 N3H and the A 14 H2 proton (data not shown) demonstrated that pseudouridine É 5 pairs with adenosine A 14 in a Watson-Crick manner involving É 5 N3H-A 14 N1 and É 5 O2-A 14 N6H hydrogen bonds ( Figure 1).
Spectra were similarly measured, assigned and interpreted for duplexes with a central É-A in triplets 5 0 CÉG/3 0 GAC, 5 0 AÉU/3 0 UAA and 5 0 UÉA/3 0 AAU ( Figure 2). In all cases, the É-A pair is in a Watson-Crick conformation. The spectra for the 5 0 AÉU/3 0 UAA, and 5 0 UÉA/3 0 AAU duplexes show eight resonances (this includes an overlap of U 15 N3H and É 5 N3H for 5 0 AÉU/ 3 0 UAA), with the É 5 N1H resonance upfield between 10 and 11 ppm. The spectrum of the 5 0 CÉG/3 0 GAC duplex, however, has two resonances between 10 and 11 ppm suggesting some conformational equilibrium involving É 5 N1H. The appearance and upfield chemical shift of the É 5 N1H is consistent with those protons being partially protected from exchange with water and potentially in a weak hydrogen bond.
NMR of duplexes with a W-G, W-U or W-C central pair The 1D proton spectrum of the É-G duplex ( Figure 2) shows eight imino proton resonances because the resonances of G 4 and U 12 are overlapped at 13.50 ppm. A strong imino-imino cross-peak was observed between resonances at 11.99 and 11.85 ppm and the chemical shifts were attributed to É 5 N3H and G 14 N1H, respectively. This type of interaction is distinctive of wobble pair formation because of the short distance between the two imino protons (Figure 1).
The 1D proton spectrum (Figure 2) of the É-U duplex shows eight imino proton resonances. The É 5 N3H imino proton was distinguished from U 14 N3H by observation of an NOE cross-peak between the amino proton of C 6 and É 5 N3H (data not shown). A strong NOE cross-peak between É 5 N3H and U 14 N3H protons was observed indicating formation of a É-U base pair. Based on NOESY data alone, however, it was not possible to distinguish between the types of É-U base pairs shown in Figure 1.
The proton spectrum of the É-C duplex ( Figure 2) shows only six imino proton resonances and all were assigned to Watson-Crick A-U and G-C pairs. No imino protons of É were detected, presumably due to exchange with water. Evidently, É and C do not hydrogen bond together. Nevertheless, a continuous set of H8/H6-H1 0 NOE connectivities including É 5 and C 14 , were determined for both strands, so both residues stack into the duplex. Additionally, the presence of aromaticaromatic NOE cross-peaks between G 4 H8/É 5 H6, É 5 H6/ C 6 H6, G 13 H8/C 14 H6 and C 14 H6/C 15 H6 support inward orientation of É 5 and C 14 . The lack of hydrogen bonding in the É-C pair could explain why the 5 0 GÉC/ 3 0 CCG duplex is the least stable of those with a central 5 0 GÉC/3 0 CNG motif.

CD spectra of RNA duplexes containing pseudouridine
The influence of É residues on global helical structure was investigated by measuring CD spectra for all the duplexes listed in Table 1. Spectra at 10 C for some of the duplexes with internal É are shown in Figure 3 (see also Supporting Information). The spectra have shapes consistent with A-form RNA helixes (46,47), although some shifting and changing of intensity of peaks, shifting of crossover points and appearance of some spectral shoulders were observed, consistent with literature reports (3,5,7,8,17). CD spectra of duplexes with U instead of É exhibited similar variety. Evidently, internal substitution of É for U has little effect on global conformation.
Surprisingly, substitution of É for U as a 5 0 -dangling end diminished by ca. half the molar ellipticity of the positive peak at 266 nm and reduced even more the intensity of the negative peak at 209 nm (Supporting Information). The CD spectra with and without a 5 0 -dangling U were essentially identical. When É or U was placed as a 3 0 -dangling end, the CD spectra were nearly identical. Less intense CD spectra were also observed when É replaced the 5 0 U in U-A, U-G and U-U pairs. All the spectra with 5 0 -terminal É had ellipticity of less than or $1 near 265 nm. Most other spectra had ellipticities between 1 and 2 in this region. No duplexes with ellipticity >2 near 265 nm had internal or 5 0 terminal É. The CD spectra did not reveal any structural reasons for the thermodynamic results.

DISCUSSION
It has been shown that replacement of U with É thermodynamically stabilizes RNA duplexes (3,8,9,18). Usually stronger stacking of É in comparison to U was postulated as the major origin of enhancement of stability (3). Recently, Znosko's team quantified the effect for É-A pairs relative to U-A pairs by determination of nearestneighbor parameters for É-A on the basis of optical melting of 24 duplexes (19). They found that on average at internal and terminal positions, substitution of É-A for U-A enhanced duplex stability by 1.7 and 1.0 kcal/mol, respectively. As shown in Table 1, the nearest-neighbor parameters for É-A and for Watson-Crick pairs predict within 1 kcal/mol the measured ÁG 37 's for five of the six duplexes with É-A pairs, when experimental error is considered. The exception has a central motif of 5 0 AÉU/ 3 0 UAA and is predicted to be more stable than measured by 3 kcal/mol at 37 C. This duplex contains the two É-A nearest neighbors with parameters that differ most from those of U-A, suggesting an unusual context dependence for stabilities of 5 0 AÉ/3 0 UA and 5 0 ÉA/3 0 AU nearest neighbors.
Tables 1-3 present for the first time thermodynamic effects of substituting É for U to give É-G, É-U and É-C pairs. Such pairs are often found in pseudouridylated RNA duplexes, particularly of splicesomal snRNAs (13). Considering experimental error, substituting É for U in these mispairs can enhance stability between $0 and 1 kcal/mol, with an average of 0.5 kcal/mol (Tables 2  and 3). The largest stabilization after É-A is for É-G, whereas É-C has little effect. For the six duplexes with É-G, enhancements of stabilities relative to U-G duplexes range between 0 and 1.4 kcal/mol (Tables 2 and  3). These initial results imply that nearest-neighbor parameters for É-G pairs will be dependent on the neighbor, as already known for Watson-Crick (24), É-A (19) and U-G (43) pairs. NMR studies of duplexes with internal É-A, É-G and É-U pairs revealed hydrogen bonding via the N3 imino proton. The spectra are consistent with É and U hydrogen bonding to A, G and U in similar ways (Figure 2). Resonances are also seen for ÉN1H proton in duplexes with a É-A pair, suggesting ÉN1H may be involved in a weak hydrogen bond. In contrast, no imino proton resonances were observed for É-C, implying a lack of hydrogen bonding. This correlates with É-C duplexes being the least stable. CD spectra are consistent with expectations from NMR spectra in that the overall global conformation is A-form with or without É.

SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.