Challenges in RNA Regulation in Huntington’s Disease: Insights from Computational Studies

: Novel therapeutic approaches are being developed to tackle neurodegenerative diseases, due to the lack of efficiency of the known druggable targets. For Huntington’s disease, a promising approach is the regulation of the RNA product. This target would allow for a selective and effective inhibition of the toxic effects exerted by the final nucleic product and the coded protein. In this review, the current state of the art of RNA regulation is discussed, with a brief but insightful view on novel plausible targets. After this, an emphasis on successful computational and experimental approaches tailored in modeling and regulating RNA aberrant behavior are extensively presented. Finally, the application and limitations of current computational meth-ods are discussed, and possible avenues for improvement are outlined.


Introduction
Huntington's disease (HD) was first recognized as a distinct disease in 1872. [1] A century later, the identification of a marker on chromosome 3 led to the subsequent elucidation of an unstable CAG trinucleotide expansion located in exon-1 of the huntingtin gene (HTT) as the single cause of the disease. [2] These mutant mRNA transcripts are then translated into a prolonged glutamine tract (polyQ) in the mutant huntingtin protein (HTT). In turn, the polyQ tract is responsible for an increased protein aggregation propensity of mutant HTT, [3] which can interconvert its monomeric, oligomeric, and fibrillar forms in vitro and in vivo. [4] In addition to the HTT protein-dependent pathogenicity, a toxic gain of function of the mutant HTT mRNA has also been observed. [5,6] This is not unique to HD. During the last decade, it has become increasingly evident that abnormalities in the functionality of protein-coding and non-coding RNAs, as well as in RNA-binding proteins function, represent a common feature among many neurodegenerative diseases. [7,8] Recent progress in next-generation sequencing has enabled researchers to study the most abundant transcripts, alternative splicing processes, and non-coding RNA molecules of nervous tissue, [9] thus furthering our knowledge of RNA pathways.
In this review, we will describe the aberrant RNA behavior in HD and highlight the recent computational advances towards modeling in RNA regulation in HD.

RNA Behavior in HD
The concept of "RNA toxicity" denotes the direct ability of the mutant transcript to induce pathogenesis. In particular, this effect has been described as the result of aberrant interactions between RNA and other proteins. The RNA can also induce toxicity indirectly by being translated into toxic proteins. [10] This is the case in HD, where RNA toxic effects coexist with the ones at the protein level, thus rendering them more challenging to assess and to understand.
Before we delve into the many ways RNA contributes to progressive neurodegenerative illnesses, we give a brief overview of RNAs and focus on those typically implicated in HD.

The Versatile Functional Roles of RNA
The human transcriptome consists of coding messenger RNA (mRNA) and multiple classes of non-coding RNAs (ncRNA), including ribosomal, transfer and a myriad of other different types. [11] While about 90 % of the human genome is transcribed to RNA, only approximately 1.2 % is protein-coding and thus suggests that ncRNA has functions in the regulation of physiological processes in all cell types. [12] Transcription of mRNA occurs initially as pre-mRNA ( Figure 1). Mature mRNA, ready for translation, is formed by splicing pre-mRNA and hereby removing intronic sequences. [13] Alternative splicing patterns result in different mature mRNA originating from the same pre-mRNA. To export mature mRNA from the nucleus into the cytoplasm, the ribonucleoprotein particle (mRNP) is formed. In the cytoplasm, mRNP associates with the ribosomal machinery and facilitates together with the transfer RNA (tRNA) the translation into the corresponding protein. During transcription, processing, transport, and eventual degradation, mRNA is handled not only by a host of RNA binding proteins but also by several ncRNAs.
Besides defective mRNA, mutations in ribosomal RNA (rRNA) and tRNA play roles in mammalian brain development, neurological syndromes, and neurodegeneration, [14] which highlights the importance of RNA in biological processes.

MicroRNA Dysregulation in HD
MicroRNAs (miRNAs) are a class of small ncRNA (typically 20-30 nucleotides long) and have received renewed interest recently in the context of aging and neurodegeneration. [9] They regulate gene expression post-transcriptionally, e. g. reducing fluctuations in protein expression [15] and are crucial for neuronal function and disease. [16] For a description in depth of biogenesis of miRNAs, the reader is referred to other works. [17][18][19][20] HD appears to affect both miRNA biogenesis and expression, since: 1. Reduced levels of ribonucleases Drosha and Dicer were noted in mouse models of HD. [21] 2. Postmortem cortex samples of HD patients showed impairment of specific miRNAs, such as upregulation of miR-29a and miR-330 and downregulation of miR-132, affecting the RNAinduced silencing complex (RISC) activity. [22,23] Some works indicate that mutant HTT is also involved in the regulation of RNA processes. [24] All these aspects can be further affected by the presence of oxidative stress in HD, [25] which is tightly intertwined with miRNA machinery and can further exacerbate disease progression. [26,27]

mRNA CAG Expansion as a Primary Driver in HD
Several evidences demonstrate that the RNA toxicity in HD is triggered by expanded CAG repeats in mRNA and not by other trinucleotides repeats encoding also for glutamine (i. e. CAA). Experiments in Drosophila showed that CAG repeats interrupted with CAA clearly manifest less pronounced toxic effects. [28] A similar result was found in human neuronal cells. [29,30] Moreover, while both CAG and CAA encode for glutamine, the secondary RNA structures linked to each of them differ. Whereas expanded CAG repeats produce a hairpin-like structure, CAA or CAG stretches interrupted with CAA do not fold into a hairpin. [31] Moreover, CAG hairpin structures were shown to bind to proteins in a length-dependent manner, [32] therefore contributing to HD pathogenicity.
Indeed a feature of HD is the presence of nuclear RNA inclusions (nuclear foci) loaded with several proteins having affinity for the CAG mRNA hairpin. [33] The number of foci per nucleus correlates with CAG repeat expansion. [34] Presumably, RNA repeat inclusions are trapped in the nucleus as a consequence of RNA overloading with proteins. This, in turn, decreases the amount of available RNA-binding proteins while increasing the likelihood of alterations in splicing and expression of other mRNA. In this way, the expanded CAG repeats have effects on the subcellular localization of the mutant transcript, as well as in the binding dynamics of specific proteins.

RNA Binding Proteins Show Aberrant Interactions with Mutant HTT mRNA
Several evidences demonstrate that the RNA toxicity in HD is triggered by expanded CAG repeats in mRNA and not by other trinucleotides repeats encoding also for glutamine (i. e. CAA). Experiments in Drosophila showed that CAG repeats interrupted with CAA clearly manifest less pronounced toxic effects. [28] A similar result was found in human neuronal cells. [29,30] Moreover, while both CAG and CAA encode for glutamine, the secondary RNA structures linked to each of them differ. Whereas expanded CAG repeats produce a hairpin-like structure, CAA tracts or CAG stretches interrupted with CAA do not fold into a hairpin. [31] Moreover, CAG hairpin structures were shown to bind to proteins in a lengthdependent manner, [32] therefore contributing to HD pathogenicity.
In HTT mRNA with multiple CAG repeats, the formation of the hairpin-like structure leads to abnormal protein interactions, impeding their normal functions. Researchers probed the mutant HTT RNA interactome using mass spectrometry and found that dysregulated splicing is pivotal for RNA-induced neurotoxicity in HD. Accordingly, they found that the majority of proteins captured by mutant HTT RNA are part of the spliceosome pathway. [35] In this section, we discuss a few key RNA-protein interactions relevant to HD recently described in the literature. 1. MID1, when acting as an E3 ubiquitin ligase, catalyzes the degradation of target proteins such as PP2A, and also allows the activity of mTOR, a PP2A-opposing kinase. Thus, mTOR enhances the activity of S6K, a kinase translational regulator. In this way, the activation of MID1 is linked to an increase of translation. [32] It has been described that in HD, the complex MID1 binds to HTT mRNA in a length-dependent manner. [36] 2. Nuclear RNA foci co-localizing with muscleblind-like splicing regulator 1 (MBNL1) were found in fibroblasts of patients with HD and SCA3. [37] Furthermore, alternative splicing defects of MBNL1 target genes were detected in HD fibroblasts and other human cell lines expressing expanded CAG repeats, [37] suggesting a role for MBNL1 sequestration and alternative splicing defects in polyQ diseases. 3. Serine/arginine-rich splicing factor 6 (SRSF6/SRp55) binding to the CAG repeats is suggested to lead to SRSF6 loss of function and subsequent splicing defects in specific targets, although this mechanism has not been yet completely clarified. [38,39] 4. Nucleolin, a protein of the nucleolus in control of rRNA transcription, shows impaired activity in HD, which in turn triggers apoptosis via the p53 pathway. [40] It was shown by Tsoi et al. [41,42] that CAG-repeat RNA induces nucleolar stress, and is able to bind and sequester the nucleolar protein nucleolin, thus disturbing ribosome homeostasis. The examples above illustrate that CAG mRNA modifies in a significant way several biological pathways. Thus, approaches towards blocking these aberrant interactions can be pursued as therapeutic strategies towards restoring normal protein function.
In the past years, two main strategies were followed to reach this aim: either targeting directly the CAG mRNA to hamper its aberrant interactions; or inhibiting RNA binding proteins by directly targeting their RNA binding motifs. In the subsequent pages, we will discuss advances and achievements following both strategies.

The CAG Hairpin: Insights into Structure, Stability, and Dynamics
Although the complete structural characterization of the HTT mRNA remains elusive, various RNA structural motifs have been identified for this macromolecule. Studies have consistently demonstrated that HTT mRNA transcripts containing CAG repeats can assume a stable secondary structure and fold into duplex helices and hairpins, [43,44] and that interruptions in the normal HTT gene destabilize the smaller hairpin structure, [45] suggesting a relevant role for the RNA structure towards inducing a toxic effect.
Within a normal CAG repeat number, in the HTT mRNA structure, a nearby CCG polymorphic trinucleotide repeat pairs with the polyCAG section, forming contiguous and small bulges with an AÀ C mismatch in a hairpin. However, for mutant HTT mRNA, the CAG region interacts both with the polyCCG section, as well as with itself, forming a secondary bulge. [46] The number of repeats is linked to the evolution from the bulge into a proper secondary hairpin, the latter thus being the one associated with the toxicity of RNA in HD. [6] Notably, sections of the HTT mRNA covering duplex helices have been experimentally obtained, and the isolated sections have been analyzed via NMR and X-ray crystallography ( Table 1).
The first crystallized structure by Kilisek and coworkers [52] in 2010 described the purine-purine mismatch due to the AÀ A interaction for the sequence r(À GG(CAG) 2 CCÀ ) 2 , with weak hydrogen-bonding interaction between the two nucleobases. Pure electrostatic interactions were suggested as the main responsible for the affinity of RNA binding proteins towards RNA, such as MBNL1. Later studies [50] also highlighted that the AÀ A mismatch widens the major grove, changing from a pure A-type RNA to a hybrid between A-and B-type RNA (with a C1'-C1' distance from complementary riboses of around 10.3 A for CÀ G contacts, but 11.3 for AÀ A contacts). [50] This widening was suggested to increase the accessibility of binding proteins or small molecules. [50] Tawani and Kumar also analyzed the latter sequence via NMR and restrained molecular dynamics (MD) studies. [50] Unlike the previous report of Yildirim et al., [51] the internal CAG AÀ A pairs displayed the anti-anti conformation, but the external CAG AÀ A pairs displayed the anti-syn conformation ( Figure 2B).
Later, Pan et al. [53] suggested that spurious conformations could occur in the external CAG AÀ A pairs because they were Figure 2. The CAG motif from experimental structures. The colors encode Cytidine (orange), Adenine (yellow) and Guanosine (cyan). The CAG can be found as a triplet (A, PDB ID: 5VH7), [49] a single motif (B, PDB ID: 2MS5), [50] or single motif co-crystallized with Myricetin (C, PDB ID: 5XI1) [48] are shown. The main conformations for the AÀ A mismatch are also highlighted (B, PDB ID: 4J50). [51] Review Isr. J. Chem. 2020, 60, 681 -693 flanked by CC or GG nucleotides, which were shown to be the least stable of the 10 dinucleotide steps. [54] Because these steps were never present in a genuine (CAG) n , its presence could bias the conformation of adjacent AÀ A pairs.
It was found that anti-anti conformations, followed by antisyn conformations, were energetically preferred, by a very small energetic difference (~1 kcal/mol), which could explain all the previous findings. Transition rates of the AÀ A conformations were also found to be frequent. They also suggested that anti-anti interactions are indeed connected with a wider major groove, and a substantial decrease of the inclination angle, while anti-syn interactions favors the canonical AÀ RNA form. [53] Chen and coworkers [49] published independently a study with NMR and restrained MD, which indicated that one-and zero-hydrogen bond states are the most frequent structures observed in the CAG AÀ A interactions, and that the AÀ A and UÀ U pairs induced distortions to A-form RNA of their helices, therefore corroborating the previous results ( Figure 2A).

Small Molecules Binding to CAG Hairpins: Experiments and Computations
In this section, computational and experimental approaches regarding CAG-small molecule interactions will be described. In this respect, molecular simulation has revealed itself as an increasingly powerful tool to predict RNA/ligands adducts structures. [55][56][57][58] In particular, all-atom molecular dynamics (MD) simulations reproduce rather accurately the structure, (sub-) microsecond conformational fluctuations and energetics of RNA oligos alone [59] and in complex with ligands. [60,61]

Probing the Druggability of CAG Triplet
Kumar and coworkers [62] probed small molecules binding to the CAG-motif, specifically at the AA mismatch, with a combination of virtual screening and chemical similarity searching, based on known nucleic acid binders. A small molecule,4-guanidinophenyl-4-guanidinobenzoate (hereafter D6), was found to inhibit the CAG-MBNL1 complex in vivo, with D6 binding to the CAG hairpin with high affinity.
In its 2013 study, Yildirim et al. [51] also suggested that the anti-syn conformation is quite flexible, and showed a plausible binding pocket near AA base pairs, thus reinforcing the viability of a binding pocket in the CAG motif.
The first structure confirming experimentally the interaction between the CAG motif and a small molecule was obtained by Khan and coworkers, [48] showing the interaction of r(À CCG(CAG)CGGÀ ) 2 with Myricetin using NMR spectroscopy and restrained MD ( Figure 2C). The structure shows the flavonoid located in the AÀ A mismatch region, binding as an intercalator. As a result, one of the adenines is forced towards the solvent. Further π-π stacking and hydrogen bond interactions stabilize the bound pose.
Studies by Mukherjee et al. [47] showed the binding of cyclic bis-naphthyridines to r(À G(CAG) 2 CÀ ) 2 by biophysical and Xray studies. Interestingly, the results showed a proportion of two ligands per CAG triplet, where the ligands are located inside the RNA helix and mimic nucleobases. The ligands are located in the AÀ A mismatch, as in the flavonoid example, but in this case the immediate guanosine and cytosine are flipped out, widening and kinking the helix.
Overall, the two crystallized complexes are posed in the AÀ A mismatch but inducing quite different binding conformations. These results suggest a large plasticity of the CAG hairpin, which in turn can be used as an advantage when designing new small molecule binders towards CAG-motifs.

Docking Compounds into CAG Triplet Pockets
The analysis of the atomic details of CAG RNA-ligand interactions was greatly advanced by the availability of highresolution structures of CAG RNA-ligand complexes discussed above. However, the experimental structure determination for RNA and its complexes is challenging and currently cannot be accomplished in a high-throughput manner. Therefore, the development and the implementation of computer software for modeling RNA-ligand complexes is becoming a key technology in this field.

Review
Isr. J. Chem. 2020, 60, 681 -693 to find bioactive compounds towards the CAG motif. After identifying the top candidates, docking studies suggested the respective binding poses and the docking binding strength.
Further testing analyzed the activity of these compounds towards alleviating the polyQ-mediated pathogenesis in HD cellular models and cells derived from HD patients.
In another study, [64] a pyridocoumarin family of derivatives was also tested against the CAG motif. Docking studies were also used to model the putative binding mode, and further ITC and fluorescence studies were performed.
Taken together, when modeling the interaction between a small molecule and RNA, docking should be taken as a complimentary procedure or as an educated guess, due to the lack of proper force field parameterization (vide infra).

Molecular Dynamics Simulations and Free Energy Calculations for CAG Repeats Binding Ligands
Bochicchio and coworkers [65] used computational tools towards predicting the binding pose and affinity of D6 (vide supra) towards a CAG-repeat oligo r(À GG(CAG) 2 CCÀ ) 2 . The free energy surface (FES) was calculated with MD by welltempered metadynamics [66] as a function of three collective variables: distance between the centers of mass of the RNA and D6 (dCM), the number of intermolecular hydrogen bonds (nHB), and the number of intermolecular hydrophobic contacts (nHC).
Although the starting pose was obtained with a docking procedure, the authors noticed that the initial pose changed significantly, indicating that the docked pose was far from the real minimum. The obtained in silico model suggested a nonintercalating binding mode for D6, with salt bridges between the guanidine tails of D6 and the phosphate backbone, and parallel (β ring) and T-shaped (α ring) π-stacking interactions with two adenines of the CAG repeats.
Based on this information, Matthes et al. [67] identified a set of CAG-repeat binder candidates by in silico methods. One of those, Furamidine, decreases the protein level of HTT in an HD cell line model, demonstrating for the first time the activity of an RNA binder against mutant HTT protein in living cells. Free energy calculations were also performed on furamidine-CAG interaction, which agreed with experimental results. Moreover, furamidine reduces MID1 binding, as well as the binding of other RNA binding proteins, to HTT mRNA in vitro. [67] A comparison of the two binding poses is shown in Figure 3.
Up to now, the affinity of ligands binding to RNA CAG repeats has been measured for oligos of nonpathogenic length. [62,64] Knowledge of the binding poses of these ligands may help design new molecules, which eventually bind to CAG repeats with pathogenic length.

Modulating Protein-RNA Interactions
Coding and non-coding RNA are handled and processed by several RNA-binding proteins (RBPs), which often feature evolutionarily conserved RNA binding domains (RBDs), e. g. RNA recognition motifs (RRMs). [68] A recently compiled evolutionary-oriented database of RRMs (RRMdb) was put forward by Nowacka et al. [69] Also, high throughput RNA interactome capture (RIC) experiments performed by Hentze et al. have significantly expanded our knowledge of existing RBPs. [70] Interestingly, they found that many proteins detected in screens for RBPs lack a structured Pfam domain, while detected RBDs were enriched in intrinsically disordered regions (IDRs). This suggests that many RNA-protein interactions in RBPs occur via non-structured proteins or regions. [71]

In Silico Approaches to Detect RNA Binding Proteins
Encouragingly, experimental methods have been augmented with computational approaches to reliably predict RBPs. Such algorithms can be classified as either sequence-, or templatebased. The former primarily depend on evolutionary information and physiochemical similarity, e. g. RBPPred, [72] whereas template-based methods also incorporate structural information of RBPs, e. g. focusing on electrostatic complementarity. [73] In this regard, support vector machine models (SVM) were successfully used in both mentioned examples.
SVMs were trained using large-scale mass spectrometry data, exploiting the strong tendency of RBPs to interact with each other in an algorithm known as SONAR (Support vector machine Obtained from Neighborhood Associated RBPs) and showed good predictive ability. [74] SPOT-Seq-RNA represents an alternative template-based technique that predicts protein-RNA complex structures and their associated binding affinities  [67] using a knowledge-based energy function, DRNA, derived from known protein-RNA complexes. [75,76] Apart from RBPs, there is considerable interest in the computational development of algorithms that map the RNAinteracting surface of proteins to predict the RNA binding residues that facilitate protein-RNA interactions. [77,78] Miao and Westhof reported a large scale assessment of 19 web servers and 3 stand-alone programs on 41 datasets including more than 5000 proteins derived from 3D structures of proteinnucleic acid complexes. [79] An updated overview is presented by Wan et al. [80]

Predicting RNA-protein Complexes
Experimental RBP prediction in combination with computational approaches will further our understanding of mutant HTT synthesis, regulation, modification, and degradation. In turn, these insights can guide the rational design of agents that hamper the correct interaction of mutant HTT RNA with HDrelevant RBPs, thus representing an intriguing therapeutic strategy.
In this respect, the development of RNA-protein docking algorithms has been hampered by many obstacles. Although initially, researchers just extended algorithms for proteinprotein docking to protein-nucleic acids complexes, the high flexibility, often associated with RNA molecules, rendered searching-algorithms using only static RNA structures questionable at best. Additionally, while small molecule protein docking programs benefited from the ever growing amount of structural protein data, the number of experimentally elucidated RNA-protein complexes still lags behind. [77] Several programs have been developed to mitigate this situation (see footnote 1 ).
Due to the growth of high-throughput experimental technologies, like Chip-Chip and Chip-Seq, a lot of RNAprotein interactions have been revealed, which enables datadriven inference of RNA-protein associations. [91] Various machine learning methods have been employed for the prediction of RBP-RNA interactions. In this regard, methods based on SVM such as BindN/BindN + , [92,93] the PiRaNhA web server, [94] or PSSMs-SVMs [95] have been developed. Other algorithms rely on decision trees, e. g. NAPS, [96] Naïve Bayes classifiers, e. g., RNABindR [97] or random forests, e. g. PRBR. [98] Notable recent advancements include DisoRDPbind, a method to predict intrinsically disordered protein interactions with RNA [99] and long short-term memory (LSTM) neural networks approaches for generative models to construct single-stranded nucleic acids which target proteins with high affinity. [100]

Challenges and Opportunities of RNA Modeling
To date, only short segments of the pathogenic RNA have been obtained by structural studies, which leaves important questions regarding the behaviour of the toxic RNA in more than four repeats, or the interaction of this with other relevant binding proteins. In this way, computational approaches in the form of molecular dynamics or molecular docking have been used for several years in many applications to successfully model the RNA structure and dynamics.
In the next lines, some of the current options for RNA modeling are discussed, along with current limitations and challenges.

RNA-small Molecules Docking Approaches
After the development of the first scoring function specific for RNA-ligand complexes, by Morley et al. [101] for their proprietary high-throughput docking program RiboDock, several novel methodologies tailored for ligand-RNA docking were proposed.
Among these, we can find: (i) Knowledge-based scoring functions for evaluating RNA-ligand complexes like DrugScoreRNA, [102] or LigandRNA, [103] where both methods use the analysis of RNA-ligand contact distances for evaluating the stability of the complexes. (ii) Grid-based scoring functions, like the one used in DOCK6, [56] or MORDOR, [104] that implement the nonbonded terms of the AMBER and other molecular mechanics force field.

Atomistic Force Fields
Atomistic molecular dynamics (MD) simulation is a powerful tool for characterizing, at atomistic level, the conformational changes undergone by proteins. The application of such tools to RNA structures, however, has proven more challenging, mostly due to the fact that the physical models (force fields) available for MD simulations of RNA molecules are considered substantially less accurate in many aspects than those currently available for proteins. [105] The stability of nucleic acids depends on a delicate balance of electrostatic and Van der Waals forces, which has not been properly treated in current force fields. Current problems that have been commonly mentioned [105][106][107][108][109] are the following: -torsional parameters on the RNA backbone and glycosidic bonds. 1 Currently, popular programs and web servers include RosettaDock server, [81] 3D-Garden, [82] HADDOCK, [83] HEX server, [84] SwarmDock, [85] ZDOCK server, [86] ATTRACT, [87] pyDOCKSAXS, [88] InterEvDOCK, [89] and HDOCK. [90] -overstabilization of nucleobase stacking by electrostatic and van der Waals (vdW) interactions. -underestimation of base-pairing strength, which can lead to a destabilization of the proper native state. -excessive stabilization of the unfolded single-stranded RNA ensemble by intramolecular base-phosphate and sugarphosphate interactions -conformational sampling of the 2-hydroxyl group on the sugar. -charged species are not properly handled, such as phosphate groups, polyvalent ions, etc.
The AMBER force field [107,110] is currently the most widely used force field for MD simulations of RNA systems. Initial efforts to improve the accuracy of the AMBER RNA force field have aimed largely at improving the description of canonical double-helical RNA structure, by refining backbone and glycosidic torsion parameters on the basis of quantum mechanical (QM) calculations [107,[111][112][113][114][115] or experimental data. [116] Also, van der Waals parameters were optimized to avoid overstabilization of nucleobase stacking and the underestimation of base-pairing strength. [117][118][119] This improved the correct modeling of RNA tetraloops, but still did not optimize the accuracy of other RNA systems, like single strand RNA.
Recently, the D. E. Shaw's group further modified the electrostatic, vdW, and torsional parameters of the AMBER ff14 RNA force field [110] by using a combination of ab initio and empirical methods. The new RNA force field showed a more accurate reproduction of the energetics of nucleobase stacking, base pairing, and key torsional conformers for extended and structured RNA systems, including short and long single strand RNAs, RNA duplexes, tetraloops, and riboswitches. [105] Kuhrova et al. [106] suggested a different approach, by selectively fine-tuning H-bonds. The gHBfix potential introduced highlights that conventional reparameterization of dihedral potentials or non-bonded terms can lead to major undesired side effects, and that the addition of these extra terms improves the force field performance while avoiding introduction of major new imbalances.
Force fields usually consist of several empirical energy terms, including short-range bonded interactions and nonbonded interactions such as dispersion and electrostatics. In particular, non-polarizable force fields use fixed point charges to represent electrostatic interactions. The main limitation of said force fields is the absence of polarization, i. e., the response of the charge distribution to the environment. This is particularly relevant when dealing with small chemicals interacting with highly charged biomolecules, like RNA. Also, atom-centered point charge models are very different from the realistic charge distributions, which are usually smooth and anisotropic. Moreover, these models cannot reproduce charge penetration effects that occur when atomic electron clouds overlap. [120,121] Such effects are critical for determining the equilibrium geometry and energy of molecular complexes. [122][123][124] With respect to the previous shortcomings, some strategies have been followed: (i) Charge anisotropy can be reproduced by the adoption of higher-order multipolar electrostatics models or the addition of off-center charge sites. [125][126][127] In this regard, atomic multipoles truncated at the quadrupole term have been shown to model phenomena such as σ-holes, lone pairs, and π-bonding. (ii) Charge penetration can be modeled as a softening of the electrostatic interaction at short ranges, usually with the aid of empirically determined damping functions. [128] Charge penetration models can be combined with the above anisotropic electrostatic model, by placing charge densities on bonds or lone pairs, [129] or by using damping functions for atomic multipoles. [128] Advances in incorporating electron polarization in RNA simulations are represented by the polarizable Drude force field for RNA [108] (an extension of the Drude-2017 force field), [130] or the AMOEBA force field for nucleic acids. [131] It is worth mentioning that albeit the description of the biomolecular system often increases, the computational demand for the computation also rises. In this respect, both Drude and AMOEBA have their own OpenMM version which enables GPU computing [132][133][134] and AMOEBA also has its Tinker-HP version for parallel computing. [135] Overall, atomistic force fields are suitable for modeling short RNA-paired sequences, RNA-small molecule interactions, and RNA protein interactions.

Coarse-grained Force Fields
Coarse-grained force fields have gained attention as potential functions that can sample conformational spaces of larger molecules in a reduced computational time, at the cost of a less defined atomic description. In these models, more than an atom is encoded into a bead (pseudo-atom), which thus reduces the total number of particles in the system.
These potentials are able to study biologically interesting processes such as the folding of RNA structures, as in the TAR hairpin, [136] which usually develop over very long time scales, where normal atomistic force fields would require prohibitive amounts of computational power.
Several force fields exist, which can go from very coarse definitions, using 1 (NAST) [137] or 3 [138,139] pseudo-atoms per nucleotide, to more defined models, ranging from 5 to 7 pseudo-atoms (RACER, [136] SimRNA, [140] HiRE-RNA, [141] MARTINI [142] ) per nucleotide. In general, the more detailed CG models agree that higher resolution in the backbone or in the nucleobases improves the overall behaviour of the simulations.
However, one of the major issues regarding coarse-grained force fields in RNA is the lack of directionality on key molecular interactions such as hydrogen bonds, or the lack of an explicit description of non-canonical base pairs, which are critical for RNA structures. [143]

Current Therapeutic Approaches and Perspectives
Here we give a brief overview of the many therapeutics which are currently approved or under clinical trial for HD, as well as their biological rationale. [144,145]

Small Molecule Approaches
Some small molecules targeting excitotoxicity have shown efficacy in humans, such as Tetrabenazine, Deutetrabenazine and Memantine. [146] Another strategy is hampering HTT proteolysis, by inhibiting caspase-1 and -3 mRNA upregulation, e. g. with Minocycline. [147] Since mitochondrial dysfunction plays a major role already at an early stage of HD pathogenesis, [148] researchers investigated Meclizine and Cystamine with success in mouse/fly models. [149,150] Furthermore, targeting transcriptional dysregulation with histone deacetylase inhibitors, e. g. sodium phenylbutyrate or suberoylanilide hydroxamic acid proved efficient in mice models. [151,152] Also, as shown by Matthes et al. [67] and Kumar and coworkers, [62] it is possible to decrease mRNA toxicity, as well as hamper its aberrant binding with proteins, by directly targeting the CAG hairpin in the HTT mRNA with small molecules, like D6 or Furamidine.

Small Biologics
Yamamoto et al. showed with a conditional mouse model the potential for silencing mutant HTT expression as a putative treatment option for HD. [153] Moreover, they demonstrated that a continuous influx of mutant protein is essential to maintain inclusions and symptoms. Thus, the silencing of the mutant HTT gene is an interesting therapeutic approach.
In general, RNA-targeted treatments in HD are based on RNA interference (RNAi), antisense oligonucleotides (ASO), or small-molecule splicing modulators.
RNA interference. RNAi relies on either duplex RNAs (dsRNAs) or chemically modified single-stranded RNAs (ssRNAs), [154][155][156] which are introduced into the body for robust and sustained suppression of the target gene product. [157,158] Since these miRNA are known to produce only a partial knockdown in the transduced region, further elucidation of the precise mechanism behind miRNA-based HTT silencing in the brain with computational biology provides an intriguing line of research. [159][160][161] Antisense Oligonucleotides (ASO). Single-stranded artificial DNA molecules either bind to mRNA resulting in mRNA degradation, or interfere with ribosomal attachment, modulating RNA splicing. [144,162] However, some disadvantages such as complicated administration procedures are present. [163] HD patients with two mutant alleles are rare and limited data suggested a similar disease onset age but a more severe clinical progression, [164] while HTT has been associated with neuroprotective effects. [165] Therefore, allele-selective ASOs, targeting only mutant HTT, were investigated by Carroll et al. Their ASOs exploit single-nucleotide polymorphisms (SNPs) to increase selectivity for the mutant allele. [166] They showed furthermore, that therapies targeting as few as three SNPs may benefit 85 % of patients with HD of northern European and indigenous South American descent.
On the other hand, Tominersen (formerly known as RG6042 and IONIS-HTTRx) is a non-allele-selective ASO, consisting of a synthetic 20-nucleotide DNA-like sequence, where oxygen-to-sulphur backbone substitutions, and 2'-Omethoxylethyl modifications improve efficacy and the pharmacokinetic profile. [167] Tominersen was the first agent to be assessed in clinical trials and researches estimated an 55-85 % reduction in cortical mHTT and a 20-50 % reduction in caudate mHTT while no dose-limiting toxicity was noted. [144,168]

Outlook
HD is caused by CAG repeat expansions in the HTT gene, encoding the huntingtin protein. Although no disease-curing or disease-slowing treatment currently exists for HD, the development of therapies to target HTT transcription and the translation of its mRNA is of great interest, and it is currently under intense investigation.
Here we have reviewed the current knowledge of RNA pathogenicity in HD and how this was exploited, with both experimental and computational approaches, to propose novel therapeutic strategies. Up to now, three main methods to reduce HTT mRNA toxicity emerged: These are ASOs, RNAi, and small-molecules either targeting splicing modulators or directly the HTT mRNA, i. e. specifically the toxic CAG hairpin.
Clinical trials using ASOs targeting HTT mRNA are currently underway, and future ASO trials are planned, including agents that seek to lower mutant HTT selectively. On the other hand, the absence of human trials for RNAi reflects the distribution and delivery challenges associated with RNAi: RNA does not distribute well throughout brain tissue, so stereotactic surgery is needed to deliver the agent via a viral vector, at the cost of increased invasiveness, and risks associated with long-term toxicity.
For these reasons, small molecule RNA-targeting compounds are a viable strategy: a brain-penetrant, orally bioavailable small molecule acting on protein manufacture to lower HTT expression or on the toxic CAG hairpin to lower its aberrant ability to recruit proteins, is appealing, specifically in terms of easiness of delivery.
Computational methods are becoming key tools in advancing the understanding of the biomolecular interactions and the development of novel, potent modulators of biological pathways. Overall, some advances have been made towards inhibiting protein-RNA interactions through small molecules, peptides, and nucleic acids. However, more research is required to improve the quality of structural predictions via force fields or statistical learning tools.
We expect that as the development of new modulators increases, novel data-intensive tools from the machine learning field will further contribute here, and the resulting models will benefit both computational and experimental research, in order to manage to tackle these challenging neurodegenerative diseases.