Journal of Molecular Biology
Simulating RNA Folding Kinetics on Approximated Energy Landscapes
Introduction
Ribonucleic acid performs diverse and important functions in the cell. Messenger RNAs, transfer RNAs and ribosomal RNAs function as integral components of the protein synthesis machinery whose workings can be understood at nearly atomic detail,1 while other RNAs are involved in the mRNA maturation as part of the spliceosome.2 Recent work has highlighted the ability of noncoding and plant microRNAs to base-pair with complementary regions of mRNAs that target them for degradation.3 Another class of folded RNA domains, termed riboswitches, are found in the 5′ untranslated regions of mRNAs and function as molecular switches by binding a specific metabolite that changes the structure of the mRNA; this, in turn, controls transcription termination/anti-termination or translation initiation.4 Detailed structural and physicochemical studies of metabolite-sensing riboswitches have led to the hypothesis that the function of these domains may be kinetically controlled, with regulation of the flavin mononucleotide riboswitch, in particular, controlled by the relative rates of flavin mononucleotide binding and RNA transcription under the prevailing intracellular concentrations of metabolite.5 Thus, this ligand-controlled conformational switch may operate under kinetic control, with the relative stabilities of the two, often mutually exclusive, conformational states (with and without ligand) perhaps less important.
Earlier work on the control of Escherichia coli ColE1 plasmid copy number and translation initiation in bacterial and bacteriophage mRNAs hinted strongly that the kinetics of folding might be capable of dictating biological function. For example, accelerating the refolding rate of RNA II can increase the ColE1 plasmid copy number.6, 7 In the control of translation initiation, mutations of IS10 transposase mRNA that slowed the folding kinetics of structure formation increased the rate of ribosome binding resulting in a higher expression of IS10 transposase.8 For the bacteriophage MS2 maturation protein, the mRNA is efficiently translated only when the region around the Shine–Dalgarno sequence, which base-pairs with the 16S ribosomal RNA, is found in an unpaired or “open” conformational state. Since it is closed (paired) in the native state, this can only happen before folding finishes. The longer the RNA remains in the open “metastable” state, the higher the gene expression rate. Recently coined RNA thermometers associated with the heat-shock response in α- and γ-proteobacteria are thought to function in the same way, by controlling translation initiation via a temperature-dependent conformational change around the Shine–Dalgarno sequence.9, 10 These few examples highlight the importance of developing a robust computational method that can be used both to study the global properties of RNA folding and to provide more detailed insights of the kinetics of folding RNA substructures as well as the rates of interconversion between these substructures as a function of temperature.
A number of computational approaches have been developed to investigate the kinetics of RNA folding. For instance, folding pathways have been identified using both Monte Carlo (MC)-based algorithms11, 12, 13 and genetic algorithms.14, 15, 16 Other approaches utilize dynamic programming to calculate the partition funtion Q = ∑sexp(−ΔG(s)/kT) over all secondary structures s to gain insight into RNA folding kinetics.17 The ViennaRNA package implements McCaskill's algorithm in addition to an expanded set of nearest-neighbor free energies to account for noncanonical pairings.18 Ding and Lawrence extended this algorithm to generate statistical samplings of RNA structures based on the partition function.19 Some methods incorporate computation of the RNA folding energy landscape. Chen and Dill used matrices to compute the partition function over all possible structures in order to approximate the complete folding landscape.20 Wuchty modified Zuker's algorithm to generate all secondary structures within some given free energy of the native structure,21 and Clote developed an algorithm to generate all secondary structures below a threshold with respect to the Nussinov–Jacobson energy model.22 Flamm11 and Wolfinger et al.23, 24 extended this algorithm to identify local minima within some energy threshold of the native state and connect them via energy barriers, with the resulting energy barrier tree representing the energy landscape. In order to calculate the energy barrier, these authors used a flooding algorithm that is exponential in the size of the RNA; it is therefore impractical for large RNAs. Waldispühl and Clote created an algorithm to generate all saturated secondary structures having k base pairs.25 Statistical mechanical methods have also been used to study RNA folding kinetics. For example, the master equation has been used to compute the population kinetics of the folding landscape, with a matrix of differential equations used to calculate the probability of a transition between different conformations.26, 27, 28 Once solved, the dominant modes of the solution describe the general folding kinetics.29 In addition, one can also approximate the folding kinetics by performing a statistical analysis on an ensemble of MC folding pathways.30, 31, 32
In this work, we present a novel suite of computational tools that can be used to approximate the folding energy landscape and extract both global properties and detailed features of the folding process. The key advantage of our approach over the other computational techniques discussed above is that it is fast and efficient while providing both macroscopic and microscopic properties of the folding kinetics, e.g., population kinetics and low-resolution folding details. Our method first builds a map, or model, of the RNA folding energy landscape. We then adapt standard energy landscape analysis tools to analyze the energy landscape represented by our map; the solution of the master equation on our map (map-based master equation or MME) allows us to investigate properties of the ensemble during the time course of folding. We also present another technique, termed a map-based Monte Carlo (MMC) simulation, to extract microscopic folding pathways stochastically from our maps. These tools allow us to investigate the formation rates of transition states and therefore provide a comprehensive picture of the rates at which folding intermediates are formed and lost, as well as the final equilibrium distribution of folded conformers. A new statistical sampling method ensures that our method scales well and is applicable to large RNAs containing hundreds of nucleotides. Finally, we validate our method against traditional MC simulation methods that require a complete energy landscape, as well as against existing experimental data in two systems.
Section snippets
Summary of the method
Our work provides computational tools to approximate the RNA folding energy landscape and extract global properties and detailed features of the folding process. The key advantage of our approach over other computational techniques is that it is fast and efficient while bridging the gap between high-level folding events and low-level folding details. Our method builds an approximate representation (called a map) of the RNA's folding energy landscape and then uses specialized analysis techniques
Conclusions
We present a new model and new map-based analysis tools that can be used to study RNA folding kinetics and provide specific insights into both local and global features of the process. These new tools enable us to study larger RNAs than before, up to hundreds of nucleotides. The method is validated against known experimental data for two classic cases in detail. We anticipate that our method will be a valuable tool for discovering such relationships for other RNAs that have not yet been
General computational methods
Our method first constructs a map that approximates the energy landscape and then uses a number of map-based tools to analyze the approximated energy landscape. In our current implementation, we are able to generate three types of maps based on complete BPE, SPE and PBS. While a BPE map describes the complete energy landscape, it is not feasible for large RNA, e.g., more than approximately 40 nucleotides. SPE maps are a subset of BPE maps and are 1 or 2 orders of magnitude smaller but are still
Acknowledgements
We acknowledge grants from the NSF (EIA-0103742, ACR-0081510, ACR-0113971, CCR-0113974, ACI-0326350, CRI-0551685 to N.M.A.), the NIH (AI040187 to D.P.G.), the Department of Energy and the Hewlett-Packard Foundation (to N.M.A.). Ms. Thomas was supported in part by an NSF Graduate Research Fellowship, a PEO scholarship, a Department of Education Graduate Fellowship (GAANN), and an IBM TJ Watson PhD Fellowship. Ms. Tapia was supported in part by an NIH Molecular Biophysics Training Grant (T32
References (40)
- et al.
The ribosome in focus: new structures bring new insights
Trends Biochem. Sci.
(2007) The spliceosome: caught in a web of shifting interactions
Curr. Opin. Struct. Biol.
(2007)MicroRNAs: genomics, biogenesis, mechanism, and function
Cell
(2004)Riboswitches and the role of noncoding RNAs in bacterial metabolic control
Curr. Opin. Chem. Biol.
(2005)- et al.
The speed of RNA transcription and metabolite binding kinetics operate an FMN riboswitch
Mol. Cell
(2005) - et al.
The computer simulation of RNA folding pathways using a genetic algorithm
J. Mol. Biol.
(1995) - et al.
RNA folding pathway functional intermediates: their prediction and analysis
J. Mol. Biol.
(2001) - et al.
Biphasic folding kinetics of RNA pseudoknots and telomerase RNA activity
J. Mol. Biol.
(2007) - et al.
Self-induced structural switches in RNA
Biochimie
(2002) - et al.
RNA secondary structure and their prediction
Bull. Math. Biol.
(1984)
The influence of a metastable structure in plasmid primer RNA on antisense RNA binding kinetics
Nucleic Acids Res.
RNA structure and the regulation of gene expression
Plant Mol. Biol.
Control of translation by mRNA secondary structure: the importance of the kinetics of structure formation
Mol. Microbiol.
RNA thermometers
FEMS Microbiol. Rev.
FourU: a novel type of RNA thermometer in Salmonella
Mol. Microbiol.
RNA Folding at Elementary Step Resolution
RNA
RNA secondary structure: physical and computational aspects
Q. Rev. Biophys.
Prediction and statistics of pseudoknots in RNA structures using exactly clustered stochastic simulations
Proc. Natl Acad. Sci. USA
The massively parallel genetic algorithm for RNA folding: MIMD implementation and population variation
Bioinformatics
The equilibrium partition function and base pair binding probabilities for RNA secondary structure
Biopolymers
Cited by (50)
Cotranscriptional folding of RNA pseudoknots with different rates
2021, Chemical Physics LettersCitation Excerpt :Denesyuk and Thirumalai used coarse-grained models to study RNA folding [14], which can not deal with complicated RNA. Several computational methods based on coarse-grained kinetic model was developed [15,16]. This method can effectively reduce the number of conformations.
Efficient computation of co-transcriptional RNA-ligand interaction dynamics
2018, MethodsCitation Excerpt :Since barriers relies on an exhaustive enumeration of low energy structures, the computational effort grows exponentially with sequence size, which limits the length of RNAs that can be studied to a little over 100 nt. Recently, a number of heuristic approaches have been reported that attempt to raise this limit based on flooding techniques [34,24] or sampling of local minima [27,28,20,21]. RNA-ligand interactions are crucial for the functioning of cis-acting ribo-regulators, but are outside the conventional RNA secondary structure model.
Predicting RNA secondary structures from sequence and probing data
2016, MethodsCitation Excerpt :Since barriers relies on an exhaustive enumeration of low energy structures, it is in practice applicable only to RNAs of less than 100 nt. Recently, a number of heuristic approaches have been reported that attempt to raise this limit based on flooding techniques [134,71] or sampling of local minima [113,114,56]. Experimental technologies to elucidate RNA structure by means of chemical and enzymatic probing were established long before the first computational approaches toward RNA structure prediction have become available [105].
Review of machine learning methods for RNA secondary structure prediction
2021, PLoS Computational BiologyProbing the RNA Structure-Dependent RNA Regulations and Functions
2023, RNA Technologies
- †
This work was completed while he was a Ph.D. student in the Department of Computer Science at Texas A&M University.