Skip to main content

Faster Algorithms for RNA-Folding Using the Four-Russians Method

  • Conference paper
Algorithms in Bioinformatics (WABI 2013)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8126))

Included in the following conference series:

Abstract

The secondary structure that maximizes the number of non-crossing matchings between complimentary bases of an RNA sequence of length n can be computed in O(n 3) time using Nussinov’s dynamic programming algorithm. The Four-Russians method is a technique that will reduce the running time for certain dynamic programming algorithms by a multiplicative factor after a preprocessing step where solutions to all smaller subproblems of a fixed size are exhaustively enumerated and solved. Frid and Gusfield designed an \(O(\frac{n^3}{\log n})\) algorithm for RNA folding using the Four-Russians technique. In their algorithm the preprocessing is interleaved with the algorithm computation.

We simplify the algorithm and the analysis by doing the preprocessing once prior to the algorithm computation. We call this the two-vector method. We also show variants where instead of exhaustive preprocessing, we only solve the subproblems encountered in the main algorithm once and memoize the results. We give a simple proof of correctness and explore the practical advantages over the earlier method.

The Nussinov algorithm admits an O(n 2) time parallel algorithm. We show a parallel algorithm using the two-vector idea that improves the time bound to \(O(\frac{n^2}{\log n})\).

We have implemented the parallel algorithm on graphics processing units using the CUDA platform. We discuss the organization of the data structures to exploit coalesced memory access for fast running times. The ideas to organize the data structures also help in improving the running time of the serial algorithms. For sequences of length up to 6000 bases the parallel algorithm takes only about 2.5 seconds and the two-vector serial method takes about 57 seconds on a desktop and 15 seconds on a server. Among the serial algorithms, the two-vector and memoized versions are faster than the Frid-Gusfield algorithm by a factor of 3, and are faster than Nussinov by up to a factor of 20.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Akutsu, T.: Approximation and exact algorithms for RNA secondary structure prediction and recognition of stochastic context-free languages. J. Comb. Optim. 3(2-3), 321–336 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  2. Andronescu, M., Condon, A., Hoos, H.H., Mathews, D.H., Murphy, K.P.: Efficient parameter estimation for RNA secondary structure prediction. Bioinformatics 23(13), i19–i28 (2007)

    Google Scholar 

  3. Andronescu, M., Condon, A., Hoos, H.H., Mathews, D.H., Murphy, K.P.: Computational approaches for RNA energy parameter estimation. RNA 16(12), 2304–2318 (2010)

    Article  Google Scholar 

  4. Arlazarov, V., Dinic, E., Kronrod, M., Faradzev, I.: On economical construction of the transitive closure of a directed graph (in Russian). Dokl. Akad. Nauk. 194(11) (1970)

    Google Scholar 

  5. Backofen, R., Tsur, D., Zakov, S., Ziv-Ukelson, M.: Sparse RNA folding: Time and space efficient algorithms. Journal of Discrete Algorithms 9(1), 12–31 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  6. Chang, D.-J., Kimmer, C., Ouyang, M.: Accelerating the nussinov RNA folding algorithm with CUDA/GPU. In: ISSPIT, pp. 120–125. IEEE (2010)

    Google Scholar 

  7. Ding, Y., Lawrence, C.E.: A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Research 31(24), 7280–7301 (2003)

    Article  Google Scholar 

  8. Do, C.B., Woods, D.A., Batzoglou, S.: CONTRAfold: RNA secondary structure prediction without physics-based models. Bioinformatics 22(14), e90–e98 (2006)

    Google Scholar 

  9. Dowell, R., Eddy, S.: Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction. BMC Bioinformatics 5(1), 71 (2004)

    Article  Google Scholar 

  10. Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G.: Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press (1998)

    Google Scholar 

  11. Frid, Y., Gusfield, D.: A simple, practical and complete O(n 3)-time algorithm for RNA folding using the Four-Russians speedup. Algorithms for Molecular Biology 5, 13 (2010)

    Article  Google Scholar 

  12. Frid, Y., Gusfield, D.: A worst-case and practical speedup for the RNA co-folding problem using the four-russians idea. In: Moulton, V., Singh, M. (eds.) WABI 2010. LNCS, vol. 6293, pp. 1–12. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  13. Frid, Y., Gusfield, D.: Speedup of RNA pseudoknotted secondary structure recurrence computation with the four-russians method. In: Lin, G. (ed.) COCOA 2012. LNCS, vol. 7402, pp. 176–187. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  14. Hamada, M., Kiryu, H., Sato, K., Mituyama, T., Asai, K.: Prediction of RNA secondary structure using generalized centroid estimators. Bioinformatics 25(4), 465–473 (2009)

    Article  Google Scholar 

  15. Hofacker, I.L.: Vienna RNA secondary structure server. Nucleic Acids Research 31(13), 3429–3431 (2003)

    Article  Google Scholar 

  16. Knudsen, B., Hein, J.: Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Research 31(13), 3423–3428 (2003)

    Article  Google Scholar 

  17. Lu, Z.J., Gloor, J.W., Mathews, D.H.: Improved RNA secondary structure prediction by maximizing expected pair accuracy. RNA 15(10), 1805–1813 (2009)

    Article  Google Scholar 

  18. Markham, N.R., Zuker, M.: UNAFold. Bioinformatics 453, 3–31 (2008)

    Article  Google Scholar 

  19. Mathews, D.H., Disney, M.D., Childs, J.L., Schroeder, S.J., Zuker, M., Turner, D.H.: Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. PNAS 101(19), 7287–7292 (2004)

    Article  Google Scholar 

  20. Nussinov, R., Jacobson, A.B.: Fast algorithm for predicting the secondary structure of single-stranded RNA. PNAS 77(11), 6309–6313 (1980)

    Article  Google Scholar 

  21. Nussinov, R., Pieczenik, G., Griggs, J.R., Kleitman, D.J.: Algorithms for loop matchings. SIAM Journal on Applied Mathematics 35(1), 68–82 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  22. Reuter, J., Mathews, D.: RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinformatics 11(1), 129 (2010)

    Article  Google Scholar 

  23. Rizk, G., Lavenier, D.: GPU accelerated RNA folding algorithm. In: Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2009, Part I. LNCS, vol. 5544, pp. 1004–1013. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  24. Stojanovski, M.Z., Gjorgjevikj, D., Madjarov, G.: Parallelization of dynamic programming in nussinov RNA folding algorithm on the CUDA GPU. In: Kocarev, L. (ed.) ICT Innovations 2011. AISC, vol. 150, pp. 279–289. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  25. Tinoco, I., et al.: Improved Estimation Of Secondary Structure In Ribonucleic-Acids. Nature-New Biology 246(150), 40–41 (1973)

    Article  Google Scholar 

  26. Venkatachalam, B., Frid, Y., Gusfield, D.: Faster algorithms for RNA-folding using the Four-Russians method. UC Davis Technical report (2013)

    Google Scholar 

  27. Wexler, Y., Zilberstein, C.B.-Z., Ziv-Ukelson, M.: A study of accessible motifs and RNA folding complexity. Journal of Computational Biology 14(6), 856–872 (2007)

    Article  MathSciNet  Google Scholar 

  28. Zakov, S., Goldberg, Y., Elhadad, M., Ziv-Ukelson, M.: Rich parameterization improves RNA structure prediction. In: Bafna, V., Sahinalp, S.C. (eds.) RECOMB 2011. LNCS, vol. 6577, pp. 546–562. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  29. Zakov, S., Tsur, D., Ziv-Ukelson, M.: Reducing the worst case running times of a family of RNA and CFG problems, using valiant’s approach. In: Moulton, V., Singh, M. (eds.) WABI 2010. LNCS, vol. 6293, pp. 65–77. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  30. Zuker, M.: Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Research 31(13), 3406–3415 (2003)

    Article  Google Scholar 

  31. Zuker, M., Stiegler, P.: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Research 9(1), 133–148 (1981)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Venkatachalam, B., Gusfield, D., Frid, Y. (2013). Faster Algorithms for RNA-Folding Using the Four-Russians Method. In: Darling, A., Stoye, J. (eds) Algorithms in Bioinformatics. WABI 2013. Lecture Notes in Computer Science(), vol 8126. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40453-5_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40453-5_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40452-8

  • Online ISBN: 978-3-642-40453-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics