Skip to main content
Log in

Adaptive Ensemble Biomolecular Applications at Scale

  • Review Article
  • Published:
SN Computer Science Aims and scope Submit manuscript

A Publisher Correction to this article was published on 28 September 2023

This article has been updated

Abstract

Recent advances in both theory and methods have created opportunities to simulate biomolecular processes more efficiently using adaptive ensemble simulations. Ensemble-based simulations are used widely to compute a number of individual simulation trajectories and analyze statistics across them. Adaptive ensemble simulations offer a further level of sophistication and simulation efficacy by enabling high-level algorithms to control simulations based on intermediate results. Novel high-level algorithms for adaptive simulations require sophisticated approaches to manage the ensemble members and utilize the intermediate data during runtime. Thus, there is a need for scalable software systems to support adaptive ensemble-based methods. We describe the operations in executing adaptive workflows, classify different types of adaptations, and describe challenges in implementing them in software tools. We establish the design considerations of software systems to support the requirements of adaptive ensemble applications at extreme scale. We use Ensemble Toolkit (EnTK) and its associated task execution runtime system (RADICAL-Pilot)—middleware building blocks to implement a scalable adaptive ensemble execution system. We implement two high-level adaptive ensemble algorithms—multiwalker expanded ensemble and Markov state modeling, and execute up to \(2^{12}\) ensemble members, on thousands of cores on three distinct HPC platforms. We highlight scientific advantages enabled by the novel capabilities of our approach. To the best of our knowledge, this is the first attempt at describing and implementing multiple adaptive ensemble workflows using a common conceptual and implementation framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Change history

References

  1. Cheatham TE, Roe DR. The impact of heterogeneous computing on workflows for biomolecular simulation and analysis. Comput Sci Eng. 2015;17(2):30–9.

    Article  Google Scholar 

  2. Trebst S, Troyer M, Hansmann UHE. Optimized parallel tempering simulations of proteins. J Chem Phys. 2006;124:174903.

    Article  Google Scholar 

  3. Hansmann UHE. Parallel tempering algorithm for conformational studies of biological molecules. Chem Phys Lett. 1997;281:140–50.

    Article  Google Scholar 

  4. Mitsutake A, Sugita Y, Okamoto Y. Replica-exchange multicanonical and multicanonical replica-exchange Monte Carlo simulations of peptides. I. Formulation and benchmark test. J Chem Phys. 2003;118:6664.

    Article  Google Scholar 

  5. Mitsutake A, Okamoto Y. Replica-exchange extensions of simulated tempering method. J Chem Phys. 2004;121:2491.

    Article  Google Scholar 

  6. Ballard AJ, Jarzynski C. Replica exchange with nonequilibrium switches. Proc Natl Acad Sci. 2009;106(30):12224–9. https://doi.org/10.1073/pnas.0900406106.

    Article  Google Scholar 

  7. Rauscher S, Neale C, Pomes R. Simulated tempering distributed replica sampling, virtual replica exchange, and other generalized-ensemble methods for conformational sampling. J Chem Theory Comput. 2009;5(10):2640–62. https://doi.org/10.1021/ct900302n ISSN: 1549-9618.

    Article  Google Scholar 

  8. Comer J, Phillips JC, Schulten K, Chipot C. Multiple-replica strategies for free-energy calculations in NAMD: multiple-walker adaptive biasing force and walker selection rules. J Chem Theory Comput. 2014;10(12):5276–85 ISSN: 1549-9618.

    Article  Google Scholar 

  9. Janosi L, Doxastakis M. Accelerating flat-histogram methods for potential of mean force calculations. J Chem Phys. 2009;131(5):054105 ISSN: 1089-7690.

    Article  Google Scholar 

  10. Raiteri P, Laio A, Gervasio FL, Micheletti C, Parrinello M. Efficient reconstruction of complex free energy landscapes by multiple walkers metadynamics. J Phys Chem B. 2006;110:3533–9.

    Article  Google Scholar 

  11. Voter AF. Hyperdynamics: accelerating molecular dynamics of infrequent events. Phys Rev Lett. 1997;78:3908–11. https://doi.org/10.1103/PhysRevLett.78.3908.

    Article  Google Scholar 

  12. Huang C, Perez D, Voter AF. Hyperdynamics boost factor achievable with an ideal bias potential. J Chem Phys. 2015;143:074113. https://doi.org/10.1063/1.4928636.

    Article  Google Scholar 

  13. Voter AF. Parallel replica method for dynamics of infrequent events. Phys Rev B. 1998;57(22):13985–8.

    Article  Google Scholar 

  14. Chodera JD, Swope WC, Pitera JW, Dill KA. Long-time protein folding dynamics from short-time molecular dynamics simulations. Multiscale Model Simul. 2006;5(4):1214–26.

    Article  MathSciNet  MATH  Google Scholar 

  15. Bowman GR, Huang X, Pande VS. Using generalized ensemble simulations and Markov state models to identify conformational states. Methods. 2009;. https://doi.org/10.1016/j.ymeth.2009.04.013.

    Article  Google Scholar 

  16. Maragliano L, Roux B, Vanden-Eijnden E. Comparison between mean forces and swarms-of-trajectories string methods. J Chem Theory Comput. 2014;10(2):524–33. https://doi.org/10.1021/ct400606c.

    Article  Google Scholar 

  17. Atzori A, Bruce NJ, Burusco KK, Wroblowski B, Bonnet P, Bryce RA. Exploring protein kinase conformation using swarm-enhanced sampling molecular dynamics. J Chem Inf Model. 2014;54(10):2764–75. https://doi.org/10.1021/ci5003334.

    Article  Google Scholar 

  18. Sanchez-Martinez M, Field M, Crehuet R. Enzymatic minimum free energy path calculations using swarms of trajectories. J Phys Chem B. 2015;119(3):1103–13. https://doi.org/10.1021/jp506593t.

    Article  Google Scholar 

  19. Pan AC, Sezer D, Roux B. Finding transition pathways using the string method with swarms of trajectories. J Phys Chem B. 2008;112(11):3432–40.

    Article  Google Scholar 

  20. Husic BE, Pande VS. Markov state models: from an art to a science. J Am Chem Soc. 2018;140(7):2386–96.

    Article  Google Scholar 

  21. Bowman GR, Ensign DL, Pande VS. Enhanced modeling via network theory: adaptive sampling of markov state models. J Chem Theory Comput. 2010;6(3):787–94.

    Article  Google Scholar 

  22. Miron RA, Fichthorn KA. Accelerated molecular dynamics with the bond-boost method. J Chem Phys. 2003;119(12):6210–6. https://doi.org/10.1063/1.1603722.

    Article  Google Scholar 

  23. Voter AF. Parallel replica method for dynamics of infrequent events. English. Phys Rev B. 1998;57(22):13985–8.

    Article  Google Scholar 

  24. Suárez E, Lettieri S, Zwier MC, Stringer CA, Subramanian SR, Chong LT, Zuckerman DM. Simultaneous computation of dynamical and equilibrium information using a weighted ensemble of trajectories. J Chem Theory Comput. 2014;10(7):2658–67. https://doi.org/10.1021/ct401065r.

    Article  Google Scholar 

  25. Dakka J, Balasubramanian KPV, Turilli M, Wright DW, Zasada SJ, Wan S, Coveney PV, Jha S. [n. d.] Concurrent and adaptive extreme scale binding free energy calculations. in review. arXiv:1801.01174.

  26. Zwier MC, Adelman JL, Kaus JW, Pratt AJ, Wong KF, Rego NB, Surez E, Lettieri S, Wang DW, Grabe M, Zuckerman DM, Chong LT. Westpa: an interoperable, highly scalable software package for weighted ensemble simulation and analysis. J Chem Theory Comput. 2015;11(2):800–9. https://doi.org/10.1021/ct5010615.

    Article  Google Scholar 

  27. DeFever RS, Hanger W, Sarupria S, Kilgannon J, Apon AW, Ngo LB. Building a scalable forward flux sampling framework using big data and hpc. In: Proceedings of the practice and experience in advanced research computing on rise of the machines (Learning) (PEARC’19). ACM, Chicago, IL, USA, 2019;3:1–3:8. ISBN: 978-1-4503-7227-5. https://doi.org/10.1145/3332186.3332205

  28. Case DA, Cheatham TE, Darden T, Gohlke H, Luo R, Merz KM, et al. The amber biomolecular simulation programs. J Comput Chem. 2005;26(16):1668–88.

    Article  Google Scholar 

  29. Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, et al. Scalable molecular dynamics with namd. J Comput Chem. 2005;26(16):1781–802.

    Article  Google Scholar 

  30. Abraham MJ, Murtola T, Schulz R, Páall S, Smith JC, Hess B, Lindahl E. Gromacs: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1:19–25.

    Article  Google Scholar 

  31. Kasson PM, Jha S. Adaptive ensemble simulations of biomolecules. Curr Opin Struct Biol. 2018;52:87–94.

    Article  Google Scholar 

  32. Balasubramanian V, Treikalis A, Weidner O, Jha S. Ensemble toolkit: scalable and flexible execution of ensembles of tasks. In: 2016 45th international conference on parallel processing (ICPP). Volume 00, 2016;458–463. https://doi.org/10.1109/ICPP.2016.59.

  33. Turilli M, Balasubramanian V, Merzky A, Paraskevakos I, Jha S. [n. d.] Middleware building blocks for workflow systems. Computing in Science & Engineering (CiSE) special issue on Incorporating Scientific Workflows in Computing Research Processes. 2019; https://doi.org/10.1109/MCSE.2019.2920048. arXiv:1903.10057.

  34. Balasubramanian V, Jha S, Merzky A, Turilli M. Radical-cybertools: middleware building blocks for scalable science. CoRR. 2019; arXiv:1904.03085.

  35. Coulibaly P, Baldwin CK. Nonstationary hydrological time series forecasting using nonlinear dynamic methods. J Hydrol. 2005;307(1–4):164–74.

    Article  Google Scholar 

  36. Behrens J, Rakowsky N, Hiller W, Handorf D, Läuter M, Päpke J, et al. Amatos: parallel adaptive mesh generator for atmospheric and oceanic simulation. Ocean Model. 2005;10(1–2):171–83.

    Article  Google Scholar 

  37. Casarotti C, Pinho R. An adaptive capacity spectrum method for assessment of bridges subjected to earthquake action. Bull Earthq Eng. 2007;5(3):377–90.

    Article  Google Scholar 

  38. Lan Z, Taylor VE, Bryan G. Dynamic load balancing for structured adaptive mesh refinement applications. In: International Conference on Parallel Processing, 2001. IEEE, 2001; p. 571–579.

  39. Okamoto Y. Generalized-ensemble algorithms: enhanced sampling techniques for monte carlo and molecular dynamics simulations. J Mol Graph Model. 2004;22(5):425–39.

    Article  Google Scholar 

  40. Babin V, Roland C, Sagui C. Adaptively biased molecular dynamics for free energy calculations. J Chem Phys. 2008;128(13):134101.

    Article  Google Scholar 

  41. Chodera JD, Swope WC, Pitera JW, Dill KA. Long-time protein folding dynamics from short-time molecular dynamics simulations. Multiscale Modeli Simul. 2006;5(4):1214–26.

    Article  MathSciNet  MATH  Google Scholar 

  42. Mattoso M, Dias J, Ocaña KACS, Ogasawara E, Costa F, Horta F, et al. Dynamic steering of hpc scientific workflows: a survey. Future Gen Comput Syst. 2015;46:100–13.

    Article  Google Scholar 

  43. Pronk S, Pouya I, Lundborg M, Rotskoff G, Wesen B, Kasson PM, Lindahl E. Molecular simulation work-flows as parallel algorithms: the execution engine of copernicus, a distributed high-performance computing platform. J Chem Theory Comput. 2015;11(6):2600–8.

    Article  Google Scholar 

  44. McKinley PK, Sadjadi M, Kasten EP, Cheng BHC. Composing adaptive software. Computer. 2004;37(7):56–64.

    Article  Google Scholar 

  45. Barducci A, Bonomi M, Parrinello M. Metadynamics. Wiley Interdiscip Rev Comput Mol Sci. 2011;1(5):826–43. https://doi.org/10.1002/wcms.31.

    Article  Google Scholar 

  46. Chelli R, Signorini GF. Serial generalized ensemble simulations of biomolecules with self-consistent determination of weights. J Chem Theory Comput. 2012;8(3):830–42.

    Article  Google Scholar 

  47. Comer J, Phillips JC, Schulten K, Chipot C. Multiple-replica strategies for free-energy calculations in namd: multiple-walker adaptive biasing force and walker selection rules. J Chem Theory Comput. 2014;10(12):5276–85.

    Article  Google Scholar 

  48. Pande VS, Beauchamp K, Bowman GR. Everything you wanted to know about markov state models but were afraid to ask. Methods. 2010;52(1):99–105.

    Article  Google Scholar 

  49. Singhal N, Pande VS. Error analysis and efficient sampling in markovian state models for molecular dynamics. J Chem Phys. 2005;123(20):204909.

    Article  Google Scholar 

  50. Hinrichs NS, Pande VS. Calculation of the distribution of eigenvalues and eigenvectors in markovian state models for molecular dynamics. J Chem Phys. 2007;126(24):244101.

    Article  Google Scholar 

  51. Scherer MK, Trendelkamp-Schroer B, Paul F, Perez-Hernandez G, Hoffmann M, Plattner N, Wehmeyer C, Prinz J-H, Noe F. Pyemma 2: a software package for estimation, validation, and analysis of markov models. J Chem Theory Comput. 2015;11(11):5525–42.

    Article  Google Scholar 

  52. van der Aalst WMP, Jablonski S. Dealing with workflow change: identification of issues and solutions. Comput Syst Sci Eng. 2000;15(5):267–76.

    Google Scholar 

  53. Balasubramanian V, Turilli M, Hu W, Lefebvre M, Lei W, Modrak RT, Cervone G, Tromp J, Jha S. Harnessing the power of many: extensible toolkit for scalable ensemble applications. In: 2018 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018, Vancouver, BC, Canada, May 2018;21-25, 536–545. https://doi.org/10.1109/IPDPS.2018.00063.

  54. [n. d.] Rabbitmq. https://www.rabbitmq.com/ (Accessed 03/2018).

  55. Merzky A, Turilli M, Maldonado M, Santcroos M, Jha S. Using pilot systems to execute many task workloads on supercomputers. Job Scheduling Strategies for Parallel Processing - 22nd International Workshop, JSSPP 2018. Vancouver. 2018;2018:61–82. https://doi.org/10.1007/978-3-030-10632-44.

  56. Balasubramanian V. https://radicalentk.readthedocs.io/en/latest/advanced_examples.html. (2019).

  57. Balasubramanian V. https://github.com/radical-experiments/adap-bms-exps-ipdps18/blob/master/expanded-ensemble/bin/runme.py. 2019.

  58. [n. d.] Stress-ng. http://kernel.ubuntu.com/~cking/stress-ng/stress-ng.pdf (accessed March 2018). ().

  59. [n. d.] Openmm. https://github.com/pandegroup/openmm (Accessed March 2018). ().

  60. Monroe Jacob I, Shirts Michael R. Converging free energies of binding in cucurbit[7]uril and octa-acid host-guest systems from SAMPL4 using expanded ensemble simulations. J Comput Aided Mol Des. 2014;28(4):401–15. https://doi.org/10.1007/s10822-014-9716-4.

    Article  Google Scholar 

  61. Muddana HS, Fenley AT, Mobley DL, Gilson MK. The sampl4 host-guest blind prediction challenge: an overview. J Comput Aided Mol Des. 2014;28(4):305–17. https://doi.org/10.1007/s10822-014-9735-1.

    Article  Google Scholar 

  62. [n. d.] Md trajectories of ala2. https://figshare.com/articles/new_fileset/1026131 (accessed March 2018). ().

  63. Wang F, Landau DP. Efficient, multiple-range random walk algorithm to calculate density of states. Phys Rev Lett. 2001;86:2050–3.

    Article  Google Scholar 

  64. Shirts MR, Chodera JD. Statistically optimal analysis of samples from multiple equilibrium states. J Chem Phys. 2008;129:124105.

    Article  Google Scholar 

  65. Tiwary P, Berne BJ. Spectral gap optimization of order parameters for sampling complex molecular systems. Proc Natl Acad Sci. 2016;. https://doi.org/10.1073/pnas.1600917113 eprint: http://www.pnas.org/content/early/2016/02/24/1600917113.full.pdf.

    Article  Google Scholar 

Download references

Acknowledgements

We acknowledge support from NSF 1440677, 1639694 and 1835449. XSEDE computational resources were made available via XRAC allocation TG-MCB090174. On behalf of all authors, the corresponding author states that there is no conflict of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shantenu Jha.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Software Challenges to Exascale Computing” guest edited by Amit Majumdar and Ritu Arora.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Balasubramanian, V., Jensen, T., Turilli, M. et al. Adaptive Ensemble Biomolecular Applications at Scale. SN COMPUT. SCI. 1, 104 (2020). https://doi.org/10.1007/s42979-020-0081-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-020-0081-1

Keywords

Navigation