Abstract
Recent advances in both theory and methods have created opportunities to simulate biomolecular processes more efficiently using adaptive ensemble simulations. Ensemble-based simulations are used widely to compute a number of individual simulation trajectories and analyze statistics across them. Adaptive ensemble simulations offer a further level of sophistication and simulation efficacy by enabling high-level algorithms to control simulations based on intermediate results. Novel high-level algorithms for adaptive simulations require sophisticated approaches to manage the ensemble members and utilize the intermediate data during runtime. Thus, there is a need for scalable software systems to support adaptive ensemble-based methods. We describe the operations in executing adaptive workflows, classify different types of adaptations, and describe challenges in implementing them in software tools. We establish the design considerations of software systems to support the requirements of adaptive ensemble applications at extreme scale. We use Ensemble Toolkit (EnTK) and its associated task execution runtime system (RADICAL-Pilot)—middleware building blocks to implement a scalable adaptive ensemble execution system. We implement two high-level adaptive ensemble algorithms—multiwalker expanded ensemble and Markov state modeling, and execute up to \(2^{12}\) ensemble members, on thousands of cores on three distinct HPC platforms. We highlight scientific advantages enabled by the novel capabilities of our approach. To the best of our knowledge, this is the first attempt at describing and implementing multiple adaptive ensemble workflows using a common conceptual and implementation framework.
Similar content being viewed by others
Change history
28 September 2023
A Correction to this paper has been published: https://doi.org/10.1007/s42979-023-02168-3
References
Cheatham TE, Roe DR. The impact of heterogeneous computing on workflows for biomolecular simulation and analysis. Comput Sci Eng. 2015;17(2):30–9.
Trebst S, Troyer M, Hansmann UHE. Optimized parallel tempering simulations of proteins. J Chem Phys. 2006;124:174903.
Hansmann UHE. Parallel tempering algorithm for conformational studies of biological molecules. Chem Phys Lett. 1997;281:140–50.
Mitsutake A, Sugita Y, Okamoto Y. Replica-exchange multicanonical and multicanonical replica-exchange Monte Carlo simulations of peptides. I. Formulation and benchmark test. J Chem Phys. 2003;118:6664.
Mitsutake A, Okamoto Y. Replica-exchange extensions of simulated tempering method. J Chem Phys. 2004;121:2491.
Ballard AJ, Jarzynski C. Replica exchange with nonequilibrium switches. Proc Natl Acad Sci. 2009;106(30):12224–9. https://doi.org/10.1073/pnas.0900406106.
Rauscher S, Neale C, Pomes R. Simulated tempering distributed replica sampling, virtual replica exchange, and other generalized-ensemble methods for conformational sampling. J Chem Theory Comput. 2009;5(10):2640–62. https://doi.org/10.1021/ct900302n ISSN: 1549-9618.
Comer J, Phillips JC, Schulten K, Chipot C. Multiple-replica strategies for free-energy calculations in NAMD: multiple-walker adaptive biasing force and walker selection rules. J Chem Theory Comput. 2014;10(12):5276–85 ISSN: 1549-9618.
Janosi L, Doxastakis M. Accelerating flat-histogram methods for potential of mean force calculations. J Chem Phys. 2009;131(5):054105 ISSN: 1089-7690.
Raiteri P, Laio A, Gervasio FL, Micheletti C, Parrinello M. Efficient reconstruction of complex free energy landscapes by multiple walkers metadynamics. J Phys Chem B. 2006;110:3533–9.
Voter AF. Hyperdynamics: accelerating molecular dynamics of infrequent events. Phys Rev Lett. 1997;78:3908–11. https://doi.org/10.1103/PhysRevLett.78.3908.
Huang C, Perez D, Voter AF. Hyperdynamics boost factor achievable with an ideal bias potential. J Chem Phys. 2015;143:074113. https://doi.org/10.1063/1.4928636.
Voter AF. Parallel replica method for dynamics of infrequent events. Phys Rev B. 1998;57(22):13985–8.
Chodera JD, Swope WC, Pitera JW, Dill KA. Long-time protein folding dynamics from short-time molecular dynamics simulations. Multiscale Model Simul. 2006;5(4):1214–26.
Bowman GR, Huang X, Pande VS. Using generalized ensemble simulations and Markov state models to identify conformational states. Methods. 2009;. https://doi.org/10.1016/j.ymeth.2009.04.013.
Maragliano L, Roux B, Vanden-Eijnden E. Comparison between mean forces and swarms-of-trajectories string methods. J Chem Theory Comput. 2014;10(2):524–33. https://doi.org/10.1021/ct400606c.
Atzori A, Bruce NJ, Burusco KK, Wroblowski B, Bonnet P, Bryce RA. Exploring protein kinase conformation using swarm-enhanced sampling molecular dynamics. J Chem Inf Model. 2014;54(10):2764–75. https://doi.org/10.1021/ci5003334.
Sanchez-Martinez M, Field M, Crehuet R. Enzymatic minimum free energy path calculations using swarms of trajectories. J Phys Chem B. 2015;119(3):1103–13. https://doi.org/10.1021/jp506593t.
Pan AC, Sezer D, Roux B. Finding transition pathways using the string method with swarms of trajectories. J Phys Chem B. 2008;112(11):3432–40.
Husic BE, Pande VS. Markov state models: from an art to a science. J Am Chem Soc. 2018;140(7):2386–96.
Bowman GR, Ensign DL, Pande VS. Enhanced modeling via network theory: adaptive sampling of markov state models. J Chem Theory Comput. 2010;6(3):787–94.
Miron RA, Fichthorn KA. Accelerated molecular dynamics with the bond-boost method. J Chem Phys. 2003;119(12):6210–6. https://doi.org/10.1063/1.1603722.
Voter AF. Parallel replica method for dynamics of infrequent events. English. Phys Rev B. 1998;57(22):13985–8.
Suárez E, Lettieri S, Zwier MC, Stringer CA, Subramanian SR, Chong LT, Zuckerman DM. Simultaneous computation of dynamical and equilibrium information using a weighted ensemble of trajectories. J Chem Theory Comput. 2014;10(7):2658–67. https://doi.org/10.1021/ct401065r.
Dakka J, Balasubramanian KPV, Turilli M, Wright DW, Zasada SJ, Wan S, Coveney PV, Jha S. [n. d.] Concurrent and adaptive extreme scale binding free energy calculations. in review. arXiv:1801.01174.
Zwier MC, Adelman JL, Kaus JW, Pratt AJ, Wong KF, Rego NB, Surez E, Lettieri S, Wang DW, Grabe M, Zuckerman DM, Chong LT. Westpa: an interoperable, highly scalable software package for weighted ensemble simulation and analysis. J Chem Theory Comput. 2015;11(2):800–9. https://doi.org/10.1021/ct5010615.
DeFever RS, Hanger W, Sarupria S, Kilgannon J, Apon AW, Ngo LB. Building a scalable forward flux sampling framework using big data and hpc. In: Proceedings of the practice and experience in advanced research computing on rise of the machines (Learning) (PEARC’19). ACM, Chicago, IL, USA, 2019;3:1–3:8. ISBN: 978-1-4503-7227-5. https://doi.org/10.1145/3332186.3332205
Case DA, Cheatham TE, Darden T, Gohlke H, Luo R, Merz KM, et al. The amber biomolecular simulation programs. J Comput Chem. 2005;26(16):1668–88.
Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, et al. Scalable molecular dynamics with namd. J Comput Chem. 2005;26(16):1781–802.
Abraham MJ, Murtola T, Schulz R, Páall S, Smith JC, Hess B, Lindahl E. Gromacs: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1:19–25.
Kasson PM, Jha S. Adaptive ensemble simulations of biomolecules. Curr Opin Struct Biol. 2018;52:87–94.
Balasubramanian V, Treikalis A, Weidner O, Jha S. Ensemble toolkit: scalable and flexible execution of ensembles of tasks. In: 2016 45th international conference on parallel processing (ICPP). Volume 00, 2016;458–463. https://doi.org/10.1109/ICPP.2016.59.
Turilli M, Balasubramanian V, Merzky A, Paraskevakos I, Jha S. [n. d.] Middleware building blocks for workflow systems. Computing in Science & Engineering (CiSE) special issue on Incorporating Scientific Workflows in Computing Research Processes. 2019; https://doi.org/10.1109/MCSE.2019.2920048. arXiv:1903.10057.
Balasubramanian V, Jha S, Merzky A, Turilli M. Radical-cybertools: middleware building blocks for scalable science. CoRR. 2019; arXiv:1904.03085.
Coulibaly P, Baldwin CK. Nonstationary hydrological time series forecasting using nonlinear dynamic methods. J Hydrol. 2005;307(1–4):164–74.
Behrens J, Rakowsky N, Hiller W, Handorf D, Läuter M, Päpke J, et al. Amatos: parallel adaptive mesh generator for atmospheric and oceanic simulation. Ocean Model. 2005;10(1–2):171–83.
Casarotti C, Pinho R. An adaptive capacity spectrum method for assessment of bridges subjected to earthquake action. Bull Earthq Eng. 2007;5(3):377–90.
Lan Z, Taylor VE, Bryan G. Dynamic load balancing for structured adaptive mesh refinement applications. In: International Conference on Parallel Processing, 2001. IEEE, 2001; p. 571–579.
Okamoto Y. Generalized-ensemble algorithms: enhanced sampling techniques for monte carlo and molecular dynamics simulations. J Mol Graph Model. 2004;22(5):425–39.
Babin V, Roland C, Sagui C. Adaptively biased molecular dynamics for free energy calculations. J Chem Phys. 2008;128(13):134101.
Chodera JD, Swope WC, Pitera JW, Dill KA. Long-time protein folding dynamics from short-time molecular dynamics simulations. Multiscale Modeli Simul. 2006;5(4):1214–26.
Mattoso M, Dias J, Ocaña KACS, Ogasawara E, Costa F, Horta F, et al. Dynamic steering of hpc scientific workflows: a survey. Future Gen Comput Syst. 2015;46:100–13.
Pronk S, Pouya I, Lundborg M, Rotskoff G, Wesen B, Kasson PM, Lindahl E. Molecular simulation work-flows as parallel algorithms: the execution engine of copernicus, a distributed high-performance computing platform. J Chem Theory Comput. 2015;11(6):2600–8.
McKinley PK, Sadjadi M, Kasten EP, Cheng BHC. Composing adaptive software. Computer. 2004;37(7):56–64.
Barducci A, Bonomi M, Parrinello M. Metadynamics. Wiley Interdiscip Rev Comput Mol Sci. 2011;1(5):826–43. https://doi.org/10.1002/wcms.31.
Chelli R, Signorini GF. Serial generalized ensemble simulations of biomolecules with self-consistent determination of weights. J Chem Theory Comput. 2012;8(3):830–42.
Comer J, Phillips JC, Schulten K, Chipot C. Multiple-replica strategies for free-energy calculations in namd: multiple-walker adaptive biasing force and walker selection rules. J Chem Theory Comput. 2014;10(12):5276–85.
Pande VS, Beauchamp K, Bowman GR. Everything you wanted to know about markov state models but were afraid to ask. Methods. 2010;52(1):99–105.
Singhal N, Pande VS. Error analysis and efficient sampling in markovian state models for molecular dynamics. J Chem Phys. 2005;123(20):204909.
Hinrichs NS, Pande VS. Calculation of the distribution of eigenvalues and eigenvectors in markovian state models for molecular dynamics. J Chem Phys. 2007;126(24):244101.
Scherer MK, Trendelkamp-Schroer B, Paul F, Perez-Hernandez G, Hoffmann M, Plattner N, Wehmeyer C, Prinz J-H, Noe F. Pyemma 2: a software package for estimation, validation, and analysis of markov models. J Chem Theory Comput. 2015;11(11):5525–42.
van der Aalst WMP, Jablonski S. Dealing with workflow change: identification of issues and solutions. Comput Syst Sci Eng. 2000;15(5):267–76.
Balasubramanian V, Turilli M, Hu W, Lefebvre M, Lei W, Modrak RT, Cervone G, Tromp J, Jha S. Harnessing the power of many: extensible toolkit for scalable ensemble applications. In: 2018 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018, Vancouver, BC, Canada, May 2018;21-25, 536–545. https://doi.org/10.1109/IPDPS.2018.00063.
[n. d.] Rabbitmq. https://www.rabbitmq.com/ (Accessed 03/2018).
Merzky A, Turilli M, Maldonado M, Santcroos M, Jha S. Using pilot systems to execute many task workloads on supercomputers. Job Scheduling Strategies for Parallel Processing - 22nd International Workshop, JSSPP 2018. Vancouver. 2018;2018:61–82. https://doi.org/10.1007/978-3-030-10632-44.
Balasubramanian V. https://radicalentk.readthedocs.io/en/latest/advanced_examples.html. (2019).
Balasubramanian V. https://github.com/radical-experiments/adap-bms-exps-ipdps18/blob/master/expanded-ensemble/bin/runme.py. 2019.
[n. d.] Stress-ng. http://kernel.ubuntu.com/~cking/stress-ng/stress-ng.pdf (accessed March 2018). ().
[n. d.] Openmm. https://github.com/pandegroup/openmm (Accessed March 2018). ().
Monroe Jacob I, Shirts Michael R. Converging free energies of binding in cucurbit[7]uril and octa-acid host-guest systems from SAMPL4 using expanded ensemble simulations. J Comput Aided Mol Des. 2014;28(4):401–15. https://doi.org/10.1007/s10822-014-9716-4.
Muddana HS, Fenley AT, Mobley DL, Gilson MK. The sampl4 host-guest blind prediction challenge: an overview. J Comput Aided Mol Des. 2014;28(4):305–17. https://doi.org/10.1007/s10822-014-9735-1.
[n. d.] Md trajectories of ala2. https://figshare.com/articles/new_fileset/1026131 (accessed March 2018). ().
Wang F, Landau DP. Efficient, multiple-range random walk algorithm to calculate density of states. Phys Rev Lett. 2001;86:2050–3.
Shirts MR, Chodera JD. Statistically optimal analysis of samples from multiple equilibrium states. J Chem Phys. 2008;129:124105.
Tiwary P, Berne BJ. Spectral gap optimization of order parameters for sampling complex molecular systems. Proc Natl Acad Sci. 2016;. https://doi.org/10.1073/pnas.1600917113 eprint: http://www.pnas.org/content/early/2016/02/24/1600917113.full.pdf.
Acknowledgements
We acknowledge support from NSF 1440677, 1639694 and 1835449. XSEDE computational resources were made available via XRAC allocation TG-MCB090174. On behalf of all authors, the corresponding author states that there is no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Software Challenges to Exascale Computing” guest edited by Amit Majumdar and Ritu Arora.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Balasubramanian, V., Jensen, T., Turilli, M. et al. Adaptive Ensemble Biomolecular Applications at Scale. SN COMPUT. SCI. 1, 104 (2020). https://doi.org/10.1007/s42979-020-0081-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-020-0081-1