Multiscale Biomolecular Simulations in the Exascale Era

The complexity of biological systems and processes, spanning molecular to macroscopic scales, necessitates the use of multiscale simulations to get a comprehensive understanding. Quantum mechanics/molecular mechanics (QM/MM) molecular dynamics (MD) simulations are crucial for capturing processes beyond the reach of classical MD simulations. The advent of exascale computing offers unprecedented opportunities for scientific exploration, not least within life sciences, where simulations are essential to unravel intricate molecular mechanisms underlying biological processes. However, leveraging the immense computational power of exascale computing requires innovative algorithms and software designs. In this context, we discuss the current status and future prospects of multiscale biomolecular simulations on exascale supercomputers with a focus on QM/MM MD. We highlight our own efforts in developing a versatile and high-performance multiscale simulation framework with the aim of efficient utilization of state-of-the-art supercomputers. We showcase its application in uncovering complex biological mechanisms and its potential for leveraging exascale computing.


Introduction
Biological processes extend across wide scales in space and time due to the hierarchical organization of biological matter [1].The characteristic dimensions span from the molecular scale of a few ångströms with ultrafast electronic processes on the order of atto-and femtoseconds and rapid chemical reactions that occur within pico-to microseconds, to the macroscopic scale of cells and organs that are visible to the naked eye and where processes extend to seconds and even days and years.The complexity in living organisms largely stems from this hierarchical structure, where a local process may trigger a cascade of events across multiple spatial and temporal scales.Thus, multiscale approaches integrating different resolutions and methodologies are essential for capturing the entire spectrum of biological events [2,3].
The continuous development of multiscale methods in computational structural biology is driven by simultaneous advancements in algorithms, software implementations, and hardware technology that push the boundaries of molecular simulations in terms of accessible time scales and system sizes, capturing biological systems from the atomistic to the cellular level.In particular, quantum mechanics/molecular mechanics (QM/MM) molecular dynamics (MD) simulations have become increasingly important in the last decades for studying processes where a description of electronic degrees of freedom is paramount and thus beyond the capabilities of classical MD based on standard analytical force fields.Examples of such events encompass all types of chemical reactions, including proton-coupled electron transfers [4,5], photochemistry [6], and the manifold of chemical transformations observed in enzymatic reactions [7,8], especially those involving transition metals [9].A detailed characterization of these phenomena, including reactants, transition states, intermediates, and products, as well as the involved relative free energies, reaction rates, and binding affinities in both electronic ground and excited states, also has direct implications for drug design [10].In QM/MM models of biomolecular systems, a smaller part (the QM subsystem), such as the active site of an enzyme or a chromophore embedded in a protein or lipid bilayer, is described at the quantum mechanical level, while the remainder (the MM subsystem) is modeled using molecular mechanics.This multiscale strategy balances a detailed and accurate but computationally costly description of the essential part of a system with a coarser but computationally expedient approach for the much larger MM part.The dynamic aspect of QM/MM MD is crucial to accurately capture the behavior and function of complex proteins [11,12].However, QM/MM MD simulations have a substantially higher computational cost than classical (i.e., MM) MD simulations, severely limiting the accessible time scales.Typically, for a density functional theory (DFT)-based QM/MM MD simulation with around 100 atoms in the QM subsystem, accessible time scales are limited to a few hundred picoseconds [8,13].
The advent of exascale computing marks a pivotal moment for all simulation-based scientific fields [14,15].This remarkable technological achievement was accomplished by connecting thousands of computing nodes through high-speed network interconnects.Each node combines traditional general-purpose central processing units (CPUs) with powerful graphics processing units (GPUs).However, exascale supercomputer architectures also introduce new challenges since programming software applications for these heterogeneous machines requires a judicious decomposition of the computational work to exploit each component of the machines optimally [16].Therefore, to fully exploit the computational power of present and future heterogeneous HPC architectures, a high degree of concurrent parallelism is needed, where different parts of the supercomputer work on different subdomains of the computational model, each described within a different theoretical method.Together, exascale computing and the development of novel computational methods and software can enable longer and more accurate simulations on larger and more complex systems.This opens up new opportunities for discovery and innovation in the life and health sciences, potentially revolutionizing areas such as drug design and bioengineering.
The heterogeneous CPU/GPU technology highlighted above is taken to the next level with modular supercomputer architectures that integrate a variety of hardware technologies into interconnected partitions [17,18].A prime example of this is the LUMI supercomputer [19], along with the upcoming exascale JUPITER supercomputer [20], both procured by the European HPC Joint Undertaking (EuroHPC JU).Currently, LUMI consists of two primary number-crunching partitions: a general-purpose CPU partition and a high-performance GPU-accelerated partition.Additionally, it features an interactive data analytics partition and an accelerated storage partition.Looking ahead, the integration of prospective technologies like quantum and neuromorphic computing into traditional HPC infrastructure is expected [17,18].Indeed, the procurement of a quantum computing partition for LUMI is already underway [21], and the Jülich Supercomputing Center in Germany is already experimenting with this kind of integration [22].Modular supercomputer architectures offer substantial benefits, particularly in allowing calculations to run on the most suitable hardware for the specific problem.Moreover, existing software packages that have yet to be optimized for new hardware remain useful for solving important scientific problems.Fully exploiting the capabilities of complex modular supercomputers for exceptionally challenging scientific problems, which require the full computing power of the machine, necessitates a new computational paradigm.Here, we highlight an approach recently introduced in the field of multiscale biomolecular simulations.
Multiscale methods, especially the QM/MM approach, have proven indispensable for exploring complex biological phenomena.Interestingly, QM/MM is not only a robust technique in itself but also offers a route to overcome some of the existing limitations in the development of software for atomistic simulations on modern hybrid computer architectures [14].Beyond QM/MM, multiscale methods can also be extended to incorporate multiple layers ranging from coarse-grained (CG) to continuum models of matter where parts of a system, e.g., membrane regions that are sufficiently distant from an embedded protein of interest, are treated by techniques usually employed for meso-and macroscopic systems [23][24][25][26][27].
To reach the full potential offered by exascale supercomputers, it is imperative that multiscale interfaces scale efficiently, fully leveraging the extensive network of CPUs and GPUs.That is one of the main objectives of MiMiC (multiscale modeling in computational chemistry), a highperformance and versatile framework for multiscale simulations [28].The program-agnostic MiMiC framework is designed to combine virtually any QM and MM (or other) program without compromising the computational efficiency and scalability of the simulation.Indeed, MiMiC, although still in its nascent stages, has demonstrated its ability to efficiently scale QM/MM MD simulations across thousands of CPU-based computing nodes [13,29].Here we will showcase the MiMiC framework for biomolecular simulations, illustrating its potentiality in leveraging state-of-the-art supercomputing resources and its capability to address complex biochemical and biophysical problems.
The interface between QM and MM components can be classified as tightly or loosely coupled, reflecting the degree of interdependence between the components.Loose coupling is preferred for its flexibility and ease of maintenance, enabling straightforward integration of independent programs.This approach supports the quick incorporation of new functionalities and advancements within individual programs without necessitating alterations to other coupled components.Crucially, loose coupling permits independent optimization of each program for peak performance.However, this flexibility can come at the cost of computational efficiency due to slower inter-program communication, particularly when compared to the tight coupling approach that allows for direct in-memory data sharing.
There are three communication mechanisms for exchanging data between programs, namely the file-, library-, and network-based approaches.Most commonly utilized for general interfaces and integrative frameworks is the file-based approach, which does not require modifications to the source code of the interfaced programs, making it the simplest to implement.Here, the main driver program generates input files for the interfaced programs, executes them, and subsequently reads their output files.However, it is notably the least efficient communication mechanism, primarily due to slow disk write/read operations and overhead associated with executing and terminating the interfaced programs.Both drawbacks are avoided in the library-based approach, where the interfaced programs are converted into a library that is linked to the main program, thus enabling efficient in-memory data exchange.The disadvantages are the rather intrusive modifications needed in the source codes of the interfaced programs, and the risk of ending up with a tight coupling focused on a few specific programs.Moreover, a critical aspect of multiscale implementations is parallel scalability and efficiency.The optimal parallel algorithms for QM and MM components, or even among different QM methods, may differ significantly and may not seamlessly integrate within the more tightly coupled environments.
The network-based approach, on the other hand, offers a solution that potentially circumvents the limitations inherent in both file-and library-based methods.It achieves a balance between flexibility and communication efficiency by enabling data exchange over a network.This facilitates seamless communication between programs running on different processors (CPUs and GPUs), across computing nodes, or even among supercomputer partitions.Such an approach retains the benefits of loose coupling, i.e., ease of integration and maintenance, while surpassing file-based approaches in performance through the use of fast network protocols and remote direct memory access (RDMA) technologies.Crucially, it can be implemented so as not to disrupt the normal execution of interfaced programs, thus preserving their performance.However, the network-based approach requires a more sophisticated design and management strategy to reduce latency and enhance throughput.Despite these challenges, with proper implementation, the network-based approach substantially improves the efficiency and scalability of multiscale simulations by facilitating high-speed communication between diverse computational models.

Toward Multiscale Biomolecular Simulations on Exascale Architectures
For multiscale simulations to run efficiently on state-of-the-art supercomputers, first and foremost, it is imperative that the programs dedicated to a specific methodology, such as QM and MM, are able to take advantage of the strengths of both CPUs and GPUs.Moreover, these programs must be capable of using the vast parallel processing capabilities of exascale architectures.Considerable efforts are underway to push QM-and MM-based programs towards exascale [60][61][62][63][64][65][66][67] with support from EuroHPC JU through its HPC Centres of Excellence [68] and the Exascale Computing Project (ECP) led by the US Department of Energy [69].
For a multiscale simulation framework to fully exploit the capabilities of these QM and MM programs, it must seamlessly integrate with their optimal parallelization strategies without introducing inefficiencies.This entails enabling the concurrent execution of interfaced programs, eliminating the need for their repeated startup and shutdown, and reducing communication overhead to a minimum.Crucially, the calculation of subsystem interactions, such as those between QM and MM particles, needs to be highly parallel and efficient to prevent it from becoming a computational bottleneck.Implementing effective and automatic load balancing is also critical to ensure that computational resources are utilized optimally, thereby maximizing the throughput of individual simulations.
The versatile and high-performance MiMiC framework for multiscale simulations was designed with the aim of addressing the criteria outlined above [28].The strategy used by the MiMiC framework is illustrated in Figure 1.On one side, MiMiC connects to a simulation driver that manages the overall simulation process, including the integration of the equations of motion and the maintenance of temperature and pressure.On the other side, it interfaces multiple external programs using a client-server approach combined with a multiple-program multiple-data (MPMD) model.Each external program is tasked with calculations belonging to a given subsystem, while MiMiC calculates subsystem interactions.Importantly, the external programs run concurrently on separate computational resources using their own optimal parallelization capabilities.Communication between MiMiC and the external programs is facilitated by the MiMiC Communication Library (MCL), a dedicated lightweight library that ensures efficient network-based communication and simplifies the interfacing with external programs.An illustration of a MiMiC-based simulation workflow is shown in Figure 2. The scope of the MiMiC framework is broad, extending beyond QM/MM.This includes QM/QM, which integrates different levels of QM theory, QM/MM/CG, and even models that incorporate machine-learning (ML) techniques like ML/MM or QM/ML.The MiMiC framework has been used to implement a DFT-based electrostatic embedding QM/MM method, employing the CPMD program [70] as both the MD driver and QM engine, and GRO-MACS [61] as the MM engine [28,29].In electrostatic embedding, the (external) electric field from the point charges in the MM subsystem is included in the Hamiltonian of the QM subsystem, thus directly polarizing its electronic density.The calculation of electrostatic QM/MM interactions is based on a dense grid representation of the electronic density, which is computationally expensive, especially for large systems, because the number of integrals over the electron density scales linearly with the number of atoms in the MM subsystem.The MiMiC framework has implemented an efficient approach that substantially speeds up the calculation essentially without compromising the accuracy of the forces, thus reducing the computational cost by about 80 % for small systems (e.g., small solvated proteins) and up to 99 % for larger systems (e.g., membrane-embedded proteins) [28].Furthermore, MiMiC implements a hybrid shared-and distributed-memory (OpenM-P/MPI) parallelization strategy to ensure that these calculations remain efficient and do not hinder performance, even under highly parallel conditions [29].A powerful example of this is shown in Figure 3, where the extreme scalability of the CPMD program is harnessed to achieve strong scalability beyond 80,000 CPU cores with a parallel efficiency of 70 % for a single MiMiC-based QM/MM MD simulation of the human isocitrate dehydrogenase-1 (IDH1) [13].Table 1 shows examples of the computational performance and cost of MiMiC-based QM/MM MD simulations using CPMD and GROMACS.It is clear that the simulation throughput depends first and foremost on the chosen exchange-correlation functional and the size of the QM subsystem.This reflects the fact that the QM calculation is by far the most computationally demanding part of a QM/MM MD simulation.Hybrid exchange-correlation functionals are particularly expensive in a plane-wave basis set such as the one employed by the CPMD program.Still, due to the excellent parallelism in CPMD, the simulation throughput can be pushed to 4.8 ps/day for the small QM region (46 atoms) and 0.7 ps/day for the larger QM subsystem (142 atoms).Using instead a non-hybrid functional pushes the performance to 21 and 5.4 ps/day for the small and large QM subsystems, respectively.Table 1.Computational performance and cost of MiMiC-based QM/MM MD simulations.All simulations were run with a 0.5 fs timestep on the CPU partition of JUWELS [71] at the scaling limit (parallel efficiency ≥ 70%).The systems are p38 mitogen-activated protein kinase (169,550 atoms) and human isocitrate dehydrogenase-1 (130,828 atoms).We refer to the original work for full computational details [13].

Biological Applications
The MiMiC framework has enabled a number of recent computational studies that demonstrate its utility and efficiency across a broad spectrum of systems with biological relevance.In this section, we showcase select examples that highlight the impactful contributions of MiMiC-based QM/MM MD simulations in advancing our understanding of complex biological processes.Among the pioneering uses of MiMiC-based QM/MM MD simulations were studies of CLC proteins, a large family of anion channels and transporters.Chiariello et al. studied the molecular mechanism of fluoride inhibition of the anion/proton exchanger ClC-ec1 from E. coli (Fig. 4A) [72].The use of QM/MM for this study was mandatory as ion translocation involves proton transfer processes.On the basis of QM/MM MD and well-tempered metadynamics (wtMTD) simulations at the B3LYP and BLYP levels of theory, Chiariello et al. were able to report proton affinities of the fluoride ion and the gating glutamate residue E148, thus providing valuable insights into transport inhibition.In a second study, the mechanisms of proton transfer and release by the fluoride/proton antiporter CLC F -eca were investigated (Fig. 4B) [73].Employing the same simulation techniques, it could be shown that a triad is formed between fluoride, glutamate E318, and the gating glutamate E118, eventually releasing protons and fluoride as hydrogen fluoride.
In a recent study, the ligand iperoxo (routinely used in neuroimaging) targeting the human muscarinic acetylcholine receptor 2 was investigated [75].The work focuses on the calculation of the drug unbinding rate constant  of f , a very difficult parameter to correctly estimate with force field-based approaches.In fact, while methods based on modern force fields are nowadays able ClC-ec1 anion/proton antiporter embedded in a solvated lipid bilayer with a total of 150,925 atoms (19 QM atoms) [72].Reprinted with permission from Ref. [72].Copyright 2020 American Chemical Society.B: CLC F -eca fluoride/proton antiporter embedded in a solvated lipid bilayer with a total of around 174,000 atoms (36 QM atoms) [73].Reprinted with permission from Ref. [73].Copyright 2021 American Chemical Society.C: AMPAR cation channel embedded in a solvated lipid bilayer where the transmembrane domain was included in the simulations (22 QM atoms) [74].to predict accurate binding free energies and affinities, this is often not the case for rate constants such as  of f , which also require a correct estimation of the free energy of transition states.The study shows how sensitive this estimation is to values of the partial charges of the ligand and that while results obtained with a QM/MM MD simulation are in good agreement with experimental findings, standard force field-based procedures lead to qualitatively wrong results, most likely due to the lack of explicit electronic polarization and charge transfer, which are included in QM/MM.
Another important application of QM/MM MD simulations is in scenarios where parts of a system cannot be adequately described using simplified analytical force fields.Notable examples are metal ions, in particular transition metals or divalent alkaline-earth ions.The absence of accurate parameters for the latter hinders a thorough understanding of many biological processes related to, e.g., ion channels, transporters, and pumps.Recently, Schackert et al. studied the mechanism of calcium permeation in -amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptors (AMPARs), which are key to rapid synaptic transmission in the central nervous system (Fig. 4C) [74].In that study, the calcium-binding sites within the channel, initially identified through classical MD, were further confirmed using QM/MM MD.This step was crucial for validating a newly developed classical force field specifically designed for calcium ions.Interestingly, they observed charge transfer between the calcium ion and the water molecules in the first solvation shell, which underlines the importance of a QM/MM description for such systems.

Outlook
The exascale era is poised to radically change life sciences.With the ability to conduct longer and more accurate simulations on larger and more complex molecular systems than ever before, we will be able to tackle scientific questions that are currently beyond our reach, deepening our understanding of biological processes.This opens up new opportunities for discovery and innovation, potentially revolutionizing fields such as drug design and bioengineering.However, fully realizing the potential of exascale computing is contingent upon overcoming substantial challenges in software design, algorithm optimization, and the efficient utilization of modular and heterogeneous HPC architectures.
The versatility of the MiMiC framework makes it ideal for pushing multiscale biomolecular simulations towards the exascale.It is capable of exploiting the results of the large-scale initiatives by EuroHPC JU and ECP for individual domains such as QM and MM.Indeed, work is already underway to couple a diverse set of QM programs to MiMiC, namely, Quantum ESPRESSO, CP2K, and DFT-FE, all of which are being developed to exploit state-of-the-art HPC technology [67].Additionally, it is likely that MD simulations will benefit greatly from future ML-based force fields [76,77], which MiMiC is fully able to integrate into advanced simulation workflows [78].For instance, hybrid ML/MM models facilitate long simulations of biological systems using ML-based force fields, achieving near QM/MM precision at a substantially reduced computational cost [79], while ML-enhanced free-energy methods significantly accelerate QM/MM calculations of ligand binding affinities for drug discovery [80].These and other novel methods are expected to aid in accurately predicting key biophysical properties like drug-protein binding free energies and complete free-energy profiles.We anticipate that the integration of MiMiC-based multiscale simulations with ML and enhanced sampling techniques will further catalyze a quantum leap in the fundamental atomistic understanding of biological processes.

Figure 1 .Figure 2 .
Figure 1.Illustration of the strategy used by the MiMiC framework

Figure 3 .
Figure 3. Parallel scalability and efficiency of MiMiC-based QM/MM MD simulations of IDH1.The IDH1 system has a total of 130,828 atoms with 142 QM atoms [13].The simulations were run on the CPU partition of JUWELS [71].The speedup is given in terms of the time per MD step normalized against a reference run using seven nodes.Adapted from Raghavan et al. [13], licensed under CC BY 4.0.

Figure 4 .
Figure 4. Illustrations of biological systems that have been studied using the MiMiC framework.A:ClC-ec1 anion/proton antiporter embedded in a solvated lipid bilayer with a total of 150,925 atoms (19 QM atoms)[72].Reprinted with permission from Ref.[72].Copyright 2020 American Chemical Society.B: CLC F -eca fluoride/proton antiporter embedded in a solvated lipid bilayer with a total of around 174,000 atoms (36 QM atoms)[73].Reprinted with permission from Ref.[73].Copyright 2021 American Chemical Society.C: AMPAR cation channel embedded in a solvated lipid bilayer where the transmembrane domain was included in the simulations (22 QM atoms)[74].Figure by Schackert et al. [74], licensed under CC BY 4.0.
Figure 4. Illustrations of biological systems that have been studied using the MiMiC framework.A:ClC-ec1 anion/proton antiporter embedded in a solvated lipid bilayer with a total of 150,925 atoms (19 QM atoms)[72].Reprinted with permission from Ref.[72].Copyright 2020 American Chemical Society.B: CLC F -eca fluoride/proton antiporter embedded in a solvated lipid bilayer with a total of around 174,000 atoms (36 QM atoms)[73].Reprinted with permission from Ref.[73].Copyright 2021 American Chemical Society.C: AMPAR cation channel embedded in a solvated lipid bilayer where the transmembrane domain was included in the simulations (22 QM atoms)[74].Figure by Schackert et al. [74], licensed under CC BY 4.0.