Simulating the Long-timescale Structural Behavior of Bacterial and Influenza Neuraminidases with Different HPC Resources

Understanding the conformational dynamics which affects ligand binding by Neuraminidases is needed to improve the in silico selection of novel drug candidates targeting these pathogenicity factors and to adequately estimate the efficacy of potential drugs. Conventional molecular dynamics (MD) is a powerful tool to study conformational sampling, drug-target recognition and binding, but requires significant computational effort to reach timescales relevant for biology. In this work the advances in a computer power and specialized architectures were evaluated at simulating long MD trajectories of the structural behavior of Neuraminidases. We conclude that modern GPU accelerators enable calculations at the timescales that would previously have been intractable, providing routine access to microsecond-long trajectories in a daily laboratory practice. This opens an opportunity to move away from the “static” affinity-driven strategies in drug design towards a deeper understanding of ligand-specific conformational adaptation of target sites in protein structures, leading to a better selection of efficient drug candidates in silico. However, the performance of modern GPUs is yet far behind the deeply-specialized supercomputers co-designed for MD. Further development of affordable specialized architectures is needed to move towards the much-desired millisecond timescale to simulate large proteins at a daily routine.


Introduction
Co-infection of the human lower respiratory tract with viruses and pathogenic bacteria can lead to life-threatening complications to influenza including pneumonia. The mechanism that facilitates this synergism is a major focus of scientific research, and thus Neuraminidases, which are key virulence factors of both pathogens, are considered as important drug targets. Despite intensive studies, no inhibitors targeting bacterial Neuraminidases have been approved for clinical use so far. Four drugs targeting the influenza Neuraminidases are available, but their efficacy is challenged by evolution of the pathogen and emerging resistance [1]. It is therefore important to design new drugs before the currently available ones become useless. Probably, the cause for the poor success rate of delivering approved drugs to the market is that drug discovery is usually driven by optimizing the binding affinity and selectivity of the respective drug candidate in a static crystallographic structure of a target protein. Previous studies indicated that structures of bacterial and influenza Neuraminidases contain flexible regions which can be involved in a conformation-specific accommodation of ligands [2,5,7]. A deeper understanding of conformational flexibility and dynamics in Neuraminidases that can affect the binding of ligands, can help to improve the selection of novel drug candidates in silico.
The conventional molecular dynamics (MD) is a powerful computational method which can take on the challenge of accounting for protein flexibilities at studying drug recognition and binding to a target site [3]. The key problem is that computational cost of a simulation is very high even for a small protein, and much higher for the large multi-domain/chain Neuraminidases, thus the average MD trajectory so far has been limited to tens of nanoseconds. E.g., in one of the most recent studies of the influenza Neuraminidase ten MD trajectories, each of at most 100 ns, were calculated [2]. These computations can help to understand protein flexibility and function, but overall are far below timescales relevant for the majority of functionally important structural rearrangements. The use of CPU-based computing clusters can provide some speed-up, but the most promising solution so far is the emergence of specialized architectures. The Anton-1 and Anton-2 are special-purpose supercomputers for MD built by a privately held biochemistry company D. E. Shaw Research [6]. Antons run calculations entirely on specialized applicationspecific integrated circuits and can accommodate only custom-designed MD software, presenting a successful example of a deep co-design of the computer software and hardware to execute it. Although some access to this unique and apparently very expensive hardware is provided to scientific community in the US, its overall availability so far has been very limited. Alternatively, the graphical processing units (GPU) are serially produced specialized architectures which offer an affordable speed-up to MD.
In this work the advances in computer technologies were evaluated at simulating long MD trajectories of Neuraminidases. The article is organized as follows. Section 1 describes the molecular modeling protocol, hardware and software setup. In section 2, we report the results and their analysis. The Conclusion section summarizes the study.

Hardware and Software
The computations were carried out on the "Lomonosov-2" supercomputer [4]

Structure Preparation and the Molecular Modeling Protocol
The molecular systems of H1N1 influenza virus Neuraminidase (PDB 3B7E) and Neuraminidase A (NanA) from Streptococcus pneumoniae bacteria (PDB 2YA8) were prepared and simulated in the FF15IPQ force field with SPC/E b water model as recently described [5]. The homotetrameric structure of the H1N1 Neuraminidase was reconstructed based on the BIOMT record in PDB. Only the catalytic domain (including the insertion domain) of the NanA Neuraminidase was selected for this study as the key catalytically competent part of the structure [5]. The size of influenza and bacterial Neuraminidases in a water box was ∼150000 and ∼65000 atoms, respectively. To compare the speed of MD on contemporary GPUs with the special-purpose Anton family of supercomputers, the molecular system of Human Dihydrofolate Reductase (PDB 2C2T) in water worth of ∼24000 atoms was constructed, which was equivalent to one previously used to benchmark Antons [6].

Results
Speed of MD simulation of H1N1 influenza virus Neuraminidase and the bacterial NanA in a water box was compared between classical Intel Xeon Gold CPUs and two types of GPUs -Tesla K40 based on Kepler architecture, and the recently introduced Tesla P100 based on Pascal architecture. The maximum speed-up reached on CPUs and Tesla K40 GPUs were comparable; however, it took at most 4-6 GPU-nodes to gain the performance of 20+ CPU-nodes (Fig. 1,  . Therefore, the CPU-based acceleration of MD is possible, but not efficient. The computational yield of Pascal-based cards, although superior in absolute values, was qualitatively different compared to its Kepler-based predecessor. The speed of MD on a single Tesla P100 was 1.95x-2.14x times higher than the top output from 4-6 nodes equipped with Tesla K40 (Fig. 1, c). However, the performance of Tesla P100 did not improve when executed on multiple nodes, instead it degraded by 22-28%. The use of multiple Tesla P100 to run MD in parallel was efficient only within a single node in the peer-to-peer mode, when GPUs communicate directly via the PCI-E bus, providing the speed-up of 1.31x-1.41x with the yield of 73-150 ns/day. Therefore, a microsecond-long trajectory can be reached in at most 1-2 weeks. The performance of the above-mentioned hardware was compared with special-purpose supercomputers by using a previously described benchmark based on a small protein system (see Section 1.2 of Methods). The peak performances of Anton-1 and Anton-2 machines were ∼40x and ∼228x times superior to two Tesla P100 GPUs in the peer-to-peer mode (Fig. 2).

Discussion and Conclusions
Molecular dynamics has a potential of becoming a particularly important tool in drug discovery to evaluate ligand recognition and binding to a flexible target site, but its use at a daily routine has been limited by a significant computational cost. We have shown that MD simulation of the structural dynamics of large Neuraminidases on modern GPUs can reach the microsecond timescale. This opens an opportunity to move away from the "static" affinity-driven strategies towards a deeper understanding of conformational dynamics in structures of proteins and mechanisms of association/dissociation of a drug candidate from a target site, leading to a better selection of promising compounds in silico to be further profiled experimentally. With the de-velopment of more powerful video cards, the use of more than one GPU-node per simulation is becoming inefficient, leaving GPUs yet far behind the deeply-specialized Anton supercomputers, and thus supporting the recent trend for co-design of software and hardware for a particular purpose, of which Antons are an example, as a strategy to accelerate calculations. Further development of affordable specialized architectures is needed to move towards the much-desired millisecond timescale to simulate structural behavior of large proteins at a daily routine.