Computing Clinically Relevant Binding Free Energies of HIV-1 Protease Inhibitors

The use of molecular simulation to estimate the strength of macromolecular binding free energies is becoming increasingly widespread, with goals ranging from lead optimization and enrichment in drug discovery to personalizing or stratifying treatment regimes. In order to realize the potential of such approaches to predict new results, not merely to explain previous experimental findings, it is necessary that the methods used are reliable and accurate, and that their limitations are thoroughly understood. However, the computational cost of atomistic simulation techniques such as molecular dynamics (MD) has meant that until recently little work has focused on validating and verifying the available free energy methodologies, with the consequence that many of the results published in the literature are not reproducible. Here, we present a detailed analysis of two of the most popular approximate methods for calculating binding free energies from molecular simulations, molecular mechanics Poisson–Boltzmann surface area (MMPBSA) and molecular mechanics generalized Born surface area (MMGBSA), applied to the nine FDA-approved HIV-1 protease inhibitors. Our results show that the values obtained from replica simulations of the same protease–drug complex, differing only in initially assigned atom velocities, can vary by as much as 10 kcal mol–1, which is greater than the difference between the best and worst binding inhibitors under investigation. Despite this, analysis of ensembles of simulations producing 50 trajectories of 4 ns duration leads to well converged free energy estimates. For seven inhibitors, we find that with correctly converged normal mode estimates of the configurational entropy, we can correctly distinguish inhibitors in agreement with experimental data for both the MMPBSA and MMGBSA methods and thus have the ability to rank the efficacy of binding of this selection of drugs to the protease (no account is made for free energy penalties associated with protein distortion leading to the over estimation of the binding strength of the two largest inhibitors ritonavir and atazanavir). We obtain improved rankings and estimates of the relative binding strengths of the drugs by using a novel combination of MMPBSA/MMGBSA with normal mode entropy estimates and the free energy of association calculated directly from simulation trajectories. Our work provides a thorough assessment of what is required to produce converged and hence reliable free energies for protein–ligand binding.

1 follow the method set out in Sadiq et al. 1 . Ensembles of 20 replica systems were run to produce 4 ns of production trajectory for each of the nine FDA approved inhibitors bound to the HXB2 wildtype HIV-1 protease. The trajectories were analysed using MMPBSA and normal modes to produce binding affinities and the protonation state with the most attractive (negative) was chosen for each inhibitor. Table S1 shows the results for all systems with the assigned protonation state emboldened for each inhibitor.

MMPBSA/MMGBSA Average Time Evolution
The per nanosecond averages of ∆G MMPBSA and ∆G MMGBSA for 50 replica ensembles of wildtype HIV-1 protease bound to LPV are shown in Figure S1. In both cases the averages show only small changes over the 10 ns of production simulation. A drift to less negative values is however detectable. This is likely due to the incursion of individual water molecules into the active site which is poorly handled by the MMPBSA and MMGBSA methods.

MMPBSA/MMGBSA Component Conversion
The variation of the bootstrap statistics, σ boot , for each of the components of ∆G MMPBSA and ∆G MMGBSA is shown in Figure S2.    Table S3 and Table S4 show the thermodynamic decomposition of the binding affinity estimates for all nine FDA approved inhibitors to HIV-1 protease produced by MMPBSA and MMGBSA calculations respectively. Two ensembles were run for lopinavir, the second is labelled LPV2 .

Compared to Experimental Values
The free energy of association ∆G Assoc was proposed by Swanson et al. 6 to incorporate a component of solute entropy to MMPBSA calculations and provide a link to the standard state concentration. Table S5 shows the values obtained for each of the ligands simulated and compares the combination of ∆G Assoc and MMPBSA and MMGBSA results with values obtained from experiment.

19
Ten replicas of the ritonavir (RTV) ensemble were extended to 20 ns to investigate whether increased sampling might result in a better estimate of the binding free energy. Whilst the binding affinities calculated for most of the simulations remained stable one exhibited a marked change. Figure S11 shows that ∆G MMPBSA altered suddenly at the same point in the trajectory as a conformational change was observed in residue D29.  Figure S11: Measurements of the changes seen in a single replica of RTV bound to wildtype HIV-1 protease. In this simulation residue D29 moves from a position interacting with residue R8 at the entrance to the protease binding site to a flipped out position which allows additional water access to the drug, see Figure S12. This change coincides with a significant lowering of the binding affinity measured by MMPBSA as shown by tracks over time of (a) the separation of the D8 δ carbon with the ζ carbon of R8 and (b) ∆G MMPBSA . Figure S12 shows the location of D29 and the other residues in the P3 pocket of the RTV bound HIV-1 protease before and after the conformational change of D29. After the conformational change a water molecule is observed to interact with the drug; this is also shown in Figure S12. This in particular greatly enhanced our ability to run normal mode calculations by allowing us to combine many single core analysis runs into a single job, easing the management of large numbers of tasks and making more effective use of resources designed primarily for larger runs.

BigJob: A Pilot-based Framework
BigJob is a pilot-job implementation which provides a framework for running many types of distributed applications -including but not limited to very-large scale parallel simulations, many small high-throughput simulations, or ensemble-based workflows. 9 BigJob (BJ) provides a unified runtime environment for pilot-jobs on heterogeneous infrastructures. For this purpose, BigJob provides a higher-level, unifying interface to heterogeneous and/or distributed data and compute resources. The framework is accessed via the Pilot-API.
Applications can specify their resource requirements using a Pilot description. In the compute case, the user typically specifies the application to run as well as the number of cores required by their application. Pilots are started via the Pilot-Compute Service. BigJob eliminates the need to interact separately with different kinds of compute resources, e.g. batch-style HPC/HTC resources as well as cloud resources, and provides a unified abstraction for allocating resources.

SAGA: Interoperability Layer
In order for BigJob to work on heterogeneous resources, it requires an interoperability layer which provides access to a variety of middleware. This is achieved through the use of the Simple API for Grid Applications (SAGA). 7 SAGA defines a high-level access mechanism for distributed infras-tructure components like job schedulers, file transfers, and resource provisioning services. Given the heterogeneity of distributed infrastructures, SAGA provides a much needed interoperability layer that lowers the complexity and simplifies the use of distributed infrastructure whilst enhancing the sustainability of distributed applications, services, and tools.
SAGA is an Open Grid Forum (OGF) recognized standard (GFD.90). It allows developers of distributed applications to construct higher-level functionality and abstractions, such as gateways, workflows, application management systems, and runtime environments. One key advantage to running with SAGA is that users do not need to be concerned about the individual batch queueing systems implemented on the various machines. Using the SAGA API and appropriate job adaptors, the different submission mechanisms for these queueing systems is handled on the SAGA backend, which is transparent to the user.
The SAGA API has been used to provide almost complete coverage over nearly all grid and distributed computing middleware/systems, including but not limited to Condor, Genesis, Globus, UNICORE, SGE, LSF/PBS/Torque, and Amazon EC2.

Deployment of BigJob into User Space
SAGA and BigJob are lightweight in the sense that they can be easily installed into the home directory of a user via the Python Package Index (PyPi). SAGA is packaged within BigJob, so users do not have to worry about installing two separate modules. Further information is provided on the BigJob website.

Resources Used
The simulations presented in this work were performed on a wide range of resources, some of which form (or formed) part of the EU PRACE (http://www.prace-ri.eu) and US XSEDE networks (https://www.xsede.org). These included six conventional CPU based clusters and one GPU machine all of which are listed below. Simulations run on the CPU based machines typically ran on 64 cores which provided optimal scaling (or 48 in the case of Kraken). On EMERALD a single node with 12 CPU cores and 2 GPUs was used. This provided performance of approximately 2.5 hours or wallclock time per simulated nanosecond. On five machines (Ranger, Kraken, HECToR, HLRB II and Huygens) we ran simulations using the BigJob architecture. When this was the case, a workload of sixteen one nanosecond simulations was selected to be run as a single BigJob. Each system had a SAGA BigJob submitted to it with 256 cores, and a wall-clock time of twelve hours. Each BigJob runs four 1 ns simulation sections on 64 cores each simultaneously, repeated four times to produce a total of 16 ns of trajectory per BigJob. The job length was sufficient to allow the simulations to complete (assuming ideal behaviour of four sequential sets of four simulations) and to also fit in the queue at all of the sites targeted. Further details of how BigJob was used to run these systems has been published previously by Kenway et al. 10 .
The MMPBSA, MMGBSA and normal mode computations were all performed on Kraken or small local clusters. All of these analyses are serial compute runs requiring only a single core. On Kraken BigJobs containing up to 40 of these single core jobs were created and executed. Typically 24 MMPBSA and MMGBSA combined runs took approximately one hour, normal mode computations up to 12 hours.