Introducing the Automated Ligand Searcher

The Automated Ligand Searcher (ALISE) is designed as an automated computational drug discovery tool. To approximate the binding free energy of ligands to a receptor, ALISE includes a three-stage workflow, with each stage involving an increasingly sophisticated computational method: molecular docking, molecular dynamics, and free energy perturbation, respectively. To narrow the number of potential ligands, poorly performing ligands are gradually segregated out. The performance and usability of ALISE are benchmarked for a case study containing known active ligands and decoys for the HIV protease. The example illustrates that ALISE filters the decoys successfully and demonstrates that the automation, comprehensiveness, and user-friendliness of the software make it a valuable tool for improved and faster drug development workflows.

Automated Ligand Searcher (ALISE) can search for potential ligands from the PubChem database 1,2 through the PubChem Power User Gateway -Representational State Transfer (PUG-REST) interface. 3,4This search is based on chemical similarity, which can be described as follows.A molecule can be characterized by a binary string (molecular fingerprint) whose bits identify whether it contains a specific molecular fragment (1) or not (0). 5The percentage of similarity between two ligands with molecular fingerprints A and B is determined by a Tanimoto score, K, which is the fraction of the intersection and union of fragments in the molecules, identical to 1s in A and B: 6 To fetch the ligands similar to the given input ligand, the user must supply its molecular formula or simplified molecular input line entry system (SMILES) string, 7 as well as the similarity threshold and the number of desired ligands.

AutoLigand
Based on the ligand with the most rotatable bonds and the receptor, ALISE can automatically determine a suitable search space using AutoLigand. 8AutoLigand identifies the continuous volume in the receptor with the highest interaction energy.A grid with 1 Å spacing that covers the entire receptor is created using AutoGrid4 9,10 and stores in its grid points the potential between the receptor and any atom type in the ligand 8 in that grid point.
AutoLigand performs three steps: flood fill, local migration, and ray-casting neighborhood search. 8The flood fill is initiated from one grid point and new points are sequentially added to the neighbor grid point with the highest potential until a continuous collection of grid points (an envelope) of a predefined size is formed.Flood fills are initiated from every fourth grid point unless the grid point is part of an existing flood fill. 8 local migration, the ten envelopes with the highest accumulated potentials are optimized by substituting the point in the envelope with the lowest potential, V old , with that of the neighbor grid point of highest potential, V new , see Fig.The ray-casting neighborhood search investigates a larger neighborhood for better potential.Rays of up to ten grid points are cast in six directions from all points in the envelope, see Fig. S1B.If the points in a ray have higher potential than the same number of lowest potential points in the envelope, the low-potential points are substituted by the points in

Virtual Screening Setup Options
In ALISE, a virtual screening experiment can be performed by only supplying a receptor and ligands of interest.However, to leave expert users as much flexibility as possible, virtually every aspect of ALISE can be adjusted from the summary page.The summary page is divided into six sub-menus: General options.A summary of the settings requested in the preliminary steps of ALISE, presented in Fig. 3 and the section Computational Realization of ALISE in the main paper.
Additionally, it is here possible to switch between different computational resources.
Input files setup.Search parameters.Revisits the search criteria used to search for ligands on the Pub-Chem database, 1,2 see step 2 of Fig. 3 in the main paper.
MD options.Figure S4 shows the MD setup options.The parameters used by the generalized Born implicit solvent functionality to compute solvation free energies can be changed.
Furthermore, standard MD settings: simulation time, time step, and system temperature, as well as the criteria for calculating interaction energies, e.g., the cutoff and switching distance, can be updated here.Finally, output frequencies for trajectory and energy files, and the number of MD simulations to perform in the virtual screening experiment can be modified.
Figure S4: Sub-menu allowing to modify how the molecular dynamics (MD) simulations are performed.The options affect how solvation free energies and interaction energies are computed.Furthermore, the simulation length and resolution can be modified here.
options include how many equilibration and simulation steps to perform in each step of the FEP transformation and how many steps to perform, in terms of the lambda window size.MD setting, e.g., time step, cutoff-and switching-distance, and output frequencies can be updated.Finally, the number of FEP simulations to perform in the virtual screening experiment can be modified.
Figure S5: Sub-menu allowing to modify how the FEP simulations are performed.Typical MD settings, e.g., time step, system temperature and pressure, and output frequencies can be defined.Additionally, the quality of the FEP simulations can be adjusted by the number of equilibration and simulation steps performed in each step of the FEP transformation.

Figure S1 :
Figure S1: The AutoLigand method.Possible ligand binding sites on the receptor are represented by collections of grid points (envelopes).A) In a migration step, the interaction energy of an envelope is improved by substituting low-potential points (red) with higher potential neighbor points (green).B) In the following step the neighborhood is investigated by casting rays of points (yellow) from each point in the envelope.If the search improves the potential of the envelope, a ray substitutes existing points in the envelope.Migration and ray casting are repeated until an envelope of converged potential is found as illustrated in C.
S1A. Substitution is allowed only if |V new | > |V old | and the envelope remains continuous.Local migration stops when no substitution will improve the potential of the envelope.
FigureS2shows the input files setup options.Here it can be chosen which types of molecular bonds should be flexible in the ligands, and if any amino acid side chains should be flexible during docking.Furthermore, it is possible to choose whether the receptor should be cleaned for water molecules and irregular residues, i.e., non-amino acids.

Figure S2 :
Figure S2: Sub-menu allowing to modify how the receptor and ligand files are set up.

Figure
FigureS3: Sub-menu allowing to modify how the docking is performed and how many docked ligands to present on the result page.