Identification of non-covalent SARS-CoV-2 main protease inhibitors by a virtual screen of commercially available drug-like compounds

Graphical abstract

tease (M pro , also known as 3CL pro ) and its complex with an α-ketoamide inhibitor was reported. 4 M pro is known to be a critical component of viral replication, making it an extremely attractive target for antivirals. 5 Unsurprisingly, several computational studies have been conducted to identify inhibitors of M pro focusing primarily on existing drugs and/or natural products. 6 In contrast, we decided to conduct a virtual screen of all commercially available drug-like molecules, giving a ligand pool with millions of candidates. By restricting our virtual screen to compounds which are already available, any identified hits would be readily available without the need for synthesizing them in-house.
We began our virtual screen by selecting the crystal structure of M pro bound to an α-ketoamide inhibitor 3 (PDB 6y2f) as our protein target ( Fig. 1). We selected this protein structure as it was the only structure of the SARS-CoV-2 M pro with a bound drug-like inhibitor at the time (early 2020). The protein structure was prepared using ChimeraX 7 to remove the inhibitor and water molecules, leaving only the protein chain. The remaining structure was then adjusted to physiological pH (7.4) and converted to the PDBQT format using Open Babel. 8 We next prepared a library of virtual compounds using the ZINC15 database, which contains more than 230 million compounds. 9 We filtered this initial library by selecting only compounds that were commercially available and applied a series of Lipinski filters to select only drug-like molecules (molecular weight 200-500 g/mol, log P ≤ 5, number of rotatable bonds ≤ 7, total polar surface area ≤ 150 Å 2 , Hbond donors ≤ 5, and H-bond acceptors ≤ 10). We further filtered the library to remove PAINS scaffolds, 10 then adjusted the pH of all compounds to 7.4 and compiled the library in PDBQT format. The final library consisted of 9,779,510 compounds.
Since several inhibitors of M pro have been reported, we next decided to test which of the three docking programs available to us (AutoDock Vina 11 , iDock 12 and Smina 13 ) gave the best performance using these known inhibitors as a benchmark. We selected a test set of 7 inhibitors with IC 50 values spanning 3 orders of magnitude 13,14 and then used the Spearman correlation coefficient (R s ) to rank the performance of each docking program (see supporting information, Table S1). We found that the iDock program performed the best (R s = 0.745) and selected it for our virtual screen. We then selected the center of our search space as the center of mass of the bound inhibitor (x = 10.832, y = -0.291, z = 20.731) and defined the search area as a 22 Å x 22 Å x 22 Å cube to encompass the full binding surface. All other parameters of the iDock program were left at default settings, as a recent benchmarking study has found that increasing the settings to make the calculations more time intensive give little to no improvement. 15 After concatenating the results, we selected the top 75 compounds as an initial arbitrary cutoff. These 75 compounds were then upload to Datawarrior 16 where we applied additional drug-like filters (Druglikedness score > 0, DrugScore > 0.25) and removed any identified toxicophores, resulting in 45 hit compounds. We next applied a Tanimoto filter to these 45 compounds based on the FP2 path-based fingerprint 17 using a threshold value of 0.4 to eliminate compounds with high similarity, resulting in a total of 28 unique hit compounds (see supporting information, Fig. S1). The ZINC IDs and several select properties of the top 28 hits (named CP-1, CP-2, etc.) as calculated by DataWarrior are shown in Table 1.
With our hit compounds identified, we began to order them from the associated vendors as indicated by the ZINC database. Due to the SARS-CoV-2 pandemic, however, some commercial vendors reported significant disruption in their transportation networks. As a result of these complications, we were unable to obtain eight of our hit compounds (CP-1, CP-4, CP-6, CP-15, CP-17, CP-18, CP-19, and CP-21). Rather than waiting for these compounds to become available, we decided to proceed to in vitro studies with the remaining 20 compounds.
With our compounds in hand, we next tested their inhibitory activity against M pro . Briefly, a pH 7.3 buffer prepared with 20 mM Tris-HCl, 100 mM NaCl, 1 mM EDTA and 1 mM DTT was utilized for the inhibitory assay of the hit compounds against SARS-CoV-2 M pro in vitro. The substrate labeled with Dabcyl and Edans at the N and C-terminus, respectively, and comprising the cleavage site of M pro ({Dabcyl}-KTSAVLQ↓SGFRKM-E{Edans}-NH2, cleavage site indicated by the arrow {↓}, GL Biochem, Shanghai) was employed in the FRET (fluorescence resonance energy transfer) based cleavage assay. Dequenching of the Edans fluorescence resulting from the cleavage of the substrate by the M pro was monitored at the emission/excitation wavelength of 460 nm and 360 nm, respectively, using an Flx800 fluorescence spectrophotometer (BioTek). All compounds were dissolved in DMSO for preparation of the stock solutions. Initially, 2.5 μL of M pro at a final concentration of 0.5 μM was pipetted into a well with 20 μL of buffer solution followed by the addition of 2.5 μL of the corresponding compound stock solution dissolved in DMSO, resulting in a final concentration of 50 μM in each well (2.5 μL of DMSO was used as the negative control). The protease and compound mixture was incubated at 37 • C for 10 min. Afterwards, 25 μL of FRET substrate dissolved in the reaction buffer at an overall concentration of 20 μM was added into the corresponding wells containing the protease and compound mixture to initiate the reaction. The relative fluorescence units per unit of time (ΔRFU/s) from the linear section of the curve were used for the calculation of the inhibition rate. The inhibition rate was determined by using the formula listed below where V o represents the initial velocity of reaction of the negative control (DMSO only) and V I represents the initial velocity of reaction with inhibitor: All experiments were performed in triplicate and the values are presented as the means ± the standard deviation (Table 2).
Since compounds CP-13 and CP-25 showed modest activity, we decided to subject them to molecular dynamics (MD) simulations carried out using AMBER18 to examine their ability to remain bound to the putative binding site under equilibrating conditions and also to provide a calculated binding energy at this site. 18 We did not select CP-27 for MD given the large SD associated with its observed inhibition due to autofluorescence. Protein-ligand complexes of CP-13 and CP-25 with M pro were subjected to 50 ns production runs at 300 K. Examination of the protein-ligand structure trajectories showed them to be stable throughout the production runs, with minimal reorganization of the ligands at the binding site and general stability of the protein structure. RMSD and potential energy data graphs are provided for CP-25 (Fig. S2) demonstrating that equilibration occurs within the first 35 ns, although a significant decrease in potential energy is not observed in this system. The RMSD-potential energy profile for CP-25 is representative of CP-13 with relative inhibitory activity, supporting the putative interactions of CP-13 and CP-25 with M pro as shown in Fig. 2 and Fig. 3. Additionally, the trajectories were analyzed by the MM/GBSA and MM/PBSA for computation of binding energies. 19 ΔG binding was estimated for CP-13  In conclusion, our virtual screen of nearly 10 million commercially available compounds targeting SARS-CoV-2 M pro and subsequent viral protease inhibition assay identified two compounds with modest inhibitory activity. The hit compounds, CP-13 and CP-25, are noncovalent inhibitors that bind with minimal reorganization and interact with protein residues primarily through non-polar interactions as indicated by MD simulation. While these compounds are not currently potent enough to advance to the clinic, they provide two additional candidates for development as antivirals to combat COVID-19. These  compounds could also serve as a template to synthetically attach electrophilic "warheads" which have led to the development of powerful covalent inhibitors of M pro . 4 While additional improvement in potency is needed, we believe these findings help address the critical need for new treatment options to combat the raging pandemic as our identification of commercially available compounds with inhibitory activity against M pro is a useful starting point for further optimization, either through purchasing or synthesizing additional analogues.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.