Identification of novel and potent inhibitors of SARS-CoV-2 main protease from DNA-encoded chemical libraries

In vitro screening of large compounds libraries with automated high-throughput screening is expensive, time consuming and requires dedicated infrastructures. Conversely, the selection of DNA-encoded chemical libraries (DECL) can be rapidly performed with routine equipment available in most laboratories. In this study we identified novel inhibitors of SARS-CoV-2 main protease (Mpro) through the affinity-based selection of the DELopen library (open access for academics), containing 4.2 billion compounds. The identified inhibitors were peptide-like compounds containing an N-terminal electrophilic group able to form a covalent bond with the nucleophilic Cys145 of Mpro, as confirmed by x-ray crystallography. This DECL selection campaign enabled the discovery of the unoptimized compound SLL11 displaying an IC50 of 30 nM, proving that the rapid exploration of large chemical spaces enabled by DECL technology, allows for the direct identification of potent inhibitors avoiding several rounds of iterative medicinal chemistry. Compound MP1, a close analogue of SLL11, showed antiviral activity against SARS-CoV-2 in the low micromolar range when tested in Caco-2 and Calu-3 (EC50 = 2.3 µM) cell lines. As peptide-like compounds can suffer from low cell permeability and metabolic stability, the cyclization of the compounds as well as the substitution of selected residues with D-enantiomers will be explored in the future to improve the antiviral activity of these novel compounds.


1-Introduction:
Four years after the COVID-19 pandemic started, infections are driven by the emergence of new SARS-CoV-2 variants of concern (1).Although mRNA vaccines have been instrumental in reducing severe disease and hospitalization, achieving long-term immunity appears challenging and periodical boosting is required.The now dominant Omicron subvariants present over 30 mutations in the spike protein (S gene), conferring resistance to neutralizing antibodies induced by previous mRNA vaccines, bivalent vaccines or infection with a previous variant (2)(3)(4).Also the efficacy of monoclonal antibodies used to prevent SARS-CoV-2 infection (5), disease progression and death (6,7) can be reduced by emerging variants presenting new mutations in the spike gene (6,7).In this scenario, antiviral options for the prevention and treatment of SARS-CoV-2 infection in immunosuppressed and high-risk subjects are necessary to further reduce the COVID-19 burden.Moreover, broad-spectrum antivirals active against several coronavirus will be instrumental for preventing or mitigating the next pandemic by reducing early transmission and providing a starting point for the development of more potent compounds if necessary.SARS-CoV-2 3-chymotrypsin-like cysteine protease (3CL protease) also known as main protease (M pro ) (8) mediates the maturation of viral proteins by cleaving the two viral polyproteins pp1a and pp1ab at 11 sites (9).M pro represents an attractive target for the development of antivirals against SARS-CoV-2 as it is essential for the viral life cycle and structurally conserved among alpha and beta coronaviruses (10), allowing the development of potent pan-coronavirus protease inhibitors (11)(12)(13).The feasibility of M pro inhibitors as prophylaxis or treatment against SARS-CoV-2 infection has also already been proven.Two orally administered M pro inhibitors, nirmatrelvir (approved by the FDA and EMA) and ensitrelvir (approved in Japan by the Ministry of Health, Labour and Welfare (MHLW) (14) are currently approved for emergency treatment of COVID-19.SARS-CoV-2 inhibitors have been identified through the development of substrate derived peptide-like compounds (8,15,16) as well as screening of large libraries of compounds and fragments using different techniques such as in silico screening (11), high-throughput screening (17) and crystallographic screening (18).
Two studies have also reported the identification of novel M pro inhibitors, with IC50 and EC50 in the low micromolar range (19,20), using DNA encoded chemical libraries (DECLs).DECLs are large collections of compounds tagged with a unique DNA barcode that are screened by affinity to select binders for a target of interest (21).Following the selection process, the chemical structures of the binders are elucidated through the sequencing of their unique DNA tags.This technology is particularly attractive because the combinatorial nature of DECL combined with affinity screening allow for rapid and cost-efficient exploration of large portions of chemical space in vitro.
Here we report a new class of peptide-like inhibitors of M pro identified using the commercially available DELopen platform (Wuxi Apptech) that provides a library of 4.2 billion compounds (screened by the user) as well as the services necessary for the sequence to structure decoding of binders' DNA tags.The most potent compound (MP1) inhibited the activity of SARS-CoV-2 M pro with IC50 = 24 nM and SARS-CoV-2 infection with EC50 = 2.3 µM in cell-based assays.

2.1-Bead capture test
Immobilization of M pro -Avi to Dynabeads MyOne Streptavidin T1 paramagnetic beads was tested before affinity selection according to a protocol provided with the DELopen kit.All reagents were added to 1.5 ml Epperdorf DNA Low Binding tubes during bead capture test and affinity selection.In brief, paramagnetic beads were washed three times with 1X wash buffer (50 mM Tris-HCl pH 7.5 -150 mM sodium chloride -0.05% Tween-20 -1 mM dithiothreitol) using DynaMag-2 magnetic stand (Invitrogen).M pro -Avi (6 µg) was diluted to 1X selection buffer (50 mM Tris-HCl pH 7.5 -150 mM sodium chloride -0.05% Tween-20 -1 mM dithiothreitol -0.1 mg/ml sheared salmon sperm DNA) and an aliquot was collected as input sample.Remaining Avi-tagged protein (5 µg) was immobilized to Dynabeads MyOne Streptavidin T1 paramagnetic beads at room temperature for 30 minutes.Flowthrough was collected and beads were washed once with selection buffer.Beads were suspended to 1X selection buffer, a 'beads' sample was collected, and the remaining bead suspension was heated at 95° C for 10 minutes.Beads were collected using a magnetic stand and solution was collected as heated eluate sample.Beads were resuspended to selection buffer (heated beads).To visualize the relative amount of immobilized and eluted protein the bead capture test samples were separated by SDS-PAGE.Samples were denatured and reduced using NuPAGE LDS Sample Buffer (4X) (Invitrogen) and NuPAGE Sample Reducing Agent (10X) (Invitrogen) by heating at 95° C for 5 minutes.Samples were separated on NuPAGE 4-12% Bis-Tris Mini Protein Gels (Invitrogen) using matched electrophoresis run chamber at 200V for 35 minutes.Proteins on the gels were visualized using InstantBlue Coomassie Protein Stain (Abcam), destaining with distilled water.

2.2-DECL affinity selection
Three rounds of affinity selection were performed using the 3rd generation DELopen DECL (WuXi Apptec; hits.wuxiapptec.com/delopen)according to the provider's instructions.Before each round, M pro -Avi (5 µg) was immobilized to Dynabeads MyOne Streptavidin T1 paramagnetic beads at room temperature for 30 minutes in 1X selection buffer.Reversible small molecule inhibitors were added at 20 µM concentration for the last 10 minutes of immobilization and included into DEL incubation mixtures at 20 µM final concentration.Covalent inhibitors at 5 µM were preincubated with immobilized target for 10 minutes at room temperature but not added to the DEL incubation.Bead-immobilized target at final concentration 1.7 µM was incubated with the DELopen compound library in 1X selection buffer (50 mM Tris-HCl pH 7.5 -150 mM sodium chloride -0.05% Tween-20 -1 mM dithiothreitol -0.1 mg/ml sheared salmon sperm DNA) for one hour at room temperature with gentle rotation.Three wash cycles were performed with 1X selection buffer (50 mM Tris-HCl pH 7.5 -150 mM sodium chloride -0.05% Tween-20 -1 mM dithiothreitol) after each round.The compound release method was heating for 10 minutes at 98°C after each round.Samples collected after the third selection round were stored at -80°C before transfer to Wuxi App Tec for post-selection quality control and deep-sequencing.

2.3-Protease expression and purification
SARS-CoV-2 3CL protease (M pro ) used for enzymatic assays and co-crystallization experiments was produced as previously described (22).The used construct contained nucleotide sequences corresponding to SARS-CoV-2 M pro residues S1-Q306 (Chinese isolate, NCBI accession number YP_009725301).A detailed protocol is provided in the supporting information.
Avi-tagged SARS-CoV-2 M pro (batch nr.MPRO_p009 GP-AVI) protease used for affinity selection of the DECL was kindly provided by Martin Walsh, Diamond Light Source, UK.A detailed protocol for the expression and purification of the enzyme is provided in the supporting information.SARS-CoV-2 M pro Washington strain (WA1, accession number MT246667), WT (M pro -WT) or carrying the E166V mutation (M pro -E166V) was expressed and purified as described in (23).

2.4-Protein-ligand co-crystallization and x-ray data collection:
Compounds SLL11, SLL12 and MP9 were added at 27-fold, 34-fold and 13-fold excess respectively to a 5.2 mg/ml SARS-CoV-2 protease solution in 20 mM HEPES (pH 7.5) and 50 mM NaCl.Before co-crystallization, the non-dissolved ligand was spun down by centrifugation 13,000 rpm for 30 seconds in a Hettich 200 R microcentrifuge.
For ligand SLL11 sitting-drop co-crystallization in 96-well Corning 3550 plates was performed by mixing equal amounts (150 + 150 nl) of ligand incubated protein with a solution consisting of 18% w/v PEG3350, 0.2M potassium-thiocyanate and 0.1M bis-tris propane pH 8.5 using a mosquito robot.The first crystals of the SLL11 complex appeared after 53 days in the well solution and were harvested on day 98.For ligand SLL12 we performed hanging-drop co-crystallization in 24-well VDXm plates by manual mixing equal amounts (1+1 l) of ligand incubated protein with a solution consisting of 20% w/v PEG3350, 0.2M potassium-thiocyanate and 0.1M bis-tris propane pH 8.5.The first crystals of the M pro -SLL12 complex appeared after 35 days and was harvested at day 98.For ligand MP9 we performed sitting-drop cocrystallization in 96-well MRC plates by mixing equal volumes 150 + 150 nl of ligand incubated protein with a solution consisting of 18% w/v PEG3350, 0.2M potassium-thiocyanate and 0.1M bis-tris propane pH 8.7 using a mosquito robot.The first crystals of the MP9 complex appeared after 5 days and were harvested at day 9.All crystals were cryo protected by adding cryo solution consisting of a solution supplemented with 15-30% Glycerol and 50mM NaCl to the crystal droplet right before the crystals were picked up using Dual-Thickness MicroLoop and flash-frozen in liquid nitrogen.X-ray diffraction datasets were collected at cryogenic 100 K temperature, using wavelength 0.9763 Å, at MAX IV BioMAX beamline (24) in Lund Sweden (Compounds: SLL11, PDB ID 9EO6, and SLL12, PDB ID 9EOR) and, using wavelength 0.9795 Å, at Diamond Light Source (25), i04 beamline in Oxfordshire, UK (Compound MP9, PDB ID 9EOX).All complex datasets crystallize in space group P1 with six SARS-CoV2 M Pro molecules in the asymmetric unit and we collected 360 degrees of data using a single crystal for each complex and processed our datasets using XDS (26) part of XDSAPP (27).The crystals diffracted better in some directions, and we therefore applied elliptical data truncation using the staraniso webserver (staraniso.globalphasing.org)leading to capture of best diffraction data at the expense of completeness in the highest resolution shells.
The structures were solved by molecular replacement using our in-house determined apo structure (7PFL) as search model.For data scaling, molecular replacement and refinement software from the CCP4 suite (28) was used such as aimless (29), phaser (30), refmac5 (31) and coot (32).Refinement dictionaries for the ligands were generated from a ligand SMILES string using the grade web server (grade.globalphasing.org),Acedrg (33) and other recently developed ligand tools (34) in CCP4 to model the ligand-protein covalent bond.We used non-crystallographic symmetry (NCS) restraints throughout refinement that ended when R/Rfree values did not improve further.The structures of compounds SLL11, SLL12 and MP9 have 97.0/96.2/96.5 % residues favored, 2.7/3.3/2.7 % residues allowed, and 0.3/0.5/0.8 % residues in outlier regions of the Ramachandran plot.The SLL11, SLL12 and MP9 structure have been deposited in the protein data bank with accession 9EO6, 9EOR and 9E0X respectively with data collection and refinement statistics presented in Supplementary Table1.

2.5-In vitro enzymatic assay:
All experiments were carried out as previously reported (22) in black flat-bottomed 96-well plates (Nunc, Thermo Fisher Scientific) in a final volume of 100 µl.Different concentrations of the compounds were incubated with recombinant SARS-CoV-2 protease (final concentration 100nM) in assay buffer (20 mM HEPES pH 7.5, 0.01% Triton X-100) for 10 minutes at room temperature.The enzymatic reaction was started by addition of the substrate DABCYL-Lys-Thr-Ser-Ala-Val-Leu-Gln-Ser-Gly-Phe-Arg-Lys-Met-Glu-EDANS at a final concentration of 20 µM.The fluorescent emission was monitored every 60 seconds for 40 minutes using a Tecan infinite M200 PRO plate reader (Tecan Trading AG, Switzerland) with the excitation wavelength set to 355 nm and the emission wavelength set to 538 nm.
For the initial screening, compounds stocks (10 mM in 100% DMSO) were diluted to 10µM in assay buffer and then further diluted ten times to the final concentration of 1 µM by transferring 10µl in the assay wells (final volume 100 µl).The final concentration of DMSO was kept to 0.01% (v/v) in all wells comprising the controls wells.All compounds and controls were tested with triplicates.
For the SAR study, compound's IC50 were determined with a 12-points concentration series composed of one series of six 1:5 dilutions ranging from 4 to 0.00128 µM (final concentration in the well), and a second series of six 1:5 dilutions ranging from 2 to 0.00064 µM (final concentration in the well).Average IC50 and standard error of the means were calculated from two independent experiments with each compound concentration tested in triplicates.DMSO concentrations were always kept ≤ 0.04% (v/v) in all wells.
For resistance testing, MP6, MP9, and nirmatrelvir were tested at final concentrations ranging from 4 µM to 0.00005 µM (eight 1:5 serial dilutions).The recombinant SARS-CoV-2 M pro (WA1, Washington strain) carrying the E166V mutation (M pro -E166V) was used at a final concentration of 500nM.The same compounds were also tested at concentration ranging from 100 µM to 0.78 µM (eight 1:2 serial dilutions) against the M pro -WT (final concentration 100 nM) also from the WA1 strain for comparison.Average IC50 and standard error of the means were calculated from two independent experiments with each compound concentration tested in triplicates.DMSO concentrations were always kept to ≤ 0.04% (v/v) in all wells when testing the compounds against M pro -E166V and to 1% (v/v) when testing the compounds against M pro -WT.The same concentration of DMSO was also present in control wells.
The relative fluorescence units (RFU) per second were plotted and the initial velocities were calculated, normalized to the controls (untreated protease controls wells = 0% inhibition, control wells with no substrate = 100% inhibition) and expressed as % of enzyme activity inhibition.The half maximal inhibitory concentration (IC50) was calculated by nonlinear regression fitting of the normalized 12-point dose response curve to the model "log(inhibitor) vs. normalized response -Variable slope" with equation: Y=100/(1+10^((LogIC50-X)*HillSlope)).The data analysis was conducted in GraphPad Prism (v.9.5., GraphPad Software, La Jolla California, USA).

2.6-CPE-based antiviral assay:
Calu-3 cells were grown in DMEM (Gibco, 41966029) supplemented with 10% FBS (Gibco, 10500064) and 1× penicillin-streptomycin (Sigma-Aldrich, PA333) and incubated at 37°C, 5% CO 2 atmosphere.The compound MP1 was tested at concentrations ranging from 20 to 0.156 µM (eight 1:2 serial dilutions).MP1 was tested in two independent experiments with each concentration tested in triplicates.One day prior to the assay, Calu-3 cells were seeded at a density of 20.000 cells/well in a 96-well plate in a final volume of 100µl of cell media (DMEM supplemented with 2% FBS, 1× penicillin−streptomycin, from now on referred to as DMEM-2).After overnight incubation (37°C, 5% CO2 atmosphere), cells were pretreated for two hours with 100µl of DMEM-2 with CP-100356 (MedChemExpress, HY-108347) added at a final concentration of 4µM.After two hours, the cell media containing CP-100356 was discarded, cells were washed with 100µl of PBS and infected with 200 plaque forming unites of SARS-CoV-2 (Isolate from Sweden ( 35)) corresponding to a MOI ~ 0.01.After one hour the cell media with the virus was discarded, cells were washed with 100µl of PBS and treated by adding 100µl of DMEM-2 containing different concentrations of MP1 and CP-100356 at a final concentration of 4µM.After 48h, the cell media supplemented with MP1 and CP-100356 was substituted with 100µl of fresh DMEM-2 to which 10 µl of a 5 mg/mL MTT (Sigma-Aldrich, M2128) solution in PBS was added.Following 4h of incubation, formazan crystals were solubilized overnight by adding 100 µl of a 10% SDS, 0.01 M HCl solution.Optical density (OD) at 570 and 690 nm was read using a Tecan infinite M200 PRO plate reader (Tecan Trading AG, Switzerland).Throughout the assay, cell controls (not infected cells treated or not treated with MP1 at different concentrations) and infection controls (wells not treated with MP1 at different concentrations) were also treated with CP-100356 (4µM) and DMSO concentration was kept constant to 0.25% (v/v) in all wells.OD readings at different wavelengths were subtracted, the resulting values were normalized to the controls, and EC50 were determined by nonlinear regression analysis using GraphPad Prism (vr.9.5, GraphPad Software, La Jolla California, USA).
The reaction mixture contained 12.5 µl of reaction buffer (a buffer containing 0.4 mM of each dNTP, 3.2 mM Mg2SO4), 0.5 µl of SuperScript III RT/Platinum Taq Mix, 0.5 µl of each primer (10 μM stock concentrations), 0.25 µl probe (10 μM stock concentration), 2.4 µl of 25 mM magnesium sulfate, 3.35 µl of nuclease-free water, and 5 µl of RNA template.The RT-qPCR assay was performed on a CFX96 Touch Real-Time PCR Detection System (Bio-Rad Laboratories, Hercules CA) under the following conditions: reverse transcription at 55 °C for 30 min and 95 °C for 3 min, followed by 45 cycles of denaturation at 95 °C for 15 s, extension at 57 °C for 30 s, and collecting the fluorescence signal at 68 °C for 30s.All samples were run in triplicate.The corresponding number of copies for each Ct was calculated from a standard curve prepared with synthetic DNA gene fragments (gBLOCKs; IDT, San Jose, CA) with a five-base-pair deletion in the amplified regions of the viral genome diluted in deionized, nuclease-free water to concentrations of 103-105 copies per µl.The five-base-pairs were deleted to be able to distinguish between viral RNA and gBLOCKs during sequencing.The LODs for both genes were 101 copies per µl.The relative fluorescence unit (RFU) data were obtained from the CFX Maestro Software (Bio-Rad CFX Maestro for Mac 1.1 Version 4.1.2434.0214,Bio-Rad Laboratories, Hercules, CA).Quantified viral RNA from infected wells treated with different concentrations of the compounds were normalized to the controls using GraphPad Prism (vr.9.5, GraphPad Software, La Jolla CA).

2.8-Compounds synthesis:
All compounds were synthesized by solid-phase peptide synthesis (SPPS) on 2-chlorotrityl chloride resin (2CTC) using different Fmoc-protected natural and unnatural amino acids.All compounds prepared were either dipeptides or tripeptides terminated by a carboxylic acid as capping agent at the N-terminus.Below follows, a general synthetic procedure describing the preparation of a carboxylic acid capped tripeptide.The syntheses of the corresponding capped dipeptides were carried out in an analogous fashion using less amino acid coupling.To a 3 mL syringe with a frit was added 2CTC resin (63 mg, 0.1 mmol, 1 equiv.),after which a solution of Fmoc-protected amino acid #1 (0.2 mmol, 2 equiv.)and N,N-diisopropylethylamine (DIPEA, 0.05 ml, 0.30 mmol, 3 equiv.) in dichloromethane (DCM, 1 ml) was prepared, and subsequently aspirated into the syringe.The resulting mixture was agitated at room temperature for 2 h to allow for coupling of amino acid #1 to the resin, after which the reaction solution was ejected and the 2CTC resin was washed with dimethyl formamide (DMF, 2x1 ml) under agitation for 30 seconds.To deactivate the remaining 2CTC functionalities on the resin, a solution of DCM/methanol/DIPEA (ratio 85:15:5, 1.05 mL) was aspirated into the syringe followed by agitation for an additional 15 min.The 2CTC resin was then washed with DMF (2x1 ml) and DCM (3x1 ml), before being subjected to subsequent Fmoc deprotection.Removal of the Fmoc group was done by treating the 2CTC resin with piperidine in methanol (80%, 1 ml) for 20 min under agitation at room temperature.The reaction solution was then ejected from the syringe, and the 2CTC resin was carefully washed using DMF (3x1 ml), methanol (2x1 ml), DCM (2x1 ml) and DMF (2x1 ml) before coupling of the next amino acid.The following method was used for coupling of amino acid #2 (0.15 mmol), and was subsequently repeated for the coupling of amino acid #3 (0.15 mmol) and the capping carboxylic acid (0.40 mmol with 3.8 equiv.HATU).The 2CTC-resin was treated in a 3 mL syringe with a frit with a solution of amino acid #2 (0.15 mmol) and DIPEA (0.07 mL, 0.4 mmol, 4 equiv.) in DMF (0.6 ml).To this mixture, was then aspirated a solution of HATU (72 mg, 0.19 mmol, 1.9 equiv.) in DMF (0.4 ml), and the corresponding solution was agitated at room temperature for 45 min, after which it was washed carefully with DMF (3x1 ml), methanol (2x1 ml), DCM (2x1 ml) and DMF (2x1 ml).Following coupling of amino acids #2 and #3, removal of the Fmoc group was done by treating the 2CTC resin with piperidine in methanol (80%, 1 ml) for 20 min under agitation at room temperature.The reaction solution was then ejected from the syringe, and the 2CTC resin was carefully washed using DMF (3x1 ml), methanol (2x1 ml), DCM (2x1 ml) and DMF (2x1 ml) before coupling of the next amino acid (or carboxylic acid in the last reaction).After completed synthesis, the target peptides were cleaved from the 2CTC resin using a mixture of hexafluoroisopropanol/DCM (700 µl/300 µl), followed by agitation for 10 minutes.The final product (purity >95%) was isolated after preparative HPLC purification using mobile phase: 20-60% MeCN in H2O with 0.05% formic acid over 8 CV with a flow of 30 mL/min.

3.1-DNA-encoded-chemical library screening
The in vitro affinity selection of binders from DELopen library (WuXi AppTech) was performed against recombinant avi-tagged SARS-CoV-2 M pro coupled to paramagnetic beads coated with streptavidin (Figure 1).The efficiency of M pro coupling to the paramagnetic beads and the activity of M pro bound to the magnetics beads was tested prior to the selection experiments, ensuring that the protease was stably bound to the beads while retaining its active conformation (Supplementary Figure 1 and 2).The selection of binders was performed with or without addition of SARS-CoV-2 M pro inhibitors X77 (20 µM) or GC376 (5 µM) that are known to bind and block the active site of M pro .Therefore, compounds selected in the presence of X77 or GC376 were regarded as possibly binding to an allosteric pocket.Selection was also performed against empty beads to filter out promiscuous binders of no interest.After three rounds of selection, four highly enriched binders of which one (SLL11) was a possible allosteric inhibitor, were identified by next generation sequencing (NGS) and selected for off-DNA synthesis (NGS and synthesis of compounds were provided by WuXi Apptec).The inhibitory activity of the synthesized compounds against recombinant SARS-CoV-2 M pro (Table 1) was confirmed using a FRET based enzymatic assay.The inhibitory activities of the synthesized compounds against recombinant SARS-CoV-2 M pro (Table 1) were confirmed using a FRET based enzymatic assay.Three out of the four synthetized compounds were active with sub micromolar IC50 ranging from 30 to 140 nM.The two most enriched binders against M pro , SLL11 and SLL12, were also the most potent (SLL11 IC50 = 30 nM, SLL12 IC50 = 53 nM) while the least enriched binder (SLL08) was not active at tested concentrations (IC50 > 50 µM).The active compounds displayed high selectivity for SARS-CoV-2 M pro and had no off-target activity against human Cathepsin S when tested up to a concentration of 50 µM (Supplementary Figure 3).The two most potent compounds were synthesized and co-crystallized with M pro to determine their binding pose and binding site.SLL11 and SLL12 are peptide-like compounds (Figure 2) composed of four nonnatural amino acids of which P1' present an electrophilic group.Both compounds were found to bind to the M pro active site with similar binding poses proving that SLL11 was not an allosteric inhibitor despite being selected and enriched also in presence of the M pro inhibitors X77 and GC376.This was not surprising as SLL11 and SLL12 shared three out of four groups.
The compounds interacted with M pro mainly via a covalent bond formed between the electrophilic P1' group and the catalytic Cys145 close to the S1' subsite and a hydrogen bond formed between the P3 group and His163 in the S1 subsite.While P2 and P3 hydrophobic residues occupied the S2 and S4 subsite.Hydrogen bonds were also formed between the backbone of the compounds and Glu166 and between the C-terminal amine of SLL11 and M pro Asn142 side chain.

3.2-Structure activity relationship study
Compound MP1 was obtained substituting the C-terminus of compound SLL11 (IC50 = 30 nM) with a carboxylic acid group to increase the compound solubility (Table 2).This structural difference caused no loss of activity and most likely did not affect its binding mode (Supplementary figure 4), thus compound MP1 (IC50 = 25 nM) was used as starting point and reference for the design of analogues and comparison of inhibitory activity (Table 2).The inhibitory activity of analogues against recombinant SARS-CoV-2 M pro was first screened at a concentration of 1 µM and analogues inhibiting M pro activity by 80% or more were further tested to determine their IC50 (Dose response curves are shown in Supplementary Figure 5).
We assumed that the covalent bond formed between the N-terminal electrophilic group and Cys145 was essential for MP1 activity and started designing analogues with modification at the C-terminus of the molecule.First, analogues presenting variations of the P3 and P2 residues, were designed to assess if the size of MP1 could be reduced while retaining activity.As observed from the crystallographic binding poses, the P3 residue occupies the S1 subsite of M pro and forms a hydrogen bond with His163.Therefore, removing the P3 residues (Compound MP2) or substituting P3 with groups lacking a hydrogen bond acceptor (Compounds MP3 and MP4) were poorly tolerated, reducing the compounds inhibitory activity to ~2% and ~30% respectively.Removing the side group of the P2 residue while maintaining the portion of backbone connecting P3 to the P1 residue (Compound MP5) also sharply reduced the inhibitory effect to ~8%.Since both P3 and P2 groups were essential for binding, and attempts to shorten MP1 were unsuccessful, we next assessed the contribution of the P1' electrophilic group to the binding of MP1 to the active of SARS-CoV-2 M pro .As expected, substituting the P1' residue with a less reactive analogue (Compounds MP6) was detrimental (~18% residual inhibitory activity) while a highly reactive aldehyde (compound MP7, IC50 = 24 nM) was equipotent to the original electrophile group of MP1.A bulkier electrophile group at P1' (Compounds MP8) also reduced the inhibitory activity against M pro to ~ 18%.Lastly, the analogues MP9, MP10 and MP11 were designed to evaluate variants of the P1 residues.Exchanging the hydrophobic P1 group to other hydrophobic aliphatic (MP9 IC50 = 139 nM) or planar aromatic groups (MP10 IC50 = 71 nM and MP11 IC50 = 69 nM) reduced the IC50 of the analogues up to 4-fold.However, all the analogues maintained a sub micromolar activity with IC50 in the low nanomolar range.
Overall, all residues constituting MP1 seemed to be well optimized, so we next explored the post-modification of the Cterminus with various amide groups.Analogues with polar amide substituents (compounds MP13 IC50 = 47 nM, MP14 IC50 = 45 nM and MP16 IC50 = 47 nM) were ~2 fold more potent than analogues with a non-polar amide groups (MP12 IC50 = 106 nm), suggesting that this substituent might take part in hydrogen bonding with residues located near the S1 subsite of M pro .

3.3-Antiviral activity of MP1 against SARS-CoV-2 in cell-based assays
MP1 was chosen to test the antiviral activity of the novel scaffold discovered in this study.The compound cell permeability was first evaluated Caco-2 cells (human colon epithelial cells) that were cultured on a permeable filter.MP1 showed good apical to basolateral (A-B) permeability (Papp A-B = 5.4 x 10 6 ± 3.6 x 10 6 cm/s) and lower basolateral to apical (B-A) permeability (Papp B-A = 2.5 x 10 6 ± 0.6 x 10 7 cm/s) with no observable efflux of the compound (Papp B-A / Papp A-B = 0.5).Accordingly, MP1 was active in Caco-2 cells and showed dose response inhibition of the viral replication measured by RT-qPCR (Figure 3A).Since Caco-2 cells are highly permissive for SARS-CoV-2 infection but do not develop CPE, we tested the capacity of MP1 to inhibit CPE development and increase cell viability in infected Calu-3 cells (Human lung epithelial cells).Unexpectedly, MP1 had no protective effect in infected Calu-3 at concentrations as high as 20 µM (data not shown).Based on previous studies reporting the P-glycoprotein (P-gp) mediated efflux of M pro peptide-like inhibitors (12,37,38), we assumed MP1 being a substrate of P-gp.When tested in combination with the P-gp inhibitor CP-100356 (4 µM), MP1 showed dose dependent inhibition of CPE development with EC50 = 2.3 ± 1.1 µM and observable cytotoxic effect (Fig. 3B).

3.4-Effect of the E166V M pro variant on MP1 and MP7 inhibitory activity
The inhibitory activity of compounds MP1 and MP7 was tested against recombinant M pro carrying the E166V (M pro -E166V) variant, known to confer resistance against nirmatrelvir.Compounds MP1, MP7 and nirmatrelvir, used as a reference, had comparable IC50 in the low nanomolar range against the recombinant wild type M pro (M pro -WT).The inhibitory activity of the three compounds was significantly decreased to 36.6 µM (MP1), 21.1 µM (MP7) and 15.7 µM (nirmatrelvir) and against M pro -E166V (Figure 4).

4-Discussion:
In this study we report the identification of novel inhibitors of SARS-COV-2 M pro from the commercially available "DELopen" DNA encoded chemical library (DECL).
Three potent compounds (SLL07, SLL11 and SLL12) with IC50 in the low nanomolar range were directly identified through the cost-efficient affinity screening of 4.2 billion combinatorial molecules.The selected peptide-like compounds were formed by three unnatural amino acids capped at the N-terminus by a carboxylic acid derivative carrying a reactive electrophilic group.Crystallographic studies of SLL11 and SLL12 binding poses showed that side groups of all four residues (P1', P1, P2 and P3) occupied one of the major subsites of M pro active site (S1', S1, S2 and S4) with a novel arrangement of the peptide-like backbone to fit the binding site.The backbone of compounds SLL11 and SLL12 in the crystal structures assumed an unprecedented semi-cyclic conformation with the N-terminus and C-terminus stacked and possibly forming a T-shaped non-bonded π-π interaction.(Figure 2).In this conformation the N-terminal electrophilic group binds Cys145 close to the S1' pocket while the C-terminal pyridine group binds in the S1 pocket.This C-terminus to N-terminus orientation is opposite to the orientation of the natural substrate of M pro (39) and its substrate-derived inhibitors.This results in the peptide backbone of SLL11 and SLL12 assuming the typical conformation of a D-peptide with the amino groups presented on the left side of the backbone alpha carbon as opposed to conventional L-amino acid backbones displaying them on the right side.This unusual confirmation becomes more evident when the binding poses of SLL11 and nirmatrelvir bound to the M pro of SARS-CoV-2 are compared as shown in the Supplementary Figure 6.
Attempts to further improve the compounds' potency by systematically varying one of the building blocks residues resulted in the design of several analogues with IC50 lower than 100 nM.However, SLL11 (and its close analogue MP1) remained the most potent inhibitor (IC50 = ≤ 30 nM) proving that the screening of large combinatorial libraries can identify well optimized compounds without the need for several iterative cycles of medicinal chemistry.
Reducing the molecular weight thus improving the physiochemical properties of the compounds proved to be difficult.The inhibitory activity of analogues with a less reactive electrophile at P1' was almost completely abrogated, suggesting that a covalent interaction with M pro was essential.Co-crystallization of SLL11 and SLL12 with SARS-CoV-2 M pro confirmed that the N-terminal nitrile electrophilic group formed a covalent bond with Cys145 of enzyme's catalytic dyad as reported for several other peptide-like inhibitors of M pro (12,38,40,41).Likewise, removing the whole P2 residue or P3 side group was not possible without causing a drop in compounds inhibitory activity.P3 formed an essential hydrogen bond with S2 His163 that mimics the interaction with the glutamine residue of its natural substrate (42).The side group of the P2 residue did not form any hydrogen bond with residues in the S4 pocket but was still essential for activity, suggesting that P2 contributes with more than just connecting P3 to the remaining part of the molecule.

Table 1 :Table 2 :
List of compounds selected from DECL libraries and relative in vitro activity.aThe enrichment score quantify how abundant a compound was after selection, e.g. a compound with enrichment score = 100 was 100 times more abundant List of MP1 analogues and relative Inhibitory activity against SARS-CoV-2 M pro a Compounds were tested at concentration of 1 µM with triplicates (n = 3 replicates).Average inhibition of M pro ± standard deviation (SD) is shown only for compounds that were tested in two independent experiments.b Average IC50 and standard error of the mean (SEM) were calculated from two independent experiments where each compound's concentration was tested in triplicates (n = 3 replicates).cThe IC50 of compounds MP12 was determined from a single experiment where each compound's concentration was tested in triplicates (n = 3 replicates), standard deviation (SD) is reported instead of SEM.

Figure 1 :
Figure 1: Workflow of hit identification and validation using DNA-encoded chemical libraries.Novel inhibitors of SARS-CoV-2 M pro were identified from the DELopen library containing 4.2 billion DNA-tagged compounds.After three rounds of affinity selection, the unique DNA-tag of binders was sequenced allowing the identification of highly enriched compounds with their numeric building block codes.The molecular structures of the most promising compounds were subsequentially disclosed by WuXi App Tec, four compounds were synthetized off-DNA and tested to confirm their in vitro inhibitory activity against M pro and antiviral activity against SARS-CoV-2 in cell-based assays.

Figure 2 :
Figure 2: Compound's general structure and interactions made with SARS-CoV-2 M pro .(A) Crystal structures of compounds SLL11 (Provisory PDB ID 9EO6), shown in orange, and SLL12 (Provisory PDB ID 9EOR), shown in purple bound to SARS-CoV-2 M pro active site.The two residues forming the enzyme's catalytic dyad (His41 and Cys145) and the residues found interacting with SLL11 and SLL12 are shown in red.The subsites of M pro active site and predicted hydrogen bonds are labeled and drawn in green respectively.(B) Electron density maps (2fo-fc) covering the ligands SLL11 (yellow) and SLL12 (orange) contoured at 1.2 sigma level.

Figure 3 :
Figure 3: Antiviral effect of MP1 against SARS-CoV-2.(A) The Inhibitory effect on viral replication was assessed using a yield reduction assay.Caco-2 cells were infected with SARS-CoV-2 (MOI 0.01) and treated with 5, 0.5 or 0.05 µM of MP1.Supernatants were collected 48h post infection and viral copy number per ml were quantified by RT-qPCR.The reduction in viral copy number per ml from two independent experiments is reported as inhibition of viral replication (%) ± SD. (B) Inhibitory effect on CPE development induced by SARS-CoV-2 infection.Calu-3 cells were infected (MOI 0.01) and treated with different concentrations of MP1, 48h post infection cell viability was assessed by MTT assay.The average EC50 from two independent experiments is reported ± the standard error of the mean.

Figure 4 :
Figure 4: Effect of the E166V variant on the inhibitory activity of MP1 and MP7 compounds.Dose response curves of MP1 and MP7 tested against recombinants wild type (M pro -WT) and mutated M pro carrying the E166V (M pro -E166V).IC50 values are reported as average ± SEM from two independent experiments perfumed with triplicates.Nirmatrelvir was also tested for comparison.