Docking and molecular dynamics studies of human ezrin protein with a modelled SARS-CoV-2 endodomain and their interaction with potential invasion inhibitors

Graphical abstract


Introduction
Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) is a single-stranded positive-sense RNA beta coronavirus of the coronaviridae family which include SARS-CoV-1 and the Mid-dle Eastern respiratory syndrome coronavirus (MERS-CoV) (Paules et al., 2020;de Wit et al., 2016). This highly pathogenic member of the coronaviridae family targets the upper respiratory tract (URT) with pneumonia-like symptoms which was termed Coronavirus Disease 2019  which was first detected in Wuhan in the Hubei Province of China (Zhu et al., 2019). The pandemic has so far involved 535 million infections with a death count of 6.3 million as on 16th of June 2022 (WHO, n.d.).
The spike membrane domain is 1255 amino acids long which is further classified into N-terminal ectodomain (1211 amino acids), along with smaller regions that form the transmembrane domain (TM) (23 amino acids), and the C-terminal endodomain (39 amino acids) (Li et al., 2005;Lu et al., 2013) (Fig. 1a). In particular, the endodomain is made up of both cysteine-rich and charge-rich motifs (DEDDSE) (Petit et al., 2007). The cysteine-rich motif is comprised of 4 cysteine clusters, namely, Cl1, Cl2, Cl3 and Cl4 which are principally involved in cell-cell membrane fusion and are accompanied by S protein-protein interaction (shown in Fig. 1b). The charge rich motif is involved in virion assembly, effected through the interaction of S and M proteins (Fehr and Perlman, 2015). It plays a crucial role in virus entry after the formation of the fusion pore (Millet et al., 2012). Based on the literature survey, the F1 lobe of the human ezrin protein interact with the endodomain of SARS-CoV-1 which in turn belongs to the ERM family (Millet et al., 2012). They are mainly involved in organizing the membrane domain through their association with actin (filamentous), transmembrane protein and lipid rafts (Fehon et al., 2010). However, the crystal structure of this part of the endodomain is still to be elucidated.
Ezrin is comprised of three functional domains classified into the FERM domain (1-296), central domain (297-496), and Cterminal domain (C-ERMAD) (497-586) (Fiévet et al., 2007). The FERM domain with three lobes (F1, F2 and F3) prefers to interact with membrane and signaling proteins (Smith et al., 2003;Bretscher et al., 2002). The F1 lobe of the ezrin protein interacts with the SARS-CoV-1 endodomain (Millet et al., 2012). The coiled central domain interacts with p85 and the C-ERMAD interacts with F-actin (Gautreau et al., 1999). Such binding of ezrin with the SARS-CoV endodomain restricts the membrane fusion entry of the virus and S dependent early events of infection, which appears to alter the efficacy of fusion (Millet et al., 2012). Based on the literature review, a single amino acid substitution is observed between the SARS-CoV-1 and 2 endodomain wherein alanine at position 13 of SARS-CoV-1 is substituted by cysteine in SARS-CoV-2 (Xia, 2021). The impact of this substitution within the endodomain and their role in ezrin binding remains elusive till date.
A search for potential drugs to treat the patients with COVID-19 infection is underway (National Institute of Health., Singh et al., 2020;Kumar et al., 2020). Two bis-piperazine based inhibitors (DHHC9) were recently reported to inhibit the palmitoylation in SARS-CoV resulting in lower fusion and infection (Ramadan et al., 2022).
This in-silico based study examines the difference in binding affinity of the ezrin protein with SARS-CoV-1 and 2 endodomain and drug molecules.

3D modelling of the protein complexes
In this study, the 3D protein models of human actin, human ezrin, and the endodomain of SARS-CoV-1 and 2 were generated. Firstly, the actin protein sequence of Homo sapiens was downloaded from the Uniprot database with Accession Number P63267 (Bairoch and Apweiler, 1997). This sequence was submitted to the SWISS-MODEL server to identify a potential template for generating actin protein (Schwede et al., 2003). A potential template with better query coverage and resolution was selected for model building and energy minimization processes using the GROMACS standalone software (Berendsen et al., 1995). Secondly, the ezrin protein sequence of Homo sapiens was retrieved from the Uniprot database (Accession Number: P15311) and subjected to similar approaches. Finally, an endodomain sequence for SARS-CoV-1 and 2 of 39 residues was retrieved from the National Center for Biotechnology Information database (NCBI, 1988) (SARS-CoV-1: YP0098250, SARS-CoV-2 MT050493), and a structural search was initiated. With no reported hits from the Protein Data Bank, the I-TASSER online server was considered for model building (Berman, 2000;Yang and Zhang, 2015). The five largest structure clusters itemized five models, each with C-and TM-scores. The highest C-score confirms the highest confidence score, and the highest TM score ensures the accuracy of the topology of the model. Furthermore, the amino acid alanine in SARS-CoV-1 was replaced by cysteine to generate the endodomain of SARS-CoV-2. Finally, all the minimized 3D structures were subjected to structural validation using the PROCHECK online server (Laskowski et al., 1993) as well as the ProSA-Web server (Wiederstein and Sippl, 2007). These validated structures were further subjected to docking studies.

. Optimization of the ligand molecules
Based on the literature review, drugs listed as being of interest for potential repurposing for the treatment of COVID-19 were retrieved from Protein Data Bank (Berman et al., 2000) and Pub-Chem Database (Kim et al., 2016). Drug molecules retrieved from the PDB are listed in Supplementary Table 1  server for the generation of stable conformers (Miteva et al., 2010). Regarding the procedural settings, the ''input type file" was 1D2D and the ''input drug" description was SDF. The ''output format" was in PDB formats. Regarding the produce setting, here we opted for ''single" for a single energetically reasonable conformation taken from several passes for multi-conformer generations.

Protein-protein docking simulations
The human ezrin protein was docked with actin protein using the HADDOCK online server (van Zundert et al., 2015) to identify their exact region of interaction. For the docking, the requisite active and passive residues were provided as needed. For the actin protein, the active residues include Gly24, Asp26, Ser349, Thr352, Thr149, and Glu168 . Residues of the passive were defined automatically through the checkbox. For the ezrin protein, based on the literature review, the last 34 amino acids of C-ERMAD from the Cterminal responsible for binding are considered to be the active site region (Turunen et al., 1994). Passive residues were again selected automatically. Next, the endodomains of SARS-CoV-1 and SARS-CoV-2 were docked with the ezrin protein to understand the effect of amino acid substitution on their binding affinity. Endodomain proteins were targeted against the F1 lobe (6-93 amino acid residues) of the ezrin protein which were also considered to be the active/passive residues. After each run, the HADDOCK score and the buried surface area were taken into consideration for clustering to derive a maximum score. All the docked poses were subjected to a PRODIGY online server analysis Xue et al., 2016). All the selected poses were visualized using Discovery Studio (BIOVIA, 2019).

Protein-ligand docking studies
To understand the overall drug binding affinity in ezrin protein, forty-six stable chemical compounds were docked with the ezrin protein using the AutoDock version 1.5.6 software (Morris et al., 2009). Docking studies were initiated with the addition of Kollman and Gasteiger charges for proteins and ligands respectively. The grid consists of twelve residues based on literature review 29 (His48, Trp58, Tyr85, Gln105, Ile115, Tyr201, and Asn204, Trp278, Leu274, Phe250, Pro272, and Leu281) was placed within the binding sites. The size of the grid box was maintained at 74 Å, 60 Å and 58 Å for Â, y, and z respectively. The grid center was also set to À8.583 Å, 6.056 Å, and À1.056 Å for x, y, and z, respectively. The AutoGrid 4.0 and AutoDock 4.0 programs were used to generate grid maps, and 10 conformers were generated using a Lamarckian Genetic Algorithm (LGA) for each of the compounds. Additionally, the binding energy and inhibition constant for each compound were selected for visualization using Discovery Studio Visualizer.

Molecular dynamics simulation
The structural stability of the ezrin protein in association with the viral endodomains and the drug moieties were investigated under optimum thermobaric conditions using the Desmond v.3.6 package (Shaw, 2013). The TIP3P water model was applied for the simulation of water molecules. Furthermore, orthorhombic periodic conditions were considered to define the shape and size of the repeating unit at 10 Å distances. In total, 12 Cl À ions were added to neutralize the system. The overall concentration of Na was 50.9 mM and Cl was 66.2 mM. Later the solvated system was further subjected to energy minimization under NPT ensemble maintaining the default parameters. The simulation was performed with the periodic boundary conditions in the NPT ensemble using OPLS 2005 force field parameters (Jorgensen et al., 1983). The overall temperature was maintained at 300 K, and the atmospheric pressure was at 1 atmospheric pressure using Nose-Hoover temperature coupling and isotropic scaling (Nosé, 1984). After equilibration, the molecular dynamics was performed for 100 ns to analyze the trajectory of the protein-protein and protein-ligand complex through the system. The Root Mean Square Deviation (RMSD), Root Mean Square Fluctuation (RMSF), and the number of hydrogen bonds and other interactions between the ezrin-endodomain and ezrin-drug moieties were analyzed from the generated images.

3D protein modelling
A template search for human actin protein through the SWISS-MODEL server listed 4EFH of Acanthamoeba Actin Complex with Spir Domain D, with a 92.5% amino acid sequence identity and 100% query coverage. The modelled structure (shown in Fig. 2a) was subjected to structural validation using a PROCHECK server Ramachandran plot analysis. As per the report, it showed 88.4% of residues in the most favored region, and 11.6% in the additional

Table 1
Protein-protein docking of ezrin-actin proteins using HADDOCK online software with their interacting residues.

Human Ezrin
Human Actin Gly147 allowed regions. However, no residues were observed in the generously allowed and disallowed regions (shown in Supplementary  Fig. 1a & b). Then, the partially available human ezrin protein (shown in Fig. 2b) from the Protein Data Bank with PDB ID: 4RM8 was energy minimized and submitted for Ramachandran Plot analysis. 92.5% of these residues were in the most favored region, 7.2% in the additional allowed region, no residues were in the generously allowed region, and 0.3% in the disallowed region. Based on a ProSA-Web server report, the local model quality of the actin and ezrin proteins was stable, with the graph running >0 value in both instances (shown in Supplementary Fig. 1c & d).
Next, the modelled endodomain of the SARS-CoV-1 using the I-TASSER with a C-score and an estimated TM-score of À2.13 and 0.46 ± 0.15 respectively, was selected for further analysis. The energy minimized modelled endodomain reported 21.2% residues in the most favored regions, 60.6% in additional allowed regions, 0.0% in the generously allowed region, and 3.0% in the disallowed region according to the Ramachandran plot. A ProSA-Web report   also confirmed this short peptide region to be energetically stable (shown in Supplementary Fig. 1e & f). The 1-39 long generated endodomain model with its actual amino acid position of 1235-1273 was considered for bond analysis. As per the generated report, the 3D model of the viral endodomain displayed three disulphide bridges, between Cys9-Cys14, Cys2-Cys6 and Cys7-Cys19. Their images were generated using PyMol standalone software (PyMOL Molecular Graphics System, Schrödinger, LLC). These disulphide bridges were observed between Cl-1 and Cl-2 (Cys2-Cys6); Cl-2 and Cl-3 (Cys9 and Cys14) and Cl-2 and Cl-4 (Cys7 and Cys19). Thus, all 4 subdivisions of clusters interact with each other to give more stability to the endodomain from a structural perspective. Our 3D modelled viral endodomains of SARS-CoV-1 and SARS-CoV-2 are the first of their kind to be reported (shown in Fig. 2c). To confirm the disulphide bridge formation in the endodomains of SARS-CoV-1 and SARS-CoV-2 here we performed

Table 2
The interacting residues of human ezrin protein with SARS-CoV-1 and SARS-CoV-2 endodomains during protein-protein docking.

Protein-protein docking
Based on the docking studies ezrin and actin proteins generated ten clusters with 4 poses each. The cluster was assessed with a HADDOCK score of À54.0 ± 6.1 and a buried surface area of 2602. 8 ± 36.0 Å 2 . As per the PRODIGY report, the binding energy of ezrin with actin is À9.5 kcal/mol. The bond analysis revealed three salt bridges between ezrin and actin which include Arg569-Glu335, Glu582-Lys329, and Arg559-Glu168. Additionally, thirteen hydrogen bonds were observed between the ezrin and actin proteins which stabilized their interactions (shown in Fig. 3a and in Table 1). The ezrin-actin protein docking agrees with the literature wherein actin prefers to bind within the 34 amino acids (Ile553-Leu586) of the C-terminal region. Additionally, interactions expand further into C-ERMAD (Thr533, Ser536, Arg542, and Glu525). Three salt bridges assist in stabilizing these interactions which are located within the C-terminal region. With respect to the ezrin-SARS-CoV-1 docking, the HADDOCK score is À17.4 ± 15.6 with a buried surface area of 1934.4 ± 32.6 Å 2 . Furthermore, the HADDOCK score of the endodomain of SARS-CoV-2 with the ezrin protein is À24.5 ± 6.7 with a buried surface area of 1835.4 ± 74.5 Å 2 . To conclude, the binding affinity between the endodomains and ezrin revealed that the binding of the SARS-CoV-1 endodomain was stronger with a larger binding surface area than that of SARS-CoV-2. The 99 Å 2 difference in surface area may assist in binding the host with the viral protein. According to the available literature, a containment in infection at the entry stage was reported by Millet et al., 2012. This could be due to the higher binding affinity and larger binding surface area, as concluded by these docking studies. Thus, the substitution of alanine with cysteine between SARS-CoV-1 and SARS-CoV-2 due to the mutation plays a vital role in the overall change in the binding pattern, wherein ezrin prefers lesser interaction with the cysteine cluster residues in SARS-CoV-1, which could assist through a better protein-protein interaction with a larger surface area.
As per the PRODIGY report, the binding affinities of SARS-CoV-1 and SARS-CoV-2 were À14.2 kcal/mol and À13.7 kcal/mol respectively. Based on the docking report, the endodomain of SARS-CoV-1 exhibited a stronger affinity than that of SARS-CoV-2 (shown in Table 2). The substitution of Ala 13 in SARS-CoV-1 with Cys1247 in SARS-CoV-2 confers a considerable change in the binding pattern of viral endodomains with ezrin. From the analyses, it's observed that the Ala13 prefers to interact only with Asn23, which after substitution with Cys13 in SARS-CoV-2 binds with Lys64 apart from Asn23 (shown in Fig. 3b and c). An overall change in interaction patterns between the SARS-CoV-1 and SARS-CoV-2 endodomains against the ezrin protein due to the amino acid substitution is established through their amino acids' interaction via salt bridges and cysteine cluster interactions. Salt bridge analysis confirms that four salt bridges were observed between ezrin-SARS-CoV-1 and six salt bridges between ezrin-SARS-CoV-2. Additionally, cysteine clusters 1 and 2 of the endodomain of SARS-CoV-1 interact with the F1 lobe of the ezrin protein, whereas cysteine clusters 1, 2, and 3 of the endodomain of SARS-CoV-2 interact with the human ezrin protein. Compared to salt bridges, cysteine cluster interaction plays a crucial role in ezrin-endodomain interaction. Two cysteine clusters interacting between ezrin-SARS-CoV-1 and four cysteine clusters interacting between ezrin-SARS-CoV-2 have been observed, which may have positive effects on the interactions between the viral endodomain and the human ezrin protein (shown in Table 2). These additional cysteine cluster interactions and the ezrin protein may play an important role in reduced membrane fusions. In contrast, an increase in the number of cysteine cluster interactions between SARS-CoV-2 and ezrin has decreased the binding affinity between these two subunits, also considerably reducing their overall binding surface area. This lower level of affinity between the ezrin and the endodomain could negatively modulate the entry of the virus.

Protein-ligand docking
Current research provided further validation of the importance of ezrin regarding the infection processes in SARS-CoV-2, along with elucidation as to its interactions with drug molecules. It seemed prudent to attempt to discern the potential value of various drug repurposing candidates that may be applied as ezrintargeted therapies, to assist further research and clinical validation. Forty-six targeted compounds were docked against ezrin, of which seven showed promising binding affinities in conjunction with lower inhibition constants. The poses with a better binding affinity and inhibition constant (lM) value of <2 lM include ivermectin, minocycline, selamectin, calcifediol, calcitriol, ergocalciferol, and quercetin, with a binding affinity of À8.12, À8.16, À7.81, À7.83, À7.91, À7.81, and À7.85 kcal/mol respectively (shown in Fig. 4ag). Next, the chemical compounds with inhibition constant values (lM) of between >2 and <3 were selected. These include ber-   Quinoline is considered as a reference drug which has experimental approval as inhibitor for ezrin protein (Bulut et al., 2012). However, it showed a binding score of -5.76 kcal/mol with an inhibition constant of 59.7 (lM), much lower than that of the top seven drug molecules like ivermectin, minocycline, selamectin, calcifediol, calcitriol, ergocalciferol, and quercetin. Clinical studies and validation should follow up on these findings to further validate the potential efficacy of these ezrin-interacting drug candidates in treating COVID-19 infection.

Molecular dynamics simulation
Desmond molecular dynamics package was used to validate the structural stability of the docked complex of the ezrin-SARS-CoV-1 for the time scale of 100 ns. The overall RMSD of the SARS-CoV-1 endodomain-ezrin complex is maintained between 3.0 Å and 3.5 Å (Fig. 5a). The RMSD of the SARS-CoV-2 endodomain ezrin complex showed the RMSD score of 3.0-4.5 Å (Fig. 5b). The RMSF plot analysis of SARS-CoV-1 confirms the higher stability of the ezrin protein and higher flexibility of the endodomain (Fig. 5c). The endodomain of SARS-CoV-2 confirms a higher flexibility of the ezrin protein with the rigidity of the endodomain (Fig. 5d). The substitution of Ala with Cys in SARS-CoV-2 therefore conveys a dramatic effect on their stability. This is reflected in their protein-protein interaction wherein the ezrin-SARS-CoV-1 scored better with a larger interface area compared to that of the ezrin-SARS-CoV-2. During the protein-ligand simulation, the RMSD score of selamectin in complex with human ezrin protein maintained the stability of the protein starting from 5 ns to 70 ns at 1.8 Å. Calcitriol scored between 1.75 Å and 2.00 Å indicating their tightly bound state within the cavity till the end of the simulation. Ivermectin maintains 2.0 Å-2.5 Å. Similarly, minocycline maintains 2.4 Å with their strong binding starting from 25 ns till the end of the simulation. Even though, calcifediol fluctuates between 2.0 Å À2.8 Å whereas the drug maintains its stability within 2.0 Å from 25 ns till the end. Larger conformational change was observed with the binding of ergocalciferol. As a result, the drug could not maintain contact within the binding site. Thus, they showed a higher RMSD value compared to that of the ezrin protein. Quercetin exhibits stable conformation throughout the course of simulation at 1.75 Å (Supplementary Fig. S3a-g). Based on the RMSF plot of ezrin protein, the F1 lobe displayed rigidity after the binding of ivermectin which becomes highly flexible after the interaction of minocycline. Moderate flexibility of the F1 lobe is reported after the interaction of selamectin, calcitriol, calcifediol, quercetin, and ergocalciferol. F2 lobe shows higher flexibility after the interaction of selamectin, quercetin, and ivermectin. Regarding the F3 lobe, ergocalciferol binding enhances their flexibility which is followed by the binding of the calcifediol interaction. However, the binding of ivermectin, quercetin, and calcitriol restricts the mobility of the F3 lobe ( Supplementary Fig. S4a-g).
Ezrin interaction with the seven drug molecules can be monitored through the simulation. To begin with, calcifediol displays a positively charged interaction of Lys233 with ezrin. Their interaction with the protein during simulation is mentioned in percentage ( Fig. 6a-g). For Lys233 it is 94%. Two hydrophobic interactions of Ala82 and Tyr85 scored 49% and 41% respectively. Calcitriol shows a positive and negative charge-based interaction of Lys79 (41%) and Glu114 (26%). Ivermectin displayed a negative interaction wherein Glu114 (69%) followed by Arg279(56%) and Lys233 (40%). The protein-ligand contact report of minocycline with ezrin confirm two polar interactions of His48 (37%) and Asn204 (80%). Two hydrophobic interactions viz. Trp58 (73%) and Pro56 (34%) were observed. Also, a water bridge between Gly202 (35%) and ezrin was observed. Negatively charged interaction between quercetin and ezrin was observed between Glu108 (46%) and ezrin. Arg81(35%) showed a positive interaction with ezrin. Three hydrophobic interactions (Tyr85 (56%), Met200 (55%) and Ile203 (42%)) was observed. A single water bridge Gly202 (56%) was also observed. Selamectin showcased two positive charged interactions of Arg81 (51%) and Arg279 (51%). A negative interaction of Glu114 (71%) was observed. A polar interaction of Asn204 (45%) was also observed during the simulation. Ergocalciferol displayed a single hydrophobic interaction of Ile203 (45%) (Fig. 7a-g) (Supplementary Fig. S5a-g). The average total energy of the protein-ligand complex was calculated using the Simulation Quality Analysis tool which confirm quercetin to have a total average energy score of À134124.289 kcal/mol followed by minocycline to have the average total energy score of À134089.225 kcal/mol followed by calcifediol with an average total energy score of À134024.699 kcal/ mol. Calcitriol shows a score of À133997.913 kcal/mol followed by ergocalciferol with a score of À133984.27 kcal/mol. Both selamectin and ivermectin showed an average energy score of À133741.2 kcal/mol and À133613.193 kcal/mol respectively. Drugs like quercetin, minocycline, calcifediol, and calcitriol showed better total average scores as per their order.

Conclusions
A paucity of structural details of the SARS-CoV-1 and SARS-CoV-2 viral endodomains warranted the modelling and docking of these viral peptides with ezrin protein. Three disulphide bridges stabilize the endodomain. The SARS-CoV-1 endodomain demonstrates strong interaction with ezrin protein compared to SARS-CoV-2 due to the presence of alanine. This study establishes the role of a cysteine cluster of viral endodomain in interaction with ezrin protein, a property that likely assists SARS-CoV-2 with its observed unrestricted entry of fusion and S dependent infection. Finally, seven drugs were able to show strong binding with ezrin of which six of them bind within the pocket during simulation. This mechanism assists the stability of the ezrin protein and may block the entry of the viral endodomain. Future studies may further proceed to clinical trials aiming to identify potential drug molecules which can treat COVID-19 infection.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.