Computational design of medicinal compounds to inhibit RBD-hACE2 interaction in the Omicron variant: unveiling a vulnerable target site

The COVID-19 pandemic, caused by SARS-CoV-2, has globally affected both human health and economy. Several variants with a high potential for reinfection and the ability to evade immunity were detected shortly after the initial reported case of COVID-19. A total of 30 mutations in the spike protein (S) have been reported in the SARS-CoV-2 (BA.2) variant in India and South Africa, while half of these mutations are in the receptor-binding domain and have spread rapidly throughout the world. Drug repurposing offers potential advantages over the discovery of novel drugs, and one is that it can be delivered quickly without lengthy assessments and time-consuming clinical trials. In this study, computational drug design, such as pharmacophore-based virtual screening and MD simulation has been concentrated, in order to find a novel small molecular inhibitor that prevents hACE2 from binding to the receptor binding domain (RBD). three medicinal compound databases: North-East African, North African, and East African were screened and carried out a multi-step screening approach that identified three compounds, which are thymoquinol 2-O-beta-glucopyranoside (C1), lanneaflavonol (C2), and naringenin-4′-methoxy-7-O-Alpha-L-rhamnoside (C3), with excellent anti-viral properties against the RBD of the omicron variant. Furthermore, PAIN assay interference, computation bioactivity prediction, binding free energy, and dissociation constant were used to validate the top hits, which indicated good antiviral activity. The three compounds that were found may be useful against COVID-19, though more research is required. These findings could aid the development of novel therapeutic drugs against the emerging Omicron variant of SARS-CoV-2.


Introduction
SARS-CoV-2, which created the COVID-19 global pandemic, had a devastating effect on the world's economic system [1,2]. The COVID-19 pandemic has led to challenging circumstances all around the world. Epidemiology, testing, clinical care, prevention, management, and the rate of therapeutic drug discovery have all advanced at a faster pace [3][4][5][6][7]. However, a worldwide respiratory disease outbreak  brought on by SARS-CoV-2, a newly identified coronavirus variant, threatened human life by destroying 54.76 lakh lives until January 5, 2022 [8]. The current pandemic coronavirus 2019 (COVID-19) and the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) are both caused by the Bateacoronavirus genus, which belongs to the Coronaviridae family [9]. SARS-CoV2 was initially discovered in Bangladesh in March 2020 and spread quickly across the country (https://corona.gov. bd/?gclid) [10]. The virus spread incredibly quickly over a two-year period, resulting in a record number of cases and fatalities [11]. Many variants of the virus have been found since the first instance, increasing the risk of reinfection and immune evasion [12]. Numerous variants have been categorized to date into classes called variants of concern (VOCs) and variants of interest (VOIs) in order to address and treat public health issues according to their class [13]. The reported VOCs include the alpha, beta, gamma, delta, and omicron variants [14]. SARS-CoV-2 variants such as B.1.1.7 (501Y.V1/Alpha), B.1.351 (501Y. V2/Beta), B.1.1.28.1 (501Y.V3/Gamma), and B.1.617.2 (20A/S:478K/Delta) were among the VOCs which have been detected when COVID-19 cases quickly surged in the UK, South Africa, Brazil, India, and Bangladesh, respectively. Later, these VOC spread around the globe by infecting an increasing number of individuals.
The variant B.1.617.2 emerged for the first time in India in December 2020 [15]. Due to its rapid transmission, the World Health Organization (WHO) designated it as VOC Delta on May 11, 2021. Another variant P.1 (20J/501Y.V3) is a VOC and has three additional variants, namely K417T, E484K, and N501Y; these are similar, based on the RBD site, to the variant 20H/501Y.V2 of B.1.351 [16]. BA.1 and BA.2 have also been explored by the most recent evidence of the Omicron variant, which is being tracked by the Technical Advisory Group on SARS-CoV-2 Virus Evolution (TAG-VE) of the WHO [17]. Based on information related to transmission, severity, reinfection, diagnostics, treatments, and the effects of vaccines, the advisory group reiterated that the BA.2 subvariant will be regarded as a variant of concern and should continue to be classified as Omicron. (https://www.epicov. org/epi3/frontend#554769). A novel strategy for fighting the virus targets mutations in the receptor binding domain (RBD), which is also an immunodominant site and comprises spike glycoproteins of the N-terminal domain [14]. RBD of SARS-CoV-2 could be a critical therapeutic target because mutations occur rapidly here and it has some characteristics, such as viral affinity to likely bind with the host receptor and pathogenicity. Because of the omicron BA.2 variant's proclivity to connect with host variations, we used computational approaches to identify potential drug-like compounds. Mutations in the spike glycoprotein of RBD are common, making it a potential druggable target for new approaches and novel therapeutics to combat the pandemic [19][20][21]. Significant mutations occur in RBD of the SARS-CoV-2, which is also associated with the best viral affinity and pathogenicity, making it an ideal druggable target for omicron. Here in silico approaches were performed to molecular docking and modelling, along with free energy calculations using phytomedicines that are effective against SARS-CoV-2 from North Africa, East Africa, and North-East Africa. The findings provide important details regarding the screened drug-like compounds anti-viral effectiveness against the mutations in the RBD domain of the omicron variant. The research will assist in developing and discovering innovative medication treatments that could be used to combat the newly emerging omicron strain of SARS-CoV-2.

Structure retrieval and preparation
The SARS-CoV-2 genome was sequenced at the Genomic Research Laboratory, Bangladesh Council of Scientific and Industrial Research (BCSIR), Dhaka, Bangladesh. The extraction of viral RNA was performed using PureLink™ more specifically the PureLink™ Viral RNA/DNA Mini Kit (Cat.no: 12280050, Thermo Fisher Scientific, USA). Libraries were created using cDNA from the sample and the COVIDSeq assay, which reverse-transcribed the viral RNA that had been extracted into cDNA (Illumina Inc., San Diego, CA). The GoScriptTM Reverse Transcription System Protocol (www.promega.com) was used to generate cDNA from viral RNA, and the cDNA was used for library preparation with the COVIDSeq assay (Illumina Inc., San Diego, CA). According to the manufacturer's guidelines, the library was sequenced using the Illumina Miniseq instrument, according to the manufacturer's protocol (Read length 2 × 76 bp). The DNA sequence containing the code for the Delta and Omicron spike genes is converted into peptides by converting nucleotide sequences into amino acid sequences using the EMBL-EBI Transeq (EMBOSS) program (https://www.ebi.ac.uk/Tools/emboss/) [22]. The sequences of the variants were aligned to the Wuhan wild-type spike protein sequence downloaded from NCBI using the Clustal Omega software to validate the mutational profile of both mutants [23]. Furthermore, using the iterative threading assembly refinement server, 3D models were generated from the sequence information of the variants (i-TASSER) (https://zhanggroup.org/I-TASSER/) [24,25]. For the optimization purposes of the generated models, molecular mechanics force fields have been employed to minimize energy. The RBD-ACE2 of 6M0J was overlaid with the Omicron B.A2-Spike-RBD and WT. Additionally, the Molecular Operating Environment (MOE v2019. 01) [26] was used for complex generate using the protein preparation wizard. To prepare the protein complexes for further analysis, various modifications were made, including assigning correct bond orders, removing water molecules, adding terminal caps, adding missing atoms, minimizing energy, and assigning partial charges using the AMBER99 force field. Additionally, hydrogen atoms were added to achieve a pH 7 protonation state.

Structural based pharmacophore model generation
The crystallographic structure of the RBD-ACE2 protein was obtained from RCSB using PDB ID: 6MOJ. Meanwhile, the currently available antagonist Ribavirin was identified from a literature study. Ribavirin was first rigidly docked into the active site of RBD-ACE2, followed by pharmacophore generation with the Extended Hueckel Theory (EHT) approach. The total of 20 poses were generated, and the best poses were used for structure-based drug design. The selected target was then analyzed for the total number of hydrogen bond acceptors, donors, hydrophilic, and hydrophobic characters. The pharmacophore was generated with seven features, of which three were marked as essential. The generated model was validated by a test database containing SARS-CoV-2 along with Ribavirin. All compounds in the test database were screened on the seven-featured ligand-based pharmacophore, and their mapping modes were analyzed. For evaluation and assessment of drug-like features, the validated pharmacophore geometry was screened with chemical libraries like the ChemBridge database using the screening protocol implemented in MOE 2020.

Pharmacophore based virtual screening
To find the lead compound against Omicron BA.2-Spike-RBD, structure-based pharmacophore models were created based on the protein complex with the ligand. The selected proteins were analyzed using MOE software for the number of hydrogen-bond donors or acceptors, hydrophilicity, lipophilic characteristics, and ionizable charges. Six features were identified, consisting of two aromatic features, one acceptor feature, one donor and acceptor feature, and two hydrophobic features. For better results, we followed the Lipinski rule of five and added one positive feature from the protein for better model generation. In order to prepare for screening, North African, East African, and North-East African compounds were obtained in a 3D-SDF format from the African Natural Products Databases (ANPDB) (http://african-compo unds.org/anpdb/). Before the virtual screening of these databases, the FAF-Drugs4 webserver was used to obtain only non-toxic, drug-like molecules obeying Lipinski's rule of five. Then these databases were screened against the interface site of the RBD to obtain the final hits that could compete with the receptor [27]. Then, the MOE programme was used to choose the final hits based on the docking score, binding interactions, and binding energies. The final best hits were then examined for bioactivity, dissociation constant, and molecular dynamics simulation.

Molecular docking
The MMFF94x force field was used to protonate and energyminimize the Omicron BA.2 RBD structures while enabling all bound and non-bonded interactions at a gradient value of 0.05. Molecular docking was performed on the Omicron BA.2 RBD and the hit compounds to determine their binding mechanisms and variations in affinity. The conformations with the lowest docking scores in each docking were chosen. The free energy of a ligand from a specific pose is estimated via GBVI/WSA. The lower score indicates better favorable positions for all scoring functions [13]. To visualize the protein-hit binding interactions, the Pymol software was utilized.

Binding cavity measurement and grid preparation
The prepared structure of the selected protein was analyzed using UniProtKB and PrankWeb (accessed on November 1, 2022) to identify the active site of the protein. Furthermore, CASTp (http://sts.bioe.uic. edu/castp/; accessed on November 1, 2022) is a database of molecular surface properties and detailed 3D structural information of proteins. It provides information on molecular surfaces, cavities, and pockets, as well as other properties like hydrophobicity and electrostatic potential. The database helps researchers understand the relationships between protein structure, function, and interactions with other molecules. The online server, CASTp, was used to determine the changes in the volume of the binding cavity of native as well as mutant SMT. The alpha-shape method is used by this server to measure the volume and area of cavities. This server clearly distinguishes the inner and outer surface pockets. The default value of 1.4 Å for the probe radius was used.

Molecular dynamics simulation
The Amber20 package was used to examine the complexes for the evaluation of dynamic features. In order to achieve that, the force field (FF19SB) was used to parameterize complex [28,29]. Topology files were generated along with an antechamber module, and Cl+ and Na + ions were included to maintain system neutrality. The remaining energy was reduced in two steps (the steepest descent and conjugate gradient). After that, both steps (heating and equilibration) were completed. Finally, 100ns of production was run. However, covalent bonds and long-range electrostatic interactions were treated with the help of the SHAKE algorithm and the particle mesh Ewald algorithm, respectively. Finally, for trajectory analysis, the CPPTRAJ package was used, while simulation was run with the help of PMEMD. Cuda [30].

Binding free energy (BFE)
The MM/GBSA method was utilized to quantitatively estimate the binding energies between the protein and ligand. This approach is commonly used in combination with conventional MD simulations and free energy calculations to study small molecule binding. It has been widely employed in various research studies to evaluate binding energies. For this purpose, the following equation was used to estimate the BFE [31].

Pan assay interference
During the early stages of drug design, it is crucial to screen for compounds that possess the desired pharmacokinetic properties, such as ADMET. To ensure the selection of high-quality compounds, an electronic filter known as Pan Assay Interference Compounds (PAINS) is employed, which focuses on identifying compounds of superior quality in databases [33]. The PAINS filter is designed to scrutinize compounds that have a higher probability of interfering with assays, displaying chemical reactivity, being frequently hit, and not being recognized by toxicophoric filters. To predict the ADMET characteristics of the compounds, we utilized an online PAINS server (https://biosig.lab.uq.edu. au/pkcsm/prediction). Those substances that passed the PAINS filter and exhibited superior ADMET characteristics.

Bioactivity prediction and determination of dissociation constant (K D )
In order to predict the bioactivity of the shortlisted compounds, we employed cheminformatics tool, Molinspiration (https://www.molinspi ration.com/cgi-bin/properties), to estimate IC 50 values. Molinspiration have been utilized in many studies as a tool to forecast bioactivity values [34,35]. Furthermore, PRODIGY (PROtein binDIng enerGY) webserver, (https://wenmr.science.uu.nl/prodigy/), was also used to calculate the dissociation constant values for various biological complexes in order to offer convincing information of the strength of binding as K D [36,37].

Results and discussion
The emergence of COVID-19 and the subsequent appearance Omicron variant, has created a severe global situation. These mutations are occurring continuously, resulting in multiple variants, including Alpha, Beta, Gamma, Delta, and Omicron variant, and have posed significant challenges to public health worldwide. According to reports, the omicron variant is more contagious and raises the risk of reinfection. Despite efforts to control the spread of these new variants, the fact that they are able to evade most neutralizing antibodies and reduce the effectiveness of various vaccines poses a significant and alarming threat [38]. As a result, more in-depth research is required to tackle the problem of vaccine therapeutic evasion. Therefore, the development of new and potent therapeutics is very important for effective treatment.
In this study, we utilized computational tools such as pharmacophore-based screening and simulation techniques. We screened multiple databases, including those for North Africa, East Africa, and the North East, to identify potential drug candidate. The workflow of the design study is shown in Fig. 1. The spike glycoprotein is a critical multi-domain protein that plays a crucial role in COVID-19 infection. It is responsible for the initial binding and penetration of the virus into host cells. It is a multidomain protein with the most important RBD, which comes into contact with the host receptor directly and joins the cellular membrane for entry. It is considered a crucial target for the development of drugs and vaccines to prevent the infection. The active site residues were determined before docking. The SARS-CoV-2 Omicron BA.2-spike-RBD-Domain and SARS-CoV-2 Wuhan-spike-RBD-Domain structures were stacked to demonstrate the RMSD difference. The mutations in the RBD are shown in Fig. 2. This suggests that the new variant acquired structural alterations through mutations.

Pharmacophore modelling and validation
Pharmacophore modelling identifies and extracts ligand-receptor interactions. Standard electrostatic and electrical properties are needed to induce a biological reaction. By schematically representing the essential components of molecular recognition, which can be utilized to represent and distinguish molecules. Pharmacophore-based virtual screening is a commonly used method, and several approaches can be employed depending on the knowledge base available. A pharmacophore model was generated based on the recently published most potent inhibitor of SARS-CoV-2, i.e., ribavirin. An essential characteristic of a pharmacophore model for the most active compound was generated with the help of the MOE pharmacophore query command at the top of the windows. The resulting model consisted of six important characters, including two aromatic (F3 and F6), one acc (F2), one don and acc (F5), and one hydA (F4). Different colours represent the interaction between the ligand and receptor shown in Fig. 3.
The validated pharmacophore was used for virtual screening to determine the final hits of different compounds and their chemical nature. Prior to database screening, filtering of drug-like molecules using Lipinski's rule of five was performed. By focusing on the three databases, i.e., East Africa, North Africa, and North-Eastern Africa, virtual screening through MOE, which is based on pharmacophore, was carried out. This method is very important for finding the most potent drug in commercially available databases. Through virtual screening, among the total of 1871 compounds in East Africa, 145 were selected on the basis of LRO5, and the rest were excluded. In the North Africa databases, of a total of 4924 compounds, 230 were selected and 4694 were excluded, followed by R5. In the Northeast African databases, among 6512 compounds, 720 compounds were selected on the basis of R5. Using Lipinski's RO5, which states that (log p less than 5, log S less than 5, molecular weight less than 500 Da, H-bond acceptor less than 10, and donor less than 5), the assets of each hit ligand have been assessed.
The first and most important step in the pharmacophore-based virtual screening is molecular docking, and the compounds were selected on the basis of their best dock scores, which ranged from − 4.0 to − 2.5 kcal mol-1. Among the 50 compounds, only three were reported to have a docking score higher than − 3.5 kcal mol-1 while others showed less than − 3.5 kcal mol-1. Only 15% of the top compound was selected for the third round of screening, which showed a docking score ranging from − 6.0 kcal mol-1 to − 9.0 kcal mol-1. Among the African databases, thymoquinol-2-O-beta-glucopyranoside, lanneaflavonol, and naringenin-4 ′ -methoxy-7-O-Alpha-L-rhamnoside were selected for further analysis. The top three compounds, along with a reference compound, are given in Table 1.

Detailed interaction of final hits
Detailed interaction analysis of the top three compounds revealed information about hydrogen bonding, pie-pie interactions, salt bridges, and pie stacking. In the case of compound 1(C1), was found to establish a total of four hydrogen bonds with Leu222, Phe128, Arg186, Glu93, Trp101 and Asn118. It can be seen that this compound blocked the key mutated residues such as Arg186, Glu93, Trp101, and Asn118, which consequently halted the binding of RBD to hACE2. The docking score for thymoquinol 2-O-beta-glucopyranoside was reported to be − 7.1 kcal  mol-1. The 3D and 2D interaction pattern of C1 is given in Fig. 4. Compound 2 (C2), on the other hand, with the docking score of lanneaflavonol reported to be − 7.7 kcal mol-1 established three hydrogen bonds: Leu222, Phe188, Val123, and Trp101, also including pie-pie interactions. Herein, the same residues are targeted, which are novel mutations in the RBD of the omicron variant. The 2D and 3D interaction pattern of C2 is given in Fig. 5. Compound 3(C3), on the other hand with the docking score of naringenin-4 ′ -methoxy-7-O-Alpha-L-rhamnoside was reported to be − 8.5 kcal mol-1 established three hydrogen bond Asp442, Arg346 and Trp441 also including pie-pie interactions. The interaction pattern of C3 is similar to those of C1, C2, and C4, which also block these residues to disrupt the binding between RBD and hACE2.The 2D and 3D interaction pattern of C3 is given in Fig. 6.

Dynamic stability using MD trajectories
Identification of the effectiveness of a small molecular ligand in the binding site is essential because it determines the intensity or strength of the binding contact that also assists us; it is crucial to investigate dynamics and binding stability. AMBER was used for the molecular dynamics study of the Omicron BA.2 mutant strain and the wild-type SARS-CoV-2 strain. The solvated system was neutralised with the help of counterions and was used for comparison of the structural dynamics of mutant and wild-type strains. The FF19SB force field in the AMBER20 simulation package was used to achieve comprehensive molecular simulation for all atoms and energy refinement [39]. By using the force field FF19SB, it gave us very stable RBD and RMSD with no obvious change, as shown in Fig. 7. Although some minor fluctuation was initially observed in the stability of the complex on 30-60 ns, that was decreased once it reached 100 ns, where the stability of the complex was achieved. Furthermore, the calculation of RMSF was done for RBD domain residue to know the intrinsic and mutation-induced flexibility, where higher RMSF denotes flexibility while lower RMSF denotes a stable region. For all four systems of stimulation, almost the same pattern of fluctuation was observed, as shown in Fig. 8, where the same pattern of residual flexibility is illustrated. The 340-370 region of the C2 complex and the 400-430 region of the C3 complex were showing higher RMSF, while the 472-482 region of all complexes had also shown higher RMSF. The decrease in residual flexibility index of these loops minimizes the affinity of binding to hACE2, which can be seen in the graphs.
Thus, binding of these ligands results in various internal residual dynamics patterns due to which direct movement of loops helps in interacting with hACE2 and results in minimal coupling with the receptor.

Binding free energy calculation
The MM-PBSA/GBSA is used for the calculation of binding free energies. The MMPBSA/GBSA approach is based on the combination of continuous solvent fashions and molecular mechanical energies [40]. For these calculations, MMPBSA.py, a Python script, was used to calculate the free energy of the reference compound and selected compounds. In this study, the BFE (binding free energy) of the five systems

The Pan Assay Interference Compounds (PAINS) filter assay
PAINS electronically filters quality compounds in the database [33]. This assay is focused on substances that are likely to interfere with assays which are more chemically reactive, frequent hits, and unknown to toxicophoric filters [41]. During drug design, it is always recommended to examine multiple filtrations of compounds for desirable pharmacokinetic features, for instance, ADMET characteristics. Herein, on the online server PAINS (https://www.cbligand.org/PAINS/search_struct. php) highly reactive compounds are not encoded for in the PAINS filters because they were excluded from the high-throughput screening (HTS) library used to develop the filters and so were never present to provide indicting data (Table 3).

Bioactivity prediction analysis & dissociation constant
The estimation of IC 50 values for various druggable protein complexes is often carried out through in silico bioactivity prediction. By utilizing the Molinspiration server [42], we calculated the bioactivity scores as: C1 (0.21), C2 (− 0.32), and C3 (0.26), respectively, demonstrating the strongest bioactivity against proteins. In order to re-rank both protein and ligand complexes by utilizing K D and their inhibitory properties, the shortlisted final compounds validation was predicted as C1 (− 6.24), C2 (− 4.48), and C3 (− 6.15), respectively. The values demonstrated that the mentioned drugs have the ability to inhibit the target protein.

Conclusion
The SARS-CoV-2 outbreak has been associated with several mutations that have been classified as "variants of concern" because of their  increased infectivity and transmission rates. The identification of new drugs is crucial in the fight against the SARS-CoV-2 pandemic, particularly in light of the emergence of variants that are capable of evading the characteristics of existing antibodies. Computational methods can accelerate the discovery of drugs that could potentially hinder the SARS-CoV-2 variants. Therefore, promising drug candidates against the SARS-CoV-2 omicron (BA.2) variant are identified utilizing virtual screening and simulation techniques. Here, we aimed to perform a pharmacophore-based virtual screening and identify a novel drug to cope with the newly emerging variants. Hence, we used VS and MD simulation approaches to identify potential drugs that could inhibit SARS-CoV-2 Omicron BA.2. We then applied post-simulation analysis like free energy calculation, PAIN assay interference, dissociation constant, and bioactivity prediction to check the potency of the final hits. In summary, our study aimed to provide the basis for drug design to combat SARS-CoV-2 Omicron BA.2 variants.

Author's contributions
SA and MS performed conceptualized the study, overall guidance, and manuscript writing. MHS, BG, MS, SFC and SRN performed sample collection, preparation followed by disease detection. MHS, TAB, and BG carried out the NGS performing. AH, AAS, GZ, SA and SK supervised the project with funding. SA read, approved and submit the final manuscript.

Funding
The research is funded by the Government of the People's Republic of Bangladesh.

Availability of data and materials
The sequences of SARS-CoV-2 genome from Bangladesh were submitted in the GISAID database accession no. EPI_ISL_10708763. The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request also.

Declarations
The submitting research article "Computational Design of Medicinal Compounds to Inhibit RBD-hACE2 Interaction in Omicron Variant: Unveiling a Vulnerable Target Site" for publication in your journal of repute, is a unique article and nobody did it earlier. The authors also declared that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethics approval
Ethical approval is not required as samples from suspected COVID-19 patients were collected and tested from other places but not in BCSIR.

Consent for publication
Consents were taken from all the participates for this publication.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.