A multi-targeted approach to identify potential flavonoids against three targets in the SARS-CoV-2 life cycle

The advent and persistence of the Severe Acute Respiratory Syndrome Coronavirus – 2 (SARS-CoV-2)-induced Coronavirus Disease (COVID-19) pandemic since December 2019 has created the largest public health emergency in over a century. Despite the administration of multiple vaccines across the globe, there continues to be a lack of approved efficacious non-prophylactic interventions for the disease. Flavonoids are a class of phytochemicals with historically established antiviral, anti-inflammatory and antioxidative properties that are effective against cancers, type 2 diabetes mellitus, and even other human coronaviruses. To identify the most promising bioactive flavonoids against the SARS-CoV-2, this article screened a virtual library of 46 bioactive flavonoids against three promising targets in the SARS-CoV-2 life cycle: human TMPRSS2 protein, 3CLpro, and PLpro. By examining the effects of glycosylation and other structural-activity relationships, the presence of sugar moiety in flavonoids significantly reduces its binding energy. It increases the solubility of flavonoids leading to reduced toxicity and higher bioavailability. Through protein-ligand contact profiling, it was concluded that naringin formed more hydrogen bonds with TMPRSS2 and 3CLpro. In contrast, hesperidin formed a more significant number of hydrogen bonds with PLpro. These observations were complimented by the 100 ns molecular dynamics simulation and binding free energy analysis, which showed a considerable stability of docked bioflavonoids in the active site of SARS-CoV-2 target proteins. Finally, the binding affinity and stability of the selected docked complexes were compared with the reference ligands (camostat for TMPRSS2, GC376 for 3CLpro, and GRL0617 for PLpro) that strongly inhibit their respective SARS-COV-2 targets. Overall analysis revealed that the selected flavonoids could be potential therapeutic agents against SARS-CoV-2. Naringin showed better affinity and stability for TMPRSS2 and 3CLpro, whereas hesperidin showed a better binding relationship and stability for PLpro.


Introduction
In late December 2019, multiple cases of atypical pneumonia in Wuhan city, China, were reported to the World Health Organization (WHO); in less than three months, a global pandemic was declared [1]. The novel coronavirus (2019-nCoV, SARS-CoV-2) induced coronavirus disease (COVID-19) became the leading public health emergency worldwide, resulting in around 300 million infections and 5.5 million deaths as of January 2022 [2]. Clinical symptoms of this severe acute respiratory syndrome (SARS) manifest primarily in the upper and lower respiratory tracts. They include dry cough, fever, shortness of breath, sore throat, fatigue, nasal congestion, myalgia, chills, dizziness, and in some, loss of smell and taste [3]. The severity of these manifestations varies between individualsin particular, the worsening prognosis was associated with an increase in age and/or the presence of underlying chronic or immunodeficiency disorders [4]. Despite the accelerated development and mass roll-out of multiple COVID-19 vaccines leading to the administration of over 9 billion doses worldwide, in addition to their widely reported possible resultant adverse effects, there continues to be a lack of approved and effective therapeutic drugs as treatment or cure [5][6][7]. Of the existing interventions, the most used are prophylaxis using antiviral or antiretroviral therapy, supporting treatment using corticosteroids and immunosuppressants, breathing support such as mechanical ventilation, adjunctive therapy, and convalescent plasma therapy [5,8]. The ever-increasing number of infections despite mass vaccination, the circulation of new and more infectious strains, and the long-term effects of COVID-19 infection, even on individuals with no prior comorbidities, urgently demand the expedited screening, investigation, and development of target-specific anti-SARS-CoV-2 drug candidates to treat active cases [9].
Characterized by crown-like spike proteins on the virion surface, coronaviruses (CoVs) refer to a group of enveloped viruses with positivesense single-stranded RNA ((+)-ssRNA) belonging to the taxonomical subfamily Orthocoronavirinae/Coronavirinae, under the family Coronaviridae, and the order Nidovirales [10]. They are classified into the genera of alpha-, beta-, gamma-and delta coronaviruses based on their respective hosts: while alpha-and beta coronaviruses infect mammals (e.g., porcine epidemic diarrhea virus and SARS-CoV), the latter two cause diseases in birds (e.g., avian coronavirus and bulbul coronavirus HKU11) [11]. Six species of coronaviruses were identified as pathogens in humans, of whom three beta-CoVs have been responsible for significant outbreaks in the 21st century: the severe acute respiratory syndrome (SARS) epidemic in 2003, the Middle East respiratory syndrome (MERS) epidemic in 2012, and recently, the SARS-CoV-2-induced COVID-19. Multiple SARS-CoV-2 variants of concern are in circulation as the virus spreads, generating new mutations. Many such variants question the real-world effectiveness of existing vaccines and emphasize the importance of drug discovery efforts for this viral respiratory disease.
SARS-CoV-2 is characterized by multiple structural proteins including the Nucleocapsid (N), Envelope (E), Membrane matrix (M), and Spike (S) proteins (Fig. 1a), in addition to various other significant non-structural proteins (NSPs), including its RNA-dependent RNA polymerase (RdRp, nsp12), helicase (nsp13), exonuclease (nsp14), as well as two major cysteine proteases: the main protease (3CLpro or Mpro, nsp3) and the Papain-like protease (PLpro, nsp5) [12]. Following binding of the S1 subunit of spike protein to the Angiotensin-converting enzyme-2 (ACE2) receptor in human cells, the Transmembrane Protease Serine 2 (TMPRSS2) protease in human cell membranes cleaves the S protein. It activates the S2 domain, ultimately leading to cell fusion and viral entry. Following spike fusion through endocytosis and unloading of the viral genome into the cytoplasm, the released viral (+) ssRNA uses host ribosomal machinery to translate open reading frames (ORFs) 1a and 1b into polypeptides pp1a and pp1ab [13]. These proteins are post-translationally cleaved and modified by the proteases 3CLpro and PLpro into 16 non-structural proteins (NSPs) that give rise to proteins of the viral replication and transcription complex (RTC), which leads to viral genome replication and structural protein synthesis, followed by exocytosis of the newly formed virion progeny (Fig. 2) [14][15][16][17].

SARS-CoV-2 3CLpro
The SARS-CoV-2 3CLpro is a 306-amino acid, 33.8 kDa, a hydrophilic proteolytic enzyme belonging to the PA class of proteases. Its sequence identity is conserved with those of SARS-CoV (96.08%), MERS-CoV (87.00%), Human-CoV (90.00%), and Bovine-CoV (90.00%) homologs [18]. The enzyme cleaves the functional polypeptides at 11 specific sites (Fig. 1b). It is a homodimer consisting of two chains (protomers A & B) connected by an N-finger. Each chain consists of three domains (I, II & III) and an active catalytic dyad (Cys-145 and His-41) that cleaves at ↓: X-(L/F/M)-Q↓(G/A/S (Fig. 3a) [19]. Due to its vital role in the viral life cycle and the absence of alternative agents for proteolytic processing in human cells, this enzyme is a desirable target for drug development and subsequent trials.

SARS-CoV-2 PLpro
PLpro is a slightly basic, ~37 kDa, 315 residue enzyme of the more prominent 1945-amino acid, 212 kDa, Nsp3. It lies between the SARS unique domain (SUD/HVR) and a nucleic acid-binding domain (NAB). The enzyme cleaves the functional polypeptides at 3 different sites, but along with 3CLpro, it is considered a significant player in the viral life cycle (Fig. 1b). It acts at recognition sequence consensus LXGG ↓ XX. Like SARS-CoV-2 3CLpro, the PLpro sequence is conserved among other human beta coronaviruses, being 83% identical and 90% similar to SARS-CoV, and 31% identical and 49% similar to MERS-CoV [19]. It is a monomeric enzyme consisting of two domains, the Ubiquitin-like domain (Ubl) and the catalytic domain, which subdomains, thumb, palm, and fingers ( Fig. 3b) define its characteristics. The active site, consisting of the catalytic triad Cys111, His272, and Asp286, is located between the thumb and palm subdomains [17]. In addition to its role in polypeptide processing, the enzyme also cleaves the antiviral response-regulating ubiquitin and ubiquitin-like interferon-stimulated gene 15 protein (ISG15) from host proteins, subsequently leading to the reversal of post-translational modifications [20].

Human TMPRSS2 protein
Human TMPRSS2 proteins are 32-70 kDa, up to 530 amino acid transmembrane serine proteases coded for by the TMPRSS2 gene(s). They activate various endogenous substrates such as pro-hepatocyte growth factor, PAR-2, matriptase/ST14, and currently of interest, ACE2 [19,20]. TMPRSS2 cleaves the viral S protein at the S1/S2 site (RAR685↓) following interaction with the ACE2 receptor, ultimately leading to cell fusion and viral entry. The human DESC1 protease belongs to the type II transmembrane serine proteinase family. It consists of a 20 aa cytoplasmic region, a 14 aa transmembrane domain, and a 120 aa extracellular SEA domain, followed by a C-terminal trypsin-like serine proteinase domain, with the characteristic catalytic triad of Ser195, His57, and Asp102 (Fig. 3c) [21][22][23].
There is scope for blocking multiple stages of the viral life cycle via drug-induced inhibition. However, we focus on 3CLpro, PLpro, and TMPRSS2 given their undisputed roles in crucial stages of the viral machinery: viral protein processing, construction of RTC, disruption of host antiviral responses, and viral entry. De novo development of antiviral drugs is a long, expensive, and strenuous task, requiring interdisciplinary teams of researchers and precise safety considerations [24,25]. Phytochemicals are hailed for their effectiveness in combatting major public health concerns, including cancer and diabetes. Within the last two decades, coronaviruses such as those causing SARS and MERS [1,[26][27][28].
Given that the current treatment against COVID-19 is limited to prophylactic interventions such as monoclonal antibodies, peptides, ventilation support, and interferon therapies, we favor the exploration of natural, plant-based compounds, such as flavonoids, for their anti-SARS-CoV-2 potential. The role of the flavanone hesperidin against COVID-19 was investigated since the early phases of the pandemic [29]. Cheng et al. recently reported that the flavanone hesperidin and its aglycone, hesperetin, bind to hTMPRSS2 and hACE2. This leads to downstream cascade inhibition and reduces viral entry and infection [30]. A minor attraction was reported for the proteases PLpro and 3CLpro. They were using VeroE6 cells to investigate the high cytotoxic IC50 values of 1491 μM (hesperetin) and 1435 μM (hesperidin). Hesperetin also reduces the binding activity between hACE2 and the Spike protein, while both hesperetin and hesperidin, slightly decreased the enzyme activity of TMPRSS2. Interestingly, Cheng et al. report that both these flavanones possess the ability to downregulate the protein expression of hACE2 and hTMPRSS2 in normal (Beas 2B cell) and malignant lung (H460) cells post-transcriptionally through their interaction with heat-shock proteins (HSP70/90). Clinical trials such as NC T04452799 were launched to investigate the flavonoid's role in prophylactic and treatment mechanisms against the disease; this, along with other trials such as those reviewed by Kaul et al. [28] represent what clinicians must closely follow to discover possible therapeutics for the increasing number of patients.
In this study, a database containing 46 bioactive flavonoids was screened virtually via structure-based virtual screening to identify common potential therapeutic drugs which are potentially effective against SARS-CoV-2 3CLpro, SARS-CoV-2 PLpro, and human TMPRSS2 protein(s). Selected compounds were also assessed for their druglikeness and toxicity by SwissADME and ProTox-II online tools, respectively. In addition, the selected docked protein-ligand complexes were analyzed for their intermolecular interactions followed and their dynamic stability via molecular dynamics simulation. The pharmacokinetics and toxicity of the compounds chosen as well as the binding affinity and stability of the selected docked complexes were compared with the reference ligands reported inhibiting the three SARS-CoV-2 targets [31][32][33][34].

Structure-based virtual screening
The three-dimensional crystal structure of TMPRSS2 (PDB ID: 2OQ5) [23], 3CLpro (PDB ID: 6LU7) [35], and PLpro (PDB ID: 6W9C) [19] solved at 1.61 Å, 2.15 Å, and 2.70 Å resolution respectively by x-ray crystallography were selected as the target proteins. Flavonoids were screened against the selected target proteins via SBVS at the MTI-OpenScreen web server [36]. In this study, a total of 46 bioactive flavonoids (Table S1), previously reported as active against different infectious diseases. Their three-dimensional structures were retrieved from the PubChem online database (https://pubchem.ncbi.nlm.nih. gov/) [37]. All three target proteins were prepared for SBVS by removing the water molecules and heteroatoms, assigning the Gasteiger charges, adding the polar hydrogen atoms, and removing non-polar hydrogen atoms using the Dock Prep tool in UCSF Chimera-1.14 with default parameters [38]. Following protein preparation, the MTi-OpenScreen webserver was used to screen the collected bioflavonoids against the active site of the target proteins TMPRSS2, 3CLpro, and PLpro. As a result, two common bioflavonoids, hesperidin and naringin, which exhibited the highest negative docking scores respective to all three proteins were selected for ADMET (absorption, distribution, metabolism, excretion, and toxicity) analysis using the SwissADME (http://www.swissadme.ch/index.php) online tool (Fig. S1) [39].

Pharmacokinetic studies
The SwissADME online tool was used to evaluate predictions of the pharmacokinetic data of the newly suggested compounds, hesperidin, and naringin [39]. The canonical SMILES of these flavones plus three controls were extracted from PubMed (https://pubmed.ncbi.nlm.nih. gov/) and inserted into the software, which in turn returned the structures of the compounds and several data values used to evaluate their physicochemical properties, lipophilicity, water-solubility, pharmacokinetics, drug-likeliness, and medicinal chemistry (Table S2). "BOIL-ED-Egg" analysis representing the GI and BBB (blood-brain barrier) permeability of the control compounds is also reported (Fig. S2). At the same time, radar charts estimate their respective oral bioavailability's based on physicochemical factors (Fig. S3).

Toxicity studies
All compounds were further analyzed for toxicity using another online software tool known as ProTox-II (https://tox-new.charite.de/proto x_II/index.php?site=compound_input) [40]. The software returned several values, including the predicted median lethal dose (LD50), predicted toxicity class, average similarity, prediction accuracy, and toxicity against various targets represented in Tables 1, S3, and S4. The flavones naringin and hesperidin analyses were then compared to approved (control) molecules.

Redocking simulation and intermolecular interaction analysis
Molecular docking simulations of TMPRSS2, 3CLpro, and PLpro were performed with the selected bioflavonoids hesperidin and naringin using the Chimera-AutoDock Vina plugin setup to understand and decipher the active site residues that are involved in the interaction. The compounds camostat, GC376, and GRL0617 are known to inhibit TMPRSS2 [31,33], 3CLpro [32], and PLpro [34] respectively were used as the reference ligands in this study. Using default parameters in Chimera, the target proteins and the selected compounds were prepared for docking by removing the native ligands, thereby adding the charges and polar hydrogen atoms to the crystal structure. Docking was followed by the docking simulation in AutoDock Vina as a plugin under the default setting by repositioning and adjusting the grid size of (− 5.82 × 22.094 x 23.24) at the binding site of TMPRSS2 protein, the grid size of (− 13.09 × 13.99 x 69.30) at the binding site of 3CLpro protein, and grid size of (− 27.34 × 23.41 x 37.45) at the binding site of PLpro protein, along the X, Y, and Z-axis to cover all essential residues. The grid adjustment is required to provide sufficient space for the different conformations generated by ligands during the docking process. As a result of docking and docking simulation, at least 10 docked poses were collected based on the highest negative docking score, and least root mean square deviation (RMSD) (by default zero in AutoDock Vina). These were extracted to analyze the binding poses using the 2D interaction diagram in the free academic Maestro v12.9 package (Schrödinger Release 2021-3: Maestro, Schrödinger, LLC, New York, NY, 2021). Eventually, all the three-dimensional and two-dimensional interaction images were generated in the free academic Maestro v12.9 package, and similar steps were performed for the docking of reference inhibitors with the target proteins for the comparative analysis of binding pose against the selected bioflavonoid compounds.

Molecular dynamics simulation
The dynamic stability and intermolecular interactions of the docked complexes acquired from the docking experiments were analyzed by the molecular dynamics (MD) simulation, which was performed under the Linux system running on a HP Z2 Microwater workstation using the Desmond v5.6 [41] module of Schrödinger-Maestro v11.8 (Schrödinger Release 2018-4: Maestro, Schrödinger, LLC, New York, NY, 2018). Using the system builder module in Desmond, the MD system for all the docked complexes was constructed as an orthorhombic grid box (10 Å × 10 Å × 10 Å buffer) and then minimized by adding solvent as TIP4P (transferable intermolecular potential 4 points). The neutralization of the complete system followed the system minimization by adding the counterions (Na + or Cl − ) and further re-minimization using the minimization tool with default settings. Finally, the complete system was forced to simulate at 300K temperature and 1.01325 bar pressure. Along with the reference complex, all the prepared and minimized MD systems were analyzed with the 100 ns MD simulation with OPLS-2005 (force field) and 2000 frames collected at 50 ps interval during the stimulation interval. The trajectories generated via MD simulation were further analyzed for the RMSD, root mean square function (RMSF), and protein-ligand profiling using the simulation interaction diagram tool Desmond v5.6 module of Schrödinger-Maestro v11.8 [41].

Molecular mechanics/generalized born surface area (MM/GBSA) calculation
To calculate the binding free energy, MM/GBSA was performed on the last ten poses extracted at regular interval of 10 ns from the respective MD simulation trajectory under default parameters of Prime MMGBSA module in Schrödinger suite (Schrödinger Release 2018-3: Prime, Schrödinger, LLC, New York, NY, 2018). The solvent molecules and the ions were deleted for the refinement of extracted poses required for the MMGBSA calculation [42]. The net binding free energy (ΔG) was calculated by the following equation: In the above equation, binding free energy is denoted by ΔG bind , the free energy of complex is denoted by ΔG complex And the energy for the receptor and ligand is denoted by ΔG receptor and ΔG ligand , respectively.

Structure-based virtual screening
All the 46 compounds selected for structure-based virtual screening (SBVS) against the ligand-binding site of the three target proteins TMPRSS2, 3CLpro, and PLpro, belong to the flavonoids class. The binding energies of these 46 compounds were recorded in the considerable range of − 5.0 to − 10 kcal/mol against the active site residues of all three proteins but the common compounds, hesperidin and naringin, showed the highest docking scores (lower binding energies) against all three target proteins (Table S1). Based on this, they were selected for redocking and intermolecular interaction analysis (IMI analysis) with TMPRSS2, 3CLpro, and PLpro, compared with the reference ligands camostat, GC376, and GRL0617.

Role of glycosylation and structure-activity relationships
We observed a pattern between the glycosylation status of the most studied flavonoids retrieved from structure-based virtual screening and their corresponding binding energies to each target protein. Specifically, the top ten flavonoids with the lowest binding energies for TMPRSS2 were all glycosylated; similarly, eight out of the top ten flavonoids for 3CLpro were glycosylated, as were seven of the top ten flavonoids for PLpro. Therefore, we propose that glycosylation plays an important inhibitory role in the binding mechanism of flavonoids against these targets (Table S1). Such flavonoids could mechanistically block attachment and disrupt the entry of virions into host cells while interfering with various stages of viral replication, translation, and polyprotein processing in order to prevent the release of the viruses [43].

Properties of glycosylated flavonoids
Glycosylated flavonoids consist of a base flavan structure and subclass specific -R groups in addition to a sugar moiety such as arabinose, rhamnose, rutinose, galactose, or glucose; which impacts their respective biological activities [43]. Moreover, flavonoids without sugar moieties are seen to be largely insoluble in water. Thus clinical applications of such compounds have always been in doubt due to their inherent tendency to form polymers. However, glycosylation of flavonoids leads to higher bioavailability and reduces the potential for toxicity due to their newfound solubility provided by the sugar moieties [43]. The sugar moiety block and thereby protects the phenolic group from oxidative degradation. Biological activities of such flavonoids are enhanced against HIV and rotavirus, compared to their non-sugar and their corresponding aglycones (bottom row) based on their relative binding energies with respect to human TMPRSS2 protein derived from structure-based virtual screening. Substitutions and groups that increase/decrease binding affinities (decrease binding energy) are blue and red, respectively. Lower binding energy represents higher binding affinity. Refer to Table S1 for complete data of structurebased virtual screening. Structures adapted from PubChem.org. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Fig. 5.
Structure-Activity Relationships of select glycosylated flavonoids (top row) and their corresponding aglycones (bottom row) based on their relative binding energies for SARS-CoV-2 3CLpro, as derived from structure-based virtual screening. Substitutions and groups that increase/decrease binding affinities (decrease binding energy) are blue and red, respectively. Lower binding energy represents higher binding affinity. Refer to Table S1 for complete data of structure-based virtual screening. Structures adapted from PubChem.org. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.) precursors [44].
For decades, flavonoids were lauded for their antioxidant and antiinflammatory properties, and glycosylation complements this effect. For example, hesperidin, one of the molecules of interest in this study, has demonstrated high antioxidant activity through radical scavenging and an ability to reduce tissue damage via antioxidant cellular defenses using the ERK/Nrf2 signaling pathway [43,45]. The same glycosylated flavonoid downregulates inducible Nitric oxide synthase (iNOS) and Cyclooxygenase-2 (COX-2). Wherein lies its anti-inflammatory properties [46]. It modulates inflammatory markers such as (Interleukin) IL-6, IL-1β, and TNF-α, in addition to pro-inflammatory cytokines such as IFN-γ and IL-2, in various vital organs in animal models [47]. Moreover, by downregulating the abnormal levels of Angiotensin 2 in hypertensive rodents, hesperidin has been shown to attenuate inflammation. This model has the potential to be replicated in human SARS-CoV-2 patients, given the extensive role of the receptor in viral entry and pathogenesis of associated symptoms such as cytokine storm [43].
Similarly, naringin, a glycosylated flavonoid that is richly present in citrus fruits, is also known to have wide-ranging anti-inflammatory and antiviral activities that are potentially applicable for COVID-19 related therapy. By inhibiting the expression of COX-2, IL-1, IL-6, and iNOS via the suppression of the high mobility group box 1 (HMGB1), it offers antiinflammatory activities against COVID-19 pathogenesis [45]. Perhaps even more promisingly, naringin is widely investigated through live human clinical trials for its anti-hypercholesteraemic and anti-hypertensive activities. One study found a higher dosage of naringin reduces diastolic blood pressure significantly [48]. In addition to these two glycosylated flavonoids, others like baicalin and silymarin showed simultaneous anti-neuroinflammatory and antiviral potentials for combating SARS-CoV-2 neural complications [45].

Structure-activity relationships of glycosylated flavonoids
From the multiple flavonoids presented from structure-based virtual screening in Table S1, the structure-activity relationships of hesperidin, naringin, and a third glycosylated flavonoid were compared to their respective aglycones. From Figs. 4-6, it is evident that glycosylation generally leads to higher binding affinities (translated to lower binding energies) to all three targets presented in this study. Fig. 4 exhibits how the affinity for the human TMPRSS2 protein follows the order hesperidin > naringin > astragalin based on the relative presence/absence and positioning of hydroxyl groups on the backbone, in addition to the presence/absence of the C-ring C2-C3 double bond, which leads to lower binding energies when absent, as seen in hesperidin and naringin. However, the difference in affinity due to glycosylation can only be appreciated when flavonoids with sugar moieties are compared to their respective aglycones. We observe a staggering increase of up to 2.3 kcal/mol (hesperidin vs. hesperetin) due to the absence of glycosylation, highlighting its significant role in the binding mechanism.
Figs. 5 and 6 portray similar structure-activity relationships. The glycosylated and aglycosylated forms of myricetin and quercetin (under 3CLpro and PLpro, respectively) have much lower binding energies for the target proteins, further supporting the critical role of glycosylation of flavonoids in the SARS-CoV-2 protein binding mechanisms.
Our previous work supports the position that, overall, the presence of multiple hydroxyl groups on the B-ring of flavonoids reflects positively on the binding affinities of respective flavonoids to the viral targets [28]. Hesperidin and naringin are promising compounds against all the targets examined in this study, but especially against the human TMPRSS2 protein as per the structure-function tests. Glycosylated flavonoids were generally shown to bind more effectively to each of the targets than their aglycone counterparts. The analysis of structure-activity relationships plays a vital role in the initial stages of drug discovery and synthesis. It allows for the identification of particular structural aspects of compounds that enable efficacious binding and, later, activity.  6. Structure-Activity relationships of select glycosylated flavonoids (top row) and their corresponding aglycones (bottom row) based on their relative binding energies for SARS-CoV-2 PLpro, as derived from structure-based virtual screening. Substitutions and groups that increase/decrease binding affinities (decrease binding energy) are blue and red, respectively. Lower binding energy represents higher binding affinity. Refer to Table S1 for complete data of structure-based virtual screening. Structures adapted from PubChem.org.(For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Pharmacokinetic studies and toxicity studies
Hesperidin and naringin show several favorable biochemical properties, including good water solubility, no BBB permeability, and are substrates for P glycoprotein (Table S2). Solubility is one of the significant factors affecting the absorption of a drug into cells: being soluble is a requirement for drugs meant for parenteral usage, and enough of the drug for treatment can be delivered in a lower dosage. Hesperidin and naringin show even better solubility than the chosen control molecules [39]. Like the approved controls GC376 and camostat, they are not BBB permeable; also, like controls GRL0617 and GC376, they are P-glycoprotein (p-gp) substrates, suggesting good elimination from the body, reducing the potential for drug-induced toxicity (Fig. S2). The overall pharmacokinetic properties of the drugs are represented in the bioavailability radar in (Fig. S3). SwissADME is a helpful tool to ascertain the drug-likeliness of molecules before they proceed to pre-clinical trials. The drug-likeliness of a compound assesses the probability for a molecule to become an orally administered drug based on bioavailability [39]. The Lipinski Rule of Five is a parameter that is commonly used to evaluate the drug-likeliness of orally bioavailable drugs and states certain parameters that a drug must meet for the compound to be orally bioavailable. These predicted conditions include molecular mass less than 500 Dalton, high lipophilicity (expressed as consensus LogP <5), less than 5 hydrogen bond donors, less than 10 hydrogen bond acceptors, and a molar refractivity between 40 and 130 (Fig. S3).
As Table S2 shows, naringin and hesperidin, both have violations against the Lipinski Rule of Five. However, the Lipinski Rule of Five is only a tool that can help to predict the possibility of drug-likeliness based on orally bioavailable drugsit is not a parameter that must always be met to obtain approval standards [49]. This is illustrated by our control molecule GC376, an approved prodrug that readily converts to peptide aldehyde GC373 in cats and has activity against SARS-CoV-2 in humans [32,50]. Additionally, naringin and hesperidin's bioavailability score (defined as F>10% in rats) is 0.17, as shown in Table S2. This is the same as the bioavailability score for the approved drug GC376.
Additionally, while most flavonoids exhibit poor oral absorption and bioavailability due to the hydrophilic nature of the flavonoid glycoside group, several approaches improve the bioavailability of flavonoids including nanoformulations, encapsulation, and microemulsions [50]. Previous literature states that these molecules still display antiviral activities despite these problems. For example, naringin displays anti-adsorption effects against dengue virus type 2 [51], and hesperidin has also shown antiviral activity against rotavirus [51].
Additionally, several sources reported that the aglycones of these molecules had increased bioavailability. The aglycone of naringin is naringenin, which exerts therapeutic effects against SARS-CoV-2 through its action against the 3CLpro, main protease, and by reduction of ACE receptors [51]. Naringenin also displays anti-inflammatory activity against SARS-CoV-2 by reducing viral replication and cytokine production [51]. Naringenin and hesperetin (the aglycone of hesperidin) additionally show antiviral activity against the 17D strain of Yellow Fever Virus [52]. Generally, these flavonoids exhibit a high safety profile, with all three compounds having LD50 values of >2000 mg/kg to a minimum of 70% accuracy. Hesperidin appears to be the safest compound with a predicted LD50 of approximately 12,000 mg/kg, followed by naringin at 2300 mg/kg. All these values are higher or comparable to the LD50 for the approved control drugs (Table 1).

Redocking simulation and intermolecular interaction analysis
Through SBVS, the compounds with the best conformation and stability to target protein were predicted using the scoring function, which needed to be validated further via redocking and IMI analysis. To get the most appropriate binding poses of hesperidin and naringin in the binding pocket of TMPRSS2, 3CLpro, and PLpro, redockings were executed for each compound against all three proteins using AutoDock Vina with a minimum RMSD value (zero) and lesser docking score (highly negative). Both hesperidin and naringin exhibited binding energies in the acceptable range of − 7.7 to − 10 kcal/mol, and − 7.5 to − 9 kcal/mol respectively against TMPRSS2, 3CLpro, and PLpro.
Analyzing the intermolecular interactions of all the docked complexes, the TMPRSS2-hesperidin docked complex may be considered a more stable complex due to the higher number of H-bonds than with TMPRSS2-naringin. Similarly, of the PLpro-hesperidin and PLpronaringin complexes, the former docked complex may be considered more stable. In the case of 3CLpro, the docked complexes 3CLpro-hesperidin and 3CLpro-naringin appear to accommodate almost the same number of H-bonds, while the identical residues are involved in various other intermolecular interactions. These two complexes are considered to be almost equally stable.
As discussed in section 3.2.2, the glycosylation of the selected bioflavonoids reduces the binding energy and plays a significant role in the binding with the SRAS-CoV-2 target proteins (Figs. 4-6). The affinity between the bioflavonoids and the target proteins increases due to the sugar moiety's presence, which actively participates in the intermolecular interaction with the active site residues seen in the two-dimensional interaction diagrams (Fig. 7). In TMPRSS2-hesperidin docked complex, six out of seven hydrogen bonds were formed by the sugar moiety of hesperidin (Fig. 7a), and in TMPRSS2-narigin docked complex, only one hydrogen bond was seen, and it was formed by the sugar moiety of naringin (Fig. 7c). In the 3CLpro-hesperidin docked complex, two out of four hydrogen bonds were formed by the sugar moiety of hesperidin (Fig. 7e). In the 3CLpro-naringin docked complex, four out of five hydrogen bonds were created by the sugar moiety of naringin (Fig. 7g). Similarly, in PLpro-hesperidin docked complex, three out of six hydrogen bonds were formed by the sugar moiety of hesperidin (Fig. 7i). In contrast, all three hydrogen bonds were formed by the sugar moiety of naringin in the PLpro-naringin docked complex (Fig. 7k). Therefore, by analyzing the result of SAR and redocking altogether, it is clear that glycosylation plays a crucial role in the interaction between the selected bioflavonoids and SARS-CoV-2 target proteins.

Molecular dynamics simulation
Molecular dynamics (MD) simulation provides insights into a protein's dynamic behavior and the conformational changes that occur on binding with ligands by predicting the stability and intermolecular interactions of docked complexes with respect to time [52,53] Here, to check the stability of docked protein-ligand complexes, various parameters such as root mean square deviation (RMSD), root mean square fluctuation (RMSF), and protein-ligand contact profiling was recorded and tracked during the entire 100 ns simulation trajectories. Usually, the RMSD is used to monitor the structural fluctuations at the global level. It gives information about the deviation and conformational changes in the protein and ligand structure in the docked complex. In contrast, the RMSF monitors the fluctuation at the local level and measures the mean deviation of each residue and atom of protein and ligand, respectively, in each other's presence. Additionally, protein-ligand contact profiling assesses the interactions between the protein and ligand in the docked complexes during the simulation with respect to time, further exploring the stability of the docked ligand in the protein binding pocket.

RMSD and RMSF analysis
Primarily, the RMSD analysis of protein and ligand for each of the docked protein-ligand complexes was recorded with respect to the initial pose taken as a reference frame (Fig. 8). The RMSD for TMPRSS2 deviated <2 Å in both the TMPRSS2-hesperidin and TMPRSS2-naringin docked complexes. This indicated that the protein did not experience conformational changes when bound either to hesperidin or naringin. In contrast, the ligands showed significant fluctuations in both the docked complexes but remained in the binding pocket. The RMSD for hesperidin showed a very high deviation (− 9.0 Å) (Fig. 8a) that was observed in the case of the reference ligand camostat (Fig. 8c). This is supported by the RMSF values, where several atoms showed fluctuations close to 8 Å (Figs. S5a and S5c). However, unlike camostat, hesperidin remained buried inside the binding pocket. The RMSD for the naringin showed less deviation (− 5.0 Å) as compared to the reference ligand (− 9.0 Å) and remained inside the binding pocket of TMPRSS2 (Fig. 8b and c). Therefore, it is concluded that both the docked complexes exhibit higher stability and rigidity as compared to the reference TMPRSS-camostat complex. Additionally, the TMPRSS2-naringin docked complex showed higher dynamic stability with less deviation than the TMPRSS2hesperidin docked complex (Fig. 8a-c).
Similarly, the RMSD for the protein 3CLpro in the docked complexes 3CLpro-hesperidin and 3CLpro-naringin showed a mean deviation of <2.5 Å and <1.5 Å, respectively, considered highly acceptable. If we compare the RMSD of hesperidin and naringin in comparison to the reference ligand GC376 when bound to 3CLpro, both showed a higher mean deviation (>10 Å) than GC376 (<3.5 Å) (Fig. 8d-f). Their RMSF values support this observation, wherein hesperidin few atoms exhibited an internal fluctuation ≥8.0 Å, and in naringin the fluctuation was <3 Å (Figs. S5d and S5e). Additionally, hesperidin tended to leave the binding pocket between 60 and 70 ns of the 100 ns trajectory and again after 90 ns. This was not observed in the case of naringin, which remained buried inside the binding pocket during the entire trajectory of the simulation (100ns). Finally, although hesperidin and naringin showed more fluctuations than the reference ligand GC376, they remained associated with the protein trajectory ( Fig. 8d-f). Therefore, both bioflavonoids showed considerable dynamic stability with the 3CLpro, but naringin is a more suitable ligand than hesperidin due to less deviation in RMSD.
The RMSD for the protein PLpro in the docked complexes PLprohesperidin and PLpro-naringin showed mean deviation over a considerable range (<3.2 Å and <2.4 Å, respectively) throughout the 100 ns simulation. In comparison to the RMSD value of the reference ligand GRL0617 (~7.5 Å), naringin showed a higher deviation (~3.0 Å to ~11.0 Å) after 15 ns followed by the exhibition of a stable trajectory at 10.5 Å until ~95 ns and then seemed to fall again. A longer simulation would give a better understanding of the dynamic stability of naringin. Hesperidin, also showed deviation from ~3.0 Å to 7.0 Å after exactly 15 ns. Still, in contrast to naringin it returned to 4.0 Å after 60 ns and then exhibited equilibrium until the end of the 100 ns simulation (Fig. 8g-h). RMSF analysis showed that atoms in both the ligands showed maximum fluctuations of up to ~6.0 Å (Figs. S5g and S5h), suggesting that the naringin's high RMSD is not due to internal fluctuations but occurs upon interaction with PLpro. Overall, hesperidin is a more suitable ligand than naringin, as it deviated for only a short time before attaining a stable trajectory, which is then exhibited until the end of the 100 ns simulation.

Protein-ligand interaction profiling
The protein-ligand interaction profiling includes several interactions (H-bonds, hydrophobic interactions, ionic interactions, and salt bridges). It is a critical parameter to understand the modification in interactions between protein and ligand during the 100 ns MD simulation. The interaction of hesperidin and naringin with the active site residues of TMPRSS2, 3CLpro, and PLpro during the 100 ns simulation was further analyzed for various interactions occurring at the atomic and intermolecular level for time (Fig. 9).
In the TMPRSS2-hesperidin docked complex, mainly H-bonds and hydrophobic interactions were observed during the simulation. The residues Arg41, Asn146, Gly148, and Gln192 formed H-bonds more than 25% of the simulation time, and Cys219 created the H-bond for more than 50% of the time. The residues involved in hydrophobic interactions were Tyr149, Ala190, and Val213, but only Ala190 contributed to more than 30% of the simulation time. The residues His57, Asn146, Gln192 formed a water bridge for more than 30% of the simulation time in addition to developing H-bonds (Fig. 9a). In the TMPRSS2-naringin docked complex, the residues His40 and Gly148 formed H-bonds for more than 40% of the simulation time, and Asp189 and Ala190 formed H-bond for more than 30% of the simulation time. Hydrophobic interaction was not significantly involved in this complex. A water bridge was formed by His57, Thr62, Tyr63, Ser151, Gln192, and Ser195 residues for more than 35% of the simulation time and by Arg41 for more than 80% of the simulation time. Ionic interactions were formed by Asn146, Asp147, and Gln192 but only for a brief period of the simulation (Fig. 9b). The smaller number of interactions indicates the stability of the TMPRSS2-naringin docked complex is not suitable, as was also observed in the analysis of MD simulation data (Fig. 9c).
In the 3CLpro-hesperidin docked complex, H-bonds by Asp187, Gln189, Thr190, and Gln192, and water bridging by Glu166, Gln189 were involved in the interaction for more than 30% of the simulation. Comparatively, the 3CLpro-naringin docked complex showed a more significant number of H-bonds and hydrophobic interactions, suggesting the complex is dynamically more stable than the 3CLpro-hesperidin docked complex. H-bonds were formed by Thr26, His41, and Gly143 for more than 40% of the simulation, and by Thr25, Cys44, and His164, Glu166, and Gln192 for more than 30% of the simulation (Fig. 9d).
Among the PLpro-ligand docked complexes, the PLpro-hesperidin complex involved more H-bonds, hydrophobic interactions, and water bridges than PLpro-naringin and can be considered more stable. In the PLpro-hesperidin complex, H-bonds were formed by Arg3, Thr4, Pro248, and Tyr273 for more than 20% of the simulation. By Asn267 for more than 50% of the simulation, the hydrophobic interactions formed by Tyr264, and Tyr 267 for more than 50% of the simulation, and water bridges were formed by Thr301 for more than 60% of the simulation and by Glu51, Arg166, and Asn267 for more than 25% of the simulation (Fig. 9e and f). A similar observation was obtained through the MD simulation analysis of both the complexes where hesperidin was more stable in the binding pocket of PLpro.
The interactions between the three proteins and hesperidin and naringin were determined at a total 30% interval of 100 ns MD simulation. Interestingly, all the docked complexes exhibited the formation of H-bonds and water bridges, contributing to the dynamic stability of these bioflavonoids within the active site of target proteins. Consequently, combining MD simulation and protein-ligand interaction profiling to assess the stability of the docked complexes, we can conclude that: (i) the TMPRSS2-naringin docked complex seems to be more stable than TMPRSS2-hesperidin; (ii) 3CLpro-naringin is more stable than 3CLpro-hesperidin; and (iii) PLpro-hesperidin is more stable than PLpro-naringin (Fig. 10).

Binding free energy analysis
Using molecular mechanics/generalized born surface area calculation (MM/GBSA) approach, the binding free energy and the binding affinity of the selected protein-ligand complexes were evaluated. Followed by the MD simulation, the last ten poses of each docked complexes i.e., TMPRSS2-hesperidin, TMPRSS2-naringin, 3CLpro-hesperidin, 3CLpro-naringin, PLpro-hesperidin, and PLpro-naringin, were extracted from 100 ns MD simulation trajectory at a regular interval of 10 ns to calculate the average binding free energy against the reference complexes, viz. TMPRSS2-Camostat, 3CLpro-GC376, and PLpro-GRL0617 (Supplementary Table 5). MMGBSA calculation showed that naringin exhibited significant binding free energy with TMPRSS2 (− 65.83854 kcal/mol) and 3CLpro (− 64.76481) in comparison to the reference inhibitors i.e., camostat (− 79.84702 kcal/mol) and GC376 (− 77.88006 kcal/mol) respectively. In contrast, hesperidin exhibited lesser binding energy (− 43.14713) with the PLpro protein in comparison to the reference inhibitor i.e., GRL0617 (− 39.32349) ( Table 2). In addition, parameters related with the dissociation energy were also calculated for each of the docked complexes and exhibited substantial support of Van der Waals' interaction energy (ΔGBind vdw), and Lipophilic energy (ΔG Bind Lipo ) in the stability whereas Generalized Born electrostatic solvation energy (ΔG Bind Solv GB ) was found to contribute in the instability of the respective docked complexes (Fig. 11). Coulomb energy (Δ G Bind Coulomb ) also contributed significantly to the stability of all the docked complexes except TMPRSS2flavonoid complexes. Overall, the MMGBSA result suggested that the selected bioflavonoids exhibit significant binding free energy and substantial stability by forming Van der Waals' interactions with the  essential residues in the active site of SARS-CoV-2 target proteins.

Conclusion
This study adopted a molecular docking and simulation approach to identify common bioactive flavonoids potentially active against the SARS-CoV-2 targets 3CLpro, PLpro, and TMPRSS2. Two potent flavonoids, hesperidin, and naringin, were identified through virtual screening followed by evaluation for drug likeliness, toxicity, and dynamic stability in comparison to the reference ligands camostat (for TMPRSS2), GC376 (for 3CLpro), and GRL0617 (for PLpro). The importance of glycosylation to the anti-inflammatory and antiviral activities and its contribution to the structure-activity relationship was discussed. Molecular docking simulation showed that the binding energies of hesperidin and naringin were indicative of strong complex formation with the target proteins. Molecular dynamics simulation for 100 ns showed that the deviation in dynamic stability for all the docked complexes lies in an acceptable range. Therefore, these complexes can be considered stable due to the formation of multiple intermolecular interactions. Based on our analysis of the drug likeness, toxicity, binding energy, MD simulation trajectory, and protein-ligand interaction profiles, we conclude that the bioflavonoids hesperidin and naringin are potential therapeutic agents against SARS-CoV-2. Of the two, naringin showed better affinity and stability with TMPRSS2 and 3CLpro, whereas hesperidin showed better binding affinity and stability with PLpro. The binding free energy calculation further supported the stability and affinity of the docked flavonoids through MM/GBSA where naringin showed a significant binding affinity with TMPRSS2 and 3CL pro.
In contrast, hesperidin exhibited significant binding affinity with the PLpro protein of SARS-CoV-2. The screened flavonoids should be further validated through in vitro inhibition and neutralization assays to develop a therapeutic drug molecule against SARS-CoV-2. Our results add to existing knowledge concerning the antiviral effects of flavonoids and set the stage for potentially further studies in the field. Hopefully, they invite in vitro, in vivo, and clinical trials to investigate the effects of the currently mentioned flavonoids and other phytochemicals in this global pandemic against a common foe.

Author contributions
A.C. designed the study, critically supervised the project, revised, and reviewed the manuscript with the help of S.K.J and SSM; S.K., and P. Y. performed the experiments; S.K., and A.C. analyzed the data and wrote the manuscript; P.P. and R.K. wrote parts of the manuscript and generated some of the tables and the graphs. All authors have read and agreed to the published version of the manuscript.

Funding
This research received no external funding.

Institutional review board statement
Not applicable.

Informed consent statement
Not applicable.

Data availability statement
No new data were created or analyzed in this study.

Declaration of competing interest
The authors declare no conflict of interest.