Identification of lactoferrin-derived peptides as potential inhibitors against the main protease of SARS-CoV-2

COVID-19 is a global health emergency that causes serious concerns. A global effort is underway to identify drugs for the treatment of COVID-19. One possible solution to the present problem is to develop drugs that can inhibit SARS-CoV-2 main protease (Mpro), a coronavirus protein that been considered as one among many drug targets. In this work, lactoferrin from Bos taurus L. was in silico hydrolyzed. The bioactivity, water solubility, and ADMET properties of the generated peptides were predicted using various online tools. The molecular interactions between Mpro and the peptides were studied using molecular docking and molecular dynamic simulation. The results demonstrated that peptide GSRY was predicted to have better physicochemical properties, and the value of ‘-C DOCKER interaction energy’ between peptide GSRY and Mpro was 80.8505 kcal/mol. The interaction between the peptide GSRY and the native ligand N3 co-crystallized with Mpro had overlapped amino acids, i.e., HIS163, GlY143, GLU166, GLN189 and MET165. Molecular dynamic simulation revealed that Mpro/GSRY complexes were stable. Collectively, the peptide GSRY may be a potential candidate drug against Mpro of SARS-CoV-2.

Currently, there is no specific clinical treatment for SARS-CoV-2mediated infections (Hasan et al., 2021). One possible drug target is the major structural protein of the virus, and the development of drugs targeting this protein has been suggested (Zhou, Hou, Shen, Huang, & Martin, 2020). At the same time, a great deal of studies targeting non-structural viral proteins responsible for viral replication and maturation are also being conducted (Gupta et al., 2020). The genome of COVID-19 virus consists of ~30,000 nucleotides that encode two overlapping polyproteins necessary for viral replication and transcription . Functional proteins are released due to extensive proteolysis of the polyproteins by 33.8-kDa main protease (M pro ) (Hegyi, Friebe, Gorbalenya, & Ziebuhr, 2002). M pro plays a role in virus life cycle, and the lack of this protease in human makes it an intuitive choice for antiviral drug target (Pillaiyar, Manickam, Namasivayam, Hayashi, & Jung, 2016). Thus, M pro can be used as a target for new virus inhibitors.
Growing evidence suggests that existing antivirals have low or no efficacy due to the viral resistance and/or co-infections. Therefore, the demand for novel antiviral drugs has greatly increased. In recent years, antiviral peptides have attracted increasing attention due to their high specificity and effectiveness. Additionally, antiviral peptides possess broad-spectrum activity with minimum side effects (Heydari et al., 2021). Several antiviral peptides have been previously reported, including anti-neurovirulent enterovirus A71 peptide SP40 (Lalani, Gew, & Poh, 2021), anti-influenza virus peptide M2 AH (Jung et al., 2019), and anti-Japanese encephalitis virus peptide P1 (Yu et al., 2021). Antiviral peptides that exhibit anti-coronavirus activity have also been reported, such as antiviral peptides (AVPs) reported by (Hollmann, Cardoso, Espeche, & Maffía, 2021) and antiviral peptide P9R . Furthermore, a recent study has shown that in silico hydrolysis of gastrointestinal enzymes in marine fish generates active peptides. Some oligopeptides that have high binding affinity to SARS-CoV-2 M pro have also been identified. These oligopeptides may be used as potential inhibitors of SARS-CoV-2 (Heydari et al., 2021).
Lactoferrin (LF), a milk-derived 80-kDa glycoprotein , plays important roles in host defense mechanism. Lactoferrin is also a good material for producing a variety of peptides with antimicrobial, immunomodulating, and antihypertensive activities (Tu et al., 2020). Previous study has shown that lactoferrin exhibits antiviral activity against virus during its early infection (Berlutti et al., 2011). Furthermore, lactoferrin has in vitro antiviral efficacy against SARS-CoV that is closely related to SARS-CoV-2 (Chang, Ng, & Sun, 2020). Therefore, lactoferrin can be a good material for producing peptides that may potentially be used as M pro inhibitors to inhibit the replication of SARS-COV-2.
Molecular docking is a computational technique used to discover the interactions between proteins and drugs or peptides (Azam & Jupudi, 2019). The binding energy value obtained from the molecular docking can be used as a guide for screening ligands available in the ligand library (Vora et al., 2019). An effective inhibitor can be designed based on the crystal structures of protein/ligand complexes, and this can also be applied to design new structure-based inhibitors of SARS-COV-2 M pro (BN, 2020).
To find effective M pro inhibitors, lactoferrin from Bos taurus L. was in silico hydrolyzed. The generated peptides with a bioactivity score of higher than 0.50 and with good solubility were selected. Then, the peptides with good pharmacological properties were identified by ADMET analysis. The obtained peptides that could bind closely to the active sites of M pro were identified. Finally, molecular dynamic simulation was employed to evaluate the stability of peptide/M pro complexes.

Materials and methods
The schematic diagram of the method is shown in Fig. 1.

Virtual hydrolysis
The sequence of bovine lactoferrin was retrieved from the NCBI database. ExPASy PeptideCutter (http://web.expasy.org/peptidecutter) was used for virtual hydrolysis of lactoferrin. Pepsin (EC3.4.23.1) and trypsin (EC3.4.21.4), which are the typical gastrointestinal enzymes, were used in the hydrolysis (Z.  to generate a large number of peptides, and the peptide sequences were screened based on their properties and characteristics.

Bioactivity and water solubility
The bioactivity of all peptides obtained from virtual hydrolysis was predicted using PeptideRanker (http://bioware.ucd.ie/compass/biow areweb/Serverpages/Peptideranker.php). The water solubility of the selected active peptides was also predicted. The solubility of the peptides was predicted using peptide property calculator available at http: //www.innovagen.com (Lafarga, O'Connor, & Hayes, 2015). The sequences of water-soluble active peptides were determined.

Bioavailability and ADMET parameters screening of peptides
The SWISS-ADME (https://www.swissadme.ch) was used to predict the ADMET properties of the peptides. The molecular weight (MW), number of hydrogen bond acceptors (HBA), number of hydrogen bond donors (HBD), number of rotatable bonds, the octanol-water partition coefficient (log Po/w), and the topological polar surface area (TPSA) of the peptides were evaluated. Blood-brain barrier (BBB) permeability of each peptide was also predicted.

Selection and preparation of main protease as target
Three-dimensional structure of SARS CoV-2 M pro in a complex with an inhibitor N3 (PDB ID: 6LU7) was retrieved from the Protein Data Bank (Burley et al., 2018). Prior to molecular docking, the downloaded structure was processed as follows. First, water molecules were removed and hydrogen atoms were added. Second, the binding sites of M pro were determined based on the binding site of the inhibitor N3. Third, the inhibitor N3 was deleted. Finally, the energy of M pro was minimized using the CHARMm force field in Discovery Studio 2017 R2 (Dassault Systèmes, BIOVIA, San Diego, CA, USA).

Molecular docking
In the molecular docking, the cavity around the binding site of inhibitor N3 on M pro was investigated (McIntosh & Perlman, 2014). The coordinates of the binding site are X: 10.765, Y: 12.509, and Z: 68.969. The 3D structures of the peptides were constructed using Discovery Studio 2017 R2 and their energy was minimized and optimized using the CHARMm force field. CDOCKER module was used for the semi-flexible docking between M pro and the active peptide: M pro was set as rigid, whereas the active peptide molecule was set as flexible. Dock Ligands module was used for molecule docking. The number of random conformations generated from each molecule was set to 10, and other parameters were default values. Based on the docking results, the CDOCKER energy of the active peptide/M pro complex was calculated, and the binding ability between the active peptide and M pro was presented as '-C DOCKER interaction energy'. The peptide/M pro complexes with higher '-C DOCKER interaction energy' values were selected.

Molecular dynamic simulation
Molecular dynamic simulation (100 ns) was performed using Gromacs 2018 (GNU Lesser General Public License, Boston, USA) (Alamri et al., 2021). The CHARMM36 force field was used in all simulation. Each system was placed in the center of a cubic periodic cell, hydrated with water molecules, and then neutralized by ions (CL/NA atoms). The steepest descent algorithm was used to minimize the energy of each system. Before the simulation, the system was equilibrated with NVT ensemble, followed by NPT ensemble, each for 1 ns. The temperature of the NVT ensemble was maintained at 300 K using velocity rescaling thermostat (Xia et al., 2018). The pressure of the NPT ensemble was maintained at 1 bar using Berendsen barostat (Feng & He, 2016). The programs in GROMACS package including root mean square deviation (RMSD), root mean square fluctuation (RMSF), and radius of gyration (Rg) were employed to analyze the molecular dynamic trajectory.

Prediction and analysis of bioactivity and water solubility of peptides
The bioactivity and water solubility of 94 peptides were predicted. Peptides with bioactivity scores of higher than 0.50 are shown in Table 1. The higher the bioactivity score, the higher the probability that the peptide is more active. PeptideRanker is a server that can be used to predict bioactive peptides based on a novel N-to-1 neural network (Yu et al., 2018). PeptideRanker is trained at a threshold of 0.5; thus, a predicted peptide with threshold of over 0.5 is labeled as bioactive peptide (Mooney, Haslam, Pollastri, & Shields, 2012). Water solubility of bioactive peptides plays an important role in their physiological functions (Lee, Hong, Kim, & Lee, 2017). Water solubility of drugs is highly important for their absorption and distribution characteristics (Vijay Kumar, Indra, Kamlesh, & PRASHANT, 2020). Failure of drugs is generally caused by insufficient absorption due to poor solubility of drugs (Babić et al., 2019). As shown in Table 1, the peptides RC, KCRRWQW, SQSCAPGRDPKSRL, LRP, AAPRKNVRW, RL, DGG, CKGE-GENQCACSSREPY, CQ, GSPPGQRD, GSRY, KC, DGGMV, and GGRPTY had good water solubility and bioactivity, and only these peptides were further analyzed.

Prediction and analysis of ADMET characteristics of peptides
Optimizing drug-likeness properties is the crux of new drug development (Vijay Kumar et al., 2020). Pharmacological parameters are indicators for predicting the activity of new compounds. The drug-likeness properties of compounds were determined based on Lipinski's rule of five (RO5). The RO5 criteria include MW ≤ 500 g/mol, number of HBA ≤10, number of HBD ≤5, number of rotatable bonds ≤10, and log Po/w ≤ 5. To obtain better results, the compounds that were conformed to RO5 were selected as compounds with drug-likeness properties. Table 1 shows the ADMET properties of peptides predicted by SwissADME. Peptides GGRPTY, GSPPGQRD, AAPRKNVRW, KCRRWQW, SQSCAPGRDPKSRL, and CKGEGENQCACSSREPYF only conform to log Po/w ≤ 5. By contrast, the peptides RL, KC, DGG, CQ,  LRP, RC, DGGMV, and GSRY conformed to the RO5 criteria. The percent absorption (%ABS) was calculated by: %ABS = 109 × (0.345 × TPSA); where TPSA is a descriptor for the passive transport of molecules through the membrane (Kauthale, Tekale, Damale, Sangshetti, & Pawar, 2017). The percent absorption rate of the peptides RL, KC, DGG, CQ, LRP, RC, DGGMV, GSRY, and GGRPTY ranged from 4.68% to 55.76%. Among these eight peptides, the peptide GGRPTY had the lowest percent absorption of 4.68%. The peptides GGRPTY, GSPPGQRD, AAPRKNVRW, KCRRWQW, SQSCAPGRDPKSRL, and CKGE-GENQCACSSREPYF had low percent absorption.
The distribution of drugs in the body is one of the most important parameters that can be determined by pharmacokinetics (Di, Kerns, & Carter, 2009). It can be measured by blood-brain barrier penetration (BBB) (Vishvakarma, Kumari, & Singh, 2020). The central nervous system (CNS) mainly controls the overall activities of the body. The blood-brain barrier (BBB) separates the circulating blood in CNS from the extracellular fluid circulating to the rest of the body. Drugs can be divided into CNS-targeted and non-CNS-targeted. In the development of non-CNS-targeted drugs, researchers must ensure that the drugs do not cross the BBB (Tomlinson et al., 2009), as drugs that cross BBB are associated with a greater risk of side effects (Khan et al., 2020). The characteristics of drugs in this class (BBB-impermeable drugs) include MW < 400-500 g/mol, H-bonds < 8-10, and not acids (Vijay Kumar et al., 2020). As demonstrated in Table 2, the peptides RL, KC, DGG, CQ, LRP, RC DGGMV, GSRY, GGRPTY, GSPPGQRD, AAPRKNVRW, KCRRWQW, SQSCAPGRDPKSRL, and CKGEGENQCACSSREPYF were BBB-impermeable peptide.
Metabolism is the breakdown of compounds that enter into the body. Metabolism of drugs or other molecules is accomplished by oxidoreductases in the liver. The most common types of oxidoreductases are cytochrome P450 enzymes (Al-Hazmi, 2016;Sahu et al., 2019). Members of cytochrome P450 enzymes include: CYP2D6 and CYP3A4, which are the most important enzymes in the metabolism; CYP2C9, which is one of the most important CYP s and is involved in approximately 20% of P450-mediated drug oxidation reactions; and CYP2C19, which has been shown to cause toxicity and alter efficacy of many drugs (Hirota, Eguchi, & Ieiri, 2013). Eight peptides including RL, KC, DGG, CQ, LRP, RC, DGGMV, and GSRY had no effect on the metabolism by P450 enzymes, thus were not considered inhibitors. Moreover, the eight peptides had no effect on the oxidation reaction of the drug, nor changed its toxicity and efficacy. Owing to their enhanced pharmacokinetic and pharmacodynamic properties, the peptides RL, KC, DGG, CQ, LRP, RC, and DGGMV were selected in further evaluation.

Molecular interaction mechanism between peptide and main protease M pro
In molecular docking, the higher the '-C DOCKER interaction energy' value, the more favorable the binding . Table 3 shows the binding energies obtained from the molecular docking of peptides with SARS-CoV-2 M pro . The '-C DOCKER interaction energy' value of the binding between GSRY and M pro was 80.8505 kcal/mol. The value of '-C DOCKER interaction energy' value of the binding between the inhibitor N3 and M pro was 79.0202 kcal/mol. The '-C DOCKER interaction energy' value of the binding between RC, LRP, DGGMV, KC, RL, CQ, and DGG and M pro was lower than that of the binding between the inhibitor N3 and M pro . The inhibitor N3 is the original ligand of SARS-CoV-2 M pro . The binding of peptide GSRY with M pro had higher '-C DOCKER interaction energy' than that of inhibitor N3 with M pro , indicating that the peptide GSRY could bind more strongly to M pro than the inhibitor N3. According to the above results, the peptide GSRY could fit well in the active pocket of M pro .
The interaction of the inhibitor N3 with M pro is shown in Fig. 2. The inhibitor N3 binds to M pro at the following amino acid residues: CYS145, HIS163, GLY143, GLN189, GLU166, HIS164, and THR190 through conventional hydrogen bonds; GLN189, GLU166, HIS164, and MET165 through carbon hydrogen bonds; and CYS145 and MET49 through pisulfur interactions. The binding also involves alkyl interaction with MET165, pi-donor hydrogen bond interaction with THR25, and pi-alkyl interaction with HIS41.
The interaction between the peptide GSRY and M pro is also shown in Fig. 2. The peptide GSRY binds to M pro at the following amino acid residues: HIS163, GLY143, GLU166, and LEU167 through conventional hydrogen bonds; GLN189, ASN142, GLU166, PRO168, and MET165 through carbon hydrogen bonds. Charge interaction with GLU166 and pi-alkyl interaction with MET49 are also involved in the binding.
A study has shown that proline is the amino acid most commonly found near protein-protein interaction sites (Kini & Evans, 1995). However, the binding with a proline-containing peptide leads to a lower loss of entropy, thus has restricted mobility, unlike the binding with other types of peptides, which is more flexible (Williamson, 1994). Proline is the only residue with an aliphatic ring encompassing both of its main chain and side chain, and the peptide GSRY contains a benzene ring. The aliphatic ring of proline can bind an aromatic ring through  C-H⋅⋅⋅π interaction with substantially low binding energy (Bhattacharyya & Chakrabarti, 2003). From the docking results, the interactions of M pro with the peptide GSRY and that with the inhibitor N3 were overlapped at residues HIS163, GlY143, GLU166, GLN189 and MET165; and GlY143, GLU166, GLN189 and MET165 were the critical residues for the successful binding of M pro SARS-CoV-2 (Hasan et al., 2021). Thus, the peptide GSRY may be developed as a novel coronavirus inhibitor.

Molecular dynamic simulation
RMSD is a crucial parameter for evaluating the stability of molecular dynamic trajectories (Biswas et al., 2021). The RMSD plot of M pro /GSRY complexes is displayed in Fig. 3(a). As can be seen, the RMSD trajectory value ranged between 0.005 nm and 0.436 nm, and the average RMSD was 0.329 nm. The structure of GSRY was initially deviated up to around 48 ns of the simulation time, but became considerably more rigid thereafter.
Furthermore, the RMSF values for M pro /GSRY complex were calculated (Arooj, Shehadi, Nassab, & Mohamed, 2020). As shown in Fig. 3 (b), all residues were fluctuated at below 0.6 nm, and the fluctuations of amino acid residues 1-7, 142-154, 215-226, and 296-305 were slightly higher compared to those of other residues. The overall fluctuation was considered stable, except for certain regions; thus, we may conclude that the M pro /GSRY complex has acceptable stability.
Rg is an indicator of compactness and stability of protein structures (Lobanov, Bogatyreva, & Galzitskaya, 2008). The stability and degree of compactness of M pro /GSRY complexes were analyzed based on their Rg values. As shown in Fig. 3(c), the M pro /GSRY complexes had low Rg values between 2.209 and 2.294 nm, suggesting that the complexes were stable without large variations over the entire simulation time.

Conclusions
In summary, we in silico analyzed B. taurus lactoferrin and determined their various characteristics with an aim to find a novel SARS-CoV-2 inhibitor. We found that among all the peptide analyzed, the peptide GSRY was the most promising peptide; it had a bioactivity score of 0.5381 and good water solubility. ADMET analysis indicated that the percent absorption of the peptide GSRY was 21.72%. The peptide GSRY could bind to M pro through six conventional hydrogen bonds, seven carbon hydrogen bonds, one charge interaction and one pi-alkyl interaction. In addition, the binding of the peptide GSRY to M pro overlapped with the binding of the inhibitor N3 to M pro at HIS163, GLY143, GLU166, GLN189, and MET165, and GLY143, GLU166, GLN189 and MET165 were at the critical residues for the successful binding of M pro SARS-CoV-2. Moreover, molecular dynamic simulation revealed that M pro /GSRY complexes were stable. Overall, the peptide GSRY may be developed as a novel SARS-CoV-2 inhibitor.

Declaration of competing interest
The authors declare no conflict of interest.