Deep learning model for virtual screening of novel 3C-like protease enzyme inhibitors against SARS coronavirus diseases

https://doi.org/10.1016/j.compbiomed.2021.104317Get rights and content

Highlights

  • Develop a deep learning-based CNN model to predict the inhibitory activity of 3CLpro of SARS-CoV.

  • The proposed CNN model compared with RF, NB, DT, and SVM.

  • Obtained 86% accuracy, and 71% ROC.

  • Screened unknown compounds, i.e., phytochemical compounds, FDA-approved drugs, a natural product of NCI divsetIV, and natural compounds of ZINC database.

  • To prioritize drug-like compounds, applied Lipinski's RO5.

  • Got 9 anti-SARS-CoV agents out of 10 phytochemical compounds belong to the flavonoid group.

Abstract

In the context of the recently emerging COVID-19 pandemic, we developed a deep learning model that can be used to predict the inhibitory activity of 3CLpro in severe acute respiratory syndrome coronavirus (SARS-CoV) for unknown compounds during the virtual screening process. This paper proposes a novel deep learning-based method to implement virtual screening with convolutional neural network (CNN) architecture. The descriptors represent chemical molecules, and these descriptors are input into the CNN framework to train a model and predict active compounds. When compared to other machine learning methods, including random forest, naive Bayes, decision tree, and support vector machine, the proposed CNN model's evaluation of the test set showed an accuracy of 0.86, a sensitivity of 0.45, a specificity of 0.96, a precision of 0.73, a recall of 0.45, an F-measure of 0.55, and a ROC of 0.71. The CNN model screened 17 out of 918 phytochemical compounds; 60 out of 423 from the natural product NCI divset IV; 17,831 out of 112,267 from the ZINC natural product database; and 315 out of 1556 FDA-approved drugs as anti-SARS-CoV agents. Further, to prioritize drug-like compounds, Lipinski's rule of five was applied to screen anti-SARS-CoV compounds (excluding FDA-approved drugs), resulting in 10, 59, and 14,025 hit molecules. Out of 10 phytochemical compounds, 9 anti-SARS-CoV agents belonged to the flavonoid group. In conclusion, the proposed CNN model can prove useful for developing novel target-specific anti-SARS-CoV compounds.

Keywords

Deep learning
CNN Model
Convolutional neural network
COVID-19
SARS-CoV
3CLpro
Phytochemical compounds
Virtual screening
FDA-approved drugs

Cited by (0)

Madhulata Kumari is a research associate in the School of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi, India. She holds Ph.D. in Information Technology by the Kumaun University, Nainital, Uttarakhand, India. Her main area of research interest is the data mining, machine learning, deep learning, molecular docking, Molecular dynamic simulation, pharmacophore modelling, 3D-QSAR modelling, lead optimization and in silico ADMET prediction and drug design. Her work has been published in various peer-reviewed journals.

Naidu Subbarao is an Associate Professor in the School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India. He received his MSc and PhD from IIT Kanpur. His research interests molecular modelling, molecular docking, Molecular dynamic simulation, pharmacophore modelling, 3D-QSAR modelling, development of drug target databases of Plasmodium falciparum and Mycobacterium tuberculosis, computational biology, cooperativity in macromolecules, protein-protein interactions, and structure based drug designing. His work has been published in various peer-reviewed journals.

View Abstract