Skip to main content
Log in

Affinity prediction using deep learning based on SMILES input for D3R grand challenge 4

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

Modern molecular docking comprises the prediction of pose and affinity. Prediction of docking poses is required for affinity prediction when three-dimensional coordinates of the ligand have not been provided. However, a large number of feature engineering is required for existing methods. In addition, there is a need for a robust model for the sequential combination of pose and affinity prediction due to the probabilistic deviation of the ligand position issue. We propose a pipeline using a bipartite graph neural network and transfer learning trained on a re-docking dataset. We evaluated our model on the released data from drug design data resource grand challenge 4 (D3R GC4). The two target protein data provided by the challenge have different patterns. The model outperformed the best participant by 9% on the BACE target protein from stage 2. Further, our model showed competitive performance on the CatS target protein.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

All data are publicly available. Please refer to the code availability section for detail.

Code availability

The model code and data is available at : https://github.com/arwhirang/affinity_prediction_BGNN.

References

  1. Seifert MH, Wolf K, Vitt D (2003) Virtual high-throughput in silico screening. Biosilico 1(4):143–149

    Article  CAS  Google Scholar 

  2. Braga R, Alves V, Silva A, Nascimento M, Silva F, Liao L, Andrade C (2014) Virtual screening strategies in medicinal chemistry: the state of the art and current challenges. Curr Top Med Chem 14(16):1899–1912

    Article  CAS  Google Scholar 

  3. Gimeno A, Ojeda-Montes MJ, Tomás-Hernández S, Cereto-Massagué A, Beltrán-Debón R, Mulero M, Garcia-Vallvé S (2019) The light and dark sides of virtual screening: what is there to know? Int J Mol Sci 20(6):1375

    Article  CAS  Google Scholar 

  4. Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, Shenkin PS (2004) Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J Med Chem 47(7):1739–1749

    Article  CAS  Google Scholar 

  5. Warren GL, Andrews CW, Capelli AM, Clarke B, LaLonde J, Lambert MH, Head MS (2006) A critical assessment of docking programs and scoring functions. J Med Chem 49(20):5912–5931

    Article  CAS  Google Scholar 

  6. Gaieb Z, Parks CD, Chiu M, Yang H, Shao C, Walters WP, Gilson MK (2019) D3R Grand Challenge 3: blind prediction of protein-ligand poses and affinity rankings. J Comput Aided Mol Des 33(1):1–18

    Article  CAS  Google Scholar 

  7. Ragoza M, Hochuli J, Idrobo E, Sunseri J, Koes DR (2017) Protein-ligand scoring with convolutional neural networks. J Chem Inf Model 57(4):942–957

    Article  CAS  Google Scholar 

  8. Morrone JA, Weber JK, Huynh T, Luo H, Cornell WD (2020) Combining docking pose rank and structure with deep learning improves protein-ligand binding mode prediction over a baseline docking approach. J Chem Inf Model 60(9):4170–4179

    Article  CAS  Google Scholar 

  9. Jiménez J, Skalic M, Martinez-Rosell G, De Fabritiis G (2018) Kdeep: protein-ligand absolute binding affinity prediction via 3d-convolutional neural networks. J Chem Inf Model 58(2):287–296

    Article  Google Scholar 

  10. Zheng L, Fan J, Mu Y (2019) Onionnet: a multiple-layer intermolecular-contact-based convolutional neural network for protein-ligand binding affinity prediction. ACS Omega 4(14):15956–15965

    Article  CAS  Google Scholar 

  11. Yang L, Yang G, Chen X, Yang Q, Yao X, Bing Z, Yang L (2021) Deep scoring neural network replacing the scoring function components to improve the performance of structure-based molecular docking. ACS Chem Neurosci 12:2133

    Article  CAS  Google Scholar 

  12. Muller U, Ben J, Cosatto E, Flepp B, Cun YL (2006) Off-road obstacle avoidance through end-to-end learning. Adv Neural Inf Process Syst 739–746

  13. Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1717–1724

  14. Parks CD, Gaieb Z, Chiu M, Yang H, Shao C, Walters WP, Gilson MK (2020) D3R grand challenge 4: blind prediction of protein-ligand poses, affinity rankings, and relative binding free energies. J Comput-Aided Mol Des 34(2):99–119

    Article  CAS  Google Scholar 

  15. Nguyen D, Gao K, Chen J, Wang R, Wei G (2020) Potentially highly potent drugs for 2019-nCoV. BioRxiv

  16. Ragoza M, Turner L, Koes DR (2017) Ligand pose optimization with atomic grid-based convolutional neural networks. arXiv:1710.07400

  17. Francoeur PG, Masuda T, Sunseri J, Jia A, Iovanisci RB, Snyder I, Koes DR (2020) Three-dimensional convolutional neural networks and a cross-docked data set for structure-based drug design. J Chem Inf Model 60(9):4200–4215

    Article  CAS  Google Scholar 

  18. Riniker S, Landrum GA (2015) Better informed distance geometry: using what we know to improve conformation generation. J Chem Inf Model 55(12):2562–2574

    Article  CAS  Google Scholar 

  19. Sánchez-Cruz N, Medina-Franco JL, Mestres J, Barril X (2020) Extended connectivity interaction features: improving binding affinity prediction through chemical description. Bioinformatics

  20. Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50(5):742–754

    Article  CAS  Google Scholar 

  21. Wu Z, Ramsundar B, Feinberg EN, Gomes J, Geniesse C, Pappu AS, Pande V (2018) MoleculeNet: a benchmark for molecular machine learning. Chem Sci 9(2):513–530

    Article  CAS  Google Scholar 

  22. Nguyen DD, Gao K, Wang M, Wei GW (2018) MathDL: mathematical deep learning for D3R grand challenge 4. J Comput-Aided Mol Des 342020:131–147

    Google Scholar 

  23. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781

  24. Cadeddu A, Wylie EK, Jurczak J, Wampler-Doty M, Grzybowski BA (2014) Organic chemistry as a language and the implications of chemical linguistics for structural and retrosynthetic analyses. Angew Chem Int Ed 53(31):8108–8112

    Article  CAS  Google Scholar 

  25. Hirohara M, Saito Y, Koda Y, Sato K, Sakakibara Y (2018) Convolutional neural network based on SMILES representation of compounds for detecting chemical motif. BMC Bioinform 19(19):83–94

    Google Scholar 

  26. Goh GB, Hodas NO, Siegel C, Vishnu A (2017) Smiles2vec: an interpretable general-purpose deep neural network for predicting chemical properties. arXiv:1712.02034

  27. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Polosukhin I (2017) Attention is all you need. arXiv:1706.03762

  28. Lin Z, Feng M, Santos CN, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. arXiv:1703.03130

  29. Shin B, Park S, Kang K, Ho JC (2019) Self-attention based molecule representation for predicting drug–target interaction. In: Machine learning for healthcare conference. Proceedings of Machine Learning Research (PMLR) (pp. 230–248)

  30. Zheng S, Li Y, Chen S, Xu J, Yang Y (2020) Predicting drug-protein interaction using quasi-visual question answering system. Nat Mach Intell 2(2):134–140

    Article  Google Scholar 

  31. Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Overington JP (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40:1100–1107

    Article  Google Scholar 

  32. Liu Z, Li Y, Han L, Li J, Liu J, Zhao Z, Wang R (2015) PDB-wide collection of binding data: current status of the PDBbind database. Bioinformatics 31(3):405–412

    Article  CAS  Google Scholar 

  33. Varela-Rial A, Majewski M, Cuzzolin A, Martínez-Rosell G, De Fabritiis G (2020) SkeleDock: a web application for scaffold docking in play molecule. J Chem Inf Model 60(6):2673–2677

    Article  CAS  Google Scholar 

  34. Ihlenfeldt WD, Takahashi Y, Abe H, Sasaki SI (1994) Computation and management of chemical properties in CACTVS: an extensible networked approach toward modularity and compatibility. J Chem Inf Comput Sci 34(1):109–116

    Article  CAS  Google Scholar 

  35. McNutt A, Francoeur P, Aggarwal R, Masuda T, Meli R, Ragoza M, Koes D (2021) GNINA 1.0: molecular docking with deep learning

  36. Su M, Yang Q, Du Y, Feng G, Liu Z, Li Y, Wang R (2018) Comparative assessment of scoring functions: the CASF-2016 update. J Chem Inf Model 59(2):895–913

    Article  Google Scholar 

  37. Li AH, Bradic J (2018) Boosting in the presence of outliers: adaptive classification with nonconvex loss functions. J Am Stat Assoc 113(522):660–674

    Article  CAS  Google Scholar 

  38. Prechelt L (1998) Early stopping-but when? Neural networks: tricks of the trade. Springer, Berlin, pp 55–69

    Chapter  Google Scholar 

  39. Lam PC, Abagyan R, Totrov M (2018) Hybrid receptor structure/ligand-based docking and activity prediction in ICM: development and evaluation in D3R Grand Challenge 3. J Comput Aided Mol Des 33(1):35–46

    Article  Google Scholar 

  40. Sahu S, Shukla A (2009) Fortran 90 implementation of the Hartree–Fock approach within the CNDO/2 and INDO models. Comput Phys Commun 180(5):724–734

    Article  CAS  Google Scholar 

  41. Wingert BM, Oerlemans R, Camacho CJ (2018) Optimal affinity ranking for automated virtual screening validated in prospective D3R grand challenges. J Comput Aided Mol Des 32(1):287–297

    Article  CAS  Google Scholar 

  42. Ye Z, Baumgartner MP, Wingert BM, Camacho CJ (2016) Optimal strategies for virtual screening of induced-fit and flexible target in the 2015 D3R Grand Challenge. J Comput Aided Mol Des 30(9):695–706

    Article  CAS  Google Scholar 

  43. He X, Man VH, Ji B, Xie XQ, Wang J (2019) Calculate protein-ligand binding affinities with the extended linear interaction energy method: application on the Cathepsin S set in the D3R Grand Challenge 3. J Comput Aided Mol Des 33(1):105–117

    Article  CAS  Google Scholar 

  44. Wang J, Wang W, Kollman PA, Case DA (2006) Automatic atom type and bond type perception in molecular mechanical calculations. J Mol Graph Model 25(2):247–260

    Article  Google Scholar 

  45. Salomon-Ferrer R, Case DA, Walker RC (2013) An overview of the Amber biomolecular simulation package. Wiley Interdiscip Rev 3(2):198–210

    CAS  Google Scholar 

Download references

Funding

The study is supported by National Research Council of Science & Technology (NST) grant by the Korea government (MSIP) (No. CAP-17-01-KIST Europe).

Author information

Authors and Affiliations

Authors

Contributions

The study was designed by SL, and YL. SL wrote the code and performed the analysis. The original manuscript was written by SL, and YL. All authors (SL, YL, JY, and YK) have reviewed and edited the manuscript. YL and YK acquired the funding. All authors have given approval to the final version of the manuscript.

Corresponding author

Correspondence to Sangrak Lim.

Ethics declarations

Conflict of interest

We declare no conflict of interest

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 2774 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lim, S., Lee, Y.O., Yoon, J. et al. Affinity prediction using deep learning based on SMILES input for D3R grand challenge 4. J Comput Aided Mol Des 36, 225–235 (2022). https://doi.org/10.1007/s10822-022-00448-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-022-00448-3

Keywords

Navigation