Skip to main content

Advertisement

Log in

MathDL: mathematical deep learning for D3R Grand Challenge 4

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

We present the performances of our mathematical deep learning (MathDL) models for D3R Grand Challenge 4 (GC4). This challenge involves pose prediction, affinity ranking, and free energy estimation for beta secretase 1 (BACE) as well as affinity ranking and free energy estimation for Cathepsin S (CatS). We have developed advanced mathematics, namely differential geometry, algebraic graph, and/or algebraic topology, to accurately and efficiently encode high dimensional physical/chemical interactions into scalable low-dimensional rotational and translational invariant representations. These representations are integrated with deep learning models, such as generative adversarial networks (GAN) and convolutional neural networks (CNN) for pose prediction and energy evaluation, respectively. Overall, our MathDL models achieved the top place in pose prediction for BACE ligands in Stage 1a. Moreover, our submissions obtained the highest Spearman correlation coefficient on the affinity ranking of 460 CatS compounds, and the smallest centered root mean square error on the free energy set of 39 CatS molecules. It is worthy to mention that our method on docking pose predictions has significantly improved from our previous ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. This subcategory is the common list of ligand based and structure based scoring subcategories.

References

  1. Gathiaka S, Liu S, Chiu M, Yang H, Stuckey JA, Kang YN, Delproposto J, Kubish G, Dunbar JB, Carlson HA et al (2016) D3r grand challenge 2015: evaluation of protein-ligand pose and affinity predictions. J Comput-Aided Mol Des 30(9):651–668

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Gaieb Z, Liu S, Gathiaka S, Chiu M, Yang H, Shao C, Feher VA, Walters WP, Kuhn B, Rudolph MG et al (2018) D3r grand challenge 2: blind prediction of protein-ligand poses, affinity rankings, and relative binding free energies. J Comput-Aided Mol Des 32(1):1–20

    CAS  PubMed  Google Scholar 

  3. Gaieb Z, Parks CD, Chiu M, Yang H, Shao C, Walters WP, Lambert MH, Nevins N, Bembenek SD, Ameriks MK et al (2019) D3r grand challenge 3: blind prediction of protein-ligand poses and affinity rankings. J Comput-Aided Mol Des 33(1):1–18

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Trott O, Olson AJ (2010) AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem 31(2):455–461

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Jones G, Willett P, Glen RC, Leach AR, Taylor R (1997) Development and validation of a genetic algorithm for flexible docking. J Mol Biol 267(3):727–748

    CAS  PubMed  Google Scholar 

  6. Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, Repasky MP, Knoll EH, Shelley M, JK JKP, Shaw DE, Francis P, Shenkin PS (2004) Glide: a new approach for rapid, accurate docking and scoring. 1. method and assessment of docking accuracy. J Med Chem 47:1739

    CAS  PubMed  Google Scholar 

  7. Abagyan R, Totrov M, Kuznetsov D (1994) Icm-a new method for protein modeling and design: applications to docking and structure prediction from the distorted native conformation. J Comput Chem 15(5):488–506

    CAS  Google Scholar 

  8. Liu J, Wang R (2015) Classification of current scoring functions. J Chem Inf Model 55(3):475–482

    CAS  PubMed  Google Scholar 

  9. Ortiz AR, Pisabarro MT, Gago F, Wade RC (1995) Prediction of drug binding affinities by comparative binding energy analysis. J Med Chem 38:2681–2691

    CAS  PubMed  Google Scholar 

  10. Yin S, Biedermannova L, Vondrasek J, Dokholyan NV (2008) Medusascore: an acurate force field-based scoring function for virtual drug screening. J Chem Inf Model 48:1656–1662

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Muegge I, Martin Y (1999) A general and fast scoring function for protein-ligand interactions: a simplified potential approach. J Med Chem 42(5):791–804

    CAS  PubMed  Google Scholar 

  12. Velec HFG, Gohlke H, Klebe G (2005) Knowledge-based scoring function derived from small molecule crystal data with superior recognition rate of near-native ligand poses and better affinity prediction. J Med Chem 48:6296–6303

    CAS  PubMed  Google Scholar 

  13. Zheng Z, Wang T, Li P, Merz KM Jr (2015) KECSA-Movable type implicit solvation model (KMTISM). J Chem Theor Comput 11:667–682

    CAS  Google Scholar 

  14. Huang SY, Zou X (2006) An iterative knowledge-based scoring function to predict protein-ligand interactions: I. derivation of interaction potentials. J Comput Chem 27:1865–1875

    Google Scholar 

  15. Verkhivker G, Appelt K, Freer ST, Villafranca JE (1995) Empirical free energy calculations of ligand-protein crystallographic complexes. i. Knowledge based ligand-protein interaction potentials applied to the prediction of human immunodeficiency virus protease binding affinity. Protein Eng 8:677–691

    CAS  PubMed  Google Scholar 

  16. Eldridge MD, Murray CW, Auton TR, Paolini GV, Mee RP (1997) Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes. J Comput-Aided Mol Des 11:425–445

    CAS  PubMed  Google Scholar 

  17. Wang R, Lai L, Wang S (2002) Further development and validation of empirical scoring functions for structural based binding affinity prediction. J. Comput-Aided Mol. Des 16:11–26

    CAS  PubMed  Google Scholar 

  18. Ballester PJ, Mitchell JBO (2010) A machine learning approach to predicting protein -ligand binding affinity with applications to molecular docking. Bioinformatics 26(9):1169–1175

    CAS  PubMed  Google Scholar 

  19. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Google Scholar 

  20. Li H, Leung K-S, Wong M-H, Ballester PJ (2014) Substituting random forest for multiple linear regression improves binding affinity prediction of scoring functions: cyscore as a case study. BMC Bioinform 15(1):1

    Google Scholar 

  21. Nguyen DD, Xiao T, Wang ML, Wei GW (2017) Rigidity strengthening: a mechanism for protein-ligand binding. J Chem Inf Model 57:1715–1721

    CAS  PubMed  Google Scholar 

  22. Cang ZX, Wei GW (2018) Integration of element specific persistent homology and machine learning for protein-ligand binding affinity prediction. Int J Numer Method Biomed Eng. https://doi.org/10.1002/cnm.2914

    Article  PubMed  Google Scholar 

  23. Cang ZX, Wei GW (2017) TopologyNet: topology based deep convolutional and multi-task neural networks for biomolecular property predictions. PLOS Comput Biol 13(7):e1005690. https://doi.org/10.1371/journal.pcbi.1005690

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Cang ZX, Mu L, Wei GW (2018) Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLOS Comput Biol 14(1):e1005929. https://doi.org/10.1371/journal.pcbi.1005929

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Nguyen DD, Wei G-W (2019) Dg-gl: differential geometry-based geometric learning of molecular datasets. Int J Numer Method Biomed Eng 35(3):e3179

    PubMed  PubMed Central  Google Scholar 

  26. Nguyen D, Wei G-W (2019) Agl-score: algebraic graph learning score for protein-ligand binding scoring, ranking, docking, and screening. J Chem Inf Model 59(7):3291–3304

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Nguyen DD, Cang Z, Wu K, Wang M, Cao Y, Wei G-W (2019) Mathematical deep learning for pose and binding affinity prediction and ranking in d3r grand challenges. J Comput-Aided Mol Des 33(1):71–82

    CAS  PubMed  Google Scholar 

  28. Wei GW (2010) Differential geometry based multiscale models. Bull Math Biol 72:1562–1622

    PubMed  PubMed Central  Google Scholar 

  29. Chen Z, Zhao S, Chun J, Thomas DG, Baker NA, Bates PB, Wei GW (2012) Variational approach for nonpolar solvation analysis. J Chem Phys 137:084101

    PubMed  PubMed Central  Google Scholar 

  30. Wang B, Wei G-W (2015) Parameter optimization in differential geometry based solvation models. J Chem Phys 143:134119

    PubMed  PubMed Central  Google Scholar 

  31. Chen D, Wei GW (2012) Quantum dynamics in continuum for proton transport III: generalized correlation. J Chem Phys 136:134109

    PubMed  PubMed Central  Google Scholar 

  32. Chen D, Wei GW (2012) Quantum dynamics in continuum for proton transport—generalized correlation. J Chem Phys 136:134109

    PubMed  PubMed Central  Google Scholar 

  33. Wei G-W, Zheng Q, Chen Z, Xia K (2012) Variational multiscale models for charge transport. SIAM Rev 54(4):699–754

    Google Scholar 

  34. Wei GW (2013) Multiscale, multiphysics and multidomain models I: basic theory. J Theor Comput Chem 12(8):1341006

    Google Scholar 

  35. Chen D, Wei GW (2013) Quantum dynamics in continuum for proton transport I: basic formulation. Commun Comput Phys 13:285–324

    CAS  PubMed  Google Scholar 

  36. Feng X, Xia K, Tong Y, Wei G-W (2012) Geometric modeling of subcellular structures, organelles and large multiprotein complexes. Int J Numer Method Biomed Eng 28:1198–1223

    PubMed  PubMed Central  Google Scholar 

  37. Xia KL, Feng X, Tong YY, Wei GW (2014) Multiscale geometric modeling of macromolecules i: Cartesian representation. J Comput Phys 275:912–936

    Google Scholar 

  38. Mu L, Xia K, Wei G (2017) Geometric and electrostatic modeling using molecular rigidity functions. J Comput Appl Math 313:18–37

    Google Scholar 

  39. Nguyen DD, Wei GW (2017) The impact of surface area, volume, curvature and Lennard-Jones potential to solvation modeling. J Comput Chem 38:24–36

    CAS  PubMed  Google Scholar 

  40. Kaczynski T, Mischaikow K, Mrozek M (2004) Computational homology. Springer-Verlag, Berlin

    Google Scholar 

  41. Edelsbrunner H, Letscher D, Zomorodian A (2001) Topological persistence and simplification. Discret Comput Geom 28:511–533

    Google Scholar 

  42. Zomorodian A, Carlsson G (2005) Computing persistent homology. Discret Comput Geom 33:249–274

    Google Scholar 

  43. Kasson PM, Zomorodian A, Park S, Singhal N, Guibas LJ, Pande VS (2007) Persistent voids a new structural metric for membrane fusion. Bioinformatics 23:1753–1759

    CAS  PubMed  Google Scholar 

  44. Dabaghian Y, Mémoli F, Frank L, Carlsson G (2012) A topological paradigm for hippocampal spatial map formation using persistent homology. PLoS Comput Biol 8(8):e1002581

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Gameiro M, Hiraoka Y, Izumi S, Kramar M, Mischaikow K, Nanda V (2014) Topological measurement of protein compressibility via persistence diagrams. Jpn J Ind Appl Math 32:1–17

    Google Scholar 

  46. Xia KL, Wei GW (2014) Persistent homology analysis of protein structure, flexibility and folding. Int J Numer Method Biomed Eng 30:814–844

    PubMed  PubMed Central  Google Scholar 

  47. Xia KL, Wei GW (2015) Persistent topology for cryo-EM data analysis. Int J Numer Method Biomed Eng 31:e02719

    Google Scholar 

  48. Xia KL, Feng X, Tong YY, Wei GW (2015) Persistent homology for the quantitative prediction of fullerene stability. J Comput Chem 36:408–422

    CAS  PubMed  Google Scholar 

  49. Wang B, Wei GW (2016) Object-oriented persistent homology. J Comput Phys 305:276–299

    PubMed  PubMed Central  Google Scholar 

  50. Liu B, Wang B, Zhao R, Tong Y, Wei G-W (2017) Eses: software for e ulerian solvent excluded surface. J Comput Chem 38(7):446–466

    CAS  PubMed  Google Scholar 

  51. Cang ZX, Mu L, Wu K, Opron K, Xia K, Wei G-W (2015) A topological approach to protein classification. Mol Based Math Biol 3:140–162

    Google Scholar 

  52. Cang ZX, Wei GW (2017) Analysis and prediction of protein folding energy changes upon mutation by element specific persistent homology. Bioinformatics 33:3549–3557

    CAS  PubMed  Google Scholar 

  53. Wu K, Wei GW (2018) Quantitative toxicity prediction using topology based multitask deep neural networks. J Chem Inf Model 58:520–531

    CAS  PubMed  Google Scholar 

  54. Wu K, Zhao Z, Wang R, Wei GW (2018) TopP-S: persistent homology-based multi-task deep neural networks for simultaneous predictions of partition coefficient and aqueous solubility. J Comput Chem 39:1444–1454

    CAS  PubMed  Google Scholar 

  55. Hosoya H (1971) Topological index. a newly proposed quantity characterizing the topological nature of structural isomers of saturated hydrocarbons. Bull Chem Soc Jpn 44(9):2332–2339

    CAS  Google Scholar 

  56. Hansen PJ, Jurs PC (1988) Chemical applications of graph theory. Part i. Fundamentals and topological indices. J Chem Educ 65(7):574

    CAS  Google Scholar 

  57. Newman M (2010) Networks: an introduction. Oxford University Press, Oxford

    Google Scholar 

  58. Bavelas A (1950) Communication patterns in task-oriented groups. J Acoust Soc Am 22(6):725–730

    Google Scholar 

  59. Dekker A (2005) Conceptual distance in social network analysis. J Soc Struct 6:31

    Google Scholar 

  60. Bahar I, Atilgan AR, Erman B (1997) Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. Fold Des 2:173–181

    CAS  PubMed  Google Scholar 

  61. Yang LW, Chng CP (2008) Coarse-grained models reveal functional dynamics-I. Elastic network models-theories, comparisons and perspectives. Bioinf Biol Insights 2:25–45

    CAS  Google Scholar 

  62. Wei GW, Zhan M, Lai CH (2002) Tailoring wavelets for chaos control. Phys Rev Lett 89:284103

    CAS  PubMed  Google Scholar 

  63. Go N, Noguti T, Nishikawa T (1983) Dynamics of a small globular protein in terms of low-frequency vibrational modes. Proc Natl Acad Sci USA 80:3696–3700

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Tasumi M, Takenchi H, Ataka S, Dwidedi AM, Krimm S (1982) Normal vibrations of proteins: glucagon. Biopolymers 21:711–714

    CAS  PubMed  Google Scholar 

  65. Brooks BR, Bruccoleri RE, Olafson BD, States D, Swaminathan S, Karplus M (1983) Charmm: a program for macromolecular energy, minimization, and dynamics calculations. J Comput Chem 4:187–217

    CAS  Google Scholar 

  66. Levitt M, Sander C, Stern PS (1985) Protein normal-mode dynamics: trypsin inhibitor, crambin, ribonuclease and lysozyme. J Mol Biol 181(3):423–447

    CAS  PubMed  Google Scholar 

  67. Flory PJ (1976) Statistical thermodynamics of random networks. Proc R. Soc. Lond. A 351:351–378

    CAS  Google Scholar 

  68. Bahar I, Atilgan AR, Demirel MC, Erman B (1998) Vibrational dynamics of proteins: significance of slow and fast modes in relation to function and stability. Phys Rev Lett 80:2733–2736

    CAS  Google Scholar 

  69. Atilgan AR, Durrell SR, Jernigan RL, Demirel MC, Keskin O, Bahar I (2001) Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys J 80:505–515

    CAS  PubMed  PubMed Central  Google Scholar 

  70. Hinsen K (1998) Analysis of domain motions by approximate normal mode calculations. Proteins 33:417–429

    CAS  PubMed  Google Scholar 

  71. Tama F, Sanejouand YH (2001) Conformational change of proteins arising from normal mode calculations. Protein Eng 14:1–6

    CAS  PubMed  Google Scholar 

  72. Cui Q, Bahar I (2010) Normal mode analysis: theory and applications to biological and chemical systems. Chapman and Hall, London

    Google Scholar 

  73. Balaban AT (1976) Chemical applications of graph theory. Academic Press, Cambridge

    Google Scholar 

  74. Trinajstic N (1983) Chemical graph theory. CRC Press, Boca Raton

    Google Scholar 

  75. Schultz HP (1989) Topological organic chemistry. 1. Graph theory and topological indices of alkanes. J Chem Inf Comput Sci 29(3):227–228

    CAS  Google Scholar 

  76. Foulds LR (2012) Graph theory applicatons. Springer, Berlin

    Google Scholar 

  77. Ozkanlar A, Clark AE (2014) Chemnetworks: a complex network analysis tool for chemical systems. J Comput Chem 35(6):495–505

    CAS  PubMed  Google Scholar 

  78. Di Paola L, Giuliani A (2015) Protein contact network topology: a natural language for allostery. Curr Opin Struct Biol 31:43–48

    PubMed  Google Scholar 

  79. Canutescu AA, Shelenkov AA, Dunbrack RL (2003) A graph-theory algorithm for rapid protein side-chain prediction. Protein Sci 12(9):2001–2014

    CAS  PubMed  PubMed Central  Google Scholar 

  80. Ryslik GA, Cheng Y, Cheung K-H, Modis Y, Zhao H (2014) A graph theoretic approach to utilizing protein structure to identify non-random somatic mutations. BMC Bioinform 15(1):86

    Google Scholar 

  81. Jacobs DJ, Rader AJ, Kuhn LA, Thorpe MF (2001) Protein flexibility predictions using graph theory. Proteins-Struct Funct Genet 44:150–165

    CAS  PubMed  Google Scholar 

  82. Vishveshwara S, Brinda K, Kannan N (2002) Protein structure: insights from graph theory. J Theor Comput Chem 1(01):187–211

    CAS  Google Scholar 

  83. Wu Z, Ramsundar B, Feinberg EN, Gomes J, Geniesse C, Pappu AS, Leswing K, Pande V (2017) Moleculenet: A benchmark for molecular machine learning. arXiv preprint arXiv:1703.00564

  84. Quan L, Lv Q, Zhang Y (2016) Strum: structure-based prediction of protein stability changes upon single-point mutation. Struct Bioinform (In press)

  85. Pires DEV, Ascher DB, Blundell TL (2014) mcsm: predicting the effects of mutations in proteins using graph-based signatures. Struct Bioinform 30:335–342

    CAS  Google Scholar 

  86. Park JK, Jernigan R, Wu Z (2013) Coarse grained normal mode analysis vs. refined gaussian network model for protein residue-level structural fluctuations. Bull Math Biol 75:124–160

    CAS  PubMed  PubMed Central  Google Scholar 

  87. Bramer D, Wei GW (2018) Weighted multiscale colored graphs for protein flexibility and rigidity analysis. J Chem Phys 148:054103

    PubMed  Google Scholar 

  88. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Mozer MC, Jordan MI, Petsche T (eds) Advances in neural information processing systems. MIT Press, Cambridge, pp 2672–2680

    Google Scholar 

  89. Xia KL, Opron K, Wei GW (2013) Multiscale multiphysics and multidomain models—flexibility and rigidity. J Chem Phys 139:194109

    PubMed  PubMed Central  Google Scholar 

  90. Opron K, Xia KL, Wei GW (2014) Fast and anisotropic flexibility-rigidity index for protein flexibility and fluctuation analysis. J Chem Phys 140:234105

    PubMed  PubMed Central  Google Scholar 

  91. Nguyen DD, Xia KL, Wei GW (2016) Generalized flexibility-rigidity index. J Chem Phys 144:234106

    PubMed  Google Scholar 

  92. Wei GW (2000) Wavelets generated by using discrete singular convolution kernels. J Phys A 33:8577–8596

    Google Scholar 

  93. Soldea O, Elber G, Rivlin E (2006) Global segmentation and curvature analysis of volumetric data sets using trivariate b-spline functions. IEEE Trans PAMI 28(2):265–278

    Google Scholar 

  94. Edelsbrunner H (1992) Weighted alpha shapes. Technical Report. University of Illinois, Champaign

    Google Scholar 

  95. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    Google Scholar 

  96. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: International conference on machine learning, pp 214–223

  97. Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784

Download references

Acknowledgements

This work was supported in part by NSF Grants DMS-1721024, DMS-1761320, and IIS1900473 and NIH Grant GM126189. DDN and GWW are also funded by Bristol-Myers Squibb and Pfizer.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guo-Wei Wei.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations/

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nguyen, D.D., Gao, K., Wang, M. et al. MathDL: mathematical deep learning for D3R Grand Challenge 4. J Comput Aided Mol Des 34, 131–147 (2020). https://doi.org/10.1007/s10822-019-00237-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-019-00237-5

Keywords

Navigation