Skip to main content

Advertisement

Log in

Mixed learning algorithms and features ensemble in hepatotoxicity prediction

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

Drug-induced liver injury, although infrequent, is an important safety concern that can lead to fatality in patients and failure in drug developments. In this study, we have used an ensemble of mixed learning algorithms and mixed features for the development of a model to predict hepatic effects. This robust method is based on the premise that no single learning algorithm is optimum for all modelling problems. An ensemble model of 617 base classifiers was built from a diverse set of 1,087 compounds. The ensemble model was validated internally with five-fold cross-validation and 25 rounds of y-randomization. In the external validation of 120 compounds, the ensemble model had achieved an accuracy of 75.0%, sensitivity of 81.9% and specificity of 64.6%. The model was also able to identify 22 of 23 withdrawn drugs or drugs with black box warning against hepatotoxicity. Dronedarone which is associated with severe liver injuries, announced in a recent FDA drug safety communication, was predicted as hepatotoxic by the ensemble model. It was found that the ensemble model was capable of classifying positive compounds (with hepatic effects) well, but less so on negatives compounds when they were structurally similar. The ensemble model built in this study is made available for public use.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Björnsson E (2006) Clin Pharmacol Ther 79:521–528

    Article  Google Scholar 

  2. Gunawan BK, Kaplowitz N (2007) Clin Liver Dis 11:459–475

    Article  Google Scholar 

  3. Li AP (2002) Chem Biol Interact 142:7–23

    Article  CAS  Google Scholar 

  4. Greene N, Fisk L, Naven RT, Note RR, Patel ML, Pelletier DJ (2010) Chem Res Toxicol 23:1215–1222

    Article  CAS  Google Scholar 

  5. Dearden JC (2003) J Comput Aided Mol Des 17:119–127

    Article  CAS  Google Scholar 

  6. Richard AM (2006) Chem Res Toxicol 19:1257–1262

    Article  CAS  Google Scholar 

  7. Schultz TW, Cronin MTD, Netzeva TI (2003) J Mol Struct 622:23–38

    CAS  Google Scholar 

  8. Veith GD (2004) SAR QSAR Environ Res 15:323–330

    Article  CAS  Google Scholar 

  9. Greene N, Judson PN, Langowski JJ, Marchant CA (1999) SAR QSAR Environ Res 10:299–314

    Article  CAS  Google Scholar 

  10. MetabolExpert | www.compudrug.com. http://www.compudrug.com/?q=node/36. Accessed 3 May 2011

  11. Muster W, Breidenbach A, Fischer H, Kirchner S, Müller L, Pähler A (2008) Drug Discov Today 13:303–310

    Article  CAS  Google Scholar 

  12. Xu JJ, Diaz D, O’Brien PJ (2004) Chem Biol Interact 150:115–128

    Article  CAS  Google Scholar 

  13. Subramanian K, Raghavan S, Rajan Bhat A, Das S, Bajpai Dikshit J, Kumar R, Narasimha MK, Nalini R, Radhakrishnan R, Raghunathan S (2008) Expert Opin Drug Saf 7:647–662

    Article  CAS  Google Scholar 

  14. Hultin-Rosenberg L, Jagannathan S, Nilsson KC, Matis SA, Sjogren N, Huby RD, Salter AH, Tugwood JD (2006) Xenobiotica 36:1122–1139

    Article  CAS  Google Scholar 

  15. Zidek N, Hellmann J, Kramer PJ, Hewitt PG (2007) Toxicol Sci 99:289–302

    Article  CAS  Google Scholar 

  16. Ebbels TM, Keun HC, Beckonert OP, Bollard ME, Lindon JC, Holmes E, Nicholson JK (2007) J Proteome Res 6:4407–4422

    Article  CAS  Google Scholar 

  17. Xu JJ, Henstock PV, Dunn MC, Smith AR, Chabot JR, de Graaf D (2008) Toxicol Sci 105:97–105

    Article  CAS  Google Scholar 

  18. Greer ML, Barber J, Eakins J, Kenna JG (2010) Toxicology 268:125–131

    Article  CAS  Google Scholar 

  19. Martinez SM, Bradford BU, Soldatow VY, Kosyk O, Sandot A, Witek R, Kaiser R, Stewart T, Amaral K, Freeman K, Black C, LeCluyse EL, Ferguson SS, Rusyn I (2010) Toxicol Appl Pharmacol 249:208–216

    Article  CAS  Google Scholar 

  20. Meng Q (2010) Exp Opin Drug Metab Toxicol 6:733–746

    Article  CAS  Google Scholar 

  21. Reese M, Sakatis M, Ambroso J, Harrell A, Yang E, Chen L, Taylor M, Baines I, Zhu L, Ayrton A, Clarke S (2011) Chem Biol Interact 192:60–64

    Article  CAS  Google Scholar 

  22. Cruz-Monteagudo M, Cordeiro MN, Borges F (2008) J Comput Chem 29:533–549

    Article  CAS  Google Scholar 

  23. Huang R, Southall N, Xia M, Cho MH, Jadhav A, Nguyen DT, Inglese J, Tice RR, Austin CP (2009) Toxicol Sci 112:385–393

    Article  CAS  Google Scholar 

  24. Marchant CA, Fisk L, Note RR, Patel ML, Suarez D (2009) Chem Biodivers 6:2107–2114

    Article  CAS  Google Scholar 

  25. Matthews EJ, Ursem CJ, Kruhlak NL, Benz RD, Sabaté DA, Yang C, Klopman G, Contrera JF (2009) Regul Toxicol Pharmacol 54:23–42

    Article  CAS  Google Scholar 

  26. Fourches D, Barnes JC, Day NC, Bradley P, Reed JZ, Tropsha A (2010) Chem Res Toxicol 23:171–183

    Article  CAS  Google Scholar 

  27. Rodgers AD, Zhu H, Fourches D, Rusyn I, Tropsha A (2010) Chem Res Toxicol 23:724–732

    Article  CAS  Google Scholar 

  28. Zheng W, Tropsha A (2000) J Chem Inf Comput Sci 40:185–194

    Article  CAS  Google Scholar 

  29. Tropsha A, Golbraikh A (2007) Curr Pharm Des 13:3494–3504

    Article  CAS  Google Scholar 

  30. Arodź T, Yuen DA, Dudek AZ (2005) J Chem Inf Model 46:416–423

    Article  Google Scholar 

  31. Bostrom H (2007) 10th International conference on information fusion, pp 1–7

  32. Li J, Lei B, Liu H, Li S, Yao X, Liu M, Gramatica P (2008) J Comput Chem 29:2636–2647

    Article  CAS  Google Scholar 

  33. Lei B, Xi L, Li J, Liu H, Yao X (2009) Anal Chim Acta 644:17–24

    Article  CAS  Google Scholar 

  34. Liew CY, Ma XH, Yap CW (2010) J Comput Aided Mol Des 24:131–141

    Article  CAS  Google Scholar 

  35. Asikainen AH, Ruuskanen J, Tuppurainen KA (2004) SAR QSAR Environ Res 15:19–32

    Article  CAS  Google Scholar 

  36. Votano JR, Parham M, Hall LH, Kier LB, Oloff S, Tropsha A, Xie Q, Tong W (2004) Mutagenesis 19:365–377

    Article  CAS  Google Scholar 

  37. Norinder U, Liden P, Bostrom H (2006) Mol Divers 10:207–212

    Article  CAS  Google Scholar 

  38. Tropsha A (2010) Mol Inform 29:476–488

    Article  CAS  Google Scholar 

  39. Tan P-N, Steinbach M, Kumar V (2005) Introduction to data mining. Addison Wesley, Reading

  40. Gramatica P, Pilutti P, Papa E (2004) J Chem Inf Comput Sci 44:1794–1802

    Article  CAS  Google Scholar 

  41. Gramatica P, Giani E, Papa E (2007) J Mol Graphics Model 25:755–766

    Article  CAS  Google Scholar 

  42. Agrafiotis DK, Cedeno W, Lobanov VS (2002) J Chem Inf Comput Sci 42:903–911

    Article  CAS  Google Scholar 

  43. Hong H, Tong W, Xie Q, Fang H, Perkins R (2005) SAR QSAR Environ Res 16:339–347

    Article  CAS  Google Scholar 

  44. Sushko I, Novotarskyi S, Körner R, Pandey AK, Cherkasov A, Li J, Gramatica P, Hansen K, Schroeter T, Müller KR, Xi L, Liu H, Yao X, Öberg T, Hormozdiari F, Dao P, Sahinalp C, Todeschini R, Polishchuk P, Artemenko A, Kuz’min V, Martin TM, Young DM, Fourches D, Muratov E, Tropsha A, Baskin I, Horvath D, Marcou G, Muller C, Varnek A, Prokopenko VV, Tetko IV (2010) J Chem Inf Model 50:2094–2111

    Google Scholar 

  45. Kuz’min VE, Muratov EN, Artemenko AG, Varlamova EV, Gorb L, Wang J, Leszczynski J (2009) QSAR Comb Sci 28:664–677

    Article  Google Scholar 

  46. Shen M, Beguin C, Golbraikh A, Stables JP, Kohn H, Tropsha A (2004) J Med Chem 47:2356–2364

    Article  CAS  Google Scholar 

  47. Yap CW, Chen YZ (2005) J Chem Inf Model 45:982–992

    Article  CAS  Google Scholar 

  48. Zhang S, Wei L, Bastow K, Zheng W, Brossi A, Lee KH, Tropsha A (2007) J Comput Aided Mol Des 21:97–112

    Article  CAS  Google Scholar 

  49. Breiman L (2001) MLear 45:5–32

    Google Scholar 

  50. Sutherland JJ, O’Brien LA, Weaver DF (2003) J Chem Inf Comput Sci 43:1906–1915

    Article  CAS  Google Scholar 

  51. Oloff S, Mailman RB, Tropsha A (2005) J Med Chem 48:7322–7332

    Article  CAS  Google Scholar 

  52. Katritzky AR, Kuanar M, Slavov S, Dobchev DA, Fara DC, Karelson M, Acree WE Jr, Solov’ev VP, Varnek A (2006) Bioorg Med Chem 14:4888–4917

    Article  CAS  Google Scholar 

  53. Zhang L, Zhu H, Oprea TI, Golbraikh A, Tropsha A (2008) Pharm Res 25:1902–1914

    Article  CAS  Google Scholar 

  54. Gini G, Garg T, Stefanelli M (2009) ApAI 23:261–281

    Google Scholar 

  55. Roy K, Paul S (2009) QSAR Comb Sci 28:406–425

    Article  CAS  Google Scholar 

  56. Dahlgren MK, Zetterstrom CE, Gylfe S, Linusson A, Elofsson M (2010) Bioorg Med Chem 18:2686–2703

    Article  CAS  Google Scholar 

  57. Orange book: approved drug products with therapeutic equivalence evaluations. http://www.accessdata.fda.gov/scripts/cder/ob/default.cfm. Accessed 25 November 2010

  58. Micromedex® Healthcare Series [Internet database]. Accessed 25 November 2010

  59. Budavari S, O’Neil MJ, Smith A (1989) The Merck index: an encyclopedia of chemicals, drugs, and biologicals. Merck Publishing Group

  60. Kaplowitz N (2003) Drug-induced liver disease. Marcel Dekker, Inc., New York

  61. Bolton EE, Wang Y, Thiessen PA, Bryant SH, Ralph AW, David CS (2008) Annual reports in computational chemistry. Elsevier, Amsterdam, pp 217–241

  62. CambridgeSoft Desktop Software—ChemDraw (Windows/Mac). http://www.cambridgesoft.com/. Accessed 3 Jun 2010

  63. Pipeline Pilot Student Edition. http://accelrys.com/solutions/industry/academic/student-edition.html. Accessed 10 January 2011

  64. CORINA: Generation of 3D coordinates. http://www.molecular-networks.com/software/corina/index.html. Accessed 3 Jun 2010

  65. Walgren JL, Mitchell MD, Thompson DC (2005) Crit Rev Toxicol 35:325–361

    Article  CAS  Google Scholar 

  66. Drug Safety and Availability. FDA Drug Safety Communication: Severe liver injury associated with the use of dronedarone (marketed as Multaq). http://www.fda.gov/Drugs/DrugSafety/ucm240011.htm. Accessed 17 January 2011

  67. PaDEL-Descriptor. http://padel.nus.edu.sg/software/padeldescriptor/index.html. Accessed 3 Jun 2010

  68. Yap CW (2011) J Comput Chem 32:1466–1474

    Article  CAS  Google Scholar 

  69. Vapnik V (1995) The nature of statistical learning theory. Springer, London

    Google Scholar 

  70. Czermiński R, Yasri A, Hartsough D (2001) Quant Struct Act Relat 20:227–240

    Article  Google Scholar 

  71. Trotter M, Buxton B, Holden SB (2001) Measure Control 34:235–239

    Google Scholar 

  72. Brent RP (2002) Algorithms for minimization without derivatives. Dover Publications, New York

  73. Baldi P, Brunak S, Chauvin Y, Andersen CA, Nielsen H (2000) Bioinformatics 16:412–424

    Article  CAS  Google Scholar 

  74. Matthews BW (1975) Biochim Biophys Acta 405:442–451

    CAS  Google Scholar 

  75. Fawcett T (2006) Pattern Recog Lett 27:861–874

    Article  Google Scholar 

  76. Nicholls A (2008) J Comput Aided Mol Des 22:239–255

    Article  CAS  Google Scholar 

  77. Mierswa I, Wurst M, Klinkenberg R, Scholz M and Euler T (2006) KDD ‘06: proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 935–940

  78. Yu L, Liu H (2004) In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, Seattle, pp 737–742

  79. Kuncheva L (2003) Pattern recognition and image analysis. Springer Berlin, pp 1126–1138

  80. Fan W, Wang H, Yu PS and Ma S (2003) ICDM 2003 Third IEEE international conference on data mining, pp 51–58

  81. Wolpert DH (1992) Neural Netw 5:241–259

    Article  Google Scholar 

  82. Rücker C, Rücker G, Meringer M (2007) J Chem Inf Model 47:2345–2357

    Article  Google Scholar 

  83. Jaworska J, Nikolova-Jeliazkova N, Aldenberg T (2005) ATLA Altern Lab Anim 33:445–459

    CAS  Google Scholar 

  84. Guengerich FP, MacDonald JS (2007) Chem Res Toxicol 20:344–369

    Article  CAS  Google Scholar 

  85. Dearden JC, Cronin MT, Kaiser KL (2009) SAR QSAR Environ Res 20:241–266

    Article  CAS  Google Scholar 

  86. Dragos H, Gilles M, Alexandre V (2009) J Chem Inf Model 49:1762–1776

    Article  CAS  Google Scholar 

  87. Sazonovas A, Japertas P, Didziapetris R (2010) SAR QSAR Environ Res 21:127–148

    Article  CAS  Google Scholar 

  88. Golbraikh A, Tropsha A (2002) J Mol Graphics Model 20:269–276

    Article  CAS  Google Scholar 

  89. Yen MH, Ko HC, Tang FI, Lu RB, Hong JS (2006) Alcohol 38:117–120

    Article  Google Scholar 

  90. Garbutt JC (2010) Curr Pharm Des 16:2091–2097

    Article  CAS  Google Scholar 

  91. Golbraikh A, Shen M, Xiao Z, Xiao YD, Lee KH, Tropsha A (2003) J Comput Aided Mol Des 17:241–253

    Article  CAS  Google Scholar 

  92. Validation of (Q)SAR Models. http://www.oecd.org/document/4/0,3746,en_2649_34379_42926724_1_1_1_1,00.html. Accessed 23 May 2011

  93. Molconn Z. http://www.edusoft-lc.com/molconn/. Accessed 3 Jun 2010

  94. Talete—Dragon. http://www.talete.mi.it/products/dragon_description.htm. Accessed 3 Jun 2010

Download references

Acknowledgments

This study was supported by the NUS start-up grant R-148-000-105-133.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chun Wei Yap.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (XLS 215 kb)

Supplementary material 2 (XLS 67 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liew, C.Y., Lim, Y.C. & Yap, C.W. Mixed learning algorithms and features ensemble in hepatotoxicity prediction. J Comput Aided Mol Des 25, 855–871 (2011). https://doi.org/10.1007/s10822-011-9468-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-011-9468-3

Keywords

Navigation