Abstract
Drug-induced liver injury, although infrequent, is an important safety concern that can lead to fatality in patients and failure in drug developments. In this study, we have used an ensemble of mixed learning algorithms and mixed features for the development of a model to predict hepatic effects. This robust method is based on the premise that no single learning algorithm is optimum for all modelling problems. An ensemble model of 617 base classifiers was built from a diverse set of 1,087 compounds. The ensemble model was validated internally with five-fold cross-validation and 25 rounds of y-randomization. In the external validation of 120 compounds, the ensemble model had achieved an accuracy of 75.0%, sensitivity of 81.9% and specificity of 64.6%. The model was also able to identify 22 of 23 withdrawn drugs or drugs with black box warning against hepatotoxicity. Dronedarone which is associated with severe liver injuries, announced in a recent FDA drug safety communication, was predicted as hepatotoxic by the ensemble model. It was found that the ensemble model was capable of classifying positive compounds (with hepatic effects) well, but less so on negatives compounds when they were structurally similar. The ensemble model built in this study is made available for public use.
Similar content being viewed by others
References
Björnsson E (2006) Clin Pharmacol Ther 79:521–528
Gunawan BK, Kaplowitz N (2007) Clin Liver Dis 11:459–475
Li AP (2002) Chem Biol Interact 142:7–23
Greene N, Fisk L, Naven RT, Note RR, Patel ML, Pelletier DJ (2010) Chem Res Toxicol 23:1215–1222
Dearden JC (2003) J Comput Aided Mol Des 17:119–127
Richard AM (2006) Chem Res Toxicol 19:1257–1262
Schultz TW, Cronin MTD, Netzeva TI (2003) J Mol Struct 622:23–38
Veith GD (2004) SAR QSAR Environ Res 15:323–330
Greene N, Judson PN, Langowski JJ, Marchant CA (1999) SAR QSAR Environ Res 10:299–314
MetabolExpert | www.compudrug.com. http://www.compudrug.com/?q=node/36. Accessed 3 May 2011
Muster W, Breidenbach A, Fischer H, Kirchner S, Müller L, Pähler A (2008) Drug Discov Today 13:303–310
Xu JJ, Diaz D, O’Brien PJ (2004) Chem Biol Interact 150:115–128
Subramanian K, Raghavan S, Rajan Bhat A, Das S, Bajpai Dikshit J, Kumar R, Narasimha MK, Nalini R, Radhakrishnan R, Raghunathan S (2008) Expert Opin Drug Saf 7:647–662
Hultin-Rosenberg L, Jagannathan S, Nilsson KC, Matis SA, Sjogren N, Huby RD, Salter AH, Tugwood JD (2006) Xenobiotica 36:1122–1139
Zidek N, Hellmann J, Kramer PJ, Hewitt PG (2007) Toxicol Sci 99:289–302
Ebbels TM, Keun HC, Beckonert OP, Bollard ME, Lindon JC, Holmes E, Nicholson JK (2007) J Proteome Res 6:4407–4422
Xu JJ, Henstock PV, Dunn MC, Smith AR, Chabot JR, de Graaf D (2008) Toxicol Sci 105:97–105
Greer ML, Barber J, Eakins J, Kenna JG (2010) Toxicology 268:125–131
Martinez SM, Bradford BU, Soldatow VY, Kosyk O, Sandot A, Witek R, Kaiser R, Stewart T, Amaral K, Freeman K, Black C, LeCluyse EL, Ferguson SS, Rusyn I (2010) Toxicol Appl Pharmacol 249:208–216
Meng Q (2010) Exp Opin Drug Metab Toxicol 6:733–746
Reese M, Sakatis M, Ambroso J, Harrell A, Yang E, Chen L, Taylor M, Baines I, Zhu L, Ayrton A, Clarke S (2011) Chem Biol Interact 192:60–64
Cruz-Monteagudo M, Cordeiro MN, Borges F (2008) J Comput Chem 29:533–549
Huang R, Southall N, Xia M, Cho MH, Jadhav A, Nguyen DT, Inglese J, Tice RR, Austin CP (2009) Toxicol Sci 112:385–393
Marchant CA, Fisk L, Note RR, Patel ML, Suarez D (2009) Chem Biodivers 6:2107–2114
Matthews EJ, Ursem CJ, Kruhlak NL, Benz RD, Sabaté DA, Yang C, Klopman G, Contrera JF (2009) Regul Toxicol Pharmacol 54:23–42
Fourches D, Barnes JC, Day NC, Bradley P, Reed JZ, Tropsha A (2010) Chem Res Toxicol 23:171–183
Rodgers AD, Zhu H, Fourches D, Rusyn I, Tropsha A (2010) Chem Res Toxicol 23:724–732
Zheng W, Tropsha A (2000) J Chem Inf Comput Sci 40:185–194
Tropsha A, Golbraikh A (2007) Curr Pharm Des 13:3494–3504
Arodź T, Yuen DA, Dudek AZ (2005) J Chem Inf Model 46:416–423
Bostrom H (2007) 10th International conference on information fusion, pp 1–7
Li J, Lei B, Liu H, Li S, Yao X, Liu M, Gramatica P (2008) J Comput Chem 29:2636–2647
Lei B, Xi L, Li J, Liu H, Yao X (2009) Anal Chim Acta 644:17–24
Liew CY, Ma XH, Yap CW (2010) J Comput Aided Mol Des 24:131–141
Asikainen AH, Ruuskanen J, Tuppurainen KA (2004) SAR QSAR Environ Res 15:19–32
Votano JR, Parham M, Hall LH, Kier LB, Oloff S, Tropsha A, Xie Q, Tong W (2004) Mutagenesis 19:365–377
Norinder U, Liden P, Bostrom H (2006) Mol Divers 10:207–212
Tropsha A (2010) Mol Inform 29:476–488
Tan P-N, Steinbach M, Kumar V (2005) Introduction to data mining. Addison Wesley, Reading
Gramatica P, Pilutti P, Papa E (2004) J Chem Inf Comput Sci 44:1794–1802
Gramatica P, Giani E, Papa E (2007) J Mol Graphics Model 25:755–766
Agrafiotis DK, Cedeno W, Lobanov VS (2002) J Chem Inf Comput Sci 42:903–911
Hong H, Tong W, Xie Q, Fang H, Perkins R (2005) SAR QSAR Environ Res 16:339–347
Sushko I, Novotarskyi S, Körner R, Pandey AK, Cherkasov A, Li J, Gramatica P, Hansen K, Schroeter T, Müller KR, Xi L, Liu H, Yao X, Öberg T, Hormozdiari F, Dao P, Sahinalp C, Todeschini R, Polishchuk P, Artemenko A, Kuz’min V, Martin TM, Young DM, Fourches D, Muratov E, Tropsha A, Baskin I, Horvath D, Marcou G, Muller C, Varnek A, Prokopenko VV, Tetko IV (2010) J Chem Inf Model 50:2094–2111
Kuz’min VE, Muratov EN, Artemenko AG, Varlamova EV, Gorb L, Wang J, Leszczynski J (2009) QSAR Comb Sci 28:664–677
Shen M, Beguin C, Golbraikh A, Stables JP, Kohn H, Tropsha A (2004) J Med Chem 47:2356–2364
Yap CW, Chen YZ (2005) J Chem Inf Model 45:982–992
Zhang S, Wei L, Bastow K, Zheng W, Brossi A, Lee KH, Tropsha A (2007) J Comput Aided Mol Des 21:97–112
Breiman L (2001) MLear 45:5–32
Sutherland JJ, O’Brien LA, Weaver DF (2003) J Chem Inf Comput Sci 43:1906–1915
Oloff S, Mailman RB, Tropsha A (2005) J Med Chem 48:7322–7332
Katritzky AR, Kuanar M, Slavov S, Dobchev DA, Fara DC, Karelson M, Acree WE Jr, Solov’ev VP, Varnek A (2006) Bioorg Med Chem 14:4888–4917
Zhang L, Zhu H, Oprea TI, Golbraikh A, Tropsha A (2008) Pharm Res 25:1902–1914
Gini G, Garg T, Stefanelli M (2009) ApAI 23:261–281
Roy K, Paul S (2009) QSAR Comb Sci 28:406–425
Dahlgren MK, Zetterstrom CE, Gylfe S, Linusson A, Elofsson M (2010) Bioorg Med Chem 18:2686–2703
Orange book: approved drug products with therapeutic equivalence evaluations. http://www.accessdata.fda.gov/scripts/cder/ob/default.cfm. Accessed 25 November 2010
Micromedex® Healthcare Series [Internet database]. Accessed 25 November 2010
Budavari S, O’Neil MJ, Smith A (1989) The Merck index: an encyclopedia of chemicals, drugs, and biologicals. Merck Publishing Group
Kaplowitz N (2003) Drug-induced liver disease. Marcel Dekker, Inc., New York
Bolton EE, Wang Y, Thiessen PA, Bryant SH, Ralph AW, David CS (2008) Annual reports in computational chemistry. Elsevier, Amsterdam, pp 217–241
CambridgeSoft Desktop Software—ChemDraw (Windows/Mac). http://www.cambridgesoft.com/. Accessed 3 Jun 2010
Pipeline Pilot Student Edition. http://accelrys.com/solutions/industry/academic/student-edition.html. Accessed 10 January 2011
CORINA: Generation of 3D coordinates. http://www.molecular-networks.com/software/corina/index.html. Accessed 3 Jun 2010
Walgren JL, Mitchell MD, Thompson DC (2005) Crit Rev Toxicol 35:325–361
Drug Safety and Availability. FDA Drug Safety Communication: Severe liver injury associated with the use of dronedarone (marketed as Multaq). http://www.fda.gov/Drugs/DrugSafety/ucm240011.htm. Accessed 17 January 2011
PaDEL-Descriptor. http://padel.nus.edu.sg/software/padeldescriptor/index.html. Accessed 3 Jun 2010
Yap CW (2011) J Comput Chem 32:1466–1474
Vapnik V (1995) The nature of statistical learning theory. Springer, London
Czermiński R, Yasri A, Hartsough D (2001) Quant Struct Act Relat 20:227–240
Trotter M, Buxton B, Holden SB (2001) Measure Control 34:235–239
Brent RP (2002) Algorithms for minimization without derivatives. Dover Publications, New York
Baldi P, Brunak S, Chauvin Y, Andersen CA, Nielsen H (2000) Bioinformatics 16:412–424
Matthews BW (1975) Biochim Biophys Acta 405:442–451
Fawcett T (2006) Pattern Recog Lett 27:861–874
Nicholls A (2008) J Comput Aided Mol Des 22:239–255
Mierswa I, Wurst M, Klinkenberg R, Scholz M and Euler T (2006) KDD ‘06: proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 935–940
Yu L, Liu H (2004) In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, Seattle, pp 737–742
Kuncheva L (2003) Pattern recognition and image analysis. Springer Berlin, pp 1126–1138
Fan W, Wang H, Yu PS and Ma S (2003) ICDM 2003 Third IEEE international conference on data mining, pp 51–58
Wolpert DH (1992) Neural Netw 5:241–259
Rücker C, Rücker G, Meringer M (2007) J Chem Inf Model 47:2345–2357
Jaworska J, Nikolova-Jeliazkova N, Aldenberg T (2005) ATLA Altern Lab Anim 33:445–459
Guengerich FP, MacDonald JS (2007) Chem Res Toxicol 20:344–369
Dearden JC, Cronin MT, Kaiser KL (2009) SAR QSAR Environ Res 20:241–266
Dragos H, Gilles M, Alexandre V (2009) J Chem Inf Model 49:1762–1776
Sazonovas A, Japertas P, Didziapetris R (2010) SAR QSAR Environ Res 21:127–148
Golbraikh A, Tropsha A (2002) J Mol Graphics Model 20:269–276
Yen MH, Ko HC, Tang FI, Lu RB, Hong JS (2006) Alcohol 38:117–120
Garbutt JC (2010) Curr Pharm Des 16:2091–2097
Golbraikh A, Shen M, Xiao Z, Xiao YD, Lee KH, Tropsha A (2003) J Comput Aided Mol Des 17:241–253
Validation of (Q)SAR Models. http://www.oecd.org/document/4/0,3746,en_2649_34379_42926724_1_1_1_1,00.html. Accessed 23 May 2011
Molconn Z. http://www.edusoft-lc.com/molconn/. Accessed 3 Jun 2010
Talete—Dragon. http://www.talete.mi.it/products/dragon_description.htm. Accessed 3 Jun 2010
Acknowledgments
This study was supported by the NUS start-up grant R-148-000-105-133.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Liew, C.Y., Lim, Y.C. & Yap, C.W. Mixed learning algorithms and features ensemble in hepatotoxicity prediction. J Comput Aided Mol Des 25, 855–871 (2011). https://doi.org/10.1007/s10822-011-9468-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-011-9468-3