Consolidated trees versus bagging when explanation is required

Pérez, Jesús M.; Albisua, Iñaki; Arbelaitz, Olatz; Gurrutxaga, Ibai; Martín, José I.; Muguerza, Javier; Perona, Iñigo

doi:10.1007/s00607-010-0094-z

Consolidated trees versus bagging when explanation is required

Published: 12 June 2010

Volume 89, pages 113–145, (2010)
Cite this article

Computing Aims and scope Submit manuscript

Jesús M. Pérez¹,
Iñaki Albisua¹,
Olatz Arbelaitz¹,
Ibai Gurrutxaga¹,
José I. Martín¹,
Javier Muguerza¹ &
…
Iñigo Perona¹

142 Accesses
5 Citations
Explore all metrics

Abstract

In some real-world problems solved by machine learning it is compulsory for the solution provided to be comprehensible so that the correct decision can be made. It is in this context that this paper compares bagging (one of the most widely used multiple classifier systems) with the consolidated trees construction (CTC) algorithm, when the learning problem to be solved requires the classification made to be provided with an explanation. Bearing in mind the comprehensibility shortcomings of bagging, the Domingos’ proposal, called combining multiple models, has been used to address this problem. The two algorithms have been compared from three main points of view: accuracy, quality of the explanation the classification is provided with, and computational cost. The results obtained show that it is beneficial to use CTC in situations where an explanation is required, because: CTC has a greater discriminating capacity than the explanation extraction algorithm added to bagging; the explanation provided is of a greater quality; it is simpler and more reliable; and CTC is computationally more efficient.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agnar A, Plaza E (1994) Case-based reasoning: foundational issues, methodological variations, and system approaches. Artif Intell Commun 7(1): 39–52
Google Scholar
Andonova S, Elisseeff A, Evgeniou T, Pontil M (2002) A simple algorithm for learning stable machines. In: Proceedings of the European conference on artificial intelligence, pp 513–517
Asuncion A, Newman DJ (2007) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine. http://www.ics.uci.edu/~learn/MLRepository.html
Banfield RE, Hall LO, Bowyer KW, Bhadoria D, Kegelmeyer WP, Eschrich S (2004) A comparison of ensemble creation techniques. In: The fifth international conference on multiple classifier systems. Cagliari, Italy, pp 223–232
Banfield RE, Hall LO, Bowyer KW, Kegelmeyer WP (2007) A comparison of decision tree ensemble creation techniques. IEEE Trans Pattern Anal Mach Intell 29: 173–180
Article Google Scholar
Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36: 105–139
Article Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24: 123–140
MATH MathSciNet Google Scholar
Chawla NV, Hall LO, Bowyer KW, Kegelmeyer WP (2004) Learning ensembles from bites: a scalable and accurate approach. J Mach Learn Res 5: 421–451
MathSciNet Google Scholar
Craven WM (1996) Extracting comprehensible models from trained neural networks, Phd Thesis. University of Wisconsin, Madison
Wall R, Cunningham P, Walsh P (2002) Explaining predictions from a neural network ensemble one at a time. In: Proceedings of the 6th European conference on principles of data mining and knowledge discovery, pp 449–460
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7: 1–30
MathSciNet Google Scholar
Dietterich TG (1997) Machine learning research: four currents directions. AI Mag 18(4): 97–136
Google Scholar
Dietterich TG (2002) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40: 139–157
Article Google Scholar
Domingos P (1997) Knowledge acquisition from examples via multiple models. In: Proceedings of 14th international conference on machine learning, Nashville, pp 98–106
Drummond C, Holte RC (2000) Exploiting the cost (in)sensitivity of decision tree splitting criteria. In: Proceedings of the 17th international conference on Machine Learning, pp 239–246
Dwyer K, Holte R (2007) Decision tree instability and active learning. In: Proceedings of the 18th European conference on machine learning, ECML, pp 128–139
Elisseeff A, Evgeniou T, Pontil M, Kaelbling P (2005) Stability of randomized learning algorithms. J Mach Learn 6: 55–79
Google Scholar
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceedings of the 13th international conference on machine learning, pp 148–156
García S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9: 2677–2694
Google Scholar
Gurrutxaga I, Pérez JM, Arbelaitz O, Martín JI, Muguerza J (2006) Analysis of the performance of a parallel implementation for the CTC algorithm. In: Workshop on state-of-the-art in Scientific and Parallel Computing (PARA’06), Umea, Sweden
Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, Berlin ISBN: 0-387-95284-5
MATH Google Scholar
Johansson U, Niklasson L, König R (2004) Accuracy vs. comprehensibility in data mining models. In: The 7th international conference on information fusion, Stockholm, Sweden
Mease D, Wyner AJ, Buja A (2007) Boosted classification trees and class probability/quantile estimation. J Mach Learn Res 8: 409–439
Google Scholar
Núñez H, Angulo C, Català A (2002) Rule extraction from support vector machines. In: ESANN’2002 proceedings of the European symposium on artificial neural networks bruges (Belgium), pp 107–112
Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. JAIR 11: 169–198
MATH Google Scholar
Paliouras G, Brée DS (1995) The effect of numeric features on the scalability of inductive learning programs. LNCS, vol 912. In: 8th European conference on machine learning (ECML), Greece, pp 218–231
Pérez JM, Muguerza J, Arbelaitz O, Gurrutxaga I, Martín JI et al (2006) Consolidated trees: an analysis of structural convergence, LNAI 3755. In: Graham JW (eds) Data mining: theory, methodology, techniques, and applications. Springer, Berlin, pp 39–52
Google Scholar
Pérez JM (2006) Árboles consolidados: construcción de un árbol de clasificación basado en múltiples submuestras sin renunciar a la explicación, Phd thesis. University of Basque Country, Donostia
Pérez JM, Muguerza J, Arbelaitz O, Gurrutxaga I, Martín JI (2007) Combining multiple class distribution modified subsamples in a single tree. Pattern Recognit Lett 28(4): 414–422
Article Google Scholar
Provost F, Jensen D, Oates T (1999) Efficient progressive sampling. In: Proceedings of 5th international conference on knowledge discovery and data mining. AAAI Press, Menlo Park, pp 23–32
Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann, San Mateo
Google Scholar
Schank R (1982) Dynamic memory: a theory of learning in computers and people. Cambridge University Press, New York
Google Scholar
Setiono R, Leow WK, Zurada JM (2002) Extraction of rules from artificial neural networks for nonlinear regression. IEEE Trans Neural Netw 13(3): 564–577
Article Google Scholar
Skurichina M, Kuncheva LI, Duin RPW (2002) Bagging and boosting for the nearest mean classifier: effects of sample size on diversity and accuracy. Multiple classifier systems: proceedings of 3rd international workshop, MCS, LNCS. Cagliari, Italy, vol 2364, pp 62–71
Turney P (1995) Bias and the quantification of stability. Mach Learn 20: 23–33
Google Scholar
Windeatt T, Ardeshir G (2002) Boosted tree ensembles for solving multiclass problems. In: Multiple classifier systems: proceedings of 3rd interernational Workshop, MCS, LNCS. Cagliari, Italy, vol 2364, pp 42–51
Xu L, Krzyzak A, Suen CY (1992) Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans Syst Man Cybern SMC-22(3): 418–435
Article Google Scholar
Yao YY, Zhao Y, Maguire RB (2003) Explanation oriented association mining using rough set theory. In: Proceedings of the 9th international conference rough sets, fuzzy sets, data mining, and granular computing, (RSFDGrC, 2003), LNAI, vol 2639, pp 165–172
Yao YY, Zhao Y, Maguire RB (2003) Explanation oriented association mining using combination of unsupervised and supervised learning algorithms. In: Advances in artificial intelligence, proceedings of the 16th conference of the Canadian Society for Computational Studies of Intelligence (AI 2003), LNAI, vol 2671, pp 527–532
Zenobi G, Cunningham P (2002) An approach to aggregating ensembles of lazy learners that supports explanation. In: Advances in case-based reasoning, 6th European conference ECCBR, pp 436–447

Download references

Author information

Authors and Affiliations

Department of Computer Architecture and Technology, University of the Basque Country, M. Lardizabal, 1, 20018, Donostia, Spain
Jesús M. Pérez, Iñaki Albisua, Olatz Arbelaitz, Ibai Gurrutxaga, José I. Martín, Javier Muguerza & Iñigo Perona

Authors

Jesús M. Pérez
View author publications
You can also search for this author in PubMed Google Scholar
Iñaki Albisua
View author publications
You can also search for this author in PubMed Google Scholar
Olatz Arbelaitz
View author publications
You can also search for this author in PubMed Google Scholar
Ibai Gurrutxaga
View author publications
You can also search for this author in PubMed Google Scholar
José I. Martín
View author publications
You can also search for this author in PubMed Google Scholar
Javier Muguerza
View author publications
You can also search for this author in PubMed Google Scholar
Iñigo Perona
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Olatz Arbelaitz.

Additional information

Communicated by R. Neruda.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pérez, J.M., Albisua, I., Arbelaitz, O. et al. Consolidated trees versus bagging when explanation is required. Computing 89, 113–145 (2010). https://doi.org/10.1007/s00607-010-0094-z

Download citation

Received: 29 July 2009
Accepted: 23 May 2010
Published: 12 June 2010
Issue Date: September 2010
DOI: https://doi.org/10.1007/s00607-010-0094-z

Keywords

Mathematics Subject Classification (2000)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Consolidated trees versus bagging when explanation is required

Abstract

Access this article

Similar content being viewed by others

Multi-class and feature selection extensions of Roughly Balanced Bagging for imbalanced data

Evidential Bagging: Combining Heterogeneous Classifiers in the Belief Functions Framework

Stable and actionable explanations of black-box models through factual and counterfactual rules

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2000)

Navigation

Consolidated trees versus bagging when explanation is required

Abstract

Access this article

Similar content being viewed by others

Multi-class and feature selection extensions of Roughly Balanced Bagging for imbalanced data

Evidential Bagging: Combining Heterogeneous Classifiers in the Belief Functions Framework

Stable and actionable explanations of black-box models through factual and counterfactual rules

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2000)

Search

Navigation