skip to main content
10.1145/1143844.1143921acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Pruning in ordered bagging ensembles

Published:25 June 2006Publication History

ABSTRACT

We present a novel ensemble pruning method based on reordering the classifiers obtained from bagging and then selecting a subset for aggregation. Ordering the classifiers generated in bagging makes it possible to build subensembles of increasing size by including first those classifiers that are expected to perform best when aggregated. Ensemble pruning is achieved by halting the aggregation process before all the classifiers generated are included into the ensemble. Pruned subensembles containing between 15% and 30% of the initial pool of classifiers, besides being smaller, improve the generalization performance of the full bagging ensemble in the classification problems investigated.

References

  1. Bakker, B., & Heskes, T. (2003). Clustering ensembles of neural network models. Neural Networks, 16, 261--269.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases.]]Google ScholarGoogle Scholar
  3. Breiman, L. (1996a). Bagging predictors. Machine Learning, 24, 123--140.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Breiman, L. (1996b). Bias, variance, and arcing classifiers (Technical Report 460). Statistics Department, University of California.]]Google ScholarGoogle Scholar
  5. Breiman, L. (1997). Arcing the edge (Technical Report). University of California, Berkeley, CA.]]Google ScholarGoogle Scholar
  6. Breiman, L. (2001). Random forests. Machine Learning, 45, 5--32.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. New York: Chapman & Hall.]]Google ScholarGoogle Scholar
  8. Dietterich, T. G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40, 139--157.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Domingos, P. (1997). Knowledge acquisition from examples via multiple models. Proc. 14th International Conference on Machine Learning (pp. 98--106). Morgan Kaufmann.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Esposito, R., & Saitta, L. (2004). A monte carlo analysis of ensemble classification. ICML '04: Proceedings of the twenty-first international conference on Machine learning (pp. 265--272). New York, NY, USA: ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Freund, Y., & Schapire, R. E. (1995). A decision-theoretic generalization of on-line learning and an application to boosting. Proc. 2nd European Conference on Computational Learning Theory (pp. 23--37).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Giacinto, G., & Roli, F. (2001). An approach to the automatic design of multiple classifier systems. Pattern Recognition Letters, 22, 25--33.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Margineantu, D. D., & Dietterich, T. G. (1997). Pruning adaptive boosting. Proc. 14th International Conference on Machine Learning (pp. 211--218). Morgan Kaufmann.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Martínez-Muñoz, G., & Suáárez, A. (2004). Aggregation ordering in bagging. Proc. of the IASTED International Conference on Artificial Intelligence and Applications (pp. 258--263). Acta Press.]]Google ScholarGoogle Scholar
  15. Martínez-Muñoz, G., & Suáárez, A. (2005). Switching class labels to generate classification ensembles. Pattern Recognition, 38, 1483--1494.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Nadeau, C., & Bengio, Y. (2003). Inference for the generalization error. Machine Learning, 52, 239--281.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Prodromidis, A. L., & Stolfo, S. J. (2001). Cost complexity-based pruning of ensemble classifiers. Knowledge and Information Systems, 3, 449--469.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Schapire, R. E., Freund, Y., Bartlett, P. L., & Lee, W. S. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 12, 1651--1686.]]Google ScholarGoogle Scholar
  19. Tamon, C., & Xiang, J. (2000). On the boosting pruning problem. Proc. 11th European Conference on Machine Learning (pp. 404--412). Springer, Berlin.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Webb, G. I. (2000). Multiboosting: A technique for combining boosting and wagging. Machine Learning, 40, 159--196.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Zhou, Z.-H., & Tang, W. (2003). Selective ensemble of decision trees. Lecture Notes in Artificial Intelligence (pp. 476--483). Berlin: Springer.]]Google ScholarGoogle ScholarCross RefCross Ref
  22. Zhou, Z.-H., Wu, J., & Tang, W. (2002). Ensembling neural networks: Many could be better than all. Artificial Intelligence, 137, 239--263.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Pruning in ordered bagging ensembles

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICML '06: Proceedings of the 23rd international conference on Machine learning
        June 2006
        1154 pages
        ISBN:1595933832
        DOI:10.1145/1143844

        Copyright © 2006 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 June 2006

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        ICML '06 Paper Acceptance Rate140of548submissions,26%Overall Acceptance Rate140of548submissions,26%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader