ABSTRACT
This work presents a new evolutionary ensemble method for data classification, which is inspired by the concepts of bagging and boosting, and aims at combining their good features while avoiding their weaknesses. The approach is based on a distributed multiple-population genetic programming (GP) algorithm which exploits the technique of coevolution at two levels. On the inter-population level the populations cooperate in a semi-isolated fashion, whereas on the intra-population level the candidate classifiers coevolve competitively with the training data samples. The final classifier is a voting committee composed by the best members of all the populations. The experiments performed in a varying number of populations show that our approach outperforms both bagging and boosting for a number of benchmark problems.
- D. A. Augusto, H. J. Barbosa, and N. F. Ebecken. Coevolution of data samples and classifiers integrated with grammatically-based genetic programming for data classification. In Proc. of the 2008 Conf. on Genetic and Evolutionary Computation, pages 1171--1178, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- M. Brameier and W. Banzhaf. Evolving teams of predictors with linear genetic programming. Genetic Programming and Evolvable Machines, 2(4):381--407, 2001. Google ScholarDigital Library
- T. G. Dietterich. Ensemble methods in machine learning. LNCS, 1857:1--15, 2000. Google ScholarDigital Library
- G. Folino, C. Pizzuti, and G. Spezzano. Gp ensembles for large-scale data classification. IEEE Trans. Evolutionary Computation, 10(5):604--616, 2006. Google ScholarDigital Library
- G. Folino, C. Pizzuti, G. Spezzano, L. Vanneschi, and M. Tomassini. Diversity analysis in cellular and multipopulation genetic programming. In IEEE Congress on Evolutionary Computation, pages 305--311, 2003.Google Scholar
- Y. Freund. An adaptive version of the boost by majority algorithm. In In Proc. of the Twelfth Annual Conf. on Computational Learning Theory, pages 102--113, 2000. Google ScholarDigital Library
- Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In Proc. of the 2nd European Conf. on Computational Learning Theory, pages 23--37, 1995. Google ScholarDigital Library
- H. Iba. Bagging, boosting, and bloating in genetic programming. In W. Banzhaf et al., editors, Proc. of the 1999 Genetic and Evolutionary Computation Conf., volume 2, pages 1053--1060, Orlando, Fl, USA.Google Scholar
- R. Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. In IJCAI, pages 1137--1145, 1995. Google ScholarDigital Library
- S. Kotsiantis and P. Pintelas. Combining bagging and boosting. International Journal of Computational Intelligence and Applications, 1(4):324--333, 2004.Google Scholar
- L. I. Kuncheva and C. J. Whitaker. Measures of diversity in classifier ensembles. Machine Learning, 51:181--207, 2003. Google ScholarDigital Library
- J. Paredis. Steps towards co-evolutionary classification neural networks. In Proc. of the Fourth International Workshop on the Synthesis and Simulation of Living Systems, pages 102--108, 1994.Google Scholar
- G. Paris, D. Robilliard, and C. Fonlupt. Applying boosting techniques to genetic programming. In Selected Papers from the 2002 European Conf. on Artificial Evolution, pages 267--280, London, UK. Google ScholarDigital Library
- R. Poli and N. F. McPhee. Parsimony pressure made easy. In GECCO '08: Proceedings of the 10th annual conference on Genetic and evolutionary computation, pages 1267--1274, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- M. Skurichina, L. Kuncheva, and R. P. W. Duin. Bagging and boosting for the nearest mean classifier: Eects of sample size on diversity and accuracy. In Proc. of the 3th International Workshop on Multiple Classifier Systems, pages 62--71, London, UK, 2002. Google ScholarDigital Library
- A. C. Tan and D. Gilbert. An empirical comparison of supervised machine learning techniques in bioinformatics. In Proc. of the First Asia-Pacific Bioinformatics Conf., pages 219--222, 2003. Google ScholarDigital Library
- M. Tomassini. Spatially Structured Evolutionary Algorithms: Artificial Evolution in Space and Time. Springer-Verlag, 2005. Google ScholarDigital Library
- J. Zhu, S. Rosset, H. Zou, and T. Hastie. Multi-class AdaBoost. Technical report, Univ. of Michigan, 2006.Google Scholar
Index Terms
- Coevolutionary multi-population genetic programming for data classification
Recommendations
An ensemble method in hybrid real-coded genetic algorithm with pruning for data classification
AIAP'07: Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applicationsTo obtain a classification model with high generalization ability, this paper proposes a novel ensemble method that implements a hybrid real-coded genetic algorithm with pruning (HRGA/P). A crucial idea here is to combine ensemble learning and HRGA/P ...
An organizational coevolutionary algorithm for classification
Taking inspiration from the interacting process among organizations in human societies, a new classification algorithm, organizational coevolutionary algorithm for classification (OCEC), is proposed with the intrinsic properties of classification in ...
Symbiotic coevolutionary genetic programming: a benchmarking study under large attribute spaces
Classification under large attribute spaces represents a dual learning problem in which attribute subspaces need to be identified at the same time as the classifier design is established. Embedded as opposed to filter or wrapper methodologies address ...
Comments