A Wrapper Feature Selection Method for Combined Tree-based Classifiers

Gatnar, Eugeniusz

doi:10.1007/3-540-31314-1_13

Eugeniusz Gatnar²²

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

2218 Accesses
1 Citations

Abstract

The aim of feature selection is to find the subset of features that maximizes the classifier performance. Recently, we have proposed a correlation-based feature selection method for the classifier ensembles based on Hellwig heuristic (CFSH).

In this paper we show that further improvement of the ensemble accuracy can be achieved by combining the CFSH method with the wrapper approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 159.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

AMIT, Y. and GEMAN, G. (2001): Multiple Randomized Classifiers: MRCL. Technical Report, Department of Statistics, University of Chicago, Chicago.
Google Scholar
BAUER, E. and KOHAVI R. (1999): An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants. Machine Learning, 36, 105–142.
Article Google Scholar
BLAKE, C., KEOGH, E. and MERZ, C. J. (1998): UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California, Irvine.
Google Scholar
BREIMAN, L. (1996): Bagging predictors. Machine Learning, 24, 123–140.
MATH MathSciNet Google Scholar
BREIMAN, L. (1998): Arcing classifiers. Annals of Statistics, 26, 801–849.
MATH MathSciNet Google Scholar
BREIMAN, L. (1999): Using adaptive bagging to debias regressions. Technical Report 547, Department of Statistics, University of California, Berkeley.
Google Scholar
BREIMAN, L. (2001): Random Forests. Machine Learning 45, 5–32.
MATH Google Scholar
DIETTERICH, T. and BAKIRI, G. (1995): Solving multiclass learning problem via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263–286.
Google Scholar
FREUND, Y. and SCHAPIRE, R.E. (1997): A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences 55, 119–139.
Article MathSciNet Google Scholar
GATNAR, E. (2005a): Dimensionality of Random Subspaces. In: C. Weihs and W. Gaul (Eds.): Classification-The Ubiquitous Challenge. Springer, Heidelberg, 129–136.
Google Scholar
GATNAR, E. (2005b): A Diversity Measure for Tree-Based Classifier Ensembles. In: D. Baier, R. Decker, and L. Schmidt-Thieme (Eds.): Data Analysis and Decision Support. Springer, Heidelberg, 30–38.
Google Scholar
GINSBERG, M.L. (1993): Essentials of Artificial Intelligence. Morgan Kaufmann, San Francisco.
Google Scholar
HELLWIG, Z. (1969): On the problem of optimal selection of predictors. Statistical Revue, 3–4 (in Polish).
Google Scholar
HO, T.K. (1998): The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832–844.
Google Scholar
KOHAVI, R. and WOLPERT, D.H. (1996):Bias plus variance decomposition for zero-one loss functions. In: L. Saita (Ed.) Proceedings of the 13th International Conference on Machine Learning, Morgan Kaufmann, San Francisco, 275–283.
Google Scholar
KIRA, A. and RENDELL, L. (1992): A practical approach to feature selection. In: D. Sleeman and P. Edwards (Eds.): Proceedings of the 9th International Conference on Machine Learning, Morgan Kaufmann, San Francisco, 249–256.
Google Scholar
KOHAVI, R. and JOHN, G.H. (1997): Wrappers for feature subset selection. Artificial Intelligence, 97, 273–324.
Article Google Scholar
PROVOST, F. and BUCHANAN, B. (1995): Inductive Policy: The pragmatics of bias selection. Machine Learning, 20, 35–61.
Google Scholar
SINGH, M. and PROVAN, G. (1995): A comparison of induction algorithms for selective and non-selective Bayesian classifiers. Proceedings of the 12th International Conference on Machine Learning, Morgan Kaufmann, San Francisco, 497–505.
Google Scholar
THERNEAU, T.M. and ATKINSON, E.J. (1997): An introduction to recursive partitioning using the RPART routines, Mayo Foundation, Rochester.
Google Scholar
TUMER, K. and GHOSH, J. (1996): Analysis of decision boundaries in linearly combined neural classifiers, Pattern Recognition, 29, 341–348.
Article Google Scholar
WALESIAK, M. (1987): Modified criterion of explanatory variable selection to the linear econometric model. Statistical Revue, 1, 37–43 (in Polish).
Google Scholar
WOLPERT, D. (1992): Stacked generalization. Neural Networks 5, 241–259.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Statistics, Katowice University of Economics, ul. Bogucicka 14, 40-226, Katowice, Poland
Eugeniusz Gatnar

Authors

Eugeniusz Gatnar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut für Technische und Betriebliche Informationssysteme, Otto-von-Guericke-Universität Magdeburg, Universitätsplatz 2, 39106, Magdeburg, Germany
Myra Spiliopoulou
Institut für Wissens- und Sprachverarbeitung, Otto-von-Guericke-Universität Magdeburg, Universitätsplatz 2, 39106, Magdeburg, Germany
Rudolf Kruse , Christian Borgelt & Andreas Nürnberger , &
Institut für Entscheidungstheorie und Unternehmensforschung, Universität Karlsruhe (TH), 76128, Karlsruhe
Wolfgang Gaul

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gatnar, E. (2006). A Wrapper Feature Selection Method for Combined Tree-based Classifiers. In: Spiliopoulou, M., Kruse, R., Borgelt, C., Nürnberger, A., Gaul, W. (eds) From Data and Information Analysis to Knowledge Engineering. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-31314-1_13

Download citation

DOI: https://doi.org/10.1007/3-540-31314-1_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31313-7
Online ISBN: 978-3-540-31314-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics