Elsevier

Pattern Recognition

Volume 41, Issue 5, May 2008, Pages 1676-1700
Pattern Recognition

Genetic algorithm-based feature set partitioning for classification problems

https://doi.org/10.1016/j.patcog.2007.10.013Get rights and content

Abstract

Feature set partitioning generalizes the task of feature selection by partitioning the feature set into subsets of features that are collectively useful, rather than by finding a single useful subset of features. This paper presents a novel feature set partitioning approach that is based on a genetic algorithm. As part of this new approach a new encoding schema is also proposed and its properties are discussed. We examine the effectiveness of using a Vapnik–Chervonenkis dimension bound for evaluating the fitness function of multiple, oblivious tree classifiers. The new algorithm was tested on various datasets and the results indicate the superiority of the proposed algorithm to other methods.

Section snippets

Introduction and motivation

An inducer aims to build a classifier (also known as a classification model) by learning from a set of pre-classified instances. The classifier can then be used for classifying unlabeled instances. It is well known that the required number of labeled instances for supervised learning increases as a function of dimensionality [1]. Fukunaga [2] showed that the required number of training instances for a linear classifier is linearly related to the dimensionality and for a quadratic classifier to

Related works

In this section we briefly review some of the central issues that have been addressed, and their treatment in the literature. The related work described in this section falls into three categories:

  • First, we discuss three feature oriented tasks (namely feature selection, feature set partitioning, and feature subset-based ensemble) in pattern recognition and the relations among them.

  • Then, we survey the usage of GAs for solving the above-mentioned tasks.

  • The oblivious decision tree (ODT) and its

Problem formulation

In a typical classification problem, a training set of labeled examples is given. The training set can be described in a variety of languages, most frequently, as a collection of records that may contain duplicates. A vector of feature values describes each record. The notation A denotes the set of input features containing n features: A={a1,,ai,,an} and y represents the class variable or the target feature. Features (sometimes referred to as attributes) are typically one of two types:

A GA method for feature set partitioning

In order to solve the problem defined in Section 3, we suggest using a GA search procedure. Fig. 4 presents the proposed process schematically. The left side in Fig. 4 specifies the creation of the ODTs ensemble based on feature set partitioning. Searching for the best partitioning is governed by a GA search. Each partitioning candidate is evaluated using a VC dimension-based evaluator. For this purpose, an ODT is generated for each feature partition. The ODT generator utilizes a caching

Experimental study

In order to illustrate the potential of the feature set partitioning approach in classification problems and to evaluate the performance of the proposed GA, a comparative experiment was conducted on benchmark datasets. The following subsections describe the experimental setup and the results obtained.

Conclusions

In this paper, we have presented a novel genetic algorithm for finding the best mutually exclusive feature set partitioning. The basic idea is to decompose the original set of features into several subsets, build a decision tree for each projection, and then combine them. This paper examines whether genetic algorithms can be useful for discovering the appropriate partitioning structure.

For this purpose we suggested a new encoding schema and fitness function that were specially designed for

Acknowledgments

The author gratefully thank the action editor and the anonymous reviewers whose constructive comments helped in improving the quality and accuracy of this paper.

About the Author—LIOR ROKACH is an assistant professor in the Department of Information System Engineering and the Program of Software Engineering of Ben-Gurion University, Israel. His research interests include artificial intelligence, pattern recognition, data mining, control of production processes and medical informatics. Dr. Rokach is the co-author of the book “Decomposition Methodology for Knowledge Discovery and Data Mining: Theory and Applications” published by World Scientific

References (58)

  • J. Hwang et al.

    Nonparametric multivariate density estimation: a comparative study

    IEEE Trans. Signal Process.

    (1994)
  • R. Bellman

    Adaptive Control Processes: A Guided Tour

    (1961)
  • I. Guyon et al.

    Feature Extraction, Foundations and Applications, Series Studies in Fuzziness and Soft Computing

    (2006)
  • D. Opitz et al.

    Popular ensemble methods: an empirical study

    J. Artif. Res.

    (1999)
  • S. Geman et al.

    Neural networks and the bias/variance dilemma

    Neural Comput.

    (1995)
  • K. Tumer et al.

    Linear and order statistics combiners for pattern classification

  • L. Breiman

    Bagging predictors

    Mach. Learn.

    (1996)
  • Y. Freund et al.

    Experiments with a new boosting algorithm. Machine Learning

  • K. Tumer et al.

    Input decimated ensembles

    Pattern Anal. Appl.

    (2003)
  • T.K. Ho

    The random subspace method for constructing decision forests

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1998)
  • A. Tsymbal et al.

    Ensemble feature selection with the simple Bayesian classification in medical diagnostics

  • Q.X. Wu et al.

    Multi-knowledge for decision making

    J. Knowl. Inf. Syst.

    (2005)
  • Y. Bao et al.

    Combining multiple K-nearest neighbor classifiers for text classification by reducts

  • Q.H. Hu et al.

    Constructing rough decision forests

  • P. Cunningham et al.

    Diversity versus quality in classification ensembles based on feature selection

  • G. Zenobi et al.

    Using diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization error

  • L. Rokach

    Decomposition methodology for classification tasks—a meta decomposer framework

    Pattern Anal. Appl.

    (2006)
  • A. Kusiak

    Decomposition in data mining: an industrial case study

    IEEE Trans. Electron. Packag. Manuf.

    (2000)
  • F.J. Provost, V. Kolluri, A survey of methods for scaling up inductive learning algorithms, in: Proceedings of the 3rd...
  • Cited by (0)

    About the Author—LIOR ROKACH is an assistant professor in the Department of Information System Engineering and the Program of Software Engineering of Ben-Gurion University, Israel. His research interests include artificial intelligence, pattern recognition, data mining, control of production processes and medical informatics. Dr. Rokach is the co-author of the book “Decomposition Methodology for Knowledge Discovery and Data Mining: Theory and Applications” published by World Scientific Publishing and the co-editor of “The Data Mining and Knowledge Discovery Handbook” published by Springer. Dr. Rokach holds B.Sc., M.Sc. and Ph.D. in Industrial Engineering from Tel Aviv University.

    View full text