A Genetic Algorithm for Feature Selection and Granularity Learning in Fuzzy Rule-Based Classification Systems for Highly Imbalanced Data-Sets

Villar, Pedro; Fernández, Alberto; Herrera, Francisco

doi:10.1007/978-3-642-14055-6_78

Pedro Villar⁴,
Alberto Fernández⁵ &
Francisco Herrera⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 80))

Included in the following conference series:

International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems

992 Accesses
6 Citations

Abstract

This contribution proposes a Genetic Algorithm for jointly performing a feature selection and granularity learning for Fuzzy Rule-Based Classification Systems in the scenario of data-sets with a high imbalance degree. We refer to imbalanced data-sets when the class distribution is not uniform, a situation that it is present in many real application areas. The aim of this work is to get more compact and precise models by selecting the adequate variables and adapting the number of fuzzy labels for each problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chawla, N.V., Japkowicz, N., Kolcz, A.: Editorial: special issue on learning from imbalanced data sets. SIGKDD Explorations 6(1), 1–6 (2004)
Article Google Scholar
Ishibuchi, H., Nakashima, T., Nii, M.: Classification and modeling with linguistic information granules: Advanced approaches to linguistic Data Mining. Springer, Heidelberg (2004)
MATH Google Scholar
Fernández, A., García, S., Del Jesus, M.J., Herrera, F.: A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets. Fuzzy Sets and Systems 159(18), 2378–2398 (2008)
Article MathSciNet Google Scholar
Villar, P., Fernández, A., Herrera, F.: A Genetic Learning of the Fuzzy Rule-Based Classification System Granularity for highly Imbalanced Data-Sets. In: 2009 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2009), pp. 1689–1694 (2009)
Google Scholar
Chi, Z., Yan, H., Pham, T.: Fuzzy algorithms with applications to image processing and pattern recognition. World Scientific, Singapore (1996)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)
Google Scholar
Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A Study of the Behaviour of Several Methods for Balancing Machine Learning Training Data. SIGKDD Explorations 6(1), 20–29 (2004)
Article Google Scholar
Asuncion, A., Newman, D.J.: UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences, http://www.ics.uci.edu/~mlearn/MLRepository.html
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligent Research 16, 321–357 (2002)
MATH Google Scholar
García, S., Herrera, F.: An Extension on “Statistical Comparisons of Classifiers over Multiple data sets” for all Pairwise Comparisons. Journal of Machine Learning Research 9, 2607–2624 (2008)
MATH Google Scholar
Orriols-Puig, A., Bernadó-Mansilla, E.: Evolutionary rule-based systems for imbalanced datasets. Soft Computing 13(3), 213–225 (2009)
Article Google Scholar
Weiss, G.M.: Mining with rarity: a unifying framework. SIGKDD Explorations 6(1), 7–19 (2004)
Article Google Scholar
Huang, J., Ling, C.X.: Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering 17(3), 299–310 (2005)
Article Google Scholar
Ishibuchi, H., Yamamoto, T.: Rule Weight Specification in Fuzzy Rule-Based Classification Systems. IEEE Transactions on Fuzzy Systems 13, 428–435 (2005)
Article Google Scholar
Sheskin, D.: Handbook of parametric and nonparametric statistical procedures, 2nd edn. Chapman & Hall/CRC, Boca Raton (2006)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Software Engineering, Spain
Pedro Villar
Department of Computer Science and Artificial Intelligence, E.T.S. Ing. Informática y de Telecomunicación, University of Granada, Spain
Alberto Fernández & Francisco Herrera

Authors

Pedro Villar
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Fernández
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Herrera
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Fachbereich Mathematik und Informatik, Philipps-Universität Marburg, Marburg, Germany
Eyke Hüllermeier
Department of Knowledge Processing and Language Engineering, Otto-von-Guericke University of Magdeburg, Universitätsplatz 2, 39106, Magdeburg, Germany
Rudolf Kruse
Fakultät für Elektrotechnik und Informationstechnik, Technische Universität Dortmund, 44221, Dortmund, Germany
Frank Hoffmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Villar, P., Fernández, A., Herrera, F. (2010). A Genetic Algorithm for Feature Selection and Granularity Learning in Fuzzy Rule-Based Classification Systems for Highly Imbalanced Data-Sets. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Methods. IPMU 2010. Communications in Computer and Information Science, vol 80. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14055-6_78

Download citation

DOI: https://doi.org/10.1007/978-3-642-14055-6_78
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14054-9
Online ISBN: 978-3-642-14055-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics