Skip to main content
Log in

Theoretical Comparison between the Gini Index and Information Gain Criteria

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

Knowledge Discovery in Databases (KDD) is an active and important research area with the promise for a high payoff in many business and scientific applications. One of the main tasks in KDD is classification. A particular efficient method for classification is decision tree induction. The selection of the attribute used at each node of the tree to split the data (split criterion) is crucial in order to correctly classify objects. Different split criteria were proposed in the literature (Information Gain, Gini Index, etc.). It is not obvious which of them will produce the best decision tree for a given data set. A large amount of empirical tests were conducted in order to answer this question. No conclusive results were found. In this paper we introduce a formal methodology, which allows us to compare multiple split criteria. This permits us to present fundamental insights into the decision process. Furthermore, we are able to present a formal description of how to select between split criteria for a given data set. As an illustration we apply the methodology to two widely used split criteria: Gini Index and Information Gain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. A. Babic, E. Krusinska and J.E. Stromberg, Extraction of diagnostic rules using recursive partitioning systems: A comparison of two approches, Artificial Intelligence in Medicine 20(5) (1992) 373–387.

    Google Scholar 

  2. E. Baker and A.K. Jain, On feature ordering in practice and some finite sample effects, in: Proceedings of the Third International Joint Conference on Pattern Recognition, San Diego, CA (1976) pp. 45–49.

  3. M. Ben-Bassat, Myopic policies in sequential classification, IEEE Transactions on Computing 27(2) (1978) 170–174.

    Google Scholar 

  4. L. Breiman, J. Friedman, R. Olshen and C. Stone, Classification and Regression Trees (Wadsworth International Group, 1984).

  5. Lopez de Mantaras, A distance-based attribute selection measure for decision tree induction, Machine Learning 6(1) (1991) 81–92.

    Google Scholar 

  6. J. Gama and P. Brazdil, Characterization of classification algorithms, in: EPIA-95: Progress in Artificial Intelligence, 7th Portuguese Conference on Artificial Intelligence, eds. C. Pinto-Ferreira and N. Mamede (Springer, 1995) pp. 189–200.

  7. I. Kononenko, On biases in estimating multi-valued attributes, in: IJCAI-95: Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, Canada, ed. C. Mellish (Morgan Kaufmann, San Mateo, CA, 1995) pp. 1034–1040.

    Google Scholar 

  8. T.-S. Lim, W.-Y. Loh and Y.-S. Shih, A comparison of prediction accuracy, complexity and training time of thirty-three old and new classification algorithms, Machine Learning (1999).

  9. J. Mingers, Expert systems-rule induction with statistical data, Journal of the Operational Research Society 38(1) (1987) 39–47.

    Google Scholar 

  10. J. Mingers, An empirical comparison of selection measures for decision tree induction, Machine Learning 3 (1989) 319–342.

    Google Scholar 

  11. M. Miyakawa, Criteria for selecting a variable in the construction of efficient decision trees, IEEE Transactions on Computers 35(1) (1929) 133–141.

    Google Scholar 

  12. B.M. Moret, Decision trees and diagrams, Computing Surveys 14(4) (1982) 593–623.

    Google Scholar 

  13. K.V.S. Murthy, On growing better decision trees fromdata, Ph.D. thesis, The John Hopkins University, Baltimore, MD (1995).

    Google Scholar 

  14. G. Pagallo, Adaptive decision tree algorithms for learning from examples, Ph.D. thesis, University of California, Santa Cruz, CA (1990).

    Google Scholar 

  15. J.R. Quinlan, Simplifying decision trees, International Journal of Man-Machine Studies 27 (1987) 221–234.

    Google Scholar 

  16. J.R. Quinlan, C4.5 Programs for Machine Learning (Morgan Kaufmann, 1993).

  17. L.E. Raileanu, Formalization and comparison of split criteria for decision trees, Ph.D. thesis, University of Neuchâtel, Switzerland (May 2002).

    Google Scholar 

  18. S.R. Safavin and D. Langrebe, A survey of decision tree classifier methodology, IEEE Transactions on Systems, Man and Cybernetics 21(3) (1991) 660–674.

    Google Scholar 

  19. M. Sahami, Learning non-linearly separable Boolean functions with linear threshold unit trees and madaline-style networks, in: Proceedings of the Eleventh National Conference on Artificial Intelligence (AAAI Press, 1993) pp. 335–341.

  20. K. Stoffel and L.E. Raileanu, Selecting optimal split-functions for large datasets, in: Research and Development in Intelligent Systems XVII, BCS Conference Series (2000).

  21. R. Vilalta and D. Oblinger, A quantification of distance-bias between evaluation metrics in classification, in: Proceedings of the 17th International Conference on Machine Learning, Stanford University (2000).

  22. A.P. White and W.Z. Liu, Bias il information-based measures in decision tree induction, Machine Learning 15(3) (1997) 321–328.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Raileanu, L.E., Stoffel, K. Theoretical Comparison between the Gini Index and Information Gain Criteria. Annals of Mathematics and Artificial Intelligence 41, 77–93 (2004). https://doi.org/10.1023/B:AMAI.0000018580.96245.c6

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:AMAI.0000018580.96245.c6

Navigation