Decision Tree Induction: Using Frequency Tables for Attribute Selection

Bramer, Max

doi:10.1007/978-1-4471-7493-6_6

Max Bramer¹¹

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

Abstract

This chapter describes an alternative method of calculating the average entropy of the training (sub)sets resulting from splitting on an attribute, which uses frequency tables. It is shown to be equivalent to the method used in Chapter 5 but requires less computation. Two alternative attribute selection criteria, the Gini Index of Diversity and the \(\chi^{2}\) statistic, are illustrated and it is shown how they can also be calculated using a frequency table.

The important issue of inductive bias is introduced. This leads to a description of a further attribute selection criterion, Gain Ratio, which was introduced as a way of overcoming the bias of the entropy minimisation method, which is undesirable for some datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Mingers, J. (1989). An empirical comparison of pruning methods for decision tree induction. Machine Learning, 4, 227–243.
Article Google Scholar
Quinlan, J. R. (1993). C4.5: programs for machine learning. San Mateo: Morgan Kaufmann.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, University of Portsmouth, Portsmouth, Hampshire, UK
Prof. Max Bramer

Authors

Prof. Max Bramer
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bramer, M. (2020). Decision Tree Induction: Using Frequency Tables for Attribute Selection. In: Principles of Data Mining. Undergraduate Topics in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-7493-6_6

Download citation

DOI: https://doi.org/10.1007/978-1-4471-7493-6_6
Published: 21 May 2020
Publisher Name: Springer, London
Print ISBN: 978-1-4471-7492-9
Online ISBN: 978-1-4471-7493-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics