Skip to main content

Algorithmic speedups in growing classification trees by using an additive split criterion

  • Conference paper
Selecting Models from Data

Part of the book series: Lecture Notes in Statistics ((LNS,volume 89))

Abstract

We propose a new split criterion to be used in building classification trees. This criterion called weighted accuracy or wacc has the advantage that it allows the use of divide-and-conquer algorithms when minimizing the split criterion. This is useful when more complex split families, such as intervals corners and rectangles, are considered. The split criterion is derived to imitate the Gini function as closely as possible by comparing preference regions for the two functions. The wacc function is evaluated in a large empirical comparison and is found to be competitive with the traditionally used functions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. John Bentley. Programming Pearls. ACM, 1980.

    Google Scholar 

  2. Bojan Cestnik. Hepatitis Data. Jozef Stefan Institute, Jamova 39, 61000 Ljubljana, Yugoslavia. From the UCI Machine Learning repository.

    Google Scholar 

  3. Richard S. Forsyth. Bupa liver disorders. 8 Grosvenor Avenue, Mapperley Park, Nottingham NG3 5DX, 0602-621676,1990. From the UCI Machine Learning repository.

    Google Scholar 

  4. B German. Glass data. Central Research Establishment, Home Office Forensic Science Service, Aldermaston, Reading, Berkshire RG7 4PN. From the UCI Machine Learning repository.

    Google Scholar 

  5. David J. Lubinsky. Bivariate splits and consistent split criteria in dichotomous classification trees. PhD Thesis, Department of Computer Science, Rutgers University, 1994.

    Google Scholar 

  6. John Mingers. An empirical comparison of selection measures for decision-tree induction. Machine Learning, 3: 319–342, 1989.

    Google Scholar 

  7. Robert Messenger and Lewis Mandell. A modal search technique for predictive nominal scale multivariate analysis. Journal of the American Statistical Association, 67: 768–772, 1972.

    Article  Google Scholar 

  8. National Institute of Diabetes and Digestive and Kidney Diseases. Pima indians diabetes data. From the UCI Machine Learning repository, 1990.

    Google Scholar 

  9. J.R. Quinlan. Induction of decision trees. Machine Learning, 1 (1): 81–106, 1986.

    Google Scholar 

  10. Long Beach Robert Detrano, V.A. Medical Center and Cleveland Clinic Foundation. Heart disease database. From the UCI Machine Learning repository.

    Google Scholar 

  11. Statlib. Liver disease diagnosis. From CMU statistics library.

    Google Scholar 

  12. M. Zwitter and M. Soklic. Lymphography data. University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. From the UCI Machine Learning repository.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer-Verlag New York

About this paper

Cite this paper

Lubinsky, D. (1994). Algorithmic speedups in growing classification trees by using an additive split criterion. In: Cheeseman, P., Oldford, R.W. (eds) Selecting Models from Data. Lecture Notes in Statistics, vol 89. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-2660-4_44

Download citation

  • DOI: https://doi.org/10.1007/978-1-4612-2660-4_44

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-94281-0

  • Online ISBN: 978-1-4612-2660-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics