Optimal decision trees for categorical data via integer programming

Günlük, Oktay; Kalagnanam, Jayant; Li, Minhan; Menickelly, Matt; Scheinberg, Katya

doi:10.1007/s10898-021-01009-y

Optimal decision trees for categorical data via integer programming

S.I.: GERAD-40
Published: 24 March 2021

Volume 81, pages 233–260, (2021)
Cite this article

Journal of Global Optimization Aims and scope Submit manuscript

Oktay Günlük⁴,
Jayant Kalagnanam¹,
Minhan Li²,
Matt Menickelly³ &
…
Katya Scheinberg ORCID: orcid.org/0000-0003-3547-1841⁴

1569 Accesses
18 Citations
Explore all metrics

Abstract

Decision trees have been a very popular class of predictive models for decades due to their interpretability and good performance on categorical features. However, they are not always robust and tend to overfit the data. Additionally, if allowed to grow large, they lose interpretability. In this paper, we present a mixed integer programming formulation to construct optimal decision trees of a prespecified size. We take the special structure of categorical features into account and allow combinatorial decisions (based on subsets of values of features) at each node. Our approach can also handle numerical features via thresholding. We show that very good accuracy can be achieved with small trees using moderately-sized training sets. The optimization problems we solve are tractable with modern solvers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning Decision Trees with Flexible Constraints and Objectives Using Integer Optimization

Learning optimal decision trees using constraint programming

Article 29 October 2020

SAT-based optimal classification trees for non-binary data

Article 01 June 2023

References

Bennett, K.P., Blue, J.: Optimal decision trees. Technical Report 214, Rensselaer Polytechnic Institute Math Report (1996)
Bennett, K.P., Blue, J.A.: A support vector machine approach to decision trees. Neural Netw. Proc. IEEE World Congr. Comput. Intell. 3, 2396–2401 (1998)
Google Scholar
Bertsimas, D., Dunn, J.: Optimal classification trees. Mach. Learn. 106(7), 1039–1082 (2017)
Article MathSciNet Google Scholar
Bertsimas, D., Shioda, R.: Classification and regression via integer optimization. Oper. Res. 55(2), 252–271 (2017)
Article MathSciNet Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman and Hall, New York (1984)
MATH Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011)
Article Google Scholar
Dash, S., Günlük, O., Wei, D.: Boolean Decision Rules via Column Generation. Advances in Neural Information Processing Systems. Montreal, Canada (2018)
Google Scholar
FICO Explainable Machine Learning Challenge https://community.fico.com/s/explainable-machine-learning-challenge
Hyafil, L., Rivest, R.L.: Constructing optimal binary decision trees is np-complete. Inform. Process. Lett. 5(1), 15–17 (1976)
Article MathSciNet Google Scholar
Kotsiantis, S.B.: Decision trees: a recent overview. Artif. Intell. Rev. 39(4), 261–283 (2013)
Article Google Scholar
Lichman, M.: UCI machine learning repository (2013)
Malioutov, D.M., Varshney, K.R.: Exact rule learning via boolean compressed sensing. In: Proceedings of the 30th International Conference on Machine Learning, volume 3, pp. 765–773 (2013)
Murthy, S., Salzberg, S.: Lookahead and pathology in decision tree induction. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, volume 2, pp. 1025–1031, San Francisco, CA, USA, (1995). Morgan Kaufmann Publishers Inc
Norouzi, M., Collins, M., Johnson, M.A., Fleet, D.J., Kohli, P.: Efficient non-greedy optimization of decision trees. In: Advances in Neural Information Processing Systems, pp. 1720–1728, (2015)
Ross, J.: Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Google Scholar
Therneau, T., Atkinson, B., Ripley, B.: rpart: Recursive partitioning and regression trees. Technical Report (2017). R package version 4.1-11
Wang, T., Rudin, C.: Learning optimized or’s of and’s. Technical report, (2015). arxiv:1511.02210
Wang, T., Rudin, C., Doshi-Velez, F., Liu, Y., Klampfl, E., MacNeille, P.: A Bayesian framework for learning rule sets for interpretable classification. J. Mach. Learn. Res. 18(70), 1–37 (2017)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

IBM Research, Yorktown Heights, USA
Jayant Kalagnanam
Lehigh University, Bethlehem, USA
Minhan Li
Argonne National Laboratory, Lemont, USA
Matt Menickelly
Cornell University, Ithaca, USA
Oktay Günlük & Katya Scheinberg

Authors

Oktay Günlük
View author publications
You can also search for this author in PubMed Google Scholar
Jayant Kalagnanam
View author publications
You can also search for this author in PubMed Google Scholar
Minhan Li
View author publications
You can also search for this author in PubMed Google Scholar
Matt Menickelly
View author publications
You can also search for this author in PubMed Google Scholar
Katya Scheinberg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Katya Scheinberg.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The work of Katya Scheinberg was partially supported by NSF Grant CCF-1320137. Part of this work was performed while Katya Scheinberg was on sabbatical leave at IBM Research, Google, and University of Oxford, partially supported by the Leverhulme Trust.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Günlük, O., Kalagnanam, J., Li, M. et al. Optimal decision trees for categorical data via integer programming. J Glob Optim 81, 233–260 (2021). https://doi.org/10.1007/s10898-021-01009-y

Download citation

Received: 13 August 2019
Accepted: 01 March 2021
Published: 24 March 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s10898-021-01009-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal decision trees for categorical data via integer programming

Abstract

Access this article

Similar content being viewed by others

Learning Decision Trees with Flexible Constraints and Objectives Using Integer Optimization

Learning optimal decision trees using constraint programming

SAT-based optimal classification trees for non-binary data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimal decision trees for categorical data via integer programming

Abstract

Access this article

Similar content being viewed by others

Learning Decision Trees with Flexible Constraints and Objectives Using Integer Optimization

Learning optimal decision trees using constraint programming

SAT-based optimal classification trees for non-binary data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation