Abstract
Machine learning algorithms often contain many hyperparameters whose values affect the predictive performance of the induced models in intricate ways. Due to the high number of possibilities for these hyperparameter configurations and their complex interactions, it is common to use optimization techniques to find settings that lead to high predictive performance. However, insights into efficiently exploring this vast space of configurations and dealing with the trade-off between predictive and runtime performance remain challenging. Furthermore, there are cases where the default hyperparameters fit the suitable configuration. Additionally, for many reasons, including model validation and attendance to new legislation, there is an increasing interest in interpretable models, such as those created by the decision tree (DT) induction algorithms. This paper provides a comprehensive approach for investigating the effects of hyperparameter tuning for the two DT induction algorithms most often used, CART and C4.5. DT induction algorithms present high predictive performance and interpretable classification models, though many hyperparameters need to be adjusted. Experiments were carried out with different tuning strategies to induce models and to evaluate hyperparameters’ relevance using 94 classification datasets from OpenML. The experimental results point out that different hyperparameter profiles for the tuning of each algorithm provide statistically significant improvements in most of the datasets for CART, but only in one-third for C4.5. Although different algorithms may present different tuning scenarios, the tuning techniques generally required few evaluations to find accurate solutions. Furthermore, the best technique for all the algorithms was the Irace. Finally, we found out that tuning a specific small subset of hyperparameters is a good alternative for achieving optimal predictive performance.
Similar content being viewed by others
Notes
These techniques will be described on the following sections.
The original J48 nomenclature may also be consulted at http://weka.sourceforge.net/doc.dev/weka/classifiers/trees/J48.html.
Area under the ROC curve.
Since the stochastic nature of the often used tuning algorithms, experimenting with different seeds (for random generator) is desirable.
For a complete survey on hyperparameter tuning techniques and perspectives, please, consult Bischl et al. (2023).
Initially, there were 100 datasets, but 6 of them spent too much time to finish their tuning jobs. They consumed over 1000 h when we proceeded with their interruption.
The budget size choice is discussed with more details in Sect. 7.
The population size = 10 might be small initially, but it proves to be enough to provide good and accurate results as empirically evaluated in Mantovani et al. (2016).
These additional datasets are indicated in Appendix 2.
A complete list of the pymfe available meta-features can be found here: https://pymfe.readthedocs.io/en/latest/auto_pages/meta_features_description.html.
The BAC measure was preferred at the tuning level because data collection contains binary and multiclass classification problems.
References
Abe S (2005) Support vector machines for pattern classification. Springer, London
Alcobaça E, Siqueira F, Rivolli A et al (2020) MFE: towards reproducible meta-feature extraction. J Mach Learn Res 21:111:1-111:5
Ali S, Smith-Miles KA (2006) A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing 70(13):173–186
Andradottir S (2015) A review of random search methods. In: Fu MC (ed) Handbook of simulation optimization, international series in operations research & management science, vol 216. Springer, New York, pp 277–292
Bache K, Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml
Bardenet R, Brendel M, Kégl B et al (2013) Collaborative hyperparameter tuning. In: Dasgupta S, Mcallester D (eds) Proceedings of the 30th international conference on machine learning (ICML-13), vol 28. JMLR workshop and conference proceedings, pp 199–207
Barella VH, Garcia LPF, de Souto MCP et al (2021) Assessing the data complexity of imbalanced datasets. Inf Sci 553:83–109. https://doi.org/10.1016/j.ins.2020.12.006
Barros R, Basgalupp M, de Carvalho A et al (2012) A survey of evolutionary algorithms for decision-tree induction. IEEE Trans Syst Man Cybern C Appl Rev 42(3):291–312
Barros RC, de Carvalho ACPLF, Freitas AA (2015) Automatic design of Decision-Tree induction algorithms. Springer Briefs in computer science. Springer, Berlin. https://doi.org/10.1007/978-3-319-14231-9
Bartz E, Zaefferer M, Mersmann O et al (2021) Experimental investigation and evaluation of model-based hyperparameter optimization. CoRR arXiv:abs/2107.08761
Ben-Hur A, Weston J (2010) A user’s guide to support vector machines. In: Data mining techniques for the life sciences, methods in molecular biology, vol 609. Humana Press, pp 223–239
Bendtsen C (2012) pso: Particle Swarm Optimization. https://CRAN.R-project.org/package=pso, r package version 1.0.3
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305
Bergstra J, Yamins D, Cox DD (2013) Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: Proceedings of the 30th international conference on machine learning, pp 1–9
Bergstra JS, Bardenet R, Bengio Y et al (2011) Algorithms for hyper-parameter optimization. In: Shawe-Taylor J, Zemel RS, Bartlett PL, et al (eds) Advances in neural information processing systems 24. Curran Associates, Inc., pp 2546–2554
Bermúdez-Chacón R, Gonnet GH, Smith K (2015) Automatic problem-specific hyperparameter optimization and model selection for supervised machine learning: Technical Report. Tech. rep, Zürich
Birattari M, Yuan Z, Balaprakash P et al (2010) F-race and iterated f-race: an overview. Springer, Berlin, pp 311–336. https://doi.org/10.1007/978-3-642-02538-9_13
Bischl B, Lang M, Kotthoff L et al (2016) mlr: machine learning in r. J Mach Learn Res 17(170):1–5
Bischl B, Binder M, Lang M et al (2023) Hyperparameter optimization: foundations, algorithms, best practices and open challenges. https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1484
Blanco-Justicia A, Domingo-Ferrer J (2019) Machine learning explainability through comprehensible decision trees. In: Machine learning and knowledge extraction: third IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 international cross-domain conference, CD-MAKE 2019, Canterbury, UK, August 26–29, 2019, Proceedings. Springer, Berlin, pp 15–26. https://doi.org/10.1007/978-3-030-29726-8_2
Blanco-Justicia A, Domingo-Ferrer J, Martínez S et al (2020) Machine learning explainability via microaggregation and shallow decision trees. Knowl Based Syst 194(105):532. https://doi.org/10.1016/j.knosys.2020.105532
Brazdil P, Giraud-Carrier C, Soares C et al (2009) Metalearning: applications to data mining, 1st edn. Springer, Berlin
Breiman L, Friedman J, Olshen R et al (1984) Classification and regression trees. Chapman & Hall (Wadsworth, Inc.), London
Brodersen KH, Ong CS, Stephan KE et al (2010) The balanced accuracy and its posterior distribution. In: Proceedings of the 2010 20th international conference on pattern recognition. IEEE Computer Society, pp 3121–3124
Cawley GC, Talbot NLC (2010) On over-fitting in model selection and subsequent selection bias in performance evaluation. J Mach Learn Res 11:2079–2107
Clerc M (2012) Standard particle swarm optimization
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Eggensperger K, Hutter F, Hoos HH et al (2015) Efficient benchmarking of hyperparameter optimizers via surrogates. In: Proceedings of the twenty-ninth AAAI conference on artificial intelligence. AAAI Press, AAAI’15, pp 1114–1120. http://dl.acm.org/citation.cfm?id=2887007.2887162
Eitrich T, Lang B (2006) Efficient optimization of support vector machine learning parameters for unbalanced datasets. J Comp Appl Math 196(2):425–436
Esposito F, Malerba D, Semeraro G et al (1999) The effects of pruning methods on the predictive accuracy of induced decision trees. Appl Stoch Models Bus Ind 15:277–299
European Commission (2016) Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance). https://eur-lex.europa.eu/eli/reg/2016/679/oj
Falkner S, Klein A, Hutter F (2018) BOHB: robust and efficient hyperparameter optimization at scale. In: Dy J, Krause A (eds) Proceedings of the 35th international conference on Machine Learning, Proceedings of Machine Learning Research, vol 80. PMLR, pp 1437–1446
Fernández-Delgado M, Cernadas E, Barro S et al (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15:3133–3181
Feurer M, Klein A, Eggensperger K et al (2015a) Efficient and robust automated machine learning. In: Cortes C, Lawrence ND, Lee DD, et al (eds) Advances in neural information processing systems 28. Curran Associates, Inc., pp 2944–2952
Feurer M, Springenberg JT, Hutter F (2015b) Initializing Bayesian hyperparameter optimization via meta-learning. In: Proceedings of the twenty-ninth AAAI conference on artificial intelligence, AAAI’15. AAAI Press, pp 1128–1135. http://dl.acm.org/citation.cfm?id=2887007.2887164
Feurer M, Eggensperger K, Falkner S et al (2020) Auto-sklearn 2.0: hands-free automl via meta-learning. arXiv:2007.04074 [csLG]
Garcia LPF, Lehmann J, de Carvalho ACPLF et al (2019) New label noise injection methods for the evaluation of noise filters. Knowl Based Syst 163:693–704. https://doi.org/10.1016/j.knosys.2018.09.031
Gascón-Moreno J, Salcedo-Sanz S, Ortiz-García EG et al (2011) A binary-encoded tabu-list genetic algorithm for fast support vector regression hyper-parameters tuning. In: International conference on intelligent systems design and applications, pp 1253–1257
Gijsbers P, Vanschoren J (2021) Gama: a general automated machine learning assistant. In: Dong Y, Ifrim G, Mladenić D et al (eds) Machine learning and knowledge discovery in databases. Applied data science and demo track. Springer, Cham, pp 560–564
Goldberg D (1989) Genetic algorithms in search, optimization and machine learning. Addison Wesley, London
Gomes TAF, Prudêncio RBC, Soares C et al (2012) Combining meta-learning and search techniques to select parameters for support vector machines. Neurocomputing 75(1):3–13
Gonzalez-Fernandez Y, Soto M (2014) copulaedas: an R package for estimation of distribution algorithms based on copulas. J Stat Softw 58(9):1–34
Hauschild M, Pelikan M (2011) An introduction and survey of estimation of distribution algorithms. Swarm Evol Comput 1(3):111–128
Haykin S (2007) Neural networks: a comprehensive foundation, 3rd edn. Prentice-Hall, Upper Saddle River
Hornik K, Buchta C, Zeileis A (2009) Open-source machine learning: R meets Weka. Comput Stat 24(2):225–232
Hothorn T, Hornik K, Zeileis A (2006) Unbiased recursive partitioning: a conditional inference framework. J Comput Graph Stat 15(3):651–674
Huang BF, Boutros PC (2016) The parameter sensitivity of random forests. BMC Bioinform 17(1):331. https://doi.org/10.1186/s12859-016-1228-x
Hutter F, Hoos H, Leyton-Brown K (2014) An efficient approach for assessing hyperparameter importance. In: Proceedings of the 31th international conference on machine learning, ICML 2014, Beijing, China, 21–26 June 2014, pp 754–762. http://jmlr.org/proceedings/papers/v32/hutter14.html
Jankowski D, Jackowski K (2014) Evolutionary algorithm for decision tree induction. In: Saeed K, Snášel V (eds) Computer information systems and industrial management, vol 8838. Lecture notes in computer science. Springer, Berlin, pp 23–32
Jed Wing, Weston S, Williams A et al (2016) caret: classification and regression training. https://CRAN.R-project.org/package=caret, r package version 6.0-71
Kanda J, de Carvalho A, Hruschka E et al (2016) Meta-learning to select the best meta-heuristic for the traveling salesman problem: a comparison of meta-features. Neurocomputing 205:393–406. https://doi.org/10.1016/j.neucom.2016.04.027
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, Perth, Australia, pp 1942–1948
Kohavi R (1996) Scaling up the accuracy of Naive–Bayes classifiers: a decision-tree hybrid. In: Second international conference on knowledge discovery and data mining, pp 202–207
Kotthoff L, Thornton C, Hoos HH et al (2016) Auto-weka 2.0: automatic model selection and hyperparameter optimization in weka. J Mach Learn Res 17:1–5
Krstajic D, Buturovic LJ, Leahy DE et al (2014) Cross-validation pitfalls when selecting and assessing regression and classification models. J Cheminform 6(1):1–15. https://doi.org/10.1186/1758-2946-6-10
Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 95(1–2):161–205
Lang M, Kotthaus H, Marwedel P et al (2015) Automatic model selection for high-dimensional survival analysis. J Stat Comput Simul 85(1):62–76. https://doi.org/10.1080/00949655.2014.929131
Lévesque JC, Gagné C, Sabourin R (2016) Bayesian hyperparameter optimization for ensemble learning. In: Proceedings of the thirty-second conference on uncertainty in artificial intelligence. AUAI Press, Arlington, Virginia, USA, UAI’16, pp 437–446. http://dl.acm.org/citation.cfm?id=3020948.3020994
Li L, Jamieson K, DeSalvo G et al (2018) Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res 18(185):1–52
Liaw A, Wiener M (2002) Classification and regression by randomforest. R News 2(3):18–22
Lin SW, Chen SC (2012) Parameter determination and feature selection for c4.5 algorithm using scatter search approach. Soft Comput 16(1):63–75. https://doi.org/10.1007/s00500-011-0734-z
Loh WY (2014) Fifty years of classification and regression trees. Int Stat Rev 82(3):329–348
López-Ibáñez M, Dubois-Lacoste J, Cáceres LP et al (2016) The irace package: iterated racing for automatic algorithm configuration. Oper Res Perspect 3:43–58. https://doi.org/10.1016/j.orp.2016.09.002
Ma J (2012) Parameter tuning using Gaussian processes. Master’s thesis, University of Waikato, New Zealand
Mantovani RG, Horváth T, Cerri R et al (2016) Hyper-parameter tuning of a decision tree induction algorithm. In: 5th Brazilian conference on intelligent systems, BRACIS 2016, Recife, Brazil, October 9–12, 2016. IEEE Computer Society, pp 37–42. https://doi.org/10.1109/BRACIS.2016.018
Mantovani RG, Rossi AL, Alcobaça E et al (2019) A meta-learning recommender system for hyperparameter tuning: predicting when tuning improves SVM classifiers. Inf Sci 501:193–221. https://doi.org/10.1016/j.ins.2019.06.005
Massimo CM, Navarin N, Sperduti A (2016) Hyper-parameter tuning for graph kernels via multiple kernel learning. Springer, Cham, pp 214–223. https://doi.org/10.1007/978-3-319-46672-9_25
Mills KL, Filliben JJ, Haines AL (2015) Determining relative importance and effective settings for genetic algorithm control parameters. Evol Comput 23(2):309–342. https://doi.org/10.1162/EVCO_a_00137
Miranda P, Silva R, Prudêncio R (2014) Fine-tuning of support vector machine parameters using racing algorithms. In: Proceedings of the 22nd European symposium on artificial neural networks, computational intelligence and machine learning, ESANN 2014, pp 325–330
Molina MM, Luna JM, Romero C et al (2012) Meta-learning approach for automatic parameter tuning: a case study with educational datasets. In: Proceedings of the 5th international conference on educational data mining, EDM 2012, pp 180–183
Nakamura M, Otsuka A, Kimura H (2014) Automatic selection of classification algorithms for non-experts using meta-features. China-USA Bus Rev 13(3):199–205
Padierna LC, Carpio M, Rojas A et al (2017) Hyper-parameter tuning for support vector machines by estimation of distribution algorithms. Springer, Cham, pp 787–800
Pérez Cáceres L, López-Ibáñez M, Stützle T (2014) An analysis of parameters of irace. Springer, Berlin, pp 37–48. https://doi.org/10.1007/978-3-662-44320-0_4
Pilát M, Neruda R (2013) Multi-objectivization and surrogate modelling for neural network hyper-parameters tuning. Springer, Berlin, pp 61–66. https://doi.org/10.1007/978-3-642-39678-6_11
Podgorelec V, Karakatic S, Barros RC et al (2015) Evolving balanced decision trees with a multi-population genetic algorithm. In: IEEE congress on evolutionary computation, CEC 2015, Sendai, Japan, May 25–28, 2015. IEEE, pp 54–61. https://doi.org/10.1109/CEC.2015.7256874
Probst P, Boulesteix A, Bischl B (2019) Tunability: importance of hyperparameters of machine learning algorithms. J Mach Learn Res 20:53:1-53:32
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco
Reif M, Shafait F, Dengel A (2011) Prediction of classifier training time including parameter optimization. In: Bach J, Edelkamp S (eds) KI 2011: advances in artificial intelligence, vol 7006. Lecture notes in computer science. Springer, Berlin, pp 260–271
Reif M, Shafait F, Dengel A (2012) Meta-learning for evolutionary parameter optimization of classifiers. Mach Learn 87:357–380
Reif M, Shafait F, Goldstein M et al (2014) Automatic classifier selection for non-experts. Pattern Anal Appl 17(1):83–96
Ribeiro MT, Singh S, Guestrin C (2016) Model-agnostic interpretability of machine learning. arXiv:1606.05386
Ridd P, Giraud-Carrier C (2014) Using metalearning to predict when parameter optimization is likely to improve classification accuracy. In: Vanschoren J, Brazdil P, Soares C et al (eds) Meta-learning and algorithm selection workshop at ECAI 2014, pp 18–23
Rokach L, Maimon O (2014) Data mining with decision trees: theory and applications, 2nd edn. World Scientific, River Edge
Sabharwal A, Samulowitz H, Tesauro G (2016) Selecting near-optimal learners via incremental data allocation. In: Proceedings of the thirtieth AAAI conference on artificial intelligence. AAAI Press, AAAI’16, pp 2007–2015. http://dl.acm.org/citation.cfm?id=3016100.3016179
Sanders S, Giraud-Carrier CG (2017) Informing the use of hyperparameter optimization through metalearning. In: 2017 IEEE International conference on data mining, ICDM 2017, New Orleans, LA, USA, November 18–21, 2017, pp 1051–1056
Schauerhuber M, Zeileis A, Meyer D et al (2008) Benchmarking open-source tree learners in R/RWeka. Springer, Berlin, pp 389–396. https://doi.org/10.1007/978-3-540-78246-9_46
Scrucca L (2013) Ga: a package for genetic algorithms in r. J Stat Softw 53(1):1–37. https://doi.org/10.18637/jss.v053.i04
Simon D (2013) Evolutionary optimization algorithms, 1st edn. Wiley, New York
Snoek J, Larochelle H, Adams RP (2012) Practical Bayesian optimization of machine learning algorithms. In: Pereira F, Burges C, Bottou L et al (eds) Advances in neural information processing systems, vol 25. Curran Associates, Inc., pp 2951–2959
Stiglic G, Kocbek S, Pernek I et al (2012) Comprehensive decision tree models in bioinformatics. PLoS ONE 7(3):1–13. https://doi.org/10.1371/journal.pone.0033812
Sun Q, Pfahringer B (2013) Pairwise meta-rules for better meta-learning-based algorithm ranking. Mach Learn 93(1):141–161. https://doi.org/10.1007/s10994-013-5387-y
Sureka A, Indukuri KV (2008) Using genetic algorithms for parameter optimization in building predictive data mining models. Springer, Berlin, pp 260–271. https://doi.org/10.1007/978-3-540-88192-6_25
Tan PN, Steinbach M, Kumar V (2005) Introduction to data mining, 1st edn. Addison-Wesley Longman Publishing Co., Inc, Boston
Tantithamthavorn C, McIntosh S, Hassan AE et al (2016) Automated parameter optimization of classification techniques for defect prediction models. In: Proceedings of the 38th international conference on software engineering. ACM, New York, NY, USA, ICSE’16, pp 321–332. https://doi.org/10.1145/2884781.2884857
Therneau T, Atkinson B, Ripley B (2015) rpart: recursive partitioning and regression trees. https://CRAN.R-project.org/package=rpart, r package version 4.1-10
Thornton C, Hutter F, Hoos HH et al (2013) Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the KDD-2013, pp 847–855
van Rijn JN, Hutter F (2017) An empirical study of hyperparameter importance across datasets. In: Proceedings of the international workshop on automatic selection, configuration and composition of machine learning algorithms co-located with the european conference on machine learning & principles and practice of knowledge discovery in databases, AutoML@PKDD/ECML 2017, Skopje, Macedonia, September 22, 2017, pp 91–98. http://ceur-ws.org/Vol-1998/paper_09.pdf
Vanschoren J, van Rijn JN, Bischl B et al (2014) Openml: networked science in machine learning. SIGKDD Explor Newsl 15(2):49–60
Vieira CPR, Digiampietri LA (2020) A study about explainable articial intelligence: using decision tree to explain SVM. Revista Brasileira de Computação Aplicada 12(1):113–121. https://doi.org/10.5335/rbca.v12i1.10247
Wainberg M, Alipanahi B, Frey BJ (2016) Are random forests truly the best classifiers? J Mach Learn Res 17(110):1–5
Wang L, Feng M, Zhou B et al (2015) Efficient hyper-parameter optimization for NLP applications. In: Màrquez L, Callison-Burch C, Su J et al (eds) Proceedings of the 2015 conference on empirical methods in natural language processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015. The Association for Computational Linguistics, pp 2112–2117. http://aclweb.org/anthology/D/D15/D15-1253.pdf
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
Wu X, Kumar V (2009) The top ten algorithms in data mining, 1st edn. Chapman & Hall/CRC, London
Yang XS, Cui Z, Xiao R et al (2013) Swarm intelligence and bio-inspired computation: theory and applications, 1st edn. Elsevier, Amsterdam
Zambrano-Bigiarini M, Clerc M, Rojas R (2013) Standard particle swarm optimisation 2011 at CEC-2013: a baseline for future PSO improvements. In: Proceedings of the IEEE congress on evolutionary computation, CEC 2013, Cancun, Mexico, June 20–23, 2013. IEEE, pp 2337–2344. https://doi.org/10.1109/CEC.2013.6557848
Acknowledgements
The authors would like to thank the Coordenaçq̃o de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) for the financial support, the Brazilian National Council for Scientific and Technological Development (CNPq) for the grant #409371/2021-1 (CNPq/MCTI/FNDCT No 18/2021), and specially to the grants #2012/23114-9, #2013/07375-0 and #2015/03986-0 from São Paulo Research Foundation (FAPESP). EFOP-3.6.3-VEKOP-16-2017-00001: Talent Management in Autonomous Vehicle Control Technologies—The Project is supported by the Hungarian Government and co-financed by the European Social Fund.
Author information
Authors and Affiliations
Contributions
RGM: conception of research, supplied the acquisition of data, prepared figures and tables, analysis, interpretation of data, drafted the work, reviewed the manuscript. TH: conception of research, supplied the acquisition of data, interpretation of data, drafted the work, provided the revised article critically for important intellectual content, reviewed the manuscript. ALDR: interpretation of data, prepared figures and tables, drafted the work, provided the revised article critically for important intellectual content, reviewed the manuscript. RC: drafted the work, provided the revised article critically for important intellectual content, reviewed the manuscript. SBJ: drafted the work, provided the revised article critically for important intellectual content, reviewed the manuscript. JV: Conception of research, analysis, interpretation of data, provided the revised article critically for important intellectual content, reviewed the manuscript. ACPLFdeC: Conception of research, analysis, interpretation of data, provided the revised article critically for important intellectual content, reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Responsible editor: Eyke Hüllermeier.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: List of abbreviations used in the paper
- AI:
-
Artificial Intelligence
- ANN:
-
Artificial Neural Network
- AUC:
-
Area Under the ROC curve
- AutoML:
-
Automated Machine Learning
- BAC:
-
Balanced per class Accuracy
- BOHB:
-
Bayesian Optimization with HyperBand
- CART:
-
Classification and Regression Tree
- CASH:
-
Combined Algorithm Selection and Hyper-parameter Optimization
- CD:
-
Critical Difference
- CTree:
-
Conditional Inference Trees
- CV:
-
Cross-validation
- DL:
-
Deep Learning
- DT:
-
Decision Tree
- EDA:
-
Estimation of Distribution Algorithm
- GA:
-
Genetic Algorithm
- GDPR:
-
General Data Protection Regulation
- GP:
-
Gaussian Process
- GS:
-
Grid Search
- HP:
-
Hyperparameter
- Irace:
-
Iterated F-race
- kNN:
-
k-Nearest Neighbors
- LMT:
-
Logistic Model Tree
- LR:
-
Logistic Regression
- ML:
-
Machine Learning
- MtL:
-
Meta-learning
- NB:
-
Naïve Bayes
- NBTree:
-
Naïve-Bayes Tree
- OpenML:
-
Open Machine Learning
- PD:
-
Parametric Density
- PS:
-
Pattern Search
- PSO:
-
Particle Swarm Optimization
- REP:
-
Reduced Error Pruning
- RF:
-
Random Forest
- RS:
-
Random Search
- SH:
-
Shrinking Hypercube
- SMBO:
-
Sequential Model-based Optimization
- SS:
-
Scatter Search
- SVM:
-
Support Vector Machine
- UCI:
-
University of California Irvine
- VTJ48:
-
Visual Tuning J48
Appendix 2: List of OpenML datasets used in experiments
This appendix presents the full table of datasets used in both tuning and meta-learning experiments performed in this paper. For each dataset it is shown: the OpenML dataset name and id, the number of attributes (D), the number of examples (N), the number of classes (C), the number of examples belonging to the majority and minority classes (nMaj, nMin), the proportion between them (P), and whether the dataset was added to the enrichment step for meta-learning (Tables 10, 11, 12, 13, 14).
Appendix 3: Hyperparameter distributions of the best solutions returned by the Irace tuning technique
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gomes Mantovani, R., Horváth, T., Rossi, A.L.D. et al. Better trees: an empirical study on hyperparameter tuning of classification decision tree induction algorithms. Data Min Knowl Disc (2024). https://doi.org/10.1007/s10618-024-01002-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10618-024-01002-5