Abstract
In this paper, we present a non-deterministic strategy for searching for optimal number of trees (NoTs) hyperparameter in Random Forest (RF). Hyperparameter tuning in Machine Learning (ML) algorithms optimizes predictability of an ML algorithm and/or improves computer resources utilization. However, hyperparameter tuning is a complex optimization task and time consuming. We set up experiments with the goal of maximizing predictability, minimizing NoTs and minimizing time of execution (ToE). Compared to the deterministic algorithm, e-greedy and default configured RF, this research’s non-deterministic algorithm recorded an average percentage accuracy (acc) of approximately 98%, NoTs percentage average improvement of 29.39%, average ToE improvement ratio of 415.92 and an average improvement of 95% iterations. Moreover, evaluations using Jackknife Estimation showed stable and reliable results from several experiment runs of the non-deterministic strategy. The non-deterministic approach in selecting hyperparameter showed a significant acc and better computer resources (i.e. cpu and memory time) utilization. This approach can be adopted widely in hyperparameter tuning, and in conserving utilization of computer resources i.e. green computing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bernard, S., Heutte, L., Adam, S.: Influence of hyperparameters on random forest accuracy. In: Benediktsson, J.A., Kittler, J., Roli, F. (eds.) MCS 2009. LNCS, vol. 5519, pp. 171–180. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02326-2_18
Breiman, L.: Random forests. J. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Breiman, L., Cutler, A.: Random forests manual v4.0 (2017). https://www.stat.berkeley.edu/~breiman/Using_random_forests_v4.0.pdf
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016). https://doi.org/10.1145/2939672.2939785
Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Ganjisaffar, Y., Debeauvais, T., Javanmardi, S., Caruana, R., Lopes, C.V.: Distributed tuning of machine learning algorithms using MapReduce clusters. In: 3rd Workshop on Large Scale Data Mining: Theory and Applications. ACM (2011). https://doi.org/10.1145/2002945.2002947
Hazan, E., Klivans, A., Yuan, Y.: Hyperparameter optimization: a spectral approach. arXiv preprint arXiv:1706.00764 (2017)
Kaggle: Kaggle datasets. https://www.kaggle.com/datasets
Lalor, J., Wu, H., Yu, H.: Improving machine learning ability with fine-tuning (2017)
Liu, X., et al.: Semi-supervised node splitting for random forest construction. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 492–499 (2013). https://doi.org/10.1109/CVPR.2013.70
Oshiro, T.M., Perez, P.S., Baranauskas, J.A.: How many trees in a random forest? In: Perner, P. (ed.) MLDM 2012. LNCS (LNAI), vol. 7376, pp. 154–168. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31537-4_13
Senagi, K., Jouandeau, N., Kamoni, P.: Using parallel random forest classifier in predicting land suitability for crop production. J. Agric. Inform. 8(3), 23–32 (2017). https://doi.org/10.17700/jai.2017.8.3.390
Smit, S.K., Eiben, A.E.: Comparing parameter tuning methods for evolutionary algorithms. In: IEEE Congress on Evolutionary Computation, pp. 399–406. IEEE (2009). https://doi.org/10.1109/CEC.2009.4982974
Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, pp. 2951–2959 (2012)
Wager, S., Hastie, T., Efron, B.: Confidence intervals for random forests: the jackknife and the infinitesimal jackknife. J. Mach. Learn. Res. 15(1), 1625–1651 (2014)
White, J.: Bandit Algorithms for Website Optimization. O’Reilly Media, Inc., Farnham (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Senagi, K., Jouandeau, N. (2018). Confidence in Random Forest for Performance Optimization. In: Bramer, M., Petridis, M. (eds) Artificial Intelligence XXXV. SGAI 2018. Lecture Notes in Computer Science(), vol 11311. Springer, Cham. https://doi.org/10.1007/978-3-030-04191-5_31
Download citation
DOI: https://doi.org/10.1007/978-3-030-04191-5_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04190-8
Online ISBN: 978-3-030-04191-5
eBook Packages: Computer ScienceComputer Science (R0)