Abstract
To improve the decision performance using historical decision data, this paper proposes a data-driven decision model based on local two-stage weighted ensemble learning. The assessments of historical alternatives are collected from a multicriteria framework. For each new alternative, a set of its similar alternatives is determined from historical alternatives using the K-Nearest Neighbor technique, and then a set of base classifiers (BCs) is generated by the historical assessments. Based on ensemble error and diversity of BCs in predicting the similar historical alternatives of the new alternative, a local two-stage weighted ensemble method is developed to learn the optimal BC weights for the new alternative. Such a learning process not only considers the changes of BCs’ competence in facing different alternatives (instances) but also avoids falling into the dilemma of balancing the accuracy and diversity of BCs. By combining the continuous outputs of different BCs with the learned BC weights, the weighted ensemble outputs are obtained for the similar historical alternatives of the new alternative. Based on these outputs and the assessments of those similar historical alternatives on criteria, a linear optimization model is constructed to learn criterion weights. Using the learned criterion weights, the interpretable decision is performed. The advantages of the proposed decision model against four traditional decision models are validated by a real case study for the diagnosis of thyroid nodules. Thirty real datasets examine the competence of the proposed weighted ensemble method against mainstream ensemble methods and combination rules.
Similar content being viewed by others
Notes
Usually, K is used to denote the number of data subsets in the cross validation. Considering that K has been defined as the size of the local region in this paper, here for distinction, we use Z to represent the number of data subsets in the cross validation.
References
Alcalá-Fdez, J., Fernandez, A., Luengo, J., Derrac, J., García, S., Sánchez, L., & Herrera, F. (2011). KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. Journal of Multiple-Valued Logic and Soft Computing, 17, 255–287.
Alelaumi, S., Wang, H., Lu, H., & Yoon, S. W. (2020). A predictive abnormality detection model using ensemble learning in stencil printing process. IEEE Transactions on Components, Packaging and Manufacturing Technology, 10(9), 1560–1568.
Alfaro, C., Cano-Montero, J., Gómez, J., Moguerza, J. M., & Ortega, F. (2016). A multi-stage method for content classification and opinion mining on weblog comments. Annals of Operations Research, 236, 197–213.
Ardakani, A. A., Bitarafan-Rajabi, A., Mohammadi, A., Hekmat, S., Tahmasebi, A., Shiran, M. B., & Mohammadzadeh, A. (2019). CAD system based on B-mode and color Doppler sonographic features may predict if a thyroid nodule is hot or cold. European Radiology, 29, 4258–4265.
Blake, C., & Merz, C. J. (1998). UCI repository of machine learning databases. http://www.ics.uci.Edu/mlearn/MLRepository.html
Bonami, P., Günlük, O., & Linderoth, J. (2018). Globally solving nonconvex quadratic programming problems with box constraints via integer programming method. Mathematical Programming Computation, 10, 333–382.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
Brown, G., Wyatt, J., Harris, R., & Yao, X. (2005). Diversity creation methods: A survey and categorization. Information Fusion, 6, 5–20.
Cappelli, C., Castellano, M., Pirola, I., Cumetti, D., Agosti, B., Gandossi, E., & Rosei, E. A. (2007). The predictive value of ultrasound findings in the management of thyroid nodules. QJM: An International Journal of Medicine, 100(1), 29–35.
Cevikalp, H., & Polikar, R. (2008). Local classifier weighting by quadratic programming. IEEE Transactions on Neural Networks, 19(10), 1832–1838.
Chen, T. Q., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 785–794).
Chong, E. K. P., & Zak, S. H. (2013). An introduction to optimization. Wiley.
Costa, V. S., Farias, A. D. S., Bedregal, B., Santiago, R. H. N., & Canuto, A. M. D. P. (2018). Combining multiple algorithms in classifier ensembles using generalized mixture functions. Neurocomputing, 313, 402–414.
Cruz, R. M. O., Sabourin, R., & Cavalcanti, G. D. C. (2018). Dynamic classifier selection: Recent advances and perspectives. Information Fusion, 41, 195–216.
Cui, S., Wang, Y. Z., Yin, Y. Q., Cheng, T. C. E., Wang, D. J., & Zhai, M. Y. (2021). A cluster-based intelligence ensemble learning method for classification problem. Information Sciences, 560, 386–409.
Dash, R., Samal, S., Dash, S., & Rautray, R. (2019). An integrated TOPSIS crow search based classifier ensemble: In application to stock index price movement prediction. Applied Soft Computing, 85, 105784.
Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30.
Fernandes, L., Fischer, A., Júdice, J., Requejo, C., & Soares, J. (1998). A block active set algorithm for large-scale quadratic programming with box constraints. Annals of Operations Research, 81, 75–95.
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55, 119–139.
Freund, J. H. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis, 28, 367–378.
Friedman, M. (1937). The use of ranks to avoid the assumption of normality implicit in the analysis of variance. American Statistical Association, 32, 675–701.
Fu, C., Chang, W. J., & Liu, W. Y. (2019). Data-driven group decision making for diagnosis of thyroid nodule. Science China Information Sciences, 62, 212205:1-212205:23.
Fu, C., Liu, W. Y., & Chang, W. J. (2020). Data-driven multiple criteria decision making for diagnosis of thyroid cancer. Annals of Operations Research, 293(2), 833–862.
Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., & Herrera, F. (2011). An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes. Pattern Recognition, 44, 1761–1776.
García, S., Zhang, Z. L., Altalhi, A., Alshomrani, S., & Herrera, F. (2018). Dynamic ensemble selection for multi-class imbalanced datasets. Information Sciences, 445–446, 22–37.
Guo, M. Z., Liao, X. W., Liu, J. P., & Zhang, Q. P. (2020). Consumer preference analysis: A data-driven multiple criteria approach integrating online information. Omega, 96, 102074.
Guo, M. Z., Zhang, Q. P., Liao, X. W., Chen, F. Y., & Zeng, D. D. (2021). A hybrid machine learning framework for analyzing human decision-making through learning preferences. Omega, 101, 102263.
Horvath, E., Silva, C. F., Majlis, S., Rodriguez, I., Skoknic, V., Castro, A., Rojas, H., Niedmann, J. P., Madrid, A., Capdeville, F., Whittle, C., Rossi, R., Domínguez, M., & Tala, H. (2017). Prospective validation of the ultrasound based TIRADS (Thyroid Imaging Reporting and Data System) classification: Results in surgically resected thyroid nodules. European Radiology, 27(6), 2619–2628.
Irpino, A., & Verde, R. (2008). Dynamic clustering of interval data using a wasserstein-based distance. Pattern Recognition Letters, 29(11), 1648–1658.
Jardin, P. D. (2021). Forecasting corporate failure using ensemble of self-organizing neural networks. European Journal of Operational Research, 288, 869–888.
Jiang, M., Jia, L., Chen, Z., & Chen, W. (2020). The two-stage machine learning ensemble models for stock price prediction by combining mode decomposition, extreme learning machine and improved harmony search algorithm. Annals of OperationsResearch. https://doi.org/10.1007/s10479-020-03690-w
Johnson, M., Albizri, A., & Simsek, S. (2020). Artificial intelligence in healthcare operations to enhance treatment outcomes: A framework to predict lung cancer prognosis. Annals of OperationsResearch. https://doi.org/10.1007/s10479-020-03872-6
Khosravi, K., Shahabi, H., Pham, B. T., Adamowski, J., Shirzadi, A., Pradhan, B., Dou, J., Ly, H. B., Gróf, G., Ho, H. L., Hong, H., Chapi, K., & Prakash, I. (2019). A comparative assessment of flood susceptibility modeling using multi-criteria decision-making analysis and machine learning methods. Journal of Hydrology, 573, 311–323.
Kittler, J., Hatef, M., Duin, R. P. W., & Matas, J. (1998). On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(3), 226–239.
Krannichfeldt, L. V., Wang, Y., & Hug, C. (2021). Online ensemble learning for load forecasting. IEEE Transactions on Power Systems, 36(1), 545–548.
Krawczyk, B., Galar, M., Woźniak, M., Bustince, H., & Herrera, F. (2018). Dynamic ensemble selection for multi-class classification with one-class classifiers. Pattern Recognition, 83, 34–51.
Krogh, A., & Vedelsby, J. (1994). Neural network ensembles, cross validation, and active learning. In Proceedings of the 7-th international conference on neural information processing systems (pp. 231–238).
Kuncheva, L., & Whitaker, C. (2003). Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51, 181–207.
Kuncheva, L. (2013). A bound on kappa-error diagrams for analysis of classifier ensembles. IEEE Transactions on Knowledge and Data Engineering, 25, 494–501.
Lamy, J. B., Sekar, B., Guezennec, G., Bouaud, J., & Séroussi, B. (2019). Explainable artificial intelligence for breast cancer: A visual case-based reasoning approach. Artificial Intelligence in Medicine, 94, 42–53.
Li, T., Wang, Y., & Zhang, N. (2020). Combining probability density forecasts for power electrical loads. IEEE Transactions on Smart Grid, 11(2), 1679–1690.
Li, X., Zhang, S. L., Zhang, M., & Liu, H. (2008). Rank of interval numbers based on a new distance measure. Journal of Southwest University of Science and Technology, 27(1), 87–90.
Liang, Z., Xiao, Z., Wang, J., Sun, L., Li, B., Hu, Y., & Wu, Y. (2019). An improved chaos similarity model for hydrological forecasting. Journal of Hydrology, 577, 123953.
Liu, Z. G., Pan, Q., Dezert, J., & Martin, A. (2018). Combination of classifiers with optimal weight based on evidential reasoning. IEEE Transactions on Fuzzy Systems, 26(3), 1217–1230.
Lu, H. Y., Wang, H. F., Zhang, Q. Q., Won, D., & Yoon, S. W. (2018). A dual-tree complex wavelet transform based convolutional neural network for human thyroid medical image segmentation. In Proceedings of 2018 IEEE international conference on healthcare informatics (ICHI) (pp. 191–198).
Mahbobi, M., Kimiagari, S., & Vasudevan, M. (2021). Credit risk classification: An integrated predictive accuracy algorithm using artificial and deep neural networks. Annals of Operations Research. https://doi.org/10.1007/s10479-021-04114-z
Mao, S. S., Jiao, L., Xiong, L., Gou, S., Chen, B., & Yeung, S. K. (2015). Weighted classifier ensemble based on quadratic form. Pattern Recognition, 48, 1688–1706.
Mao, S. S., Chen, J. W., Jiao, L. C., Geou, S. P., & Wang, R. F. (2019). Maximizing diversity by transformed ensemble learning. Applied Soft Computing, 82, 105580.
Nachappa, T. G., Piralilou, S. T., Gholamnia, K., Ghorbanzadeh, O., Rahmati, O., & Blaschke, T. (2020). Flood susceptibility mapping with machine learning, multi-criteria decision analysis and ensemble using Dempster Shafer Theory. Journal of Hydrology, 590, 125275.
Nguyen, T. T., Luong, A. V., Dang, M. T., Liew, A. W. C., & Mccall, J. (2020). Ensemble selection based on classifier prediction confidence. Pattern Recognition, 100, 107104.
Nocedal, J., & Wright, S. J. (2006). Numerical optimization. Springer.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12, 2825–2830.
Razzaghi, T., Safro, I., Ewing, J., Sadrfaridpour, E., & Scott, J. D. (2019). Predictive models for bariatric surgery risks with imbalanced medical datasets. Annals of Operations Research, 280, 1–18.
Şen, M. U., & Erdoğan, H. (2013). Linear classifier combination and selection using group sparse regularization and hinge loss. Pattern Recognition Letters, 34, 265–274.
Seref, O., Razzaghi, T., & Xanthopoulos, P. (2017). Weighted relaxed support vector machines. Annals of Operations Research, 249, 235–271.
Smits, P. C. (2002). Multiple classifier systems for supervised remote sensing image classification based on dynamic classifier selection. IEEE Transactions on Geoscience and Remote Sensing, 40(4), 801–813.
Sue, K. L., Tsai, C. F., & Chiu, A. (2021). The data sampling effect on financial distress prediction by single and ensemble learning techniques. Communications in Statistics-Theory and Methods. https://doi.org/10.1080/03610926.2021.1992439
Tang, E. K., Suganthan, P. N., & Yao, X. (2006). An analysis of diversity measures. Machine Learning, 65, 247–271.
Tang, L., Wang, S., He, K., & Wang, S. (2015). A novel mode-characteristic-based decomposition ensemble model for nuclear energy consumption forecasting. Annals of Operations Research, 234, 111–132.
Wang, H., Song, B., Ye, N. R., Ren, J. L., Sun, X. L., Dai, Z. D., Zhang, Y., & Chen, B. T. (2020). Machine learning-based multiparametric MRI radiomics for predicting the aggressiveness of papillary thyroid carcinoma. European Journal of Radiology, 122, 108755.
Wang, H. F., Zheng, B. C., Yoon, S. W., & Ko, H. S. (2018). A support vector machine-based ensemble algorithm for breast cancer diagnosis. European Journal of Operational Research, 267(2), 687–699.
Wang, H. F., Won, D., & Yoon, S. W. (2019). A deep separable neural network for human tissue identification in three-dimensional optical coherence tomography images. IISE Transactions on Healthcare Systems Engineering, 9(3), 250–271.
Wang, J. M. (2012). Robust optimization analysis for multiple attribute decision making problems with imprecise information. Annals of Operations Research, 197, 109–122.
Wang, Y. M. (1997). Using the method of maximizing deviation to make decision for multiindices. Journal of Systems Engineering and Electronics, 8(3), 21–26.
Wu, Z. B., & Chen, Y. H. (2007). The maximizing deviation method for group multiple attribute decision making under linguistic environment. Fuzzy Sets and Systems, 158(14), 1608–1617.
Xu, C., Fu, C., Liu, W. Y., Sheng, S., & Yang, S. L. (2021). Data-driven decision model based on dynamical classifier selection. Knowledge-Based Systems, 212, 106590.
Yin, X. C., Huang, K. Z., Yang, C., & Hao, H. W. (2014). Convex ensemble learning with sparsity and diversity. Information Fusion, 20, 49–59.
Zhang, X., & Liu, P. (2010). Methods for multiple attribute decision-making under risk with interval numbers. International Journal of Fuzzy Systems, 12(3), 237–242.
Zhang, L., & Zhou, W. D. (2011). Sparse ensembles using weighted combination methods based on linear programming. Pattern Recognition, 44, 97–106.
Zhang, Y. Q., Cao, G., Wang, B. S., & Li, X. S. (2019). A novel ensemble method for k-nearest neighbor. Pattern Recognition, 85, 13–25.
Zhou, Z. H. (2012). Ensemble methods: Foundations and algorithms. CRC Press.
Zuo, W., Zhang, D., & Wang, K. (2008). On kernel difference-weighted k-nearest neighbor classification. Pattern Analysis and Applications, 11, 247–257.
Acknowledgements
This research is supported by the National Natural Science Foundation of China (Grant Nos. 72101074, 72171066, and 72071061), and the Fundamental Research Funds for the Central Universities (JZ2021HGTA0139 and JZ2021HGQA0203).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Xu, C., Chang, W. & Liu, W. Data-driven decision model based on local two-stage weighted ensemble learning. Ann Oper Res 325, 995–1028 (2023). https://doi.org/10.1007/s10479-022-04599-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-022-04599-2