Abstract
This chapter describes several new fields, beside the feature selection preprocessing step (the theme of this book), in which ensembles have been successfully used. First, in Sect. 7.1, we introduce a very brief review of the different application fields in which ensembles have been applied, together with basic levels that are used to produce different ensemble designs, and a sample taxonomy. Then, in Sect. 7.2 basic ideas in ensemble classification design, one of the very first machine learning areas in which the idea of ensembles was applied, are stated. As there are many interesting and reference books in ensemble classification, we focus on describing the latest ideas in classification ensembles that address problems such as classification, stream data, missing data and imbalance data. Afterwards, the first attempts for applying the ensemble paradigm to the relatively new field of quantification are described in Sect. 7.3. In Sect. 7.4 we move to describing ensembles for clustering, another area in which ensembles have been increasingly popular. In Sect. 7.5 an attempt on ensembles for discretization is described and, finally, Sect. 7.6 summarizes and discusses the contents of this chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tukey, J.W.: Exploratory Data Analysis. Addison-Wesley, Reading (1977)
Dasarathy, B.V., Sheela, B.V.: Composite classifier system design: concepts and methodology. Proc. IEEE 67(5), 708–713 (1979)
Freund, Y.: Boosting a weak learning algorithm by majority. Inf. Comput. 121(82), 256–285 (1995)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 325–332 (1996)
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40, 139–157 (2000)
Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33(1), 1–39 (2010)
Villada, R., Drissi, Y.: A perspective view and survey of meta-learning. Artif. Intell. Rev. 18, 77–95 (2002)
Okun, O.: Applications of Supervised and Unsupervised Ensemble Methods. Springer, Berlin (2009)
Re, M., Valentini, G.: Ensemble Methods: A Review In Advances in Machine Learning and Data Mining for Astronomy, pp. 563–594. Chapman & Hall, Boca Raton (2012)
Kazienko, P., Lughofer, E., Trawinski , B.: Special issue on Hybrid and ensemble techniques: recent advances and emerging trends. Soft Comput. 19(12), 3353–3355 (2015)
Sharkey, A.J.C.: Types of multinet systems. In: Roli, F., Kittler, J. (eds.) Proceedings of Multiple Classifier Systems. MCS 2002. Lecture Notes in Computer Science, vol. 2364, pp. 108–117. Springer, Berlin (2002)
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms, 2nd edn. Wiley, New York (2014)
Gama, J.: Knowledge Discovery from Data Streams. Chapman & Hall/CRC, Boca Raton (2010)
Hu, J., Li, T.R., Luo, C., Fujita, H., Yang, Y.: Incremental fuzzy cluster ensemble learning based on rough set theory. Knowl.-Based Syst. 132, 144–155 (2017)
Duan, F., Dai, L.: Recognizing the gradual changes in sEMG characteristics based on incremental learning of wavelet neural network ensemble. IEEE Trans. Industr. Electron. 64(5), 4276–4286 (2017)
Khan, I., Huang, J.Z., Ivanov, K.: Incremental density-based ensemble clustering over evolving data streams. Neurocomputing 191, 34–43 (2016)
Yu, Z.W., Luo, P.N., You, J.N., Wong, H.S., Leung, H., Wu, S., Zhang, J., Han, G.Q.: Incremental Semi-Supervised Clustering Ensemble for High Dimensional Data Clustering. IEEE Trans. Knowl. Data Eng. 28(3), 701–714 (2016)
Huang, S., Wang, B.T., Qiu, J.H., Yao, J.T., Wang, G.R., Yu, G.: Parallel ensemble of online sequential extreme learning machine based on MapReduce. Neurocomputing 174, 352–367 (2016)
Das, M., Ghosh, S.K.: A deep-learning-based forecasting ensemble to predict missing data for remote sensing analysis. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 10(12), 5228–5236 (2017)
Gao, H., Jian, S.L., Peng, Y.X., Liu, X.W.: A subspace ensemble framework for classification with high dimensional missing data. Multidimens. Syst. Signal Process. 28(4), 1309–1324 (2017)
Lu, W., Li, Z., Chu, J.: Adaptive ensemble undersampling-boost: a novel learning framework for imbalanced data. J. Syst. Softw. 132, 272–282 (2017)
Lin, W.C., Tsai, C.F., Hu, Y.H., Jhang, J.S.: Clustering-based undersampling in class-imbalanced data. Inf. Sci. 409, 17–26 (2017)
Silva, C., Bouwmans, T., Frelicot, C.: Superpixel-based online wagging one-class ensemble for feature selection in foreground/background separation. Pattern Recogn. Lett. 100, 144–151 (2017)
Fernández-Francos, D., Fontenla-Romero, O., Alonso-Betanzos, A.: One-class convex hull-based algorithm for classification in distributed environments. IEEE Trans. Syst. Man Cybern. Syst. (2017). https://doi.org/10.1109/TSMC.2017.2771341
Krawczyk, B., Cyganek, B.: Selecting locally specialised classifiers for one-class classification ensembles. Pattern Anal. Appl. 20(2), 427–439 (2017)
Pérez-Gallego, P.J., Quevedo-Pérez, J.R., Coz-Velasco, J.J.: Using ensembles for problems with characterizable changes in data distribution: a case study on quantification. Inf. Fusion 34, 87–100 (2017). https://doi.org/10.1016/j.inffus.2016.07.001
Mallet, V., Herlin, I.: Quantification of uncertainties from ensembles of simulations. In: International Meeting Foreknowledge Assessment Series (2016). http://www.foreknowledge2016.com/
Brown, G.: Ensemble learning. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning. Springer, Berlin (2010)
Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: How to adjust an ensemble size in stream data mining. Inf. Sci. 381, 46–54 (2017)
Yin, Z., Zhao, M.Y., Wang, Y.X., Yang, J.D., Zhang, J.H.: Recognition of emotions using multimodal physiological signals and an ensemble deep learning model. Comput. Methods Programs Biomed. 140, 93–110 (2017)
Wozniak, M., Grana, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fusion 16, 3–17 (2014)
Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C- Appl. Rev. 42(4), 463–484 (2012)
Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)
Krawczyk, B., Minku, L.L., Gama, J., Stefanowski, J., Wozmiak, M.: Ensemble learning for data stream analysis. A survey. Inf. Fusion 37, 132–156 (2017)
Cruz, R.M.O., Sabourin, R., Cavalcanti, G.D.: Dynamic classifier selection: recent advances and perspectives. Inf. Fusion 41, 195–216 (2018)
Brun, A.L., Britto Jr., A.S., Oliveira, L.S., Enembreak, F., Sabourin, F.: A framework for dynamic classifier selection oriented by the classification problem difficulty. Pattern Recogn. 76, 175–190 (2018)
Armano, G., Tamponi, E.: Building forests of local trees. Pattern Recognit. 76, 380–390 (2018)
Mayano, J.M., Gibaja, E.L., Cios, K.J., Ventura, S.: Review of ensembles of multi-label classifiers: models, experimental study and prospects. Inf. Fusion 44, 33–45 (2018)
Monidipa, D., Ghosh, S.K.: A deep-learning-based forecasting ensemble to predict missing data for remote sensing analysis. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 10(12), 5228–5236 (2017)
Yan, Y.T., Zhang, Y.P., Zhang, Y.W., Du, X.Q.: A selective neural network ensemble classification for incomplete data. Int. J. Mach. Learn. Cybernet. 8(5), 1513–1524 (2017)
Bonab, H.R., Fazli, C.: Less is more: a comprehensive framework for the number of components of ensemble classifiers. IEEE Trans. Neural Netw. Learn. Syst
Hernández-Lobato, D., Martínez-Muñoz, G., Suárez, A.: How large should ensembles of classifiers be? Pattern Recogn. 47(5), 1323–1336 (2017)
Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: a survey and categorization. Inf. Fusion 6(1), 5–20 (2005)
Tsoumakas, G., Partalas, I., Vlahavas, I.: A taxonomy and short review of ensemble selection. In: ECAI 08, Workshop on Supervised and Unsupervised Ensemble Methods and Their Applications (2008)
Khan, S.S., Madden, M.G.: One-class classification: taxonomy of study and review of techniques. Knowl. Eng. Rev. 29(3), 345–374 (2014)
Dib, G., Karpenko, O., Koricho, E., Khomenko, A., Haq, M., Udpa, L.: Ensembles of novelty detection classifiers for structural health monitoring using guided waves. Smart Mater. Struct. 27(1) (2018). https://doi.org/10.1088/1361-665X/aa973f
Liu, J., Miao, Q., Sun, Y., Song, J., Quan, Y.: Modular ensembles for one-class classification based on density analysis. Neurocomputing 171, 262–276 (2016)
Zhou, X., Zhong, Y., Cai, L.: Anomaly detection from distributed flight record data for aircraft health management. In: Proceedings of International Conference on Computational and Information Sciences, pp 156–159 (2010)
Castillo, E., Peteiro-Barral, D., Guijarro-Berdiñas, B., Fontenla-Romero, O.: Distributed one-class support vector machine. Int. J. Neural Syst. 25(7), 1550029 (2015)
Schölkopf, B., Williamson, R.C., Smola, A.J., Shawe-Taylor, J., Platt, J.C.: Support vector method for novelty detection. In: Advances in Neural Information Processing Systems, NIPS ’00, pp. 582–588 (2000)
Casale, P., Pujol, O., Radeva, P.: Approximate polytope ensemble for one-class classification. Pattern Recogn. 47(2), 854–864 (2014)
Galar, M., Fernández, A., Barrenechea, E., Herrera, F.: EUSBoost: enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling. Pattern Recogn. 46(12), 3460–3471 (2013)
Salunkhe, U.R., Suresh, N.M.: Classifier ensemble design for imbalanced data classification: a hybrid approach. Procedia Comput. Sci. 85, 725–732 (2016)
Wang, Q., Luo, Z., Huang, J.C., Feng, Y.H., Liu, Z.: A novel ensemble method for imbalanced data learning: bagging of extrapolation-SMOTE SVM. Comput. Intell. Neurosci. pp. 1827016 (2017). https://doi.org/10.1155/2017/1827016
Sun, Y., Kamel, M., Wong, A., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn. 40, 3358–3378 (2007)
Blaszczynski, J., Deckert, M., Stefanowski, J., Wilk, S.: Integrating selective pre-processing of imbalanced data with ivotes ensemble. In: 7th International Conference on Rough Sets and Current Trends in Computing (RSCTC2010), LNCS 6086, pp. 148–157. Springer (2010)
Wang, S., Yao, X.: Diversity analysis on imbalanced data sets by using ensemble models. In: IEEE Symposium Series on Computational Intelligence and Data Mining (IEEE CIDM 2009), pp. 324–331 (2009)
Triguero, I., González, S., Moyano, J.M., García, S., Alcalá-Fernández, J., Luengo, J., Fernández, A., del Jesus, M.J., Sánchez, L., Herrera, F.: KEEL 3.0: an open source software for multi-stage analysis in data mining. Int. J. Comput. Intell. Syst. 10, 1238–1249 (2017)
Gomes, H.M., Barddal, J.P., Enembreck, F., Bifet, A.: A survey on ensemble learning for data stream classification. ACM Comput. Surv. 50(2), 1–23 (2017)
Barddal, J.P., Gomes, H.M., Enembreck, F., Pfahringer, B.: A survey on feature drift adaptation: definition, benchmark, challenges and future directions. J. Syst. Softw. 127, 278–294 (2017)
Bifet, A., Holmes, G., Pfahringer, B.: Leveraging bagging for evolving data streams. In: Proceeding ECML PKDD’10, European Conference on Machine Learning and Knowledge Discovery in Databases: Part I, pp. 135-150 (2010)
Brzezinski, D., Stefanowski, J.: Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf. Sci. 265, 50–67 (2014)
Brzezinski, D., Stefanowski, J.: Ensemble diversity in evolving data streams. In: Proceedings of the International Conference on Discovery Science, pp. 229–244. Springer (2016)
Parker, B.S., Khan, L., Bifet, A.: Incremental ensemble classifier addressing nonstationary fast data streams. In: Proceedings of the 2014 IEEE International Conference on Data Mining Workshop (ICDMW), pp 716–723. IEEE (2014)
Wang, S., Minku, L.L., Yao, X.: Resampling-based ensemble methods for online class imbalance learning. IEEE Trans. Knowl. Data Eng. 27(5), 1356–1368 (2015)
van Rijn, J.N., Holmes, G., Pfahringer, B., Vanschoren, J.: Having a blast: metalearning and heterogeneous ensembles for data streams. In: Proceedings of the 2015 IEEE International Conference on Data Mining (ICDM), pp 1003–1008. IEEE (2015)
Ryu, J.W., Kantardzic, M.M., Kim, M.W.: Efficiently maintaining the performance of an ensemble classifier in streaming data. In: Convergence and Hybrid Information Technology, pp. 533–540. Springer (2012)
Gomes, H.M., Enembreck, F.: SAE: social adaptive ensemble classifier for data streams. In: Proceedings of the 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp 199–206 (2013). https://doi.org/10.1109/CIDM.2013.6597237
Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4) (2014)
Schafer, J.L., Graham, J.W.: Missing data: our view of the state of the art. Psychol. Methods 7(2), 147–177 (2002)
Saar-Tsechansky, M., Provost, F.: Handling missing values when applying classification models. J. Mach. Learn. Res 8, 1623–1657 (2007)
Twala, B., Cartwright, M.: Ensemble missing data techniques for software effort prediction. Intell. Data Anal. 14, 299–331 (2010)
Twala, B., Cartwright, M.: Ensemble imputation methods for missing software engineering data. In: Proceedings of 11th IEEE Int. Software metric Symposium (2005)
Hassan, M.M., Atiya, A.F., El Gayar, N., El-Fouly, R.: Novel ensemble techniques for regression with missing data. New Math. Nat. Comput. 5 (2009)
Setz, C., Schumm, J., Lorenz, C., Arnrich, B., Tröster, G.: Using ensemble classifier systems for handling missing data in emotion recognition from physiology: one step towards a practical system. In: Proceedings of 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, pp. 1–8 (2009)
Moahmed, T.A., El Gayar, N., Atiya, A.F.: Forward and backward forecasting ensembles for the estimation of time series missing data. In: IAPR Workshop on ANN in Pattern Recognition, Lecture Notes in Computer Science, vol. 8774, pp. 93–104 (2014)
Polikar, R., DePasquale, J., Mohammed, H.S., Brown, G., Kuncheva, L.I.: Learn++.MF: A random subspace approach for the missing feature problem. Pattern Recogn. 43, 3817–3832 (2010)
Nanni, L., Lumini, A., Brahnam, S.: A classifier ensemble approach for the missing value problem. Artif. Intell. Med. 55(1), 37–50 (2012)
Rad, N.M., Kia, S.M., Zarbo, C., van Laarhoven, T., Jurman, G., Venuti, P., Marchiori, E., Furlanello, C.: Deep learning for automatic stereotypical motor movement detection using wearable sensors in autism spectrum disorders. Sig. Process. 144, 180–191 (2018)
Xiao, Y.W., Wu, J., Lin, Z.L., Zhao, X.D.: A deep learning-based multi-model ensemble method for cancer prediction. Comput. Methods Programs Biomed. 153, 1–9 (2018)
Forman, G.: Quantifying counts and costs via classification. Data Min. Knowl. Discov. 17, 164–206 (2008)
Barranquero, J., Díez, J., Del Coz, J.J.: Quantification-oriented learning based on reliable classifiers. Pattern Recogn. 48(2), 591–604 (2015)
Ghosh, J., Acharya, A.: Cluster ensembles. WiREs Data Min. Knowl. Discov. 1(4), 305–315 (2011)
Kuncheva, L.I., Hadjitodorov, S.T.: Using diversity in cluster ensemble. Proc. IEEE Int. Conf. Syst. Man Cybern. 2, 1214–1219 (2004)
Sevillano, X., Cobo, G., Alías, F., Socoró, J.C.: Feature diversity in cluster ensembles for robust document clustering. In: SIGIR ’06 Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 697–698 (2006)
Topchy, A.P., Jain, A.K., Punch, W.F.: Clustering ensembles: models of consensus and weak partitions. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1866–1881 (2005)
Wang, H., Shan, H., Banerjee, A.: Bayesian cluster ensembles. J. Stat. Anal. Data Min. 4(1), 54–70 (2011)
Kuhn, H.W.: The Hungarian method for the assignment problem. Naval Res. Logic. Quart. 2, 83–97 (1955)
Dudoit, S., Fridiyand, J.: Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19(9), 1090–1099 (2003)
Hong, Y., Kwong, S., Chang, Y., Ren, Q.: Unsupervised feature selection using clustering ensembles and population based incremental learning algorithm. Pattern Recogn. 41(9), 2742–2756 (2008)
Streh, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)
Domeniconi, C., Al-Razgan, M.: Weighted cluster ensembles: methods and analysis. ACM Trans. Knowl. Discov. Data 2(4), 1–40 (2009)
Yousefnezhad, M., Zhang, D.: Weighted spectral cluster ensemble. In: Proceedings of IEEE International Conference on Data Mining 2015, pp. 549–558 (2015)
Fred, A.L.N., Jain, A.K.: Data clustering using evidence accumulation. In: Proceedings of 16th International Conference on Pattern Recognition-ICPR, pp. 276–280 (2002)
Monti, S., Tamayo, P., Mesirov, J., Golub, T.: Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Mach. Learn. J. 52, 91–118 (2003)
Li, T., Ding, C., Jordan, M.: Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization. In: Proceedings of Seventh IEEE International Conference on Data Mining (ICDM 2007), pp. 577–582 (2007)
Moon, T.K.: The expectation maximization algorithm. In: IEEE Signal Processing Magazine, pp. 47–60 (1996)
Luo, H., Jing, F., Xie, X.: Combining multiple clusterings using information theory based genetic algorithm. IEEE Int. Conf. Comput. Intell. Secur. 1, 84–89 (2006)
Azimi, J., Abdoos, M., Analoui, M.: A new efficient approach in clustering ensembles. Proc. IDEAL’07 Lect. Notes Comput. Sci. 4881, 395–405 (2007)
Chatterjee, S., Mukhopadhyay, A.: Clustering ensemble: a multiobjective genetic algorithm based approach. Procedia Technol. 10, 443–449 (2013)
Ghaemi, R., bin Sulaiman, N., Ibrahim, H., Norwatti, M.: A review: accuracy optimization in clustering ensembles using genetic algorithms. Artif. Intell. Rev. 35(4), 287–318 (2011)
Yan, L., Xin, Y., Tang, W.: Consensus clustering algorithms for asset management in power systems. In: Proceedings of 5th International Conference on Electric Utility Deregulation and Restructuring and Power Technologies (DRPT), pp. 1504–1510 (2015)
Manita, G., Khanchel, R., Limam, M.: Consensus functions for cluster ensembles. Appl. Artif. Intell. 26(6), 598–614 (2012)
Kuncheva, L.I., Vetrov, D.P.: Evaluation of stability of k-means cluster ensembles with respect to random initialization. IEEE Trans. Pattern Anal. Mach. Intell. 28(11), 1798–1808 (2006)
Montalvao, J., Canuto, J.: Clustering ensembles and space discretization–a new regard towards diversity and consensus. Pattern Recogn. Lett. 31(15), 2415–2424 (2010)
Zhang, H., Yang, L., Xie, D.: Unsupervised evaluation of cluster ensemble solutions. In: Proceedings of 7th International Conference on Advanced Computational Intelligence, pp. 101–106 (2015)
Yeh, C.C., Yang, M.S.: Evaluation measures for cluster ensembles based on a fuzzy generalized Rand index. Appl. Soft Comput. 57, 225–234 (2017)
Alonso-Betanzos, A., Bolón-Canedo, V., Eiras-Franco, C., Morán-Fernández, L., Seijo-Pardo, B.: Preprocessing in high-dimensional datasets. In: Holmes, D., Jain, L. (eds.) Advances in Biomedical Informatics. Intelligent Systems Reference Library, vol. 137, pp. 247–271. Springer, Cham (2018)
Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings IJCAI, pp. 1022–1029 (1993)
Ramírez-Gallego, S., García, S., Mouriño-Talín, H., Martínez-Rego, D., Bolón-Canedo, V., Alonso-Betanzos, A., Benítez, J.M., Herrera, F.: Data discretization: taxonomy and big data challenge. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 6, 5–21 (2016)
Liu, H., Hussein, F., Tan, C.L., Dash, M.: Discretization: an enabling technique. Data Min. Knowl. Disc. 6, 393–423 (2002)
Sriwanna, K., Boongoen, T., Iam-On, N.: An enhanced univariate discretization based on cluster ensembles. In: Lavangnananda, K., Phon-Amnuaisuk, S., Engchuan, W., Chan, J. (eds.) Intelligent and Evolutionary Systems. Proceedings in Adaptation, Learning and Optimization, pp. 85–98. Springer, Cham (2016)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Bolón-Canedo, V., Alonso-Betanzos, A. (2018). Other Ensemble Approaches. In: Recent Advances in Ensembles for Feature Selection. Intelligent Systems Reference Library, vol 147. Springer, Cham. https://doi.org/10.1007/978-3-319-90080-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-90080-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-90079-7
Online ISBN: 978-3-319-90080-3
eBook Packages: EngineeringEngineering (R0)