Abstract
Instance based classifiers, such as k-Nearest Neighbors, predict the class value of a new observation based on some distance or similarity measure between the new instance and the stored training data. However, due to the required distance calculations, classifying new instances becomes computationally expensive as the number of training observations increases. Therefore, instance selection techniques have been proposed to improve instance based classifiers by reducing the number of training instances that must be stored to achieve adequate classification rates. Although other methods exist, an evolutionary algorithm has been used for instance selection with some of the best results in regard to data reduction and preservation of classification accuracy. Unfortunately, the performance of the evolutionary algorithm for instance selection comes at the cost of longer computation times in comparison to classic instance selection techniques. In this work we introduce a new stopping criterion for the evolutionary algorithm which depends on the convergence of its fitness function. Experimentation shows that the new criterion results in less computation time while achieving comparable performance.
DISTRIBUTION A: Approved for public release: distribution unlimited: 02 May 2016. Case # 88ABW-2016-2258l.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aha, D.W., Kibler, D., Albert, M.K.: Instance-Based learning algorithms. Mach. Learn. 6, 37–66 (1991)
Alcalá-Fdez, J., Fernández, A., Luengo, J., Derrac, J., GarcÃa, S., Sánchez, L., Herrera, F.: KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. J. Mult.-Valued Log. Soft Comput. 17, 255–287 (2011)
Cano, J.F., Herrera, F., Lozano, M.: Evolutionary stratified training set selection for extracting classification rules with trade off precision-interpretability. Data Knowl. Eng. 60, 90–108 (2007)
Cano, J., Herrera, F., Lozano, M.: Using evolutionary algorithms as instance selection for data reduction in KDD: an experimental study. IEEE Trans. Evol. Comput. 7, 561–575 (2003)
Cano, J.R., Herrera, F., Lozano, M.: On the combination of evolutionary algorithms and stratified strategies for training set selection in data mining. Appl. Soft Comput. 6, 323–332 (2006)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967)
De Haro-GarcÃa, A., GarcÃa-Pedrajas, N.: A divide-and-conquer recursive approach for scaling up instance selection algorithms. Data Min. Knowl. Discov. 18, 392–418 (2009)
De Haro-GarcÃa, A., GarcÃa-Pedrajas, N., Del Castillo, J.A.R.: Large scale instance selection by means of federal instance selection. Data Knowl. Eng. 75, 58–77 (2012)
Demšar, J., Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Eshelman, L.J.: The CHC adaptive search algorithm. Foundations of Genetic Algorithms. Morgan-Kaufmann (1991)
GarcÃa, S., Cano, J.R., Fernandez, A., Herrera, F.: A Proposal of Evolutionary Prototype Selection for Class Imbalance Problems, pp. 1415–1423 (2006)
GarcÃa, S., Derrac, J., Triguero, I., Carmona, C.J., Herrera, F.: Evolutionary-based selection of generalized instances for imbalanced classification. Knowl.-Based Syst. 25, 3–12 (2012)
GarcÃa, S., Fernandez, A., Herrera, F.: Enhancing the effectiveness and interpretability of decision tree and rule induction classifiers with evolutionary training set selection over imbalanced problems. Appl. Soft Comput. J. 9, 1304–1314 (2009)
GarcÃa, S., Luengo, J., Herrera, F.: Data preprocessing in data mining. Intelligent Systems Reference Library, vol. 72. Springer International Publishing, Cham (2015)
GarcÃa-Pedrajas, N.: Evolutionary computation for training set selection. Wiley Interdisc. Rev.: Data Min. Knowl. Discov. 1, 512–523 (2011)
GarcÃa-Pedrajas, N., Peez-RodrÃguez, J., De Haro-Garciá, A.: OligoIS: scalable instance selection for class-imbalanced data sets. IEEE Trans. Cybern. 43, 332–346 (2013)
Hart, P.: The condensed nearest neighbor rule (Corresp.). IEEE Trans. Inf. Theory 14, 1966–1967 (1968)
Olvera-Lopez, J.A., Carrasco-Ochoa, J.A., Martinez-Trinidad, J.F., Kittler, J.: A review of instance selection methods. Artif. Intell. Rev. 34, 133–143 (2010)
Passini, C., Luiza, M.: A strategy for training set selection in text classification problems. Int. J. Adv. Comput. Sci. Appl. 4, 54–60 (2013)
Ritter, G., Woodruff, H., Lowry, S., Isenhour, T.: An algorithm for a selective nearest neighbor decision rule (Corresp.). IEEE Trans. Inf. Theory 21 (1975)
Safe, M., Carballido, J., Ponzoni, I., Brignole, N.: On stopping criteria for genetic algorithms. Adv. Artif. Intell. 17, 405–413 (2004)
Sebban, M., Nock, R., Chauchat, J.H., Rakotomalala, R.: Impact of learning set quality and size on decision tree performances. Int. J. Comput. Syst. Signals 1, 85–105 (2000)
Wickham, H.: ggplot2: elegant graphics for data analysis (2009)
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. yst. Man Cybern. 2, 408–421 (1972)
Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Mach. Learn. 38, 257–286 (2000)
Zhu, X., Wu, X.: Scalable representative instance selection and ranking. Proc. Int. Conf. Pattern Recognit. 3, 352–355 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Bennette, W.D. (2017). A Data Driven Stopping Criterion for Evolutionary Instance Selection. In: Angelov, P., Gegov, A., Jayne, C., Shen, Q. (eds) Advances in Computational Intelligence Systems. Advances in Intelligent Systems and Computing, vol 513. Springer, Cham. https://doi.org/10.1007/978-3-319-46562-3_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-46562-3_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46561-6
Online ISBN: 978-3-319-46562-3
eBook Packages: EngineeringEngineering (R0)