Abstract
Many data mining algorithms are often found to be sensitive to the size of dataset. It may result in large memory requirements and very slow response time to execute tasks on large datasets. Thus, data reduction is an important issue in the field of data mining. This paper proposes a novel method for spatial point-data reduction. The main idea is to search a small subset of instances composed of border instances from original training set by using a modified pulse coupled neural network (PCNN) model. Original training instances are mapped into some pulse coupled neurons, and a firing algorithm is presented for determining which instances locate in border regions and filtering noisy instances. The reduced set maintains the main characteristics of original dataset and avoids the influence of noise, thus it can keep or even improve the quality of data mining results. The proposed method is a general data reduction algorithm, which can be used to improve classification, regression and clustering algorithms. The method achieves approximately linear time complexity, and can be used to process large spatial datasets. Experiments demonstrate that the proposed method is efficient and effective.
Similar content being viewed by others
References
Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6(1): 37–66
Cameron-Jones M (1995) Instance selection by encoding length heuristic with random mutation hill climbing. In: AI-conference. World Scientific Publishing, Singapore, pp 99–106
Eckhorn R, Reitboeck H, Arndt M, Dicke P (1990) Feature linking via synchronization among distributed assemblies: simulations of results from cat visual cortex. Neural Comput 2(3): 293–307
Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining. AAAI Press, Portland, pp 226–231
Gates GW (1972) The reduced nearest neighbor rule. IEEE Trans Inf Theory 18(3): 431–433
Ho TK, Kleinberg EM (1996) Building projectable classifiers of arbitrary complexity. In: Proceedings of ICPR, vol 96, p 880
Ishibuchi H, Nakashima T, Nii M (2001) Learning of neural networks with GA-based instance selection. In: IFSA world congress and 20th NAFIPS international conference, joint 9th, vol 4
Jankowski N, Grochowski M (2004) Comparison of instances selection algorithms I. Algorithms survey. Lecture notes in computer science. Springer, Heidelberg, pp 598–603
Johnson JL, Padgett ML (1999) PCNN models and applications. IEEE Trans Neural Netw 10(3): 480–498
Koperski K, Han J, Adhikary J (1998) Mining knowledge in geographical data. Commun ACM 26(1): 65–74
Lindblad T, Kinser JM (1998) Image processing using pulse-coupled neural networks. Springer, New York
Mangasarian OL, Musicant DR (2001) Lagrangian support vector machines. J Mach Learn Res 1: 161–177
Pan R, Yang Q, Pan JJ, Li L (2005) Competence driven case-base mining. In: Proceedings of the national conference on artificial intelligence, vol 20. MIT Press, Cambridge, p 228
Ritter GL, Woodruff HB, Lowry SR, Isenhour TL (1975) An algorithm for a selective nearest neighbor decision rule. IEEE Trans Inf Theory 21: 665–669
Rumelhart D, McClelland J (1986) Parallel distributed processing: explorations in the microstructure of cognition, vol 1: foundations. MIT Press, Cambridge
Sane SS, Ghatol AA (2007) A novel supervised instance selection algorithm. Int J Bus Intell Data Min 2(4): 471–495
Skalak DB (1994) Prototype and feature selection by sampling and random mutation hill climbing algorithms. In: Proceedings of the eleventh international conference on machine learning, pp 293–301
Skourikhine A (2000) Parallel image processing with autowaves: segmentation and edge extraction. In: Conference: systemics, cybernetic and informatics, Orlando, FL, USA
Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3): 293–300
Suykens JAK, Lukas L, Vandewalle J (2000) Sparse least squares support vector machine classifiers. In: Proceedings of The European symposium on artificial neural networks (ESANNédb́b2000), Bruges, Belgium, pp 37–42
Suykens JAK, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J (2002) Least squares support vector machines. World Scientific, Singapore
Tomek I (1976) An experiment with the edited nearest-neighbour rule. IEEE Trans Syst Man Cybern 6(6): 448–452
Wilson D (1972) Asympotic properties of nearest neighbor rule using edited data. IEEE Trans Syst Man Cybern 2(3): 408–421
Wilson DR, Martinez TR (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38(3): 257–286
Xia Y, Wang G, Gao S (2007) An efficient clustering algorithm for 2D multi-density dataset in large database. In: MUE ’07: Proceedings of the 2007 international conference on multimedia and ubiquitous engineering, Washington, DC, USA. IEEE Computer Society, pp 78–82
Yousef R, Hindi Ke (2007) Training radial basis function networks using reduced sets as center points. Int J Inf Technol 2(1): 21–35
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sang, Y., Yi, Z. & Zhou, J. Spatial Point-Data Reduction Using Pulse Coupled Neural Network. Neural Process Lett 32, 11–29 (2010). https://doi.org/10.1007/s11063-010-9140-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-010-9140-2