Abstract
The k- means algorithm is a well-known clustering method. Although this technique was initially defined for a vector representation of the data, the set median (the point belonging to a set P that minimizes the sum of distances to the rest of points in P) can be used instead of the mean when this vectorial representation is not possible. The computational cost of the set median is O(|P| 2). Recently, a new method to obtain an approximated median in O(|P|) was proposed. In this paper we use this approximated median in the k-median algorithm to speed it up.
The authors thank the Spanish CICyT for partial support of this work through project TIC2000-1703-CO3-02
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Bradley, P. S., Fayyad, U. M.: Refining Initial Points for K-Means Clustering. Proc. 15th International Conf. on Machine Learning (1998).
Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley (2001).
Fu, K. S.: Syntactic Pattern Recognition and Applications. Prentice-Hall, Engle-wood Cliffs, NJ (1982).
de la Higuera, C., Casacuberta, F.: The topology of strings: two NP-complete problems. Theoretical Computer Science 230 39–48 (2000).
Jain, A. K., Dubes, R. C.: Algorithms for clustering data. Prentice-Hall (1988).
Martínez, C., Juan, A., Casacuberta, F.: Improving classification using median string and nn rules. In: Proceedings of IX Simposium Nacional de Reconocimiento de Formas y Análisis de Imágenes, 391–394 (2001).
Micó, L., Oncina, J.: An approximate median search algorithm in non-metric spaces. Pattern Recognition Letters 22 1145–1151 (2001).
Peña, J. M., Lozano, J. A., Larrañaga, P.: An empirical comparison of four initialization methods for the K-means algorithm. Pattern Recognition Letters 20 1027–1040 (1999).
Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Academic Press (1999).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gómez-Ballester, E., Micó, L., Oncina, J. (2002). A Fast Approximated k-Median Algorithm. In: Caelli, T., Amin, A., Duin, R.P.W., de Ridder, D., Kamel, M. (eds) Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2002. Lecture Notes in Computer Science, vol 2396. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-70659-3_76
Download citation
DOI: https://doi.org/10.1007/3-540-70659-3_76
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44011-6
Online ISBN: 978-3-540-70659-5
eBook Packages: Springer Book Archive