Abstract
A random sample of sizeN is divided intok clusters that minimize the within clusters sum of squares locally. Some large sample properties of this k-means clustering method (ask approaches ∞ withN) are obtained. In one dimension, it is established that the sample k-means clusters are such that the within-cluster sums of squares are asymptotically equal, and that the sizes of the cluster intervals are inversely proportional to the one-third power of the underlying density at the midpoints of the intervals. The difficulty involved in generalizing the results to the multivariate case is mentioned.
Similar content being viewed by others
References
BILLINGSLEY, P. (1968),Convergence of Probability Measures, New York: John Wiley and Sons.
BLASHFIELD, R.K., and Aldenderfer, M.S. (1978), “The Literature on Cluster Analysis,”Multivariate Behavioral Research, 13, 271–95.
ELIAS, P. (1970), “Bounds on Performance of Optimum Quantizers,”IEEE Transactions on Information Theory, 16, 172–184.
FELLER, W. (1971),An Introduction to Probability Theory and Its Applications, Volume II, New York: John Wiley and Sons, 549–553.
HARTIGAN, J.A. (1975),Clustering Algorithms, New York: John Wiley and Sons.
HARTIGAN, J.A. (1978), “Asymptotic Distributions for Clustering Criteria,”Annals of Statistics, 6, 117–131.
HARTIGAN, J.A., and WONG, M.A. (1979), “Algorithm AS136: A K-means Clustering Algorithm,”Applied Statistics, 28, 100–108.
MACQUEEN, J.B. (1967), “Some Methods for Classification and Analysis of Multivariate Observations”,Proceedings of the Fifth Berkeley Symposium on Probability and Statistics, 281–297.
POLLARD, D. (1981), “Strong Consistency of K-means Clustering,”Annals of Statistics, 9, 135–140.
WONG, M.A. (1982a), “Asymptotic Properties of Univariate Population K-means Cluster,”The Classification Society Bulletin, 5, 44–50.
WONG, M.A. (1982b), “Asymptotic Properties of Bivariate K-means Clusters,”Communications in Statistics, Volume A-11, 1155–1172.
WONG, M.A. (1985), “Using the K-means Clustering Method as a Density Estimation Procedure,”Journal of Organizational Behavior and Statistics, (to appear).
Author information
Authors and Affiliations
Additional information
This research was supported in part by the National Science Foundation under Grant MCS75-08374. The author would like to thank John Hartigan and David Pollard for helpful discussions and comments.
Rights and permissions
About this article
Cite this article
Wong, M.A. Asymptotic properties of univariate sample k-means clusters. Journal of Classification 1, 255–270 (1984). https://doi.org/10.1007/BF01890126
Issue Date:
DOI: https://doi.org/10.1007/BF01890126