Skip to main content
Log in

Asymptotic properties of univariate sample k-means clusters

  • Authors Of Articles
  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

A random sample of sizeN is divided intok clusters that minimize the within clusters sum of squares locally. Some large sample properties of this k-means clustering method (ask approaches ∞ withN) are obtained. In one dimension, it is established that the sample k-means clusters are such that the within-cluster sums of squares are asymptotically equal, and that the sizes of the cluster intervals are inversely proportional to the one-third power of the underlying density at the midpoints of the intervals. The difficulty involved in generalizing the results to the multivariate case is mentioned.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • BILLINGSLEY, P. (1968),Convergence of Probability Measures, New York: John Wiley and Sons.

    Google Scholar 

  • BLASHFIELD, R.K., and Aldenderfer, M.S. (1978), “The Literature on Cluster Analysis,”Multivariate Behavioral Research, 13, 271–95.

    Google Scholar 

  • ELIAS, P. (1970), “Bounds on Performance of Optimum Quantizers,”IEEE Transactions on Information Theory, 16, 172–184.

    Google Scholar 

  • FELLER, W. (1971),An Introduction to Probability Theory and Its Applications, Volume II, New York: John Wiley and Sons, 549–553.

    Google Scholar 

  • HARTIGAN, J.A. (1975),Clustering Algorithms, New York: John Wiley and Sons.

    Google Scholar 

  • HARTIGAN, J.A. (1978), “Asymptotic Distributions for Clustering Criteria,”Annals of Statistics, 6, 117–131.

    Google Scholar 

  • HARTIGAN, J.A., and WONG, M.A. (1979), “Algorithm AS136: A K-means Clustering Algorithm,”Applied Statistics, 28, 100–108.

    Google Scholar 

  • MACQUEEN, J.B. (1967), “Some Methods for Classification and Analysis of Multivariate Observations”,Proceedings of the Fifth Berkeley Symposium on Probability and Statistics, 281–297.

  • POLLARD, D. (1981), “Strong Consistency of K-means Clustering,”Annals of Statistics, 9, 135–140.

    Google Scholar 

  • WONG, M.A. (1982a), “Asymptotic Properties of Univariate Population K-means Cluster,”The Classification Society Bulletin, 5, 44–50.

    Google Scholar 

  • WONG, M.A. (1982b), “Asymptotic Properties of Bivariate K-means Clusters,”Communications in Statistics, Volume A-11, 1155–1172.

    Google Scholar 

  • WONG, M.A. (1985), “Using the K-means Clustering Method as a Density Estimation Procedure,”Journal of Organizational Behavior and Statistics, (to appear).

Download references

Author information

Authors and Affiliations

Authors

Additional information

This research was supported in part by the National Science Foundation under Grant MCS75-08374. The author would like to thank John Hartigan and David Pollard for helpful discussions and comments.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wong, M.A. Asymptotic properties of univariate sample k-means clusters. Journal of Classification 1, 255–270 (1984). https://doi.org/10.1007/BF01890126

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01890126

Keywords

Navigation