skip to main content
10.1145/1014052.1014118acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Kernel k-means: spectral clustering and normalized cuts

Published:22 August 2004Publication History

ABSTRACT

Kernel k-means and spectral clustering have both been used to identify clusters that are non-linearly separable in input space. Despite significant research, these methods have remained only loosely related. In this paper, we give an explicit theoretical connection between them. We show the generality of the weighted kernel k-means objective function, and derive the spectral clustering objective of normalized cut as a special case. Given a positive definite similarity matrix, our results lead to a novel weighted kernel k-means algorithm that monotonically decreases the normalized cut. This has important implications: a) eigenvector-based algorithms, which can be computationally prohibitive, are not essential for minimizing normalized cuts, b) various techniques, such as local search and acceleration schemes, may be used to improve the quality as well as speed of kernel k-means. Finally, we present results on several interesting data sets, including diametrical clustering of large gene-expression matrices and a handwriting recognition data set.

References

  1. F. Bach and M. Jordan. Learning spectral clustering. In Proc. of NIPS-16. MIT Press, 2004.Google ScholarGoogle Scholar
  2. A. Banerjee, S. Merugu, I. Dhillon, and J. Ghosh. Clustering with Bregman divergence. Proceeding of SIAM Data Mining conference, pages 234--245, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  3. N. Cristianini and J. Shawe-Taylor. Introduction to Support Vector Machines: And Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge, U.K., 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. I. S. Dhillon, J. Fan, and Y. Guan. Efficient clustering of very large document collections. In Data Mining for Scientific and Engineering Applications, pages 357--381. Kluwer Academic Publishers, 2001.Google ScholarGoogle Scholar
  5. I. S. Dhillon, Y. Guan, and J. Kogan. Iterative clustering of high dimensional text data augmented by local search. In Proceedings of The 2002 IEEE International Conference on Data Mining, pages 131--138, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. I. S. Dhillon, E. M. Marcotte, and U. Roshan. Diametrical clustering for identifying anti-correlated gene clusters. Bioinformatics, 19(13):1612--1619, September 2003.Google ScholarGoogle ScholarCross RefCross Ref
  7. M. Girolami. Mercer kernel based clustering in feature space. IEEE Transactions on Neural Networks, 13(4):669--688, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. Golub and C. Van Loan. Matrix Computations. Johns Hopkins University Press, 1989.Google ScholarGoogle Scholar
  9. R. Kannan, S. Vempala, and A. Vetta. On clusterings -- good, bad, and spectral. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Y. Ng, M. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In Proc. of NIPS-14, 2001.Google ScholarGoogle Scholar
  11. B. Scholkopf, A. Smola, and K.-R. Muller. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10:1299--1319, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Trans. Pattern Analysis and Machine Intelligence, 22(8):888--905, August 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. X. Yu and J. Shi. Multiclass spectral clustering. In International Conference on Computer Vision, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. H. Zha, C. Ding, M. Gu, X. He, and H. Simon. Spectral relaxation for k-means clustering. In Neural Info. Processing Systems, 2001.Google ScholarGoogle Scholar

Index Terms

  1. Kernel k-means: spectral clustering and normalized cuts

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
        August 2004
        874 pages
        ISBN:1581138881
        DOI:10.1145/1014052

        Copyright © 2004 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 August 2004

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate1,133of8,635submissions,13%

        Upcoming Conference

        KDD '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader