Cluster Aggregate Inequality and Multi-level Hierarchical Clustering

Ding, Chris; He, Xiaofeng

doi:10.1007/11564126_12

Chris Ding²³ &
Xiaofeng He²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3721))

Included in the following conference series:

European Conference on Principles of Data Mining and Knowledge Discovery

2941 Accesses
5 Citations

Abstract

We show that (1) in hierarchical clustering, many linkage functions satisfy a cluster aggregate inequality, which allows an exact O(N ²) multi-level (using mutual nearest neighbor) implementation of the standard O(N ³) agglomerative hierarchical clustering algorithm. (2) a desirable close friends cohesion of clusters can be translated into kNN consistency which is guaranteed by the multi-level algorithm; (3) For similarity-based linkage functions, the multi-level algorithm is naturally implemented as graph contraction. The effectiveness of our algorithms is demonstrated on a number of real life applications.

Download to read the full chapter text

Chapter PDF

On the Properties of α-Unchaining Single Linkage Hierarchical Clustering

Article 04 April 2016

A Density-Sensitive Hierarchical Clustering Method

Article 28 September 2018

Versatile Linkage: a Family of Space-Conserving Strategies for Agglomerative Hierarchical Clustering

Article 16 July 2019

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, Chichester (2000)
Google Scholar
Fung, B., Wang, K., Ester, M.: Large hierarchical document clustering using frequent itemsets. In: Proc. SIAM Data Mining Conf. (2003)
Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: Elements of Statistical Learning. Springer, Heidelberg (2001)
MATH Google Scholar
Jain, A.K., Dubes, R.C.: Algorithms for clustering data. Prentice-Hall, Englewood Cliffs (1988)
MATH Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing Surveys 31, 264–323 (1999)
Article Google Scholar
Jung, S.Y., Kim, T.-S.: An agglomerative hierarchical clustering using partial maximum array and incremental similarity computation method. In: Proc. SIAM Conf. on Data Mining, pp. 265–272 (2001)
Google Scholar
Karypis, G., Han, E.-H., Kumar, V.: Chameleon: Hierarchical clustering using dynamic modeling. IEEE Computer 32, 68–75 (1999)
Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE. Trans. on Pattern Analysis and Machine Intelligence 22, 888–905 (2000)
Article Google Scholar
Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Academic Press, London (1999)
Google Scholar
Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval. Information Processing and Management 22, 465–476 (1986)
Article Google Scholar
Xiong, H., Steinbach, M., Tan, P.-N., Kumar, V.: Hicap:hierarchial clustering with pattern preservation. In: Proc. SIAM Data Mining Conf., pp. 279–290 (2004)
Google Scholar
H. Yu, J. Yang, and J. Han. Classifying large data sets using svms with hierarchical clusters. In Proc. ACM Int’l Conf. Knowledge Disc. Data Mining (KDD), pages 306–315, 2003.
Google Scholar
T. Zhang, R. Ramakrishnan, and M. Livny. Birch: an efficient data clustering method for very large databases. Proc. ACM Int’l Conf. Management of Data (SIGMOD), pages 103–114, 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

Lawrence Berkeley National Laboratory, Berkeley, California, 94720, USA
Chris Ding & Xiaofeng He

Authors

Chris Ding
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng He
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

LIACC/FEP, Universidade do Porto, Portugal
Alípio Mário Jorge
LIAAD-INESC Porto LA / FEP, University of Porto, R. de Ceuta, 118, 6, 4050-190, Porto, Portugal
Luís Torgo
LIAAD-INESC Porto L.A./Faculty of Economics, University of Porto, Rua de Ceuta, 118-6, 4050-190, Porto, Portugal
Pavel Brazdil
Faculdade de Engenharia & LIAAD, Universidade do Porto, Portugal
Rui Camacho
Faculty of Economics of the University of Porto, Portugal
João Gama

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ding, C., He, X. (2005). Cluster Aggregate Inequality and Multi-level Hierarchical Clustering. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds) Knowledge Discovery in Databases: PKDD 2005. PKDD 2005. Lecture Notes in Computer Science(), vol 3721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564126_12

Download citation

DOI: https://doi.org/10.1007/11564126_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29244-9
Online ISBN: 978-3-540-31665-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Cluster Aggregate Inequality and Multi-level Hierarchical Clustering

Abstract

Chapter PDF

Similar content being viewed by others

On the Properties of α-Unchaining Single Linkage Hierarchical Clustering

A Density-Sensitive Hierarchical Clustering Method

Versatile Linkage: a Family of Space-Conserving Strategies for Agglomerative Hierarchical Clustering

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Cluster Aggregate Inequality and Multi-level Hierarchical Clustering

Abstract

Chapter PDF

Similar content being viewed by others

On the Properties of α-Unchaining Single Linkage Hierarchical Clustering

A Density-Sensitive Hierarchical Clustering Method

Versatile Linkage: a Family of Space-Conserving Strategies for Agglomerative Hierarchical Clustering

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation