Assessing Cluster Quality Using Multiple Measures - A Decision Tree Based Approach

Osei-Bryson, Kweku-Muata

doi:10.1007/0-387-23529-9_24

Kweku-Muata Osei-Bryson⁵

Part of the book series: Operations Research/Computer Science Interfaces Series ((ORCS,volume 29))

907 Accesses
4 Citations

Abstract

Clustering is a popular data mining technique, with applications in many areas. Although there are many clustering algorithms, none of them is superior on all datasets. Typically these clustering algorithms while providing summary statistics on the generated set of clusters do not provide easily interpretable detailed descriptions of the set of clusters that are generated. Further for a given dataset, different algorithms may give different sets of clusters, and so it is never clear which algorithm and which parameter settings is the most appropriate. In this paper we propose the use of a decision tree (DT) based approach that involves the use of multiple performance measures for indirectly assessing cluster quality in order to determine the most appropriate set of clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ankerst, M., Breunig, M., Kriegel, H.-P., and Sander, J. (1999) “OPTICS: Ordering Points To Identify the Clustering Structure”, Proceedings of ACM SIGMOD’99 International Conference on the Management of Data, pp. 49–60. Philadelphia, PA.
Google Scholar
Banfield, J. and Raftery, A. (1992) “Identifying Ice Floes in Satellite Images”, Naval Research Reviews 43, pp. 2–18.
Google Scholar
Ben-Dor, A. and Yakhini, Z. (1999) “Clustering Gene Expression Patterns”, Proceedings of the 3rd Annual International Conference on Computational Molecular Biology (RECOMB 99), pp. 11–14, Lyon, France.
Google Scholar
Bohanec, M. and Bratko, I. (1994) “Trading Accuracy for Simplicity in Decision Trees”, Machine Learning 15, pp. 223–250.
MATH Google Scholar
Bryson, N. (1995) “A Goal Programming for Generating Priority Vectors”, Journal of the Operational Research Society 46, pp. 641–648.
Article MATH Google Scholar
Bryson, N., Mobolurin, A., and Ngwenyama, O. (1995) “Modelling Pairwise Comparisons on Ratio Scales”, European Journal of Operational Research 83, pp. 639–654.
Article MATH Google Scholar
Bryson, N. (K-M), and Joseph, A. (2000) “Generating Consensus Priority Interval Vectors For Group Decision Making In The AHP”, Journal of Multi-Criteria Decision Analysis 9:4, pp. 127–137.
Article MATH Google Scholar
Bezdek, J. (1981) Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York, NY.
MATH Google Scholar
Bock, H. (1996) “Probability Models in Partitional Cluster Analysis”, Computational Statistics and Data Analysis 23, pp. 5–28.
Article MATH Google Scholar
Cristofor, D. and Simovici, D. (2002) “An Information-Theoretical Approach to Clustering Categorical Databases using Genetic Algorithms”, Proceedings of the SIAM DM Workshop on Clustering High Dimensional Data, pp. 37–46. Arlington, VA.
Google Scholar
Dave, R. (1992) “Generalized Fuzzy C-Shells Clustering and Detection of Circular and Elliptic Boundaries”, Pattern Recognition 25, pp. 713–722.
Article Google Scholar
Dhillon, I. (2001) “Co-Clustering Documents and Words Using Bipartite Spectral Graph Partitioning”, Proceedings of the 7th ACM SIGKDD, pp. 269–274, San Francisco, CA.
Google Scholar
Dubes, R. (1993). “Cluster Analysis and Related Issues”, in Handbook of Pattern Recognition & Computer Vision, C. Chen, L. Pau, and P. Wang, Eds. World Scientific Publishing Co., Inc., River Edge, NJ, pp. 3–32.
Google Scholar
Fisher, D. (1987) “Knowledge Acquisition via Incremental Conceptual Clustering”, Machine Learning 2, pp. 139–172.
Google Scholar
Jain, A. and Dubes, R. (1988) Algorithms for Clustering Data. Prentice-Hall Advanced Reference Series. Prentice-Hall, Inc., Upper Saddle River, NJ.
MATH Google Scholar
Jain, A. and Flynn, P. (1993) Three Dimensional Object Recognition Systems. Elsevier Science Inc., New York, NY.
Google Scholar
Jain, A., Murty, M. and Flynn, P. (1999) “Data Clustering: A Review”, ACM Computing Surveys 31:3, pp. 264–323.
Article Google Scholar
Han, J. and Kamber, M. (2001) Data Mining: Concepts and Techniques, Morgan Kaufman, New York, NY.
Google Scholar
Huang, Z. (1997) “A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining”, Proceedings SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Tech. Report 97-07, UBC, Dept. of CS.
Google Scholar
Kim, H. and Koehler, G. (1995) “Theory and Practice of Decision Tree Induction”, Omega 23:6, pp. pp. 637–652.
Article Google Scholar
Liu, B., Yiyuan, X., and Yu, P. (2000) “Clustering through Decision Tree Construction”, Proceedings of the Ninth International Conference on Information and Knowledge Management (CIKM’00), pp. 20–29.
Google Scholar
Murphy, P., and Aha, D. (1994) UCI Repository of Machine Learning Databases. University of California, Department of Information and Computer Science
Google Scholar
Murtagh, F. (1983) “A Survey of Recent Advances in Hierarchical Clustering Algorithms which Use Cluster Centers”, Computer Journal 26, pp. 354–359.
MATH Google Scholar
Osei-Bryson, K.-M. (2004) “Evaluation of Decision Trees: A Multi-Criteria Approach”, Computers & Operations Research 31:11, pp. 1933–1945.
Article MATH Google Scholar
Saaty, T. (1980) The Analytic Hierarchy Process: Planning, Priority Setting, Resource Allocation, McGraw-Hill, New York
Google Scholar
Saaty, T. (1989) “Group Decision Making and the AHP”, in B. Golden, E. Wasil, and P. Harker (Editors), The Analytic Hierarchy Process: Application and Studies, pp. 59–67.
Google Scholar
Ward, J. (1963) “Hierarchical Grouping to Optimize An Objective Function”, J. Am. Stat. Assoc. 58, pp. 236–244.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Systems & The Informaton Systems Research Institute, Virginia Commonwealth University, Richmond, VA, 23284
Kweku-Muata Osei-Bryson

Authors

Kweku-Muata Osei-Bryson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Maryland, USA
Bruce Golden & S. Raghavan &
American University, USA
Edward Wasil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Osei-Bryson, KM. (2005). Assessing Cluster Quality Using Multiple Measures - A Decision Tree Based Approach. In: Golden, B., Raghavan, S., Wasil, E. (eds) The Next Wave in Computing, Optimization, and Decision Technologies. Operations Research/Computer Science Interfaces Series, vol 29. Springer, Boston, MA . https://doi.org/10.1007/0-387-23529-9_24

Download citation

DOI: https://doi.org/10.1007/0-387-23529-9_24
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-23528-8
Online ISBN: 978-0-387-23529-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics