Abstract
While analyzing the previous methods for determining the number of probability-based clustering, this paper introduces an improved Monte Carlo Cross-Validation algorithm (iMCCV) and attempts to solve the posterior probabilities spread problem, which cannot be resolved by the Monte Carlo Cross-Validation algorithm. Furthermore, we present a hybrid approach to determine the number of probability-based clustering by combining the iMCCV algorithm and the parallel coordinates visual technology. The efficiency of our approach is discussed with experimental results.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases, University of California, Irvine, Dept. of Information and Computer Sciences (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Cheeseman, P., Stutz, J.: Bayesian Classification (AutoClass): Theory and Results. In: Advances in Knowledge Discovery and Data Mining, pp. 153–180. AAAI Press/MIT Press (1995)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B(Methodological) 39(1), 1–38
Fraley, C., Raftery, A.E.: How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. Computer Journal 41, 578–588 (1998)
Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. Massachusetts Institute of Technology (2001)
Inselberg, A., Dimsdale, B.: Parallel Coordinates: A Tool for Visualizing Multidimensional Geometry. In: Proceedings of the First conference on Visualization, San Francisco, California, pp. 361–378 (1990)
Smyth, P.: Clustering using Monte Carlo Cross-Validation. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD 1996), pp. 126–133. AAAI Press, Menlo Park (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dai, T., Li, C., Sun, J. (2004). Determining the Number of Probability-Based Clustering: A Hybrid Approach. In: Chi, CH., Lam, KY. (eds) Content Computing. AWCC 2004. Lecture Notes in Computer Science, vol 3309. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30483-8_51
Download citation
DOI: https://doi.org/10.1007/978-3-540-30483-8_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23898-0
Online ISBN: 978-3-540-30483-8
eBook Packages: Springer Book Archive