Topological Comparisons of Proximity Measures

Zighed, Djamel Abdelkader; Abdesselam, Rafik; Hadgu, Asmelash

doi:10.1007/978-3-642-30217-6_32

Djamel Abdelkader Zighed²³,
Rafik Abdesselam²³ &
Asmelash Hadgu²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7301))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2908 Accesses
9 Citations

Abstract

In many fields of application, the choice of proximity measure directly affects the results of data mining methods, whatever the task might be: clustering, comparing or structuring of a set of objects. Generally, in such fields of application, the user is obliged to choose one proximity measure from many possible alternatives. According to the notion of equivalence, such as the one based on pre-ordering, certain proximity measures are more or less equivalent, which means that they should produce almost the same results. This information on equivalence might be helpful for choosing one such measure. However, the complexity O(n ⁴ ) of this approach makes it intractable when the size n of the sample exceeds a few hundred. To cope with this limitation, we propose a new approach with less complexity O(n ² ). This is based on topological equivalence and it exploits the concept of local neighbors. It defines equivalence between two proximity measures as having the same neighborhood structure on the objects. We illustrate our approach by considering 13 proximity measures used on datasets with continuous attributes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Batagelj, V., Bren, M.: Comparing resemblance measures. In: Proc. International Meeting on Distance Analysis, DISTANCIA 1992 (1992)
Google Scholar
Batagelj, V., Bren, M.: Comparing resemblance measures. Journal of classification 12, 73–90 (1995)
Article MathSciNet MATH Google Scholar
Bouchon-Meunier, M., Rifqi, B., Bothorel, S.: Towards general measures of comparison of objects. Fuzzy Sets and Systems 84(2), 143–153 (1996)
Article MathSciNet MATH Google Scholar
Clarke, K.R., Somerfield, P.J., Chapman, M.G.: On resemblance measures for ecological studies, including taxonomic dissimilarities and a zero-adjusted Bray-Curtis coefficient for denuded assemblages. Journal of Experimental Marine Biology & Ecology 330(1), 55–80 (2006)
Article Google Scholar
Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. In: Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics (2003)
Google Scholar
Kim, J.H., Lee, S.: Tail bound for the minimal spanning tree of a complete graph. Statistics & Probability Letters 64(4), 425–430 (2003)
Article MathSciNet MATH Google Scholar
Lerman, I.C.: Indice de similarité et préordonnance associée, Ordres. In: Travaux Du Séminaire Sur Les Ordres Totaux Finis, Aix-en-Provence (1967)
Google Scholar
Lesot, M.J., Rifqi, M., Benhadda, H.: Similarity measures for binary and numerical data: a survey. IJKESDP 1(1), 63–84 (2009)
Article Google Scholar
Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the 15th International Conference on Machine Learning, pp. 296–304 (1998)
Google Scholar
Liu, H., Song, D., Ruger, S., Hu, R., Uren, V.: Comparing Dissimilarity Measures for Content-Based Image Retrieval. In: Li, H., Liu, T., Ma, W.-Y., Sakai, T., Wong, K.-F., Zhou, G. (eds.) AIRS 2008. LNCS, vol. 4993, pp. 44–50. Springer, Heidelberg (2008)
Chapter Google Scholar
Malerba, D., Esposito, F., Gioviale, V., Tamma, V.: Comparing dissimilarity measures for symbolic data analysis. In: Proceedings of Exchange of Technology and Know-how and New Techniques and Technologies for Statistics, vol. 1, pp. 473–481 (2001)
Google Scholar
Malerba, D., Esposito, F., Monopoli, M.: Comparing dissimilarity measures for probabilistic symbolic objects. In: Data Mining III. Series Management Information Systems, vol. 6, pp. 31–40 (2002)
Google Scholar
Mantel, N.: A technique of disease clustering and a generalized regression approach. Cancer Research 27, 209–220 (1967)
Google Scholar
Noreault, T., McGill, M., Koll, M.B.: A performance evaluation of similarity measures, document term weighting schemes and representations in a Boolean environment. In: Proceedings of the 3rd Annual ACM Conference on Research and Development in Information Retrieval (1980)
Google Scholar
Park, J.C., Shin, H., Choi, B.K.: Elliptic Gabriel graph for finding neighbors in a point set and its application to normal vector estimation. Computer-Aided Design 38(6), 619–626 (2006)
Article Google Scholar
Preparata, F.P., Shamos, M.I.: Computational geometry: an introduction. Springer (1985)
Google Scholar
Richter, M.M.: Classification and learning of similarity measures. In: Proceedings der Jahrestagung der Gesellschaft fur Klassifikation. Studies in Classification, Data Analysis and Knowledge Organisation. Springer (1992)
Google Scholar
Rifqi, M., Detyniecki, M., Bouchon-Meunier, B.: Discrimination power of measures of resemblance. In: IFSA 2003. Citeseer (2003)
Google Scholar
Schneider, J.W., Borlund, P.: Matrix comparison, Part 1: Motivation and important issues for measuring the resemblance between proximity measures or ordination results. Journal of the American Society for Information Science and Technology 58(11), 1586–1595 (2007)
Article Google Scholar
Schneider, J.W., Borlund, P.: Matrix comparison, Part 2: Measuring the resemblance between proximity measures or ordination results by use of the Mantel and Procrustes statistics. Journal of the American Society for Information Science and Technology 58(11), 1596–1609 (2007)
Article Google Scholar
Spertus, E., Sahami, M., Buyukkokten, O.: Evaluating similarity measures: a large-scale study in the orkut social network. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM (2005)
Google Scholar
Strehl, A., Ghosh, J., Mooney, R.: Impact of similarity measures on web-page clustering. In: Workshop on Artificial Intelligence for Web Search, pp. 58–64. AAAI (2000)
Google Scholar
Toussaint, G.T.: The relative neighbourhood graph of a finite planar set. Pattern Recognition 12(4), 261–268 (1980)
Article MathSciNet MATH Google Scholar
UCI Machine Learning Repository, http://archive.ics.uci.edu/ml
Ward, J.R.: Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, JSTOR 58(301), 236–244 (1963)
Google Scholar
Warrens, M.J.: Bounds of resemblance measures for binary (presence/absence) variables. Journal of Classification 25(2), 195–208 (2008)
Article MathSciNet MATH Google Scholar
Zhang, B., Srihari, S.N.: Properties of binary vector dissimilarity measures. In: Proc. JCIS Int’l Conf. Computer Vision, Pattern Recognition, and Image Processing, vol. 1 (2003)
Google Scholar
Zwick, R., Carlstein, E., Budescu, D.V.: Measures of similarity among fuzzy concepts: A comparative analysis. Int. J. Approx. Reason 2(1), 221–242 (1987)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Statistics, ERIC laboratory, University Lumiére of Lyon 2, Campus Porte des Alpes, France
Djamel Abdelkader Zighed, Rafik Abdesselam & Asmelash Hadgu

Authors

Djamel Abdelkader Zighed
View author publications
You can also search for this author in PubMed Google Scholar
Rafik Abdesselam
View author publications
You can also search for this author in PubMed Google Scholar
Asmelash Hadgu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Michigan State University, 428 S. Shaw Lane, 48824-1226, East Lansing, MI, USA
Pang-Ning Tan
School of Information Technologies, University of Sydney, 1 Cleveland St., 2006, Sydney, NSW, Australia
Sanjay Chawla
Faculty of Computing and Informatics, Jalan Multimedia, Multimedia University, 63100, Cyberjaya, Selangor, Malaysia
Chin Kuan Ho
Department of Computing and Information Systems, The University of Melbourne, 111 Barry Street, 3053, Melbourne, VIC, Australia
James Bailey

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zighed, D.A., Abdesselam, R., Hadgu, A. (2012). Topological Comparisons of Proximity Measures. In: Tan, PN., Chawla, S., Ho, C.K., Bailey, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2012. Lecture Notes in Computer Science(), vol 7301. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30217-6_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-30217-6_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30216-9
Online ISBN: 978-3-642-30217-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics