Abstract
Properties are used to describe entities, a part of which are likely to be clustered together to constitute an aspect. For example, first name, middle name and last name are usually gathered to describe a person’s name. However, existing automated approaches to property clustering remain far from satisfactory for an open domain like Linked Data. In this paper, we firstly investigated the relatedness between properties using five different measures. Then, we employed three clustering algorithms and two combination methods for property clustering. Based on a moderate-sized sample of Linked Data, we empirically studied the property clustering in Linked Data and found that a proper combination of different measures gave rise to the best result. Additionally, we showed how the property clustering can improve user experience in our entity browsing system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
References
Ailon, N., Charikar, M., Newman, A.: Aggregating inconsistent information: ranking and clustering. J. ACM 55(5), 23 (2008)
Abedjan, Z., Naumann, F.: Synonym analysis for predicate expansion. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 140–154. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38288-8_10
Budanitsky, A., Hirst, G.: Evaluating WordNet-based measures of lexical semantic relatedness. Comput. Linguist. 32(1), 13–47 (2006)
Cheng, G., Gong, S., Qu, Y.: An empirical study of vocabulary relatedness and its application to recommender systems. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 98–113. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25073-6_7
Evert, S.: Corpora and collocations. In: Lüdeling, L., Kytö, M. (eds.) Corpus Linguistics: An International Handbook, pp. 1212–1248. Mouton de Gruyter, Berlin (2008)
Fleiss, J.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76(5), 378–382 (1971)
Gracia, J., Mena, E.: Web-based measure of semantic relatedness. In: Bailey, J., Maier, D., Schewe, K.-D., Thalheim, B., Wang, X.S. (eds.) WISE 2008. LNCS, vol. 5175, pp. 136–150. Springer, Heidelberg (2008). doi:10.1007/978-3-540-85481-4_12
Hearst, M.: Clustering versus faceted categories for information exploration. Commun. ACM 49(4), 59–61 (2006)
Hu, W., Jia, C.: A bootstrapping approach to entity linkage on the semantic web. J. Web Semant. 34, 1–12 (2015)
Isele, R., Bizer, C.: Active learning of expressive linkage rules using genetic programming. J. Web Semant. 23, 2–15 (2013)
Lin, D.: An information-theoretic definition of similarity. In: ICML 1998, pp. 296–304. Morgan Kaufmann, San Francisco (1998)
Oren, E., Delbru, R., Decker, S.: Extending faceted navigation for RDF data. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 559–572. Springer, Heidelberg (2006). doi:10.1007/11926078_40
Quan, D., Karger, D.: How to make a semantic web browser. In: WWW 2004, pp. 255–265. ACM, New York (2004)
Shvaiko, P., Euzenat, J.: Ontology matching: state of the art and future challenges. IEEE Trans. Knowl. Data Eng. 25(1), 158–176 (2013)
Smith, T., Frank, E.: Introducing machine learning concepts with WEKA. In: Mathé, E., Davis, S. (eds.) Statistical Genomics, pp. 353–378. Springer, Heidelberg (2016)
Stoilos, G., Stamou, G., Kollias, S.: A string metric for ontology alignment. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 624–637. Springer, Heidelberg (2005). doi:10.1007/11574620_45
Wagner, S., Wagner, D.: Comparing clusterings: an overview. Universität Karlsruhe, Fakultät für Informatik (2007)
Zhang, Z., Gentile, A.L., Blomqvist, E., Augenstein, I., Ciravegna, F.: Statistical knowledge patterns: identifying synonymous relations in large linked datasets. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 703–719. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41335-3_44
Acknowledgements
This work is supported by the National Natural Science Foundation of China (No. 61370019). We appreciate our students’ participation in the experiments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Gong, S., Li, H., Hu, W., Qu, Y. (2016). An Empirical Study on Property Clustering in Linked Data. In: Li, YF., et al. Semantic Technology. JIST 2016. Lecture Notes in Computer Science(), vol 10055. Springer, Cham. https://doi.org/10.1007/978-3-319-50112-3_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-50112-3_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50111-6
Online ISBN: 978-3-319-50112-3
eBook Packages: Computer ScienceComputer Science (R0)