An Empirical Study on Property Clustering in Linked Data

Gong, Saisai; Li, Haoxuan; Hu, Wei; Qu, Yuzhong

doi:10.1007/978-3-319-50112-3_6

Saisai Gong²⁰,
Haoxuan Li²⁰,
Wei Hu²⁰ &
…
Yuzhong Qu²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10055))

Included in the following conference series:

Joint International Semantic Technology Conference

702 Accesses
1 Citations

Abstract

Properties are used to describe entities, a part of which are likely to be clustered together to constitute an aspect. For example, first name, middle name and last name are usually gathered to describe a person’s name. However, existing automated approaches to property clustering remain far from satisfactory for an open domain like Linked Data. In this paper, we firstly investigated the relatedness between properties using five different measures. Then, we employed three clustering algorithms and two combination methods for property clustering. Based on a moderate-sized sample of Linked Data, we empirically studied the property clustering in Linked Data and found that a proper combination of different measures gave rise to the best result. Additionally, we showed how the property clustering can improve user experience in our entity browsing system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Ailon, N., Charikar, M., Newman, A.: Aggregating inconsistent information: ranking and clustering. J. ACM 55(5), 23 (2008)
Article MathSciNet MATH Google Scholar
Abedjan, Z., Naumann, F.: Synonym analysis for predicate expansion. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 140–154. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38288-8_10
Chapter Google Scholar
Budanitsky, A., Hirst, G.: Evaluating WordNet-based measures of lexical semantic relatedness. Comput. Linguist. 32(1), 13–47 (2006)
Article MATH Google Scholar
Cheng, G., Gong, S., Qu, Y.: An empirical study of vocabulary relatedness and its application to recommender systems. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 98–113. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25073-6_7
Chapter Google Scholar
Evert, S.: Corpora and collocations. In: Lüdeling, L., Kytö, M. (eds.) Corpus Linguistics: An International Handbook, pp. 1212–1248. Mouton de Gruyter, Berlin (2008)
Google Scholar
Fleiss, J.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76(5), 378–382 (1971)
Article Google Scholar
Gracia, J., Mena, E.: Web-based measure of semantic relatedness. In: Bailey, J., Maier, D., Schewe, K.-D., Thalheim, B., Wang, X.S. (eds.) WISE 2008. LNCS, vol. 5175, pp. 136–150. Springer, Heidelberg (2008). doi:10.1007/978-3-540-85481-4_12
Chapter Google Scholar
Hearst, M.: Clustering versus faceted categories for information exploration. Commun. ACM 49(4), 59–61 (2006)
Article Google Scholar
Hu, W., Jia, C.: A bootstrapping approach to entity linkage on the semantic web. J. Web Semant. 34, 1–12 (2015)
Article MathSciNet Google Scholar
Isele, R., Bizer, C.: Active learning of expressive linkage rules using genetic programming. J. Web Semant. 23, 2–15 (2013)
Article Google Scholar
Lin, D.: An information-theoretic definition of similarity. In: ICML 1998, pp. 296–304. Morgan Kaufmann, San Francisco (1998)
Google Scholar
Oren, E., Delbru, R., Decker, S.: Extending faceted navigation for RDF data. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 559–572. Springer, Heidelberg (2006). doi:10.1007/11926078_40
Chapter Google Scholar
Quan, D., Karger, D.: How to make a semantic web browser. In: WWW 2004, pp. 255–265. ACM, New York (2004)
Google Scholar
Shvaiko, P., Euzenat, J.: Ontology matching: state of the art and future challenges. IEEE Trans. Knowl. Data Eng. 25(1), 158–176 (2013)
Article Google Scholar
Smith, T., Frank, E.: Introducing machine learning concepts with WEKA. In: Mathé, E., Davis, S. (eds.) Statistical Genomics, pp. 353–378. Springer, Heidelberg (2016)
Chapter Google Scholar
Stoilos, G., Stamou, G., Kollias, S.: A string metric for ontology alignment. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 624–637. Springer, Heidelberg (2005). doi:10.1007/11574620_45
Chapter Google Scholar
Wagner, S., Wagner, D.: Comparing clusterings: an overview. Universität Karlsruhe, Fakultät für Informatik (2007)
Google Scholar
Zhang, Z., Gentile, A.L., Blomqvist, E., Augenstein, I., Ciravegna, F.: Statistical knowledge patterns: identifying synonymous relations in large linked datasets. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 703–719. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41335-3_44
Chapter Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (No. 61370019). We appreciate our students’ participation in the experiments.

Author information

Authors and Affiliations

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China
Saisai Gong, Haoxuan Li, Wei Hu & Yuzhong Qu

Authors

Saisai Gong
View author publications
You can also search for this author in PubMed Google Scholar
Haoxuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Wei Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yuzhong Qu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Hu .

Editor information

Editors and Affiliations

Information Technology, Monash University, Melbourne, Victoria, Australia
Yuan-Fang Li
Computer Science and Technology, Nanjing University, Nanjing, China
Wei Hu
Computer Science, National University of Singapore, Singapore, Singapore
Jin Song Dong
University of Huddersfield, Huddersfield, United Kingdom
Grigoris Antoniou
Information and Communication Technology, Griffith University, Brisbane, Queensland, Australia
Zhe Wang
ISTD, Singapore University of Technology and Design, Singapore, Singapore
Jun Sun
Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
Yang Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gong, S., Li, H., Hu, W., Qu, Y. (2016). An Empirical Study on Property Clustering in Linked Data. In: Li, YF., et al. Semantic Technology. JIST 2016. Lecture Notes in Computer Science(), vol 10055. Springer, Cham. https://doi.org/10.1007/978-3-319-50112-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-50112-3_6
Published: 27 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50111-6
Online ISBN: 978-3-319-50112-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics