Skip to main content

An Empirical Study on Property Clustering in Linked Data

  • Conference paper
  • First Online:
Semantic Technology (JIST 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10055))

Included in the following conference series:

Abstract

Properties are used to describe entities, a part of which are likely to be clustered together to constitute an aspect. For example, first name, middle name and last name are usually gathered to describe a person’s name. However, existing automated approaches to property clustering remain far from satisfactory for an open domain like Linked Data. In this paper, we firstly investigated the relatedness between properties using five different measures. Then, we employed three clustering algorithms and two combination methods for property clustering. Based on a moderate-sized sample of Linked Data, we empirically studied the property clustering in Linked Data and found that a proper combination of different measures gave rise to the best result. Additionally, we showed how the property clustering can improve user experience in our entity browsing system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://ws.nju.edu.cn/sview.

  2. 2.

    http://km.aifb.kit.edu/projects/btc-2011/.

  3. 3.

    http://ws.nju.edu.cn/sview/propcluster.zip.

  4. 4.

    https://developers.google.com/freebase/guide/basic_concepts.

  5. 5.

    http://dbpedia.org/resource/The_Pentagon.

  6. 6.

    http://www.wikidata.org.

  7. 7.

    http://www.freebase.com.

  8. 8.

    http://tools.wmflabs.org/reasonator.

References

  1. Ailon, N., Charikar, M., Newman, A.: Aggregating inconsistent information: ranking and clustering. J. ACM 55(5), 23 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  2. Abedjan, Z., Naumann, F.: Synonym analysis for predicate expansion. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 140–154. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38288-8_10

    Chapter  Google Scholar 

  3. Budanitsky, A., Hirst, G.: Evaluating WordNet-based measures of lexical semantic relatedness. Comput. Linguist. 32(1), 13–47 (2006)

    Article  MATH  Google Scholar 

  4. Cheng, G., Gong, S., Qu, Y.: An empirical study of vocabulary relatedness and its application to recommender systems. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 98–113. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25073-6_7

    Chapter  Google Scholar 

  5. Evert, S.: Corpora and collocations. In: Lüdeling, L., Kytö, M. (eds.) Corpus Linguistics: An International Handbook, pp. 1212–1248. Mouton de Gruyter, Berlin (2008)

    Google Scholar 

  6. Fleiss, J.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76(5), 378–382 (1971)

    Article  Google Scholar 

  7. Gracia, J., Mena, E.: Web-based measure of semantic relatedness. In: Bailey, J., Maier, D., Schewe, K.-D., Thalheim, B., Wang, X.S. (eds.) WISE 2008. LNCS, vol. 5175, pp. 136–150. Springer, Heidelberg (2008). doi:10.1007/978-3-540-85481-4_12

    Chapter  Google Scholar 

  8. Hearst, M.: Clustering versus faceted categories for information exploration. Commun. ACM 49(4), 59–61 (2006)

    Article  Google Scholar 

  9. Hu, W., Jia, C.: A bootstrapping approach to entity linkage on the semantic web. J. Web Semant. 34, 1–12 (2015)

    Article  MathSciNet  Google Scholar 

  10. Isele, R., Bizer, C.: Active learning of expressive linkage rules using genetic programming. J. Web Semant. 23, 2–15 (2013)

    Article  Google Scholar 

  11. Lin, D.: An information-theoretic definition of similarity. In: ICML 1998, pp. 296–304. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  12. Oren, E., Delbru, R., Decker, S.: Extending faceted navigation for RDF data. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 559–572. Springer, Heidelberg (2006). doi:10.1007/11926078_40

    Chapter  Google Scholar 

  13. Quan, D., Karger, D.: How to make a semantic web browser. In: WWW 2004, pp. 255–265. ACM, New York (2004)

    Google Scholar 

  14. Shvaiko, P., Euzenat, J.: Ontology matching: state of the art and future challenges. IEEE Trans. Knowl. Data Eng. 25(1), 158–176 (2013)

    Article  Google Scholar 

  15. Smith, T., Frank, E.: Introducing machine learning concepts with WEKA. In: Mathé, E., Davis, S. (eds.) Statistical Genomics, pp. 353–378. Springer, Heidelberg (2016)

    Chapter  Google Scholar 

  16. Stoilos, G., Stamou, G., Kollias, S.: A string metric for ontology alignment. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 624–637. Springer, Heidelberg (2005). doi:10.1007/11574620_45

    Chapter  Google Scholar 

  17. Wagner, S., Wagner, D.: Comparing clusterings: an overview. Universität Karlsruhe, Fakultät für Informatik (2007)

    Google Scholar 

  18. Zhang, Z., Gentile, A.L., Blomqvist, E., Augenstein, I., Ciravegna, F.: Statistical knowledge patterns: identifying synonymous relations in large linked datasets. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 703–719. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41335-3_44

    Chapter  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (No. 61370019). We appreciate our students’ participation in the experiments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Hu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Gong, S., Li, H., Hu, W., Qu, Y. (2016). An Empirical Study on Property Clustering in Linked Data. In: Li, YF., et al. Semantic Technology. JIST 2016. Lecture Notes in Computer Science(), vol 10055. Springer, Cham. https://doi.org/10.1007/978-3-319-50112-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50112-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50111-6

  • Online ISBN: 978-3-319-50112-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics