skip to main content
10.1145/3148011.3148020acmconferencesArticle/Chapter ViewAbstractPublication Pagesk-capConference Proceedingsconference-collections
research-article

Repairing Hidden Links in Linked Data: Enhancing the quality of RDF knowledge graphs

Authors Info & Claims
Published:04 December 2017Publication History

ABSTRACT

Knowledge Graphs (KG) are becoming core components of most artificial intelligence applications. Linked Data, as a method of publishing KGs, allows applications to traverse within, and even out of, the graph thanks to global dereferenceable identifiers denoting entities, in the form of IRIs. However, as we show in this work, after analyzing several popular datasets (namely DBpedia, LOD Cache, and Web Data Commons JSON-LD data) many entities are being represented using literal strings where IRIs should be used, diminishing the advantages of using Linked Data. To remedy this, we propose an approach for identifying such strings and replacing them with their corresponding entity IRIs. The proposed approach is based on identifying relations between entities based on both ontological axioms as well as data profiling information and converting strings to entity IRIs based on the types of entities linked by each relation. Our approach showed 98% recall and 76% precision in identifying such strings and 97% precision in converting them to their corresponding IRI in the considered KG. Further, we analyzed how the connectivity of the KG is increased when new relevant links are added to the entities as a result of our method. Our experiments on a subset of the Spanish DBpedia data show that it could add 25% more links to the KG and improve the overall connectivity by 17%.

References

  1. Christian Bizer, Julius Volz, Georgi Kobilarov, and Martin Gaedke. 2009. Silk - A Link Discovery Framework for the Web of Data. In 18th International World Wide Web Conference. Article 13, 6 pages.Google ScholarGoogle Scholar
  2. Gong Cheng, Yanan Zhang, and Yuzhong Qu. 2014. Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets. 422--437. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Aman Goel, Craig A Knoblock, and Kristina Lerman. 2012. Exploiting structure within data for accurate labeling using conditional random fields. In Proceedings on the International Conference on Artificial Intelligence (ICAI). 1--9.Google ScholarGoogle Scholar
  4. Christophe Guéret, Paul Groth, Claus Stadler, and Jens Lehmann. 2012. Assessing Linked Data Mappings Using Network Measures. In The Semantic Web: Research and Applications: 9th Extended Semantic Web Conference, ESWC 2012 Proceedings. 87--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Philipp Heim, Sebastian Hellmann, Jens Lehmann, Steffen Lohmann, and Timo Stegemann. 2009. RelFinder: Revealing Relationships in RDF Knowledge Bases. In Semantic Multimedia: 4th International Conference on Semantic and Digital Media Technologies, SAMT 2009 Proceedings. 182--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. José E Talavera Herrera, Marco Antonio Casanova, Bernardo Pereira Nunes, Giseli Rabello Lopes, and Luiz André P Paes Leme. 2011. DBpedia Profiler Tool: Profiling the Connectivity of Entity Pairs in DBpedia. Proc. of the 5th Int'l. Workshop on Intelligent Exploration of Semantic Data (2011), 1--12.Google ScholarGoogle Scholar
  7. Aidan Hogan, Axel Polleres, Jürgen Umbrich, and Antoine Zimmermann. 2010. Some entities are more equal than others: statistical methods to consolidate Linked Data. In Proceedings of the Workshop on New Forms of Reasoning for the Semantic Web: Scalable & Dynamic (NeFoRS2010). 1--15.Google ScholarGoogle Scholar
  8. Aidan Hogan, Antoine Zimmermann, Jürgen Umbrich, Axel Polleres, and Stefan Decker. 2012. Scalable and distributed methods for entity matching, consolidation and disambiguation over linked data corpora. Web Semantics: Science, Services and Agents on the World Wide Web 10 (2012), 76--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick Van Kleef, Sören Auer, et al. 2015. DBpedia - A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia. Semantic Web 6, 2 (2015), 167--195.Google ScholarGoogle ScholarCross RefCross Ref
  10. Fadi Maali, Richard Cyganiak, and Vassilios Peristeras. 2011. Re-using Cool URIs: Entity Reconciliation Against LOD Hubs. In 4th InternationalWorkshop on Linked Data on the Web. Article 11, 8 pages.Google ScholarGoogle Scholar
  11. Pablo N. Mendes, Max Jakob, Andrés García-Silva, and Christian Bizer. 2011. DBpedia Spotlight: Shedding Light on the Web of Documents. In Proceedings of the 7th International Conference on Semantic Systems (I-Semantics '11). ACM, New York, NY, USA, 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Minh Pham, Suresh Alse, Craig A. Knoblock, and Pedro Szekely. 2016. Semantic Labeling: A Domain-Independent Approach. In The Semantic Web - ISWC 2016: 15th International Semantic Web Conference, Proceedings, Part I. 446--462.Google ScholarGoogle Scholar
  13. Giuseppe Pirrò. 2015. Explaining and Suggesting Relatedness in Knowledge Graphs. In The Semantic Web - ISWC 2015: 14th International Semantic Web Conference, Proceedings, Part I. 622--639. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Dominique Ritze, Oliver Lehmberg, and Christian Bizer. 2015. Matching HTML Tables to DBpedia. In Proceedings of the 5th International Conference on Web Intelligence, Mining and Semantics (WIMS '15). ACM, New York, NY, USA, Article 10, 6 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Dominique Ritze, Oliver Lehmberg, and Christian Bizer. 2015. Matching HTML Tables to DBpedia. In Proceedings of the 5th International Conference on Web Intelligence, Mining and Semantics (WIMS '15). Article 10, 6 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Giovanni Tummarello, Renaud Delbru, and Eyal Oren. 2007. Sindice.Com: Weaving the Open Linked Data. In Proceedings of the 6th International The SemanticWeb and 2nd Asian Conference on Asian Semantic Web Conference (ISWC'07/ASWC'07). 552--565. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Petros Venetis, Alon Halevy, Jayant Madhavan, Marius Paşca, Warren Shen, Fei Wu, Gengxin Miao, and Chung Wu. 2011. Recovering Semantics of Tables on the Web. Proc. VLDB Endow. 4, 9 (June 2011), 528--538. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Repairing Hidden Links in Linked Data: Enhancing the quality of RDF knowledge graphs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        K-CAP '17: Proceedings of the 9th Knowledge Capture Conference
        December 2017
        271 pages
        ISBN:9781450355537
        DOI:10.1145/3148011

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 December 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate55of198submissions,28%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader