skip to main content
10.1145/1645953.1646084acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

A framework for semantic link discovery over relational data

Published:02 November 2009Publication History

ABSTRACT

Discovering links between different data items in a single data source or across different data sources is a challenging problem faced by many information systems today. In particular, the recent Linking Open Data (LOD) community project has highlighted the paramount importance of establishing semantic links among web data sources. Currently, LOD sources provide billions of RDF triples, but only millions of links between data sources. Many of these data sources are published using tools that operate over relational data stored in a standard RDBMS. In this paper, we present a framework for discovery of semantic links from relational data. Our framework is based on declarative specification of linkage requirements by a user. We illustrate the use of our framework using several link discovery algorithms on a real world scenario. Our framework allows data publishers to easily find and publish high-quality links to other data sources, and therefore could significantly enhance the value of the data in the next generation of web.

References

  1. A. Arasu, V. Ganti, and R. Kaushik. Efficient Exact Set-Similarity Joins. In Proc. of the Int'l Conf. on Very Large Data Bases (VLDB), pages 918--929, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Auer, S. Dietzold, J. Lehmann, S. Hellmann, and D. Aumueller. Triplify: Light-Weight Linked Data Publication from Relational Databases. In Int'l World Wide Web Conference (WWW), pages 621--630, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. J. Bayardo, Y. Ma, and R. Srikant. Scaling Up All Pairs Similarity Search. In Int'l World Wide Web Conference (WWW), pages 131--140, Banff, Canada, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. I. Bhattacharya and L. Getoor. Query-time Entity Resolution. Journal of Artificial Intelligence Research (JAIR), 30:621--657, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Bilke, J. Bleiholder, C. Böhm, K. Draba, F. Naumann, and M. Weis. Automatic Data Fusion with HumMer. In Proc. of the Int'l Conf. on Very Large Data Bases (VLDB), pages 1251--1254, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Bizer, T. Heath, and T. Berners-Lee. Linked Data: Principles and State of the Art. In Int'l World Wide Web Conference (WWW), November 2008.Google ScholarGoogle Scholar
  7. C. Bizer and A. Seaborne. D2RQ -- Treating Non-RDF Databases as Virtual RDF Graphs. In Proc. of the Int'l Semantic Web Conference (ISWC), November 2004.Google ScholarGoogle Scholar
  8. S. Das, E. I. Chong, G. Eadon, and J. Srinivasan. Supporting Ontology-Based Semantic matching in RDBMS. In Proc. of the Int'l Conf. on Very Large Data Bases (VLDB), pages 1054--1065, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Dill, N. Eiron, D. Gibson, D. Gruhl, R. Guha, A. Jhingran, T. Kanungo, S. Rajagopalan, A. Tomkins, J. A. Tomlin, and J. Y. Zien. SemTag and Seeker: Bootstrapping the Semantic Web via Automated Semantic Annotation. In Int'l World Wide Web Conference (WWW), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. K. Elmagarmid, P. G. Ipeirotis, and V. S. Verykios. Duplicate Record Detection: A Survey. IEEE Transactions on Knowledge and Data Engineering, 19(1):1--16, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Galhardas, D. Florescu, D. Shasha, E. Simon, and C.-A. Saita. Declarative Data Cleaning: Language, Model, and Algorithms. In Proc. of the Int'l Conf. on Very Large Data Bases (VLDB), pages 371--380, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. Gravano, P. G. Ipeirotis, H. V. Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava. Approximate String Joins in a Database (Almost) for Free. In Proc. of the Int'l Conf. on Very Large Data Bases (VLDB), pages 491--500, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. O. Hassanzadeh. Benchmarking Declarative Approximate Selection Predicates. Master's thesis, University of Toronto, February 2007.Google ScholarGoogle Scholar
  14. P. Indyk, R. Motwani, P. Raghavan, and S. Vempala. Locality-Preserving Hashing in Multidimensional Spaces. In ACM Symp. on Theory of Computing (STOC), pages 618--625, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Kementsietsidis, L. Lim, and M. Wang. Supporting Ontology-based Keyword Search over Medical Databases. In Proceedings of the AMIA 2008 Symposium, pages 409--13. American Medical Informatics Association, 2008.Google ScholarGoogle Scholar

Index Terms

  1. A framework for semantic link discovery over relational data

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management
        November 2009
        2162 pages
        ISBN:9781605585123
        DOI:10.1145/1645953

        Copyright © 2009 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 2 November 2009

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,861of8,427submissions,22%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader