skip to main content
10.1145/2766462.2767734acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Relevance Scores for Triples from Type-Like Relations

Authors Info & Claims
Published:09 August 2015Publication History

ABSTRACT

We compute and evaluate relevance scores for knowledge-base triples from type-like relations. Such a score measures the degree to which an entity "belongs" to a type. For example, Quentin Tarantino has various professions, including Film Director, Screenwriter, and Actor. The first two would get a high score in our setting, because those are his main professions. The third would get a low score, because he mostly had cameo appearances in his own movies. Such scores are essential in the ranking for entity queries, e.g. "American actors" or "Quentin Tarantino professions". These scores are different from scores for "correctness" or "accuracy" (all three professions above are correct and accurate). We propose a variety of algorithms to compute these scores. For our evaluation we designed a new benchmark, which includes a ground truth based on about 14K human judgments obtained via crowdsourcing. Inter-judge agreement is slightly over 90%. Existing approaches from the literature give results far from the optimum. Our best algorithms achieve an agreement of about 80% with the ground truth.

References

  1. A. Balmin, V. Hristidis, and Y. Papakonstantinou. ObjectRank: Authority-based keyword search in databases. In VLDB, pages 564--575, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. K. Balog, P. Serdyukov, and A. P. de Vries. Overview of the TREC 2011 Entity Track. In TREC, 2011.Google ScholarGoogle Scholar
  3. H. Bast, F. Baurle, B. Buchhold, and E. Haussmann. Broccoli: Semantic full-text search at your fingertips. CoRR, abs/1207.2615, 2012.Google ScholarGoogle Scholar
  4. H. Bast and E. Haussmann. Open information extraction via contextual sentence decomposition. In ICSC, pages 154--159, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using BANKS. In ICDE, pages 431--440, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. In NIPS, pages 601--608, 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. D. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD, pages 1247--1250, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. P. Cede\ no and K. S. Candan. R2DF framework for ranked path queries over weighted RDF graphs. In WIMS, page 40, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Chaudhuri, G. Das, V. Hristidis, and G. Weikum. Probabilistic ranking of database query results. In VLDB, pages 888--899, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc., pages 1--38, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  11. R. Q. Dividino, G. Gröner, S. Scheglmann, and M. Thimm. Ranking RDF with provenance via preference aggregation. In EKAW, pages 154--163, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. X. L. Dong, E. Gabrilovich, G. Heitz, W. Horn, K. Murphy, S. Sun, and W. Zhang. From data fusion to knowledge fusion. PVLDB, 7(10):881--892, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Elbassuoni, M. Ramanath, R. Schenkel, M. Sydow, and G. Weikum. Language-model-based ranking for queries on RDF-graphs. In CIKM, pages 977--986, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Fagin, R. Kumar, M. Mahdian, D. Sivakumar, and E. Vee. Comparing and aggregating rankings with ties. In PODS, pages 47--58, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. L. Fleiss. Measuring nominal scale agreement among many raters. Psychological bulletin, 76(5):378, 1971.Google ScholarGoogle ScholarCross RefCross Ref
  16. T. Franz, A. Schultz, S. Sizov, and S. Staab. TripleRank: Ranking semantic web data by tensor decomposition. In ISWC, pages 213--228, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. T. Hofmann. Probabilistic latent semantic indexing. In SIGIR, pages 50--57, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. N. S. Jr. and A. A. Freitas. A survey of hierarchical classification across different application domains. Data Min. Knowl. Discov., 22(1--2):31--72, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. D. Paice. Another stemmer. SIGIR Forum, 24(3):56--61, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Parsons. Current approaches to handling imperfect information in data and knowledge bases. IEEE Trans. Knowl. Data Eng., 8(3):353--372, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Ramage, D. L. W. Hall, R. Nallapati, and C. D. Manning. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In EMNLP, pages 248--256, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. J. Van Rijsbergen, S. E. Robertson, and M. F. Porter. New models in probabilistic information retrieval. Computer Laboratory, University of Cambridge, 1980.Google ScholarGoogle Scholar

Index Terms

  1. Relevance Scores for Triples from Type-Like Relations

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval
      August 2015
      1198 pages
      ISBN:9781450336215
      DOI:10.1145/2766462

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 August 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGIR '15 Paper Acceptance Rate70of351submissions,20%Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader